Learn how to use regression models, the most important statistical analysis tool in the data scientist’s toolkit. This is the seventh course in the Johns Hopkins Data Science Specialization.
Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models. Special cases of the regression model, ANOVA and ANCOVA will be covered as well. Analysis of residuals and variability will be investigated. The course will cover modern thinking on model selection and novel uses of regression models including scatterplot smoothing.
In this course students will learn how to fit regression models, how to interpret coefficients, how to investigate residuals and variability. Students will further learn special cases of regression models including use of dummy variables and multivariable adjustment. Extensions to generalized linear models, especially considering Poisson and logistic regression will be reviewed.
There will be weekly video lectures, quizzes, and peer assessments.
As part of this class you will be required to set up a GitHub account. GitHub is a tool for collaborative code sharing and editing. During this course and other courses in the Specialization you will be submitting links to files you publicly place in your GitHub account as part of peer evaluation. If you are concerned about preserving your anonymity you will need to set up an anonymous GitHub account and be careful not to include any information you do not want made available to peer evaluators.