Welcome to the Advanced Linear Models for Data Science Class 2: Statistical Linear Models. This class is an introduction to least squares from a linear algebraic and mathematical perspective. Before beginning the class make sure that you have the following:
– A basic understanding of linear algebra and multivariate calculus.
– A basic understanding of statistics and regression models.
– At least a little familiarity with proof based mathematics.
– Basic knowledge of the R programming language.
After taking this course, students will have a firm foundation in a linear algebraic treatment of regression modeling. This will greatly augment applied data scientists’ general understanding of regression models.
Introduction and expected values
In this module, we cover the basics of the course as well as the prerequisites. We then cover the basics of expected values for multivariate vectors. We conclude with the moment properties of the ordinary least squares estimates.
The multivariate normal distribution
In this module, we build up the multivariate and singular normal distribution by starting with iid normals.
In this module, we build the basic distributional results that we see in multivariable regression.
In this module we will revisit residuals and consider their distributional results. We also consider the so-called PRESS residuals and show how they can be calculated without re-fitting the model.