(This post is part of my 20 Minutes of Reading Every Day project. My current book of choice is An Introduction to Statistical Learning, and today I’m continuing with Chapter 3: Linear Regression.)

Linear Regression is one of the simplest approach for supervised learning. It’s one of the major feature of CFA Level 2 material too, so it’s something with which I’m rather familiar. From my reading so far, I think this book does a much better job of explaining linear regression, especially the intuition, than the CFA text, though. I wish I’d come across this book first!

The book also makes a point that despite its simplicity, having a good understanding of linear regression is very important. Many of the fancier approaches that we’ll see in the later chapters are generalizations or extensions of the ideas of linear regression.

In particular, in the past chapter the book points out that with the assumption of linearity, comes better interpretability. For example it’s relatively straightforward to answer questions such as the following:

- Is there a relationship between the predictors and the response?
- How strong is the relationship, if it does exist?
- Which predictor contributes the most to the response?
- How accurate is our estimation of the effect of each predictor?
- How accurately can we predict future responses?
- Is the relationship linear?
- Is there synergy among the advertising media?

### Simple Linear Regression

This is a very simple case of one predictor and one response, i.e.:

Y ≈ ß0 + ß1X

(Ha! I just found out that in Mac OS you can type ≈ using Option + x, and ß using Option + s.)

The ß0 is the intercept, and ß1 is the slope of the line. By carrying out the linear regression, then we’re estimating both, i.e.: we come up with estimated values of ß0 and ß1. Our goal is of course to come up with estimates that produce a line that matches the data points as closely as possible.

We’ve seen two measures of closeness so far — MSE (Mean Squared Error) for regression, and the Error Rate for classification. For this simple linear regression, we’re using a measure called Residual Sum of Squares (RSS).