Training

Learn and practice the linear regression model.

Implement training

Now we want to write code that implements the first part of linear regression. Given a bunch of examples (XX and YY), it finds a line with weight ww that approximates them. Can we think of a way to do that? Feel free to stop reading for a minute and think about it. It’s a fun problem to solve.

We might think that there is one simple way to find ww by using math. After all, there must be some formula that takes a list of points and comes up with a line that approximates them. We could Google for that formula and maybe even find a library that implements it.

As it turns out, such a formula does indeed exist, but we wouldn’t use it because that would be a dead end. If we use a formula to approximate these points with a straight line, then we’ll get stuck later when we tackle datasets that require twisty model functions. We would better look for a more generic solution that works for any model.

We have explored the mathematician’s approach significantly. Let’s look at a programmer’s approach instead.

How wrong are we?

Let’s discuss one strategy to find the best line that approximates the examples. Imagine if we have a function that takes the examples (XX and YY) and line weight (w)(w), and measures the line’s error. The better the line approximates the examples, the lower the error. If we have such a function, we can use it to evaluate multiple lines until we find a line with a low enough error.

Except that instead of error, the ML programmers have another name for this function: they call it the loss.

Here is how we can write a loss function. Assume that we have come up with a random value of w ...

Access this course and 1400+ top-rated courses and projects.