Training
Learn and practice the linear regression model.
We'll cover the following...
Implement training
Now we want to write code that implements the first part of linear regression. Given a bunch of examples ( and ), it finds a line with weight that approximates them. Can we think of a way to do that? Feel free to stop reading for a minute and think about it. It’s a fun problem to solve.
We might think that there is one simple way to find by using math. After all, there must be some formula that takes a list of points and comes up with a line that approximates them. We could Google for that formula and maybe even find a library that implements it.
As it turns out, such a formula does indeed exist, but we wouldn’t use it because that would be a dead end. If we use a formula to approximate these points with a straight line, then we’ll get stuck later when we tackle datasets that require twisty model functions. We would better look for a more generic solution that works for any model.
We have explored the mathematician’s approach significantly. Let’s look at a programmer’s approach instead.
How wrong are we?
Let’s discuss one strategy to find the best line that approximates the examples. Imagine if we have a function that takes the examples ( and ) and line weight , and measures the line’s error. The better the line approximates the examples, the lower the error. If we have such a function, we can use it to evaluate multiple lines until we find a line with a low enough error.
Except that instead of error, the ML programmers have another name for this function: they call it the loss.
Here is how we can write a loss function. Assume that we have come up with a random value of ...