Linear Regression
Learn to fit a function into the available data through linear regression.
Function approximation
Approximating a function means estimating the values of its parameters. Consider the SSE function we discussed in the previous lesson.
Approximating SSE means estimating the vector, , that nearly satisfies the linear system, also called the linear least squared error solution.
Formal definition
Consider a data set, , where each entry is a pair, and , of objects (scalars, vectors, matrices, and so on). Function approximation seeks a function, , such that:
Example
Let . The function represents a line passing through the two data points in the plane. However, infinitely many non-linear curves pass through the same data points. A few of these curves are shown in the figure below.
In the figure to the right, the green points are from the data set. A line, in the color red, and two different curves, of colors black and blue, are passing through the data points. As we can see, all the functions (red line, blue curve, and black curve) approximate the data rather exactly. In this case, we can find the exact function(s) that fit data. When an exact fit is hard to estimate, we may rely on an approximate fit, that is, a curve which is near to the data points.
import matplotlib.pyplot as pltplt.scatter([-3, 4], [9, 1], c='g', linewidths=10)# Red lineplt.plot([-3, 4], [9, 1], 'r')# Blue curveplt.plot([-5, -3, -2, 1, 2, 3, 3.5, 4, 7], [11, 9, 4, 5, -1, 5, 4, 1, 6], 'b')# Black curveplt.plot([-5, -3, -2, 1, 2, 3, 3.5, 4, 7], [2, 9, -4, 5, 2, 10, -4, 1, -1], 'k')
Note: The term “approximation” may become more relevant when considering several data points!
Regression vs. classification
In the dataset ...