Generalized Linear Models
Acquire knowledge on generalized linear models for regression and classification through interactive code.
Generalized linear model for regression
A regression model that is linear in parameters , and might not necessarily be linear in the input features , is known as a generalized linear model for regression.
Note: A generalized linear model is linear in transformed features , and is typically nonlinear in the input features .
Nonlinear transformations in regression
Suppose we want to predict the total marks a student will obtain in an exam based on the number of hours they studied and the number of times they attended class. We collected data on 20 students and recorded the number of hours they studied, the number of times they attended class, and their obtained marks.
To build a linear regression model, we can start with a simple linear equation of the form:
Here, is the obtained marks, is the number of hours studied, and is the number of times attended class.
However, we might find that the relationship between the input features and the marks obtained isn’t quite linear. For example, the effect of the number of times attended class on obtained marks might depend on the number of hours studied. In this case, we can use a generalized linear model for regression.
One way to do this is to transform the input features using a set of basis functions . For example, we can use the following basis functions:
Then, we can use a linear model in the transformed feature space:
Note that this model is linear in the parameters , but might be nonlinear in the input features due to the nonlinear basis functions.
Using this model, we can fit the parameters to the training data using a method such as least squares regression. Then, we can use the model to predict the exam grade of a new student based on the number of hours they studied and the number of times they attended class.
Generalized linear model for classification
A classification model that has a linear classification boundary in the transformed features is known as a generalized linear model for classification.
Note: A function that separates data points of different classes is referred to as a classification boundary. If a classification problem has only two classes, it’s known as a binary classification problem.
Get hands-on with 1400+ tech skills courses.