Implementation of polynomial regression

The following steps demonstrate the process of training and visualizing linear and polynomial regression models using the provided dataset.

Step 1 - Importing the libraries

In the first step, we import the necessary libraries.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Step 2 - Importing the dataset

After importing libraries, we load the dataset from a CSV file.

dataset = pd.read_csv('Data.csv')
x = dataset.iloc[:, 1:2].values
y = dataset.iloc[:, 2].values

Here, we use the iloc() function in python to assign the variables x and y the values of feature variable and the values of the target variable respectively from the dataset.

Step 3 - Training the linear regression model

In this step, we train the linear regression model on the entire dataset.

from sklearn.linear_model import LinearRegression
reg = LinearRegression()
reg.fit(x,y)

Step 4 - Training the polynomial regression model

Here we train the polynomial regression model on the entire dataset.

from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree = 4)
x_poly = poly.fit_transform(x)
reg2 = LinearRegression()
reg2.fit(x_poly,y)

Step 5 - The visualization of linear regression results

After training the model, we visualize the linear regression results by creating a scatter plot of the actual data points and then plotting the regression line using reg.predict(x) to predict y based on x.

plt.scatter(x, y, color= 'cadetblue')
plt.plot(x, reg.predict(x), color = 'gray')
plt.title('Linear regression')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()

Step 6 - The visualization of polynomial regression results

Here, we create a more detailed visualization of the polynomial regression results by generating a range of values based on the minimum and maximum values of x for higher resolution and smoother curves in the plot.

visual = np.arange(min(x), max(x), 0.1)
visual = visual.reshape((len(visual), 1))
plt.scatter(x, y, color= 'cadetblue')
plt.plot(visual, reg2.predict(poly.fit_transform(visual)), color = 'gray')
plt.title('Polynomial regression')
plt.xlabel('position level')
plt.ylabel('salary')
plt.show()

Step 7 - A new result prediction with linear regression

In this step, with the reg.predict() method, we predict with the trained linear regression model. It predicts the salary for a new position level of 6.5.

reg.predict([[6.5]])

Step 8 - A new result prediction with polynomial regression

Similarly, we predict with the trained polynomial regression model. It predicts the salary for a new position level of 6.5 using the polynomial features.

reg2.predict(poly.fit_transform([[6.5]]))

Code

polynomial_regression.py
Data.csv
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('Data.csv')
x = dataset.iloc[:, 1:2].values
y = dataset.iloc[:, 2].values
#Fitting linear regression to the dataset
from sklearn.linear_model import LinearRegression
reg = LinearRegression()
reg.fit(x,y)
#Fitting polynomial regression to the dataset
from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree = 4)
x_poly = poly.fit_transform(x)
reg2 = LinearRegression()
reg2.fit(x_poly,y)
#Visualising the linear regression results
plt.scatter(x, y, color= 'cadetblue')
plt.plot(x, reg.predict(x), color = 'gray')
plt.title('Linear regression')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.savefig('output/linear.png')
plt.show()
plt.clf()
#Visualising the polynomial regression results
visual = np.arange(min(x), max(x), 0.1)
visual = visual.reshape((len(visual), 1))
plt.scatter(x, y, color= 'cadetblue')
plt.plot(visual, reg2.predict(poly.fit_transform(visual)), color = 'gray')
plt.title('Polynomial regression')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.savefig('output/polynomial.png')
plt.show()
# Predicting a new result with linear regression
reg.predict([[6.5]])
# Predicting a new result with polynomial regression
reg2.predict(poly.fit_transform([[6.5]]))

Polynomial regression with varying degrees

We create four different polynomial regression models with increasing complexity by modifying the degree parameter in the PolynomialFeatures constructor to 2, 3, 4, and 5. By comparing the results of these models, we can evaluate how different degrees of polynomials capture the patterns in the data.

As the degree of the polynomial increases in polynomial regression models, the models become more flexible and capable of fitting complex patterns in the data. Higher degree polynomials can capture complex relationships between the independent and dependent variables. However, increasing the degree can also lead to overfitting, where the model becomes too sensitive to the training data and performs poorly on new, unseen data.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved