Search⌘ K
AI Features

Nonparametric Regression

Explore nonparametric regression methods that allow fitting data locally rather than globally. Learn how nearest-neighbor and kernel regression techniques work, understand distance metrics, and discover how to achieve better predictions by weighing neighbors or data points. This lesson will help you deepen your grasp of advanced regression approaches useful in data science interviews and practical modeling.

We'll cover the following...

We worked on parametric approaches in the previous lesson. This lesson is dedicated to the nonparametric approach for regression.

Using the appropriate value of parameters, we can fit the data. But what if fitting the data is not possible with one model?

See the example below.

We can think of fitting a simple regression line on this data. It will look similar to this.

However, this is not ideal. Maybe we can try high order polynomials. See the example of fit below.

It is better than the previous solution, but it still does not capture the best model. What if we can fit the data locally (small points) instead of trying globally?

This is a good and reasonable fit. The only limitation is that we need a good amount of data to fit these local models.

Nearest-neighbor regression

In this technique, we predict the value based on past nearest data. We find the most similar K values to the new value and take their observations to predict the value by a simple average or weighted average. Consider the below example.

Assume that all the blue points are training data and we have to predict a green point. We can use the nearest-neighbor regression to predict the value. Consider K=1. So, we predict the value of the nearest target value of its neighbor.

This is called 1-nearest neighbor regression.

Instead of 1-nearest neighbor, K neighbors are considered. This will give a more generalized prediction and safety from outliers. We can also provide weights on neighbors to get weighted nearest neighbors.

Distance metrics

We choose the neighbor based on the minimum distance from the new point. We use different types of metrics to get this distance. A few examples are correlation-based, rank-based, cosine similarity, Euclidean, etc.

Quiz: Value of K in KNN

1.

What are the advantages of K-nearest neighbors (K>1) compared to 1-nearest neighbor?

A.

Suitable for noisy data

B.

No discontinuity

C.

Faster prediction


1 / 1

Quiz: Complexity of KNN

1.

If we keep increasing data, what would the model complexity of the K-nearest neighbor be?

A.

Increases with data increment

B.

Decreases with data increment

C.

Remains constant with data increment


1 / 1
1.

What distance matrices can be used in K-nearest neighbor regression?

Show Answer
1 / 2

Kernel regression

While we can weigh neighbors in the K-nearest neighbor regression. In kernel regression, we weigh all the data points in the dataset.

y^=i=1i=NCeiyii=1i=NCei=i=1i=NKernel(distance(xixe)yii ...