The k-nearest neighbors (KNN) algorithm is a supervised machine learning algorithm.
KNN assumes that similar things exist in close proximity. In data science, it implies that similar data points are close to each other. KNN uses similarity to calculate the distance between points on a graph.
The KNeighborsClassifier
function can be imported from the sklearn
library. The function takes the value for n_neighbors
as a parameter. This specifies the value for k. The below example demonstrates the algorithm on the Iris dataset.
from sklearn.neighbors import KNeighborsClassifierfrom sklearn.model_selection import train_test_splitfrom sklearn.datasets import load_iris# Loading datairisData = load_iris()# Create feature and target arraysX = irisData.datay = irisData.target# Split into training and test setX_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state=42)knn = KNeighborsClassifier(n_neighbors=7) # k = 7knn.fit(X_train, y_train)# Calculate the accuracy of the modelprint("Accuracy:", knn.score(X_test, y_test))