What is a support vector machine (SVM)?

A support vector machine (SVM) is a supervised machine learning algorithm. A supervised machine learning algorithm is where the machine is fed labeled training data where for each data point, let's say x, we have a target value, y. Once trained, the algorithm finds patterns from the data and predicts output for unseen data.

Key terms

Let's explore some important terms associated with support vector machines.

Support vectors

In SVMs, support vectors are the data points closest to the decision boundary (hyperplane) and have the most significant impact on determining it.

Hyperplane

A hyperplane is a decision boundary used in SVMs to divide data points into various classifications. Depending on the selected kernel function, it can be linear or nonlinear and is represented by a mathematical equation.

Margin

A margin is the separation between the closest support vectors and the hyperplane. To perform better in generalization, SVMs try to maximize the margin.

Kernel function

A kernel function is a mathematical operation that computes the inner products between them without explicitly mapping the data points in a higher-dimensional feature space.

How do SVMs work?

Consider a scenario where we have two classes: red and blue. We want to train an SVM classifier to distinguish between these two classes based on two features: $x$ and $y$ .

Our labeled training data is plotted on a 2D plane as follows:

The data points are fed to a support vector machine, which generates a hyperplane (the line in two dimensions) that best separates them. Everything on one side of it will be classified as blue, and anything on the other as red.

This case is straightforward as we dealt with linear models, but what about the instances where the pattern is more complex than drawing a line for classification?

Nonlinear SVM

In the case of a nonlinear SVM, the data may not be as separated by a straight line. However, the data will have an obvious pattern, as in the figure below where the hyperplane is a circumference.

Let's understand how nonlinear SVM works with a simple example:

Imagine a 2D dataset where the pink and blue points form concentric circles. In the original 2D space, drawing a straight line to separate the two classes is impossible. However, using a nonlinear SVM with a polynomial kernel, the data can be mapped to a higher-dimensional space where a curved decision boundary can separate the classes.

When to use SVMs?

Let's understand the ideal circumstances for applying support vector machines:

SVMs are suitable when dealing with binary or multiclass classification problems.
SVMs are effective when the number of features is larger than the number of samples.
SVMs work well with small to medium-sized datasets. They are not recommended when the dataset size is large or has noise.

Note: SVMs can be used for both classification and regression tasks. However, they are more commonly associated with classification problems.

Applications of SVMs

SVMs have numerous and diverse applications. In this section, we will explore some of their significant applications:

Image classification: SVMs can classify images and are widely used in face recognition systems.
Text classification: SVMs can categorize text documents into various groups for text classification. They can be employed, for instance, in sentiment analysis to determine whether a text conveys an excellent or negative sentiment.
Bioinformatics: SVMs can classify samples or forecast the course of the disease by analyzing biological data, such as gene expression data. In areas like personalized medicine, they aid researchers in understanding genetic patterns and making predictions.
Finance and stock market prediction: SVMs can be used to forecast stock market movements or evaluate financial risks in the finance and stock market sectors. They can assist vectors in making knowledgeable stock purchase or sale decisions by analyzing previous data.

Conclusion

SVMs have made significant advancements in machine learning. SVMs can handle both binary and multiclass classifications. They are effective in high-dimensional data and are less prone to overfitting. However, SVMs also have limitations that must be considered. They can be computationally expensive, especially for large datasets, and may require significant computational resources. Additionally, SVMs rely on careful hyperparameter tuning for optimal performance, which can be time-consuming. Therefore, these limitations must be considered when applying the SVM algorithm.

Free Resources

Learn in-demand tech skills in half the time

PRODUCTS

Mock Interview

New

Courses

Skill Paths

Projects

Assessments