Introduction to SVM
Gain an understanding of SVM, and the concepts of signed and unsigned distance.
We'll cover the following...
Support vector machine (SVM) is a popular and powerful supervised learning algorithm for classification and regression problems. It works by finding the best possible boundary between different classes of data points. In this lesson, we’ll cover the basic concepts and principles behind SVMs and see how they can be applied in practice.
What is SVM?
Suppose a person works for a bank, and their job is to decide whether to approve or reject loan applications based on the applicant’s financial history. They have a loan dataset with various features such as credit score, income, and debt-to-income ratio, along with past approval and rejection records. The task is to use SVM to build a predictive model for future loan applications.
First, they map each loan application into a feature space based on its features and label each loan application as either “approved” or “rejected,” which creates two different classes in the dataset. Next, they try to find a decision boundary that will separate the data linearly.
SVM finds the best hyperplane that separates the two classes, that is, the decision boundary that separates the approved and rejected loan applications. It finds the hyperplane that maximizes the margin, which is the distance between the hyperplane and the closest loan applications from each class. This means they want the hyperplane to be as far away as possible from the closest approved and rejected loan applications.
However, they don’t need to consider every loan application when determining the hyperplane. Only the loan applications that lie closest to the hyperplane, called support vectors, are used to determine its position. This approach makes SVM memory-efficient and allows us to create a model that can handle a large number of features.
The plot above shows two classifiers that separate the positive and negative classes of a dataset. The blue line represents the SVM classifier, whereas the green line represents the other classifier. The points on the dotted line are called support vectors because they’re the closest to the hyperplane, and the distance between the blue dotted lines is called the margin, which is what we want to maximize in SVM to get the best possible classifier. We can’t say that the green line is a hyperplane of SVM because it doesn’t have a maximum margin.
Note: SVM can be thought of as a generalized linear discriminant with maximum margin.
Signed & unsigned distance
In SVM, the hyperplane is defined by a weight vector and a bias term . The hyperplane equation can be written as ...