Hard-Margin SVM

Learn how to implement and optimize the hard-margin SVM.

Hard-margin SVM

Hard-margin SVM is a type of support vector machine that aims to find the maximum margin hyperplane that perfectly separates the two classes without allowing any misclassification. It’s useful when the dataset is linearly separable and there are no outliers.

Linearly separable case

Given a binary classification dataset D={(x1,y1),(x2,y2),,(xn,yn)}D=\{(\bold x_1, y_1), (\bold x_2, y_2), \dots,(\bold x_n, y_n)\}, where xiRd\bold x_i \in \R^d and yi{1,1}y_i \in \{-1, 1\}, if a hyperplane wTϕ(x)=0\bold w^T\phi(\bold x)=0 exists that separates the two classes, the dataset is said to be linearly separable in the feature space defined by the mapping ϕ\phi. We’ll assume for now that the dataset DD is linearly separable. The goal is to find the hyperplane with a maximum margin by optimizing the following objective:

maxw[1wmini(yiwTϕ(xi))]\max_{\bold w}\bigg[\frac{1}{\|\bold w\|}\min_{i}\big(y_i\bold w^T\phi(\bold x_i)\big)\bigg]

The direct optimization of the above objective can be very complex. Here’s a derivation of an equivalent optimization problem which is easier to solve:

maxw[1wmini(yiwTϕ(xi))]=maxw1w[mini(yiwTϕ(xi))]=maxw,γγw                s.t.              yiwTϕ(xi)γ              i=maxw,γ1w/γ        s.t.              yi(w/γ)Tϕ(xi)1        i=maxw1w              s.t.              yi(wTϕ(xi)1              i=minww                  s. ...