Support Vector Machines

Finding the classification boundary with the widest possible margin to the nearest points of each class โ€” and the kernel trick that lets it draw curved boundaries.

SVM chooses the separating line with the widest margin

Margin rule

The dashed lines touch the nearest points. Those touching points are the support vectors.

Wider margin means the classifier is less fragile to small movement in the data.

Definition

A Support Vector Machine (SVM) classifies data by finding the boundary (a line in 2D, a hyperplane in general) that separates two classes with the widest possible margin โ€” the largest distance to the nearest point of either class. Unlike logistic regression, which finds any separating line that fits well, SVM specifically seeks the one that maximizes the buffer zone around it.

The points closest to the boundary โ€” the ones that "support" where the margin sits โ€” are called support vectors. Remarkably, only these points determine the boundary; every other point could move around (without crossing the margin) with no effect on the result.

Why maximize the margin?

Imagine two clusters of points with many possible separating lines. A line that barely squeezes between the closest points of each class is fragile โ€” a new point close to the boundary could easily land on the wrong side. The maximum-margin line is the most "confident" choice: it stays as far as possible from both classes.

Try it

If you removed a point that's not a support vector and far from the boundary, would the SVM's decision boundary change?

Solution

No. The boundary is determined entirely by the support vectors (the closest points). A point far from the margin contributes nothing to where the maximum-margin boundary sits, so removing it leaves the boundary unchanged.

Related concepts