Margins & distance to a hyperplane

A linear classifier is the line θ·x + θ₀ = 0. The two margin boundaries θ·x + θ₀ = ±1 sit a perpendicular 1/‖θ‖ away — so ‖θ‖ alone sets how wide the margin is. Drag the point P (or move the sliders) and watch the distance d = |θ·x₀+θ₀| / ‖θ‖ change. Slide ‖θ‖ to see the margins breathe while the decision line stays put.

L : θ·x + θ₀ = 0  ·  margins: θ·x + θ₀ = ±1  ·  d = |θ·x₀+θ₀| / ‖θ‖
boundary θ·x+θ₀=0 margins θ·x+θ₀=±1 θ (normal) point P & distance d

Controls

Tip: drag P anywhere on the plot, or type its coordinates. Slide ‖θ‖ and watch only the margins move — the decision line doesn't budge.

Now

Why d = |θ·x₀+θ₀| / ‖θ‖ — the derivation

1 · Distance from a point to the line

θ is normal (perpendicular) to the line, so the unit normal is θ̂ = θ/‖θ‖. Starting at the point P = x₀, walk straight toward the line along that normal:

x(t) = x₀ − t · θ̂

You hit the line L the moment the score is zero. Substitute and solve for t:

θ·x(t) + θ₀ = (θ·x₀ + θ₀) − t (θ·θ̂) = (θ·x₀ + θ₀) − t‖θ‖ = 0  ⟹  t = θ·x₀ + θ₀‖θ‖

used θ·θ̂ = θ·(θ/‖θ‖) = ‖θ‖²/‖θ‖ = ‖θ‖. The distance is |t|, giving the answer from the review problem:

d = |θ·x₀ + θ₀|‖θ‖

and the foot of the perpendicular (the projection of P onto the line) is P′ = x₀ − t·θ̂ = x₀ − θ·x₀ + θ₀‖θ‖² θ. That subtracts off exactly the component of P that points away from the line.

2 · Why the margins are 1/‖θ‖ apart

A margin boundary is just another line: θ·x + θ₀ = 1. Drop any point of it into the same distance formula — its numerator is |1| — so its distance from the decision line is

|1|‖θ‖ = 1‖θ‖,   and the same for the −1 side, so the full band has width 2‖θ‖.

Here's the key move: scaling θ and θ₀ by the same constant leaves the decision line unchanged (θ·x+θ₀=0 ⟺ cθ·x+cθ₀=0) but changes ‖θ‖ — and therefore the margin. That's why maximizing the margin is the same as minimizing ‖θ‖, the heart of the support-vector machine. Try it: slide ‖θ‖ and only the dashed lines move.

3 · Functional vs. geometric margin

The raw score θ·x₀ + θ₀ is the functional margin — it grows if you just scale θ up, so it isn't a real distance. Dividing by ‖θ‖ removes that freedom and turns it into the geometric margin, the actual distance d drawn on the plot. A point sitting exactly on the +1 boundary has functional margin 1 and geometric margin 1/‖θ‖.