Gradient

The vector of all partial derivatives — pointing in the direction of steepest ascent and encoding how a multivariable function changes locally.

Partial derivatives — fix one input, then measure the slope as the other input changes
f(x,y) = x² + y² at (x₀,y₀) = (1.0, 1.0)Changing the held input vertically shifts the other slice because it adds a constant term.Hold y fixedf(x, 1.0) = x² + 1.00x∂f/∂x = slope = 2.00Hold x fixedf(1.0, y) = 1.00 + y²y∂f/∂y = slope = 2.00Together: ∇f(x₀,y₀) = (∂f/∂x, ∂f/∂y) = (2.00, 2.00)red tangent = the one-variable slope of that slice at (1.0, 1.0)
x₀=1.0
y₀=1.0
Definition

The gradient of a scalar function f(x1,,xn)f(x_1, \ldots, x_n) is the vector of all its partial derivatives:

f=(f/x1f/x2f/xn)\nabla f = \begin{pmatrix} \partial f/\partial x_1 \\ \partial f/\partial x_2 \\ \vdots \\ \partial f/\partial x_n \end{pmatrix}

Key facts:

  • f\nabla f points in the direction of steepest ascent of ff
  • f-\nabla f points in the direction of steepest descent
  • f\|\nabla f\| measures the steepness

At a local maximum or minimum, f=0\nabla f = \mathbf{0} (critical point).

Key properties
  • The gradient is perpendicular to the function's level sets (contour lines) at every point
  • f=0\|\nabla f\| = 0 exactly at critical points — maxima, minima, and saddle points
  • Scaling the function scales the gradient linearly: (cf)=cf\nabla(cf) = c\nabla f
  • The gradient of a sum is the sum of gradients: (f+g)=f+g\nabla(f+g) = \nabla f + \nabla g
Common mistakes
  • Assuming f=0\nabla f = \mathbf{0} means a minimum: critical points can also be maxima or saddle points — the Hessian's definiteness is what distinguishes them
  • Confusing the gradient (a vector) with the directional derivative (a scalar): f\nabla f has the same dimension as the input; Duf=fuD_{\mathbf{u}}f = \nabla f \cdot \mathbf{u} is always a single number
Gradient of a simple function

f(x,y)=x2+2y2f(x,y) = x^2 + 2y^2.

f=(2x,4y)T\nabla f = (2x, 4y)^T.

At (1,1)(1, 1): f=(2,4)T\nabla f = (2, 4)^T. This vector points away from the minimum at (0,0)(0,0) — walking in this direction increases ff most steeply.

Try it

Find f\nabla f for f(x,y,z)=x2+y2+z2f(x,y,z) = x^2 + y^2 + z^2 at the point (1,2,3)(1,2,3). What is f\|\nabla f\|?

Solution

f=(2x,2y,2z)T\nabla f = (2x, 2y, 2z)^T. At (1,2,3)(1,2,3): f=(2,4,6)T\nabla f = (2, 4, 6)^T.

f=4+16+36=56=2147.48\|\nabla f\| = \sqrt{4+16+36} = \sqrt{56} = 2\sqrt{14} \approx 7.48.

Note: f=2x\nabla f = 2\mathbf{x} — the gradient of x2\|\mathbf{x}\|^2 points radially outward.

Related concepts