In optimization, a descent direction is a vector [math]\displaystyle{ \mathbf{p}\in\mathbb R^n }[/math] that points towards a local minimum [math]\displaystyle{ \mathbf{x}^* }[/math] of an objective function [math]\displaystyle{ f:\mathbb R^n\to\mathbb R }[/math]. Computing [math]\displaystyle{ \mathbf{x}^* }[/math] by an iterative method, such as line search defines a descent direction [math]\displaystyle{ \mathbf{p}_k\in\mathbb R^n }[/math] at the [math]\displaystyle{ k }[/math]th iterate to be any [math]\displaystyle{ \mathbf{p}_k }[/math] such that [math]\displaystyle{ \langle\mathbf{p}_k,\nabla f(\mathbf{x}_k)\rangle \lt 0 }[/math], where [math]\displaystyle{ \langle , \rangle }[/math] denotes the inner product. The motivation for such an approach is that small steps along [math]\displaystyle{ \mathbf{p}_k }[/math] guarantee that [math]\displaystyle{ \displaystyle f }[/math] is reduced, by Taylor's theorem.
Using this definition, the negative of a non-zero gradient is always a descent direction, as [math]\displaystyle{ \langle -\nabla f(\mathbf{x}_k), \nabla f(\mathbf{x}_k) \rangle = -\langle \nabla f(\mathbf{x}_k), \nabla f(\mathbf{x}_k) \rangle \lt 0 }[/math].
Numerous methods exist to compute descent directions, all with differing merits, such as gradient descent or the conjugate gradient method.
More generally, if [math]\displaystyle{ P }[/math] is a positive definite matrix, then [math]\displaystyle{ p_k = -P \nabla f(x_k) }[/math] is a descent direction at [math]\displaystyle{ x_k }[/math].[1] This generality is used in preconditioned gradient descent methods.
Original source: https://en.wikipedia.org/wiki/Descent direction.
Read more |