14
Oct 17

## Unconstrained optimization on the plane: sufficient conditions

Based on what we know about the one-dimensional case we should expect that first-order and second-order conditions jointly are sufficient for a function to have an extreme inside the domain. It remains to find out what those conditions are. Also previous posts provide convincing evidence that one has to work with the Taylor decomposition:

(1) $f(x+h)\approx f(x)+Df(x)h+\frac{1}{2}h^TD^2f(x)h.$

It has been derived here. The necessity of the first order condition for a function to have an extreme

(2) $\frac{\partial f(x)}{\partial x_1}=0,\ \frac{\partial f(x)}{\partial x_2}=0$

has been shown here. Under this condition (1) gives

(3) $f(x+h)\approx f(x)+\frac{1}{2}h^TD^2f(x)h.$

Case of a minimum. This approximation makes clear that to have $f(x+h)>f(x)$ for all small nonzero $h$ we need to have $\frac{1}{2}h^TD^2f(x)h>0$ for all such $h$. (We require $h$ to be nonzero because otherwise we get $x+h=x$ and we want it to be small because we are looking for a local minimum.)

Case of a maximum. Similarly, to have $f(x+h) for all small nonzero $h$ we need to have $\frac{1}{2}h^TD^2f(x)h<0$ for all such $h$.

These observations lead us to the following definition.

Definition. Let $A$ be a square matrix and let $h$ be a column-vector of compatible dimension. 1) We say that $A$ is positive definite if the quadratic form $h^TAh$ is positive for all $h\ne0$. 2) We say that $A$ is negative definite if the quadratic form $h^TAh$ is negative for all $h\ne0$.

Note that in this definition we don't require $h$ to be small because in general the matrix $A$ is not related to any optimization problem. The definition is stated in terms of the quadratic form which is a function of $h$. It's difficult to apply directly. The question is: how do you know that a matrix is positive or negative definite? The answer is given by Sylvester's criterion. For reasons we cannot discuss here it applies only to symmetric matrices that by definition satisfy $A^T=A$. As an exercise, check that a Hessian is symmetric.

Denote

$m_1=a_{11},\ m_2=\det\left(\begin{array}{cc}a_{11}&a_{12}\\a_{21}&a_{22}\end{array}\right),$ $m_3=\det\left(\begin{array}{ccc}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{array}\right),...$

These are determinants of upper left corners of the matrix $A$. They are called leading principal minors.

Sylvester's criterion. Let $A$ be symmetric. 1) $A$ is positive definite if and only if all leading principal minors are positive. 2) $A$ is negative definite if and only if principal minors have alternating signs, starting with a minus: $m_1<0,\ m_2>0,\ m_3<0,...$.

Conclusion. 1) If the first-order conditions (2) are satisfied and the Hessian is positive definite (second-order condition), then $x$ is a minimum point. 2) If the first-order conditions (2) are satisfied and the Hessian is negative definite (second-order condition), then $x$ is a maximum point.