27
Oct 17

Lagrange method: case of many equality constraints

Consider the problem

(1) maximize $f(x_1,...,x_S)$ subject to $g_1(x_1,...,x_S)=0,...,g_s(x_1,...,x_S)=0$

(the number of constraints $s$ is smaller than the number of variables $S$).

The idea is to understand the proof in case of one constraint and to develop the format of the result for many constraints by analogy.

Implicit function theorem

We saw previously that the condition

(2) $\frac{\partial g}{\partial y}\neq 0$

is sufficient for existence of the implicit function $y(x)$ defined by $g(x,y(x))=0.$ This statement is asymmetric. What if we want to find $x$ as a function of $y$? The condition

(3) $\left(\frac{\partial g}{\partial x},\frac{\partial g}{\partial y}\right)\neq0$

is more general than (2). If $\frac{\partial g}{\partial x}\neq 0$, then the restriction defines $x$ as a function of $y$ and if $\frac{\partial g}{\partial y}\neq 0$, then $y$ can be found as a function of $x.$ For example, for a circle $x^2+y^2=1$ for $x\neq 0$ we can find $x=x(y)$ and for $y\neq 0$ we can find $y=y(x),$ and the whole circle is covered.

In case of a more general implicit function existence condition (3) our derivation of the necessary and sufficient conditions goes through (if $\frac{\partial g}{\partial x}\neq 0$ holds, instead of $\frac{\partial g}{\partial y}\neq 0$, one can simply swap $x$ and $y$).

To obtain the multi-dimensional generalization of (3), denote

$x=(x_1,...,x_S),\ G(x)=G(x_1,...,x_S)=\left(\begin{array}{c} g_1(x_1,...,x_S)\\...\\g_s(x_1,...,x_S)\end{array}\right)$

and

$\frac{\partial G}{\partial x}=\left(\begin{array}{ccc}\frac{\partial g_1}{\partial x_1}& ... &\frac{\partial g_1}{\partial x_S}\\... & ... & ...\\\frac{\partial g_s}{\partial x_1}&...&\frac{\partial g_s}{\partial x_S}\end{array}\right)=\left(\begin{array}{c}Dg_1(x)\\...\\Dg_s(x)\end{array}\right).$

The last equation follows from our notation of the gradient $Df(x).$ (1) becomes

maximize $f(x)$ subject to $G(x)=0.$

The required generalization of (3) can now be stated as follows:

(4) The matrix $\frac{\partial G}{\partial x}$ has rank $s$ or, equivalently, the vectors $Dg_1(x),...,Dg_s(x)$ are linearly independent.

Remark. The best way to check this condition is to see that $\det\left(\frac{\partial G}{\partial x}\right)^T\frac{\partial G}{\partial x}\ne0$.

First and second order conditions

Denote $\lambda=(\lambda_1,...,\lambda_s)$ (this is a row vector with as many components as restrictions). The Lagrangian is defined by

$L(x,\lambda )=f(x)+\lambda G(x).$

The first-order conditions become

(5) $\frac{\partial L}{\partial x}=0,$ $\frac{\partial L}{\partial \lambda }=0.$

The Hessian is

$D^2L=\left(\begin{array}{ccc}\frac{\partial^2L}{\partial x_1^2}&...&\frac{\partial^2L}{\partial x_1\partial x_S}\\...&...&...\\ \frac{\partial^2L}{\partial x_S\partial x_1}&...&\frac{\partial^2L}{\partial x_S^2}\end{array}\right).$

Recall that the refined sufficient condition was based on the set $Z.$ Now it becomes

$Z=\{a\in R^S:\frac{\partial G}{\partial x}a=0\}.$

Finally, we can state the sufficient condition:

Theorem. Assume the implicit function existence condition (4) and consider a critical point $x$ for the Lagrangian (that satisfies (5)). a) If at that point $h^TD^2Lh>0$ for any nonzero $h\in Z$, then that point is a minimum point. b) If at that point $h^TD^2Lh<0$ for any nonzero $h\in Z$, then that point is a maximum point.