27
Oct 17

## Lagrange method: case of many equality constraints

Consider the problem

(1) maximize $f(x_1,...,x_S)$ subject to $g_1(x_1,...,x_S)=0,...,g_s(x_1,...,x_S)=0$

(the number of constraints $s$ is smaller than the number of variables $S$).

The idea is to understand the proof in case of one constraint and to develop the format of the result for many constraints by analogy.

## Implicit function theorem

We saw previously that the condition

(2) $\frac{\partial g}{\partial y}\neq 0$

is sufficient for existence of the implicit function $y(x)$ defined by $g(x,y(x))=0.$ This statement is asymmetric. What if we want to find $x$ as a function of $y$? The condition

(3) $\left(\frac{\partial g}{\partial x},\frac{\partial g}{\partial y}\right)\neq0$

is more general than (2). If $\frac{\partial g}{\partial x}\neq 0$, then the restriction defines $x$ as a function of $y$ and if $\frac{\partial g}{\partial y}\neq 0$, then $y$ can be found as a function of $x.$ For example, for a circle $x^2+y^2=1$ for $x\neq 0$ we can find $x=x(y)$ and for $y\neq 0$ we can find $y=y(x),$ and the whole circle is covered.

In case of a more general implicit function existence condition (3) our derivation of the necessary and sufficient conditions goes through (if $\frac{\partial g}{\partial x}\neq 0$ holds, instead of $\frac{\partial g}{\partial y}\neq 0$, one can simply swap $x$ and $y$).

To obtain the multi-dimensional generalization of (3), denote $x=(x_1,...,x_S),\ G(x)=G(x_1,...,x_S)=\left(\begin{array}{c}g_1(x_1,...,x_S) \\... \\g_s(x_1,...,x_S) \end{array}\right)$

and $\frac{\partial G}{\partial x}=\left(\begin{array}{ccc}\frac{\partial g_1}{\partial x_1}& ... &\frac{\partial g_1}{\partial x_S} \\ ... & ... & ... \\ \frac{\partial g_s}{\partial x_1}&...&\frac{\partial g_s}{\partial x_S}\end{array}\right)=\left(\begin{array}{c}Dg_1(x) \\... \\Dg_s(x) \end{array}\right).$

The last equation follows from our notation of the gradient $Df(x).$ (1) becomes

maximize $f(x)$ subject to $G(x)=0.$

The required generalization of (3) can now be stated as follows:

(4) The matrix $\frac{\partial G}{\partial x}$ has rank $s$ or, equivalently, the vectors $Dg_1(x),...,Dg_s(x)$ are linearly independent.

Remark. The best way to check this condition is to see that $\det\left(\frac{\partial G}{\partial x}\right)^T\frac{\partial G}{\partial x}\ne0$.

## First and second order conditions

Denote $\lambda=(\lambda_1,...,\lambda_s)$ (this is a row vector with as many components as restrictions). The Lagrangian is defined by $L(x,\lambda )=f(x)+\lambda G(x).$

The first-order conditions become

(5) $\frac{\partial L}{\partial x}=0,$ $\frac{\partial L}{\partial \lambda }=0.$

The Hessian is $D^2L=\left(\begin{array}{ccc}\frac{\partial^2L}{\partial x_1^2} &... & \frac{\partial^2L}{\partial x_1\partial x_S} \\...&...&... \\ \frac{\partial^2L}{\partial x_S\partial x_1} &... &\frac{\partial^2L}{\partial x_S^2} \end{array}\right).$

Recall that the refined sufficient condition was based on the set $Z.$ Now it becomes $Z=\{a\in R^S:\frac{\partial G}{\partial x}a=0\}.$

Finally, we can state the sufficient condition:

Theorem. Assume the implicit function existence condition (4) and consider a critical point $x$ for the Lagrangian (that satisfies (5)). a) If at that point $h^TD^2Lh>0$ for any nonzero $h\in Z$, then that point is a minimum point. b) If at that point $h^TD^2Lh<0$ for any nonzero $h\in Z$, then that point is a maximum point.