8
Nov 17

Using convexity in optimization

Using graphical analysis

Here we look at the solution of Example 6.4 from the mt3042 Optimization Guide by M. Baltovic:

maximize $f(x_1,x_2)=p_1x_1+p_2x_2$

subject to $h_1(x_1,x_2)=x_1\geq 0,$ $h_2(x_1,x_2)=x_2\geq0$, $h_3(x_1,x_2)=x_1^2+x_2^2\leq 1$.

$p_1,p_2$ are assumed to be positive. This could be about maximizing revenue from selling two goods, and those familiar with graphical analysis in Economics can immediately say that the maximum is achieved where $p_1x_1+p_2x_2$ is tangent to $x_1^2+x_2^2=1.$ The first two constraints are not binding. The proof of the Kuhn-Tucker theorem shows that then we can use the Lagrangian $L=p_1x_1+p_2x_2+\lambda(x_1^2+x_2^2-1)$ that doesn't involve the nonbinding constraints. The implicit function existence condition necessary for application of the Lagrange theorem is $\left(\frac{\partial h_3}{\partial x_1},\frac{\partial h_3}{\partial x_2}\right)=\left(2x_1,2x_2\right)\neq 0$ and is satisfied everywhere except at the origin. The FOC's are

$\frac{\partial L}{\partial x_1}=p_1+2\lambda x_1=0$, $\frac{\partial L}{\partial x_2}=p_2+2\lambda x_2=0$, $\frac{\partial L}{\partial\lambda }=x_1^2+x_2^2-1.$

I leave to the reader to find the solution: $(x_1,x_2)=\left(\frac{p_1}{\sqrt{p_1^2+p_2^2}},\frac{p_2}{\sqrt{p_1^2+p_2^2}}\right) .$

Graphical analysis in a more educated way

A set is called convex if, together with any two points belonging to it, it contains the whole segment connecting those two points. A square and a disk are convex. A heart is not convex (all sets are considered with their interiors and boundaries).

For a function $f$, the set $\text{epi}f=\left\{(x,y):y\geq f(x)\right\}$ of points lying on and above the graph $y=f(x)$ is called an epigraph. $f$ is called convex if its epigraph is convex. Examples of convex functions: a linear function $y=p_1x_1+p_2x_2$, a paraboloid $y=x_1^2+x_2^2$, an exponential function $y=e^x$. Examples of functions that are not convex: $y=\log x$, a parabola $y=ax^2+bx+c$ with $a<0$.

Maximum principle. A convex function on a compact convex domain attains its maximum only on the boundary.

Geometric interpretation of the gradient: let $y=f(x_1,x_2)$ be differentiable. Then the gradient $\left(\frac{\partial f}{\partial x_1},\frac{\partial f}{\partial x_2}\right)$, at any given point $(x_1,x_2),$ shows the direction in which the growth of the function is the fastest.

In our case both the function and the domain are convex, so by the maximum principle the maximum is attained on the boundary. Since the gradient is $(p_1,p_2)$, the values of the function cannot be maximized at the coordinate axes. Thus, only the third constraint can be binding. The rest of the solution employs the Lagrangian as above.

Compare the above solutions with the three-page solution from Baltovic (pp.93-96).

Main lessons: (1) Don't pedantically look through all possible combinations of constraints. This can be good only in a multidimensional problem where you have no geometric guidance. (2) Use geometry, convexity or any other means to figure out which constraints are binding. Then the proof of the Kuhn-Tucker theorem tells you that you can work with the Lagrangian with only binding constraints included. Check the constraint qualification (which I call the implicit function existence condition) only for binding constraints.