Oct 17

Importance of implicit function theorem for optimization

Importance of implicit function theorem for optimization

The ultimate goal is to study the Lagrange method for optimization with equality constraints. However, this is impossible to do without the implicit function theorem. For background, on the way to the Lagrange method, we first consider a home-made method for solving the problem:

maximize f(x,y) subject to g(x,y)=0.

Geometry. The constraint defines a surface \{(x,y,z):g(x,y)=0\} in the 3D space. The intersection of that surface with the surface \{(x,y,z):z=f(x,y)\} gives us the curve that we need to minimize.

Implicit function theorem

Example 1. f(x,y)=x^2+y^2, g(x,y)=x+y. We solve this problem by reducing it to a one-dimensional case.

Namely, the constraint x+y=0 can be solved for y giving

(1) y(x)=-x.

Geometrically, the constraint defines a straight line on the plane (x,y). The set of points (x,y,z) in the 3D space is a vertical plane through the line y=-x, because the constraint does not contain any restrictions on z and, for each (x,y), z can be arbitrary. This vertical plane cuts the surface z=f(x,y) along a curve which we need to minimize.

Plugging (1) in f, we have a function of one variable:


Obviously, this is minimized at x=0, which gives y=0. Thus, the point (0,0) is the solution.

Let's think again about the solution presented. Did you see an implicit function in that solution? Solving an equation g(x,y)=0 for y means exactly finding y as an implicit function of x. In our case it was easy to find (1). We need a condition that would guarantee the existence of an implicit function in the general case.

Example 2. Let g(x,y)=x. Can we find y from x=0? Obviously, no, because y is not in the equation. We need a condition that makes sure that g(x,y) indeed depends on y in a nontrivial way.

(2) \frac{\partial g}{\partial y}\neq 0

is such a condition, according to the next theorem.

Implicit function theorem. If (2) holds at some point (x_0,y_0), then the equation

(3) g(x,y)=0

defines y as an implicit function of x, in some neighborhood of (x_0,y_0).

Example 3. The equation

(4) x^2+y^2=1

describes a circle of radius 1 centered at the origin. In this case g(x,y)=x^2+y^2-1, \frac{\partial g}{\partial y}=2y. Thus, when y\neq 0, we can solve (4) for y. For y>0 we have y=\sqrt{1-x^{2}} and for y<0 we have y=-\sqrt{1-x^2}.

Remark. Condition (2) also allows us to find the derivative y^{\prime }(x). When the implicit function determined by (3) exists, (3) in fact becomes

(5) g(x,y(x))=0.

Differentiating both sides with respect to x and using the 2D version of the chain rule we get \frac{\partial g}{\partial x}x^\prime+\frac{\partial g}{\partial y}y^\prime(x)=\frac{\partial g}{\partial x}+\frac{\partial g}{\partial y}y^\prime(x)=0.

Because of (2), from here we can find the derivative

(6) y^\prime(x)=-\frac{\partial g}{\partial x}/\frac{\partial g}{\partial y}.

In particular, for Example 3 for y>0 we have y^\prime(x)=-\frac{2x}{2y}=-\frac{x}{\sqrt{1-x^2}}, which could be found directly from our solution for y.

Leave a Reply

You must be logged in to post a comment.