The Kuhn-Tucker theorem ins and outs
We capitalize on previous posts on constrained optimization to explain the format of the Kuhn-Tucker theorem. So far all proofs for one constraint have been done in full. Here we indicate how they can be generalized to the case of many constraints, without going into technicalities.
Problem statement
Consider the maximization problem with several inequality constraints
(1) maximize subject to
.
Suppose is a maximizing point and
out of
constraints are binding. Renumbering the constraints, if necessary, we can assume that it is the first
constraints that are binding:
(2)
The full Lagrangian is defined by where
The purpose is to formulate conditions which the Lagrangian, Lagrangian multipliers and constraints should satisfy.
The implicit function existence
Let us separate the equality constraints by including them in
Equality constraints are supposed to determine some of the variables as implicit functions of the others. The number of the implicit functions should be equal to the number of constraints. This results in the condition that the matrix
should have rank The number of "the other variables" should be positive, which leads to the condition
Reduction of inequality-constrained problem to equality-constrained one
If you compare my posts with the book by Sundaram, R.K. (A First Course in Optimisation Theory, Cambridge: Cambridge University Press, 1996) or the MT3042 Optimisation Theory Guide by M. Balcovic, you will notice that they assume the domain of to be an open set, while I everywhere implicitly assumed it to be the whole
This difference doesn't matter, as long as the extreme point is an internal point of the domain, which I also implicitly assumed. Another assumption that I have been skipping is that the functions have continuous derivatives of appropriate orders (one or two, depending on the context). Now I have to explicitly make and use such assumptions.
Smoothness conditions. The objective and constraint functions have continuous first derivatives.
Differentiable functions are known to be continuous. Therefore the set determined by the nonbinding constraints is open. Since by assumption the maximizing point belongs to the set (2), it will not change if we replace the original domain of
, whatever it was, by
. What we have achieved by this trick is that the nonbinding constraints have been subsumed in the domain and can be dropped from the optimization problem. The resulting problem
(3) maximize subject to
has only equality constraints and admits application of the result we already have.
The Kuhn-Tucker theorem derived from constrained optimization
For (3) we form a partial Lagrangian with
The first-order conditions are
(4) ,
.
Let us set the remaining components of to zero:
(5)
For this reason the partial Lagrangian is the same as the full one, and the first equation from (4) is the same as
or, in terms of gradients,
(A)
From (1), obviously,
(B) .
Next we need to establish
(C) .
The first equations follow from (2) and the last
follow from (5). Finally, the Kuhn-Tucker theorem states that
(D)
The last inequalities follow from (5). The first
inequalities actually express the property: If a constraint is efficient, then the Lagrange multiplier is nonnegative. This property was established by perturbing one equality constraint. Since in the present case each equality constraint can be perturbed independently of the others, the same proof goes through.
Kuhn-Tucker theorem. If the implicit function existence and smoothness conditions are satisfied, then an internal maximum must satisfy (A)-(D).
Remark. Karush, in his master's thesis, discovered this theorem 12 years before Kuhn and Tucker published it.