Review of mt3042 Optimization Guide by M. Baltovic
by Kairat Mynbaev, professor, ISE, KBTU, Almaty, Kazakhstan
This is a one-semester course in Optimization. It covers the following topics:
- Introduction and review of relevant parts from real analysis, with emphasis on higher dimensions.
- Weierstrass' Theorem on continuous functions on compact sets.
- Review with added rigour of unconstrained optimization of differentiable functions.
- Lagrange's Theorem on equality-constrained optimization.
- The Kuhn-Tucker Theorem on inequality-constrained optimization.
- Finite and infinite horizon dynamic programming.
The course is based mainly on the book by Sundaram, L.R. A First Course in Optimization Theory. (Cambridge University Press, 1996).
The course is given to fourth year students. I evaluate the guide on two points: how well it expands the students’ theoretical horizon and how well it prepares them to the challenges they may face while applying their knowledge in their careers, whether in industry or in academia.
- The exposition is overly formal. Experience shows that most students don’t understand the definition of continuity in terms of epsilon-delta. Right after giving it on p.18, the author gives an alternative definition in terms of sequences. I think it’s better to omit the former definition altogether and rephrase everything in terms of sequences. After all, the compactness notion relies on sequences.
- The differentiability definition 2.21 on p.19 is simply indigestible. It is in fact the Fréchet derivative applicable in the infinite-dimensional case. Who needs it here? Why not define the matrix as a matrix of partial derivatives, which students know from previous courses?
- The solution to Example 3.2 is overblown. A professional mathematician never thinks like that. A pro would explain the idea as follows: because of Condition 2, the function is close to zero in some neighborhood of infinity. Therefore, a maximum should be looked for in a bounded closed set. Since this is a compact, the Weierstrass theorem applies. With a proper graphical illustration, the students don’t need anything else. Remarks similar to this apply to many solutions in the guide. As a result, the students don’t see the forest behind the trees.
- Regarding the theoretical conditions, the author refers to Sundaram without explaining why they are imposed. Explanations in Sundaram are far from intuitive (see the first of them on p.107). In all cases for n=2 it is possible to give relatively simple intuitive explanations. Specifically,
- For first-order conditions see
- For second-order conditions see
- For the constraint qualification condition, see
The explanation on pp.58-60 is hard to understand because of dimensionality.
- For the Lagrange method see
(case of many constraints) and
- In case of the Kuhn-Tucker theorem, the most important point is that, once the binding constraints have been determined, the nonbinding ones can be omitted from the analysis
The proof of nonnegativity of the Lagrange multiplier for binding constraints is less than one page:
- For first-order conditions see
- In solutions that rely on the Kuhn-Tucker theorem, the author suggests to check the constraint qualification condition for all possible combinations of constraints. Not only is this time consuming, but this is also misleading, given the fact that often it is possible to determine the binding constraints and use the Lagrange method instead of the Kuhn-Tucker theorem.
- Minor remark: In Example 5.2, the function is obviously nonnegative.
- There are many optimization methods not covered in Sundaram’s book. One of them, Pontryagin’s maximum principle, is more general than the Bellman approach, see
It may be too advanced for this course. However, convexity is covered by Sundaram and omitted by Baltovic, while it can be successfully applied to solve some problems from the guide, see
- Another important area that is covered neither in the book nor in the guide is numerical optimization. Numerical optimization is as important as the Maximum Likelihood method in Statistics because ML realization in any software employs numerical optimization. People need to learn at least one method in numerical optimization to understand error messages produced on the computer.
- Speaking of applications, I would eliminate all big exercises that have nothing to do with Economics or Finance, such as Exercise 6.2.
- In the solution to Exercise 6.1, the author produces a 3-page analysis but then in one line discovers that all that was for nothing because the profit is infinite. What’s the pedagogical value of such an exposition?
Conclusion. This is a badly written guide for a badly designed course. I suggest removing it from our program.