26
Nov 17

## Cauchy-Schwarz inequality and optimization 2

Here we consider a problem from mt3042 Optimization Guide by M. Baltovic:

Example 8.1. Given the initial state $s_{1}=W$ for some $W>0$ and a discount factor $\delta\in(0,1),$ we need to maximize the discounted stream of future rewards

(1) $\sum_{i=1}^{\infty }\delta^t\sqrt{a_t}.$

The maximization is over actions $(a_1,a_2,...)$ (you can think of them as consumptions). The initial state $s_1=W$ is the initial wealth; at each step, the consumption is chosen between zero and the wealth left from the previous period, $0\leq a_t\leq s_t.$ This reduces the amount available for consumption in the next period:

(2) $s_{t+1}=s_t-a_t,$ $t\geq 1.$

Step 1. Here we reformulate the problem to remove the states from the constraints. From (2) we have $a_t=s_t-s_{t+1}.$ Fixing an integer $T>0$ then we obtain

(3) $a_1+a_2+...+a_{T-1}+a_T=s_1-s_2+s_2-s_3+...+s_{T-1}-s_T+s_T-s_{T+1}=s_1-s_{T+1}.$

Since in each state $s_t$ consumption $a_t$ is nonnegative, the states can only decrease over time. On the other hand, they cannot be negative. Hence, the sequence $\{s_T\}$ is nonincreasing and bounded from below by zero. Therefore the limit $S=\lim_{T\rightarrow\infty}s_T$ exists and is nonnegative. Letting $T\rightarrow\infty$ in (3) we get $\sum_{t=1}^{\infty}a_t=W-S\leq W.$

The problem becomes: maximize (1) subject to

(4) $\sum_{t=1}^{\infty}a_t\leq W.$

Step 2. Getting rid of square roots. Denoting $x_t=\sqrt{a_t},$ we have to maximize $\sum_{i=1}^{\infty }\delta^tx_t$ subject to $\sum_{t=1}^{\infty }x_t^2\leq W.$

Step 3. Applying the result on the Cauchy-Schwarz inequality. The result from the previous post does not depend on the number of variables. For our present needs it is formulated as follows: Let $a_1,a_2,... \ge 0$ be fixed numbers and let $x_1,x_2,... \ge 0$ be variables satisfying the restriction $x_1^2+x_2^2+... \le W.$ Then the infinite weighted sum $a_1x_1+a_2x_2+...$ is maximized at the point $(x_1,x_2,...)=\frac{W^{1/2}}{\left( a_1^2+a_2^2+... \right)^{1/2}}(a_1,a_2,...)$

and the maximized value is $a_1x_1+a_2x_2+...=W^{1/2} \left(a_1^2+a_2^2+... \right)^{1/2}.$

Letting $a_{t}=\delta^t$ we get $a_1^2+a_2^2+...=1/(1-\delta^2)$ and $(x_1,x_2,...)=[(1-\delta^2)W]^{1/2}(1,\delta,\delta^2,...)$ with the maximized value $\sqrt{W/(1-\delta^2)}.$

And no clumsy terminology and notation from the Infinite Horizon Dynamic Programming method!