Here we look at the solution of
Example 7.5 (mt3042 Optimization Guide by M. Baltovic)
Suppose a consumer, let's call her Ms Thrifty, has an initial wealth of . In any period , denote the wealth she has at the start of the period by , so that She can choose to spend , giving her a utility of . However, the wealth she does not consume in that period will be stored in a bank account where it earns an interest rate of by the beginning of the next period. Thus, she will enter the next time period with wealth .
Actions of Ms Thrifty can be identified with her spendings and the same spendings are her rewards. Thus, a strategy is a sequence of spendings and the value of that strategy is the sum of rewards To obtain the solution, we use backwards induction.
Period . The reward is obviously maximized at so that
Period . According to the Bellman equations, at each state we need to select an action that maximizes the sum
I deliberately don't use the full notation because I prefer the following verbal description: is the immediate reward for taking action at state (for taking one step leading us to the next state). Whatever is the next state, means an optimal strategy leading us from that state to the final destination. For period we use (1) to write (2) as
Since (3) is maximized at , giving
Period . Here (2) is
This is maximized at and the value function for three last actions is . This easily generalizes leading to the solution
Ms Thrifty fasts for periods, then gorges in the last period and dies a happy death. Math is a cruel science.