# Dynamic Programming Recap

algorithmDynamic Programming, aka DP, “solves the problem by breaking it into sub-problems and then recursively finding the optimal solutions to the sub-problems is said to have optimal substructure”(see Wikipedia). More concretely, the problem suffice the Principle of Optimality:

Principle of Optimality: An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision. (See Bellman, 1957, Chap. III.3.)

Or mathematically, to solve the problem from `t`

to `T`

optimally, we
can solve a sub-problem from `t`

to `s`

given `t < s < T`

and extend `s`

to `T`

.

It is relatively intuitive to reason the optimal substructure for most problems, we will only focus how to build up the optimal solution in this post.

## Top-down vs. bottom-up

We are trained to solve the problem in the top-down approach: break down a big problem to several smaller problems, recursively doing so until the smaller problems can be comfortably tackled. CLRS points out that the time complexity of top-down with memoization is the same as the bottom-up approach, but the latter is preferred as the top-down approach requires $O(n)$ callstacks which caps the problem scope.

Using the #322 Coin Change problem as one example:

Given the

`amount`

with differentcoin denominations, find the fewest coins to make up the amount.

In the top-down approach, the problem can be decomposed as:

`f(amount) = min(f(amount - c) for c in coins) + 1`

We can convert the recursion to a forward loop with the lookup table, `dp`

:

```
def coin_change(coins, amount):
dp = [float('inf')] * (amount + 1)
dp[0] = 0
for index in range(1, amount + 1):
for c in coins:
if index >= c:
dp[index] = min(dp[index], dp[index - c] + 1)
return dp[-1] if dp[-1] < float('inf') else -1
```

## Look back vs. Look forward

We solve the DP problems by looking back the sub-problems; on the other hand, some problems require us to look forward: the solution for the current sub-problem depends on the sub-problems in the future. Technically, they SHOULD be categorized as BFS, — the solution space is explored with the breadth first search. They are discussed here just for the reference.

For example, the #55 Jump Game:

Given an array of non-negative integers, each element specifies the maximum jumps at that position, determine whether you are able to reach last index from the first index*.

If we consider positions as a nodes, and the jump as the directed connection;
the problem is transformed whether there exists a path from `start`

to the
`end`

.

```
from collections import deque
def can_jump_bfs(nums):
if len(nums) <= 1:
return True
# Push the first position to the queue, and mark it visited.
queue = deque([0])
bfs = [False] * len(nums)
bfs[0] = True
while queue:
pos = queue.pop() # optimization: pop from the end of queue
for step in range(1, nums[pos] + 1):
jump_to = pos + step
if jump_to == len(nums) - 1:
return True
# Append to the queue if not visited.
if not bfs[jump_to]:
bfs[jump_to] = True
queue.append(jump_to)
return False
```

## Multi-dimension DP

In practice, I’ve only seen the two types of problems attacked by the two-dimension DP:

- the problem space is defined as $m \times n$, such as the Smith-Waterman algorithm.
- the problem space is linear, but the sub-problem MUST be solved from any
arbitrary
`(i, j)`

range, such as multiply matrix product.

In this section, let’s discuss the second case with the #312. Burst Balloons:

Given n balloons with designated values, burst balloon i will get

`nums[left] * nums[i] * nums[right]`

coins, try to maximize the gain.

If we add two balloons with value of `1`

on both sides, the
original problem become: with `n + 2`

balloons from `0`

to `n + 1`

and
`balloons[0] == balloons[n + 1] == 1`

, burst balloons from `1`

to `n`

,
maximize the profit.

If we burst `balloons[i]`

first, the `balloons[i - 1]`

and
`balloons[i + 1]`

become adjacent. The solution to the sub-problem
grows complicated due to the bookkeeping of adjacent balloons. Instead,
we can divide-and-conquer the problem by focusing on the *last balloon*
to burst: if `balloons[i]`

is the last balloon to burst, the accumulated
coins are:

$f(0, i) + f(i, n + 1) + nums[0] \times nums[i] \times nums[n + 1]$

- The
`ith`

balloons is the last balloon too burst, so the last trio is (0, i, n + 1). - Recall the
`f`

method will*NOT*burst the balloon on the boundary, so only`balloon[1]`

to`balloons[i - 1]`

are bursted in`f(0, i)`

. - The same rule applies to
`f(i , n + 1)`

.

The problem can be solved by two sub-problems in the top-down fashion. If we build the DP table from the bottom up:

- initialize the
`dp`

matrix $(n + 2) \times (n + 2)$ - we can expect minimum 3 balloons for any
`dp[i][j]`

, apply the logic. - extend the range of
`(i, j)`

until we get`(0, n + 1)`

.

```
def max_coins(nums):
n = len(nums)
nums = [1] + nums + [1]
# dp is (n + 2) * (n + 2) matrix
dp = [ [0] * (n + 2) for i in range(n + 2)]
# With nums[0] and nums[n + 1] present, we can expect 3 balloons as minimum
for gap in range(2, n + 2):
for first in range(0, n + 2 - gap):
last = first + gap
if last - first == 2:
dp[first][last] = nums[first] * nums[first + 1] * nums[last]
else:
dp[first][last] = max(
dp[first][j] + dp[j][last] + nums[first] * nums[j] * nums[last]
for j in range(first + 1, last)
)
return dp[0][n + 1]
```

## Multiple-pass DP

In some special case, we can optimize the multiple-dimension DP to multiple-pass DP. For example, #123 Best Time to Buy and Sell Stock III:

Given the stock price for each day, maximize the profit with

twotransactions.

Consider we keep tracks the `balance[i]`

, we can maximize the balance by:

- find the time to buy low, aka maximize
`balances[i] - prices[i]`

- find the time to sell high, aka maximize
`balances[i] + prices[i]`

We can repeat `k`

times(k = 2 in this case) to fully exploit the market.

```
def max_profits(prices):
balances = [0] * len(prices)
for t in range(2):
best_buy = float('-inf')
best_profit = 0
for i, price in enumerate(prices):
best_buy = max(best_buy, balances[i] - prices[i])
best_profit = max(best_profit, best_buy + prices[i])
balances[i] = best_profit
return balances[-1]
```

## Conclusion

In this post, we demonstrate several use cases of the dynamic programming from different angles for a better understanding. The cornerstone of DP is how to break down the problem to sub-problems for the optimal solution, then we can build the solution in the bottom-up fashion.