I've always wondered about this. And no books state this explicitly.
Backtracking is exploring all possibilities until we figure out one possibility cannot lead us to a possible solution, in that case we drop it.
Dynamic programming as I understand it is characterized by overlapping sub-problems. So, can dynamic programming can be stated as backtracking with cache (for previously explored paths) ?
Thanks
This is one face of dynamic programming, but there's more to it.
For a trivial example, take Fibonacci numbers:
F (n) =
n = 0: 0
n = 1: 1
else: F (n - 2) + F (n - 1)
We can call the above code "backtracking" or "recursion".
Let us transform it into "backtracking with cache" or "recursion with memoization":
F (n) =
n in Fcache: Fcache[n]
n = 0: 0, and cache it as Fcache[0]
n = 1: 1, and cache it as Fcache[1]
else: F (n - 2) + F (n - 1), and cache it as Fcache[n]
Still, there is more to it.
If a problem can be solved by dynamic programming, there is a directed acyclic graph of states and dependencies between them.
There is a state that interests us.
There are also base states for which we know the answer right away.
We can traverse that graph from the vertex that interests us to all its dependencies, from them to all their dependencies in turn, etc., stopping to branch further at the base states.
This can be done via recursion.
A directed acyclic graph can be viewed as a partial order on vertices. We can topologically sort that graph and visit the vertices in sorted order.
Additionally, you can find some simple total order which is consistent with your partial order.
Also note that we can often observe some structure on states.
For example, the states can be often expressed as integers or tuples of integers.
So, instead of using generic caching techniques (e.g., associative arrays to store state->value pairs), we may be able to preallocate a regular array which is easier and faster to use.
Back to our Fibonacci example, the partial order relation is just that state n >= 2 depends on states n - 1 and n - 2.
The base states are n = 0 and n = 1.
A simple total order consistent with this order relation is the natural order: 0, 1, 2, ....
Here is what we start with:
Preallocate array F with indices 0 to n, inclusive
F[0] = 0
F[1] = 1
Fine, we have the order in which to visit the states.
Now, what's a "visit"?
There are again two possibilities:
(1) "Backward DP": When we visit a state u, we look at all its dependencies v and calculate the answer for that state u:
for u = 2, 3, ..., n:
F[u] = F[u - 1] + F[u - 2]
(2) "Forward DP": When we visit a state u, we look at all states v that depend on it and account for u in each of these states v:
for u = 1, 2, 3, ..., n - 1:
add F[u] to F[u + 1]
add F[u] to F[u + 2]
Note that in the former case, we still use the formula for Fibonacci numbers directly.
However, in the latter case, the imperative code cannot be readily expressed by a mathematical formula.
Still, in some problems, the "forward DP" approach is more intuitive (no good example for now; anyone willing to contribute it?).
One more use of dynamic programming which is hard to express as backtracking is the following: Dijkstra's algorithm can be considered DP, too.
In the algorithm, we construct the optimal paths tree by adding vertices to it.
When we add a vertex, we use the fact that the whole path to it - except the very last edge in the path - is already known to be optimal.
So, we actually use an optimal solution to a subproblem - which is exactly the thing we do in DP.
Still, the order in which we add vertices to the tree is not known in advance.
No. Or rather sort of.
In backtracking, you go down and then back up each path. However, dynamic programming works bottom-up, so you only get the going-back-up part not the original going-down part. Furthermore, the order in dynamic programming is more breadth first, whereas backtracking is usually depth first.
On the other hand, memoization (dynamic programming's very close cousin) does very often work as backtracking with a cache, as you describede.
Yes and no.
Dynamic Programming is basically an efficient way to implement a recursive formula, and top-down DP is many times actually done with recursion + cache:
def f(x):
if x is in cache:
return cache[x]
else:
res <- .. do something with f(x-k)
cahce[x] <- res
return res
Note that bottom-up DP is implemented completely different however - but still pretty much follows the basic principles of the recursive approach, and at each step 'calculates' the recursive formula on the smaller (already known) sub-problems.
However, in order to be able to use DP - you need to have some characteristics for the problem, mainly - an optimal solution to the problem consists of optimal solutions to its sub-problems. An example where it holds is shortest-path problem (An optimal path from s to t that goes through u must consist of an optimal path from s to u).
It does not exist on some other problems such as Vertex-Cover or Boolean satisfiability Problem , and thus you cannot replace the backtracking solution for it with DP.
No. What you call backtracking with cache is basically memoization.
In dynamic programming, you go bottom-up. That is, you start from a place where you don't need any subproblems. In particular, when you need to calculate the nth step, all the n-1 steps are already calculated.
This is not the case for memoization. Here, you start off from the kth step (the step you want) and go on solving the previous steps wherever required. And obviously keep these values stored somewhere so that you may access these later.
All these being said, there are no differences in running time in case of memoization and dynamic programming.
Related
I am trying to implement the Held-Karp algorithm for the Traveling Salesman Problem by following this pseudocode:
(which I found here: https://en.wikipedia.org/wiki/Held%E2%80%93Karp_algorithm#Example.5B4.5D )
I can do the algorithm by hand but am having trouble actually implementing it in code. It would be great if someone could provide an easy-to-follow explanation.
I also don't understand this:
I thought this part was for setting the distance from the starting city to it's connected cities. If that was the case, wouldn't it be it C({1}, k) := d1,k and not C({k}, k) := d1,k? Am I just completely misunderstanding this?
I have also heard that this algorithm does not perform very well past about 15-20 cities so for around 40 cities, what would be a good alternative?
Held-Karp is a dynamic programming approach.
In dynamic programming, you break the task into subtasks and use "dynamic function" to solve larger subtasks using already computed results of smaller subtasks, until you finally solve your task.
To understand a DP algorithm it's imperative to understand how it defines subtask and dynamic function.
In the case of Held-Karp, the subtask is following:
For a given set of vertices S and a vertex k (1 ∉ S, k ∈ S)
C(S,k) is the minimal length of the path that starts with vertex 1, traverses all vertices in S and ends with the vertex k.
Given this subtask definition, it's clear why initialization is:
C({k}, k) := d(1,k)
The minimal length of the path from 1 to k, traversing through {k}, is just the edge from 1 to k.
Next, the "dynamic function".
A side note, DP algorithm could be written as top-down or bottom-up. This pseudocode is bottom-up, meaning it computes smaller tasks first and uses their results for larger tasks. To be more specific, it computes tasks in the order of increasing size of the set S, starting from |S|=1 and going up to |S| = n-1 (i.e. S containing all vertices, except 1).
Now, consider a task, defined by some S, k. Remember, it corresponds to path from 1, through S, ending in k.
We break it into a:
path from 1, through all vertices in S except k (S\k), which ends in the vertex m (m ∈ S, m ≠ k): C(S\k, m)
an edge from m to k
It's easy to see, that if we look through all possible ways to break C(S,k) like this, and find the minimal path among them, we'll have the answer for C(S, k).
Finally, having computed all C(S, k) for |S| = n-1, we check all of them, completing the cycle with the missing edge from k to 1: d(1,k). The minimal cycle is the final result.
Regarding:
I have also heard that this algorithm does not perform very well past about 15-20 cities so for around 40 cities, what would be a good alternative?
Held-Karp has algorithmic complexity of θ(n²2n). 40² * 240 ≈ 1.75 * 1015 which, I would say, is unfeasible to compute on a single machine in reasonable time.
As David Eisenstat suggested, there are approaches using mixed integer programming that can solve this problem fast enough for N=40.
For example, see this blog post, and this project that builds upon it.
Two criteria for an algorithm to be solved by dynamic programming technique is
Sub problems should be independent.
Sub problems should overlap .
I think I understand what overlapping means . It basically means that the subproblems have subsubproblems that may be the same . So instead of solving the subsubproblem over and over again we solve it once, put it in a hashtable or array and can look it up the nest time it is required . But what does point 1 ie independence of subproblems mean here ? If they have some common subsubproblems how can we call them to be independent? I mean it sounds very much counterintuitive to me at this stage .
Edit: This crtiteria is actually given in the famous book: Introduction to Algorithms by CLRS in the Dynamic Programming chapter.
Please tell us where you are reading that DP applies to problems with overlapping and independent sub-problems. I don't think that's correct, for the same intuitive reason you give-- if the problems overlap, they aren't independent.
I usually see independent sub-problems given as a criterion for Divide-And-Conquer style algorithms, while I see overlapping sub-problems and optimal sub-structure given as criteria for the Dynamic Programming family. (Intuitively, optimal substructure means that the best solution of a larger problem is composed of the best solutions of sub-problems. The classic example is the shortest path in a graph problem: If you know that the shortest path from A to B goes through C, then you also know that the part of the shortest path from A to B that goes through C happens to be the shortest path from A to C.)
UPDATE: Oh, I see-- yes, I guess they do mention independence. But I don't read that with the same emphasis that you are. Meaning, they mention independence in the context of, or as a way of understanding, the larger and more important concept of optimal substructure.
What they mean specifically by independence is that even if two problems overlap, they are "independent" in the sense that they don't interact-- the solution to one does not really depend on the solution to the other. They actually use the same example I did, the shortest path. Sub-problems of the shortest path problem are smaller shortest path problems that are independent: If the shortest path from A to B goes through C, then the shortest path from A to C doesn't use any edges in the shortest path from C to B. The longest path problem, by contrast, does not share that independence of sub-problems.
I don't think CLRS are wrong to bring up independence, but I do think the language they're using is a little ambiguous.
As offered in CLRS, the authors address the distinction between independent and overlapping properties of subproblems. They write,
"It may seem strange that dynamic programming relies on subproblems being both independent and overlapping. Although these requirements may sound contradictory, they describe two different notions, rather than two points on the same axis. Two subproblems of the same problem are independent if they do not share resources. Two subproblems are overlapping if they are really the same subproblem that occurs as a subproblem of different problems" (CLRS 3rd edition, 386).
I think these criteria have been worded badly because overlapping and independent have sort of a clashing meaning.
Anyway to be able to use effectively a DP approach you need to have
a problem that can be defined recursively in terms of simpler problems
a concept of partial solution in which the solution to the remaining part does't depend ho how you got to current point
Example: if you want to compute what is the maximum-sum path when moving in a matrix starting from the first row and having each step on next row and in the same or in an adjacent column you can use as "state" the current sum, current row and current column because for the solution it doesn't matter what was the path used to get to the current position.
1 4 [3] 2 1 4 9
2 1 [3] 1 2 3 1
9 [8] 3 0 1 2 9
0 [0] 2 4 1 6 3
1 2 [6] 3 0 4 1
In the schema above this path has a sum of 3+3+8+0+6. To maximize the sum you can observe that the maximum for paths passing from a certain point can be obtained as the maximum for getting there and the maximum for going from there to the end of the matrix. The solution can therefore be split in independent subproblems and you can cache the result of what is the maximum sum from a given point of the matrix to the end (independently on how you got to the point).
def maxsum(x, y):
if (x, y) in cache:
return cache[(x, y)]
if y == height - 1:
return matrix[y][x]
if x == 0:
left = -1
else:
left = matrix[y][x] + maxsum(x-1, y+1)
center = matrix[y][x] + maxsum(x, y+1)
if x == width-1:
right = -1
else:
right = matrix[y][x] + maxsum(x+1, y+1)
result = cache[(x, y)] = max(left, center, right)
return result
If I add to the rules that no more than three "9"s are allowed however you cannot use as state just the coordinates, because following subproblem (going to the end) will be influenced by the previous one (i.e. by how many "9" you already collected while getting to the intermediate position).
You can still use a dynamic programming approach, but with a larger state space by for example adding the numbers of collected "9" to the current state representation.
def maxsum(x, y, number_of_nines):
if (x, y, number_of_nines) in cache:
return cache[(x, y, number_of_nines)]
...
My understanding is sub problem should be solved independent of parent bigger problem. Like in backtracking the subproblem do depend on solutions you picked in bigger problems.
The subproblems are independent.
Independence is not there in divide and conquer.
For eg. in mergesort.
The subproblems are merged after dividing which means that the solution has had common subproblems. And everything is needed to be merged and not one path will give an answer.
Every subproblem share sub-subproblem which are needed to be solved in order to get the final answer.
(1, 4)
/ \
(1, 2) (3, 4)
/ \ / \
(1,1) (2,2) (3,3) (4,4)
\ / \ /
(1,2) (3, 4)
\ /
(1, 4)
I don't think subproblems should be dependent. In fact it would be great if subproblems are independent, but it's not necessary.
A good example for a dp problem with dependent subproblems is here:
Paint Houses - Algorithmic problems (paint house problem)
Here, the solution to subproblems depends on the color of the previous house. That dependency can be solved by adding a dimension to the dp array and building the solution based on the color of the previous house.
I have a simple algorithmic question. I would be grateful if you could help me.
We have some 2 dimensional points. A positive weight is associated to them (a sample problem is attached). We want to select a subset of them which maximizes the weights and neither of two selected points overlap each other (for example, in the attached file, we cannot select both A and C because they are in the same row, and in the same way we cannot select both A and B, because they are in the same column.) If there is any greedy (or dynamic) approach I can use. I'm aware of non-overlapping interval selection algorithm, but I cannot use it here, because my problem is 2 dimensional.
Any reference or note is appreciated.
Regards
Attachment:
A simple sample of the problem:
A (30$) -------- B (10$)
|
|
|
|
C (8$)
If you are OK with a good solution, and do not demand the best solution - you can use heuristical algorithms to solve this.
Let S be the set of points, and w(s) - the weightening function.
Create a weight function W:2^S->R (from the subsets of S to real numbers):
W(U) = - INFINITY is the solution is not feasible
Sigma(w(u)) for each u in U otherwise
Also create a function next:2^S -> 2^2^S (a function that gets a subset of S, and returns a set of subsets of S)
next(U) = V you can get V from U by adding/removing one element to/from U
Now, given that data - you can invoke any optimization algorithm in the Artificial Intelligence book, such as Genetic Algorithm or Hill Climbing.
For example, Hill Climbing with random restarts, will be something like that:
1. best<- -INFINITY
2. while there is more time
3. choose a random subset s
4. NEXT <- next(s)
5. if max{ W(v) | for each v in NEXT} < W(s): //s is a local maximum
5.1. if W(s) > best: best <- W(s) //if s is better then the previous result - store it.
5.2. go to 2. //restart the hill climbing from a different random point.
6. else:
6.1. s <- max { NEXT }
6.2. goto 4.
7. return best //when out of time, return the best solution found so far.
The above algorithm is anytime - meaning it will produce better results if given more time.
This can be treated as a linear assignment problem, which can be solved using an algorithm like the Hungarian algorithm. The algorithm tries to minimize the sum of costs, so just negate your weights, and use them as the costs. The assignment of rows to columns will give you the subset of points that you need. There are sparse variants for cases where not every (row,column) pair has an associated point, but you can also just use a large positive cost for these.
Well you can think of this as a binary constraint optimization problem, and there are various algorithms. The easiest algorithm for this problem is backtracking and arc propogation. However, it takes exponential time in the worst case. I am not sure if there are any specific algorithms to take advantage of the geometrical nature of the problem.
This can be solved by a pretty straight forward dynamic programming approach with a exponential time complexity
s = {A, B, C ...}
getMaxSum(s) = max( A.value + getMaxSum(compatibleSubSet(s, A)),
B.value + getMaxSum(compatibleSubSet(s, B)),
...)
where compatibleSubSet(s, A) gets the subset of s that does not overlap with A
To optimize it, you can memorize the result for each subset
Some way to do it:
Write a function that generates subsets ordered from the subset off maximum weight to the subset off minimum weight while ignoring the constraints.
Then call this function repeatedly until a subset that honors the constraints pops up.
In order to improve the performance, you can write a not so dumb generator function that for instance honors the not-on-the-same-row constraint but that ignores the not-on-the-same-column one.
I recently had this problem on a test: given a set of points m (all on the x-axis) and a set n of lines with endpoints [l, r] (again on the x-axis), find the minimum subset of n such that all points are covered by a line. Prove that your solution always finds the minimum subset.
The algorithm I wrote for it was something to the effect of:
(say lines are stored as arrays with the left endpoint in position 0 and the right in position 1)
algorithm coverPoints(set[] m, set[][] n):
chosenLines = []
while m is not empty:
minX = min(m)
bestLine = n[0]
for i=1 to length of n:
if n[i][0] <= minX and n[i][1] > bestLine[1] then
bestLine = n[i]
add bestLine to chosenLines
for i=0 to length of m:
if m[i] <= bestLine[1] then delete m[i] from m
return chosenLines
I'm just not sure if this always finds the minimum solution. It's a simple greedy algorithm so my gut tells me it won't, but one of my friends who is much better than me at this says that for this problem a greedy algorithm like this always finds the minimal solution. For proving mine always finds the minimal solution I did a very hand wavy proof by contradiction where I made an assumption that probably isn't true at all. I forget exactly what I did.
If this isn't a minimal solution, is there a way to do it in less than something like O(n!) time?
Thanks
Your greedy algorithm IS correct.
We can prove this by showing that ANY other covering can only be improved by replacing it with the cover produced by your algorithm.
Let C be a valid covering for a given input (not necessarily an optimal one), and let S be the covering according to your algorithm. Now lets inspect the points p1, p2, ... pk, that represent the min points you deal with at each iteration step. The covering C must cover them all as well. Observe that there is no segment in C covering two of these points; otherwise, your algorithm would have chosen this segment! Therefore, |C|>=k. And what is the cost (segments count) in your algorithm? |S|=k.
That completes the proof.
Two notes:
1) Implementation: Initializing bestLine with n[0] is incorrect, since the loop may be unable to improve it, and n[0] does not necessarily cover minX.
2) Actually this problem is a simplified version of the Set Cover problem. While the original is NP-complete, this variation results to be polynomial.
Hint: first try proving your algorithm works for sets of size 0, 1, 2... and see if you can generalise this to create a proof by induction.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
What is dynamic programming?
How is it different from recursion, memoization, etc?
I have read the wikipedia article on it, but I still don't really understand it.
Dynamic programming is when you use past knowledge to make solving a future problem easier.
A good example is solving the Fibonacci sequence for n=1,000,002.
This will be a very long process, but what if I give you the results for n=1,000,000 and n=1,000,001? Suddenly the problem just became more manageable.
Dynamic programming is used a lot in string problems, such as the string edit problem. You solve a subset(s) of the problem and then use that information to solve the more difficult original problem.
With dynamic programming, you store your results in some sort of table generally. When you need the answer to a problem, you reference the table and see if you already know what it is. If not, you use the data in your table to give yourself a stepping stone towards the answer.
The Cormen Algorithms book has a great chapter about dynamic programming. AND it's free on Google Books! Check it out here.
Dynamic programming is a technique used to avoid computing multiple times the same subproblem in a recursive algorithm.
Let's take the simple example of the Fibonacci numbers: finding the n th Fibonacci number defined by
Fn = Fn-1 + Fn-2 and F0 = 0, F1 = 1
Recursion
The obvious way to do this is recursive:
def fibonacci(n):
if n == 0:
return 0
if n == 1:
return 1
return fibonacci(n - 1) + fibonacci(n - 2)
Dynamic Programming
Top Down - Memoization
The recursion does a lot of unnecessary calculations because a given Fibonacci number will be calculated multiple times. An easy way to improve this is to cache the results:
cache = {}
def fibonacci(n):
if n == 0:
return 0
if n == 1:
return 1
if n in cache:
return cache[n]
cache[n] = fibonacci(n - 1) + fibonacci(n - 2)
return cache[n]
Bottom-Up
A better way to do this is to get rid of the recursion all-together by evaluating the results in the right order:
cache = {}
def fibonacci(n):
cache[0] = 0
cache[1] = 1
for i in range(2, n + 1):
cache[i] = cache[i - 1] + cache[i - 2]
return cache[n]
We can even use constant space and store only the necessary partial results along the way:
def fibonacci(n):
fi_minus_2 = 0
fi_minus_1 = 1
for i in range(2, n + 1):
fi = fi_minus_1 + fi_minus_2
fi_minus_1, fi_minus_2 = fi, fi_minus_1
return fi
How apply dynamic programming?
Find the recursion in the problem.
Top-down: store the answer for each subproblem in a table to avoid having to recompute them.
Bottom-up: Find the right order to evaluate the results so that partial results are available when needed.
Dynamic programming generally works for problems that have an inherent left to right order such as strings, trees or integer sequences. If the naive recursive algorithm does not compute the same subproblem multiple times, dynamic programming won't help.
I made a collection of problems to help understand the logic: https://github.com/tristanguigue/dynamic-programing
Memoization is the when you store previous results of a function call (a real function always returns the same thing, given the same inputs). It doesn't make a difference for algorithmic complexity before the results are stored.
Recursion is the method of a function calling itself, usually with a smaller dataset. Since most recursive functions can be converted to similar iterative functions, this doesn't make a difference for algorithmic complexity either.
Dynamic programming is the process of solving easier-to-solve sub-problems and building up the answer from that. Most DP algorithms will be in the running times between a Greedy algorithm (if one exists) and an exponential (enumerate all possibilities and find the best one) algorithm.
DP algorithms could be implemented with recursion, but they don't have to be.
DP algorithms can't be sped up by memoization, since each sub-problem is only ever solved (or the "solve" function called) once.
It's an optimization of your algorithm that cuts running time.
While a Greedy Algorithm is usually called naive, because it may run multiple times over the same set of data, Dynamic Programming avoids this pitfall through a deeper understanding of the partial results that must be stored to help build the final solution.
A simple example is traversing a tree or a graph only through the nodes that would contribute with the solution, or putting into a table the solutions that you've found so far so you can avoid traversing the same nodes over and over.
Here's an example of a problem that's suited for dynamic programming, from UVA's online judge: Edit Steps Ladder.
I'm going to make quick briefing of the important part of this problem's analysis, taken from the book Programming Challenges, I suggest you check it out.
Take a good look at that problem, if we define a cost function telling us how far appart two strings are, we have two consider the three natural types of changes:
Substitution - change a single character from pattern "s" to a different character in text "t", such as changing "shot" to "spot".
Insertion - insert a single character into pattern "s" to help it match text "t", such as changing "ago" to "agog".
Deletion - delete a single character from pattern "s" to help it match text "t", such as changing "hour" to "our".
When we set each of this operations to cost one step we define the edit distance between two strings. So how do we compute it?
We can define a recursive algorithm using the observation that the last character in the string must be either matched, substituted, inserted or deleted. Chopping off the characters in the last edit operation leaves a pair operation leaves a pair of smaller strings. Let i and j be the last character of the relevant prefix of and t, respectively. there are three pairs of shorter strings after the last operation, corresponding to the string after a match/substitution, insertion or deletion. If we knew the cost of editing the three pairs of smaller strings, we could decide which option leads to the best solution and choose that option accordingly. We can learn this cost, through the awesome thing that's recursion:
#define MATCH 0 /* enumerated type symbol for match */
#define INSERT 1 /* enumerated type symbol for insert */
#define DELETE 2 /* enumerated type symbol for delete */
int string_compare(char *s, char *t, int i, int j)
{
int k; /* counter */
int opt[3]; /* cost of the three options */
int lowest_cost; /* lowest cost */
if (i == 0) return(j * indel(’ ’));
if (j == 0) return(i * indel(’ ’));
opt[MATCH] = string_compare(s,t,i-1,j-1) +
match(s[i],t[j]);
opt[INSERT] = string_compare(s,t,i,j-1) +
indel(t[j]);
opt[DELETE] = string_compare(s,t,i-1,j) +
indel(s[i]);
lowest_cost = opt[MATCH];
for (k=INSERT; k<=DELETE; k++)
if (opt[k] < lowest_cost) lowest_cost = opt[k];
return( lowest_cost );
}
This algorithm is correct, but is also impossibly slow.
Running on our computer, it takes several seconds to compare two 11-character strings, and the computation disappears into never-never land on anything longer.
Why is the algorithm so slow? It takes exponential time because it recomputes values again and again and again. At every position in the string, the recursion branches three ways, meaning it grows at a rate of at least 3^n – indeed, even faster since most of the calls reduce only one of the two indices, not both of them.
So how can we make the algorithm practical? The important observation is that most of these recursive calls are computing things that have already been computed before. How do we know? Well, there can only be |s| · |t| possible unique recursive calls, since there are only that many distinct (i, j) pairs to serve as the parameters of recursive calls.
By storing the values for each of these (i, j) pairs in a table, we can
avoid recomputing them and just look
them up as needed.
The table is a two-dimensional matrix m where each of the |s|·|t| cells contains the cost of the optimal solution of this subproblem, as well as a parent pointer explaining how we got to this location:
typedef struct {
int cost; /* cost of reaching this cell */
int parent; /* parent cell */
} cell;
cell m[MAXLEN+1][MAXLEN+1]; /* dynamic programming table */
The dynamic programming version has three differences from the recursive version.
First, it gets its intermediate values using table lookup instead of recursive calls.
**Second,**it updates the parent field of each cell, which will enable us to reconstruct the edit sequence later.
**Third,**Third, it is instrumented using a more general goal cell() function instead of just returning m[|s|][|t|].cost. This will enable us to apply this routine to a wider class of problems.
Here, a very particular analysis of what it takes to gather the most optimal partial results, is what makes the solution a "dynamic" one.
Here's an alternate, full solution to the same problem. It's also a "dynamic" one even though its execution is different. I suggest you check out how efficient the solution is by submitting it to UVA's online judge. I find amazing how such a heavy problem was tackled so efficiently.
The key bits of dynamic programming are "overlapping sub-problems" and "optimal substructure". These properties of a problem mean that an optimal solution is composed of the optimal solutions to its sub-problems. For instance, shortest path problems exhibit optimal substructure. The shortest path from A to C is the shortest path from A to some node B followed by the shortest path from that node B to C.
In greater detail, to solve a shortest-path problem you will:
find the distances from the starting node to every node touching it (say from A to B and C)
find the distances from those nodes to the nodes touching them (from B to D and E, and from C to E and F)
we now know the shortest path from A to E: it is the shortest sum of A-x and x-E for some node x that we have visited (either B or C)
repeat this process until we reach the final destination node
Because we are working bottom-up, we already have solutions to the sub-problems when it comes time to use them, by memoizing them.
Remember, dynamic programming problems must have both overlapping sub-problems, and optimal substructure. Generating the Fibonacci sequence is not a dynamic programming problem; it utilizes memoization because it has overlapping sub-problems, but it does not have optimal substructure (because there is no optimization problem involved).
Dynamic Programming
Definition
Dynamic programming (DP) is a general algorithm design technique for solving
problems with overlapping sub-problems. This technique was invented by American
mathematician “Richard Bellman” in 1950s.
Key Idea
The key idea is to save answers of overlapping smaller sub-problems to avoid recomputation.
Dynamic Programming Properties
An instance is solved using the solutions for smaller instances.
The solutions for a smaller instance might be needed multiple times,
so store their results in a table.
Thus each smaller instance is solved only once.
Additional space is used to save time.
I am also very much new to Dynamic Programming (a powerful algorithm for particular type of problems)
In most simple words, just think dynamic programming as a recursive approach with using the previous knowledge
Previous knowledge is what matters here the most, Keep track of the solution of the sub-problems you already have.
Consider this, most basic example for dp from Wikipedia
Finding the fibonacci sequence
function fib(n) // naive implementation
if n <=1 return n
return fib(n − 1) + fib(n − 2)
Lets break down the function call with say n = 5
fib(5)
fib(4) + fib(3)
(fib(3) + fib(2)) + (fib(2) + fib(1))
((fib(2) + fib(1)) + (fib(1) + fib(0))) + ((fib(1) + fib(0)) + fib(1))
(((fib(1) + fib(0)) + fib(1)) + (fib(1) + fib(0))) + ((fib(1) + fib(0)) + fib(1))
In particular, fib(2) was calculated three times from scratch. In larger examples, many more values of fib, or sub-problems, are recalculated, leading to an exponential time algorithm.
Now, lets try it by storing the value we already found out in a data-structure say a Map
var m := map(0 → 0, 1 → 1)
function fib(n)
if key n is not in map m
m[n] := fib(n − 1) + fib(n − 2)
return m[n]
Here we are saving the solution of sub-problems in the map, if we don't have it already. This technique of saving values which we already had calculated is termed as Memoization.
At last, For a problem, first try to find the states (possible sub-problems and try to think of the better recursion approach so that you can use the solution of previous sub-problem into further ones).
Dynamic programming is a technique for solving problems with overlapping sub problems.
A dynamic programming algorithm solves every sub problem just once and then
Saves its answer in a table (array).
Avoiding the work of re-computing the answer every time the sub problem is encountered.
The underlying idea of dynamic programming is:
Avoid calculating the same stuff twice, usually by keeping a table of known results of sub problems.
The seven steps in the development of a dynamic programming algorithm are as follows:
Establish a recursive property that gives the solution to an instance of the problem.
Develop a recursive algorithm as per recursive property
See if same instance of the problem is being solved again an again in recursive calls
Develop a memoized recursive algorithm
See the pattern in storing the data in the memory
Convert the memoized recursive algorithm into iterative algorithm
Optimize the iterative algorithm by using the storage as required (storage optimization)
in short the difference between recursion memoization and Dynamic programming
Dynamic programming as name suggest is using the previous calculated value to dynamically construct the next new solution
Where to apply dynamic programming : If you solution is based on optimal substructure and overlapping sub problem then in that case using the earlier calculated value will be useful so you do not have to recompute it. It is bottom up approach. Suppose you need to calculate fib(n) in that case all you need to do is add the previous calculated value of fib(n-1) and fib(n-2)
Recursion : Basically subdividing you problem into smaller part to solve it with ease but keep it in mind it does not avoid re computation if we have same value calculated previously in other recursion call.
Memoization : Basically storing the old calculated recursion value in table is known as memoization which will avoid re-computation if its already been calculated by some previous call so any value will be calculated once. So before calculating we check whether this value has already been calculated or not if already calculated then we return the same from table instead of recomputing. It is also top down approach
Here is a simple python code example of Recursive, Top-down, Bottom-up approach for Fibonacci series:
Recursive: O(2n)
def fib_recursive(n):
if n == 1 or n == 2:
return 1
else:
return fib_recursive(n-1) + fib_recursive(n-2)
print(fib_recursive(40))
Top-down: O(n) Efficient for larger input
def fib_memoize_or_top_down(n, mem):
if mem[n] is not 0:
return mem[n]
else:
mem[n] = fib_memoize_or_top_down(n-1, mem) + fib_memoize_or_top_down(n-2, mem)
return mem[n]
n = 40
mem = [0] * (n+1)
mem[1] = 1
mem[2] = 1
print(fib_memoize_or_top_down(n, mem))
Bottom-up: O(n) For simplicity and small input sizes
def fib_bottom_up(n):
mem = [0] * (n+1)
mem[1] = 1
mem[2] = 1
if n == 1 or n == 2:
return 1
for i in range(3, n+1):
mem[i] = mem[i-1] + mem[i-2]
return mem[n]
print(fib_bottom_up(40))