I heard the only difference between dynamic programming and back tracking is DP allows overlapping of sub problems, e.g.
fib(n) = fib(n-1) + fib (n-2)
Is it right? Are there any other differences?
Also, I would like know some common problems solved using these techniques.
There are two typical implementations of Dynamic Programming approach: bottom-to-top and top-to-bottom.
Top-to-bottom Dynamic Programming is nothing else than ordinary recursion, enhanced with memorizing the solutions for intermediate sub-problems. When a given sub-problem arises second (third, fourth...) time, it is not solved from scratch, but instead the previously memorized solution is used right away. This technique is known under the name memoization (no 'r' before 'i').
This is actually what your example with Fibonacci sequence is supposed to illustrate. Just use the recursive formula for Fibonacci sequence, but build the table of fib(i) values along the way, and you get a Top-to-bottom DP algorithm for this problem (so that, for example, if you need to calculate fib(5) second time, you get it from the table instead of calculating it again).
In Bottom-to-top Dynamic Programming the approach is also based on storing sub-solutions in memory, but they are solved in a different order (from smaller to bigger), and the resultant general structure of the algorithm is not recursive. LCS algorithm is a classic Bottom-to-top DP example.
Bottom-to-top DP algorithms are usually more efficient, but they are generally harder (and sometimes impossible) to build, since it is not always easy to predict which primitive sub-problems you are going to need to solve the whole original problem, and which path you have to take from small sub-problems to get to the final solution in the most efficient way.
Dynamic problems also requires "optimal substructure".
According to Wikipedia:
Dynamic programming is a method of
solving complex problems by breaking
them down into simpler steps. It is
applicable to problems that exhibit
the properties of 1) overlapping
subproblems which are only slightly
smaller and 2) optimal substructure.
Backtracking is a general algorithm
for finding all (or some) solutions to
some computational problem, that
incrementally builds candidates to the
solutions, and abandons each partial
candidate c ("backtracks") as soon as
it determines that c cannot possibly
be completed to a valid solution.
For a detailed discussion of "optimal substructure", please read the CLRS book.
Common problems for backtracking I can think of are:
Eight queen puzzle
Map coloring
Sudoku
DP problems:
This website at MIT has a good collection of DP problems with nice animated explanations.
A chapter from a book from a professor at Berkeley.
One more difference could be that Dynamic programming problems usually rely on the principle of optimality. The principle of optimality states that an optimal sequence of decision or choices each sub sequence must also be optimal.
Backtracking problems are usually NOT optimal on their way! They can only be applied to problems which admit the concept of partial candidate solution.
Say that we have a solution tree, whose leaves are the solutions for the original problem, and whose non-leaf nodes are the suboptimal solutions for part of the problem. We try to traverse the solution tree for the solutions.
Dynamic programming is more like BFS: we find all possible suboptimal solutions represented the non-leaf nodes, and only grow the tree by one layer under those non-leaf nodes.
Backtracking is more like DFS: we grow the tree as deep as possible and prune the tree at one node if the solutions under the node are not what we expect.
Then there is one inference derived from the aforementioned theory: Dynamic programming usually takes more space than backtracking, because BFS usually takes more space than DFS (O(N) vs O(log N)). In fact, dynamic programming requires memorizing all the suboptimal solutions in the previous step for later use, while backtracking does not require that.
DP allows for solving a large, computationally intensive problem by breaking it down into subproblems whose solution requires only knowledge of the immediate prior solution. You will get a very good idea by picking up Needleman-Wunsch and solving a sample because it is so easy to see the application.
Backtracking seems to be more complicated where the solution tree is pruned is it is known that a specific path will not yield an optimal result.
Therefore one could say that Backtracking optimizes for memory since DP assumes that all the computations are performed and then the algorithm goes back stepping through the lowest cost nodes.
IMHO, the difference is very subtle since both (DP and BCKT) are used to explore all possibilities to solve a problem.
As for today, I see two subtelties:
BCKT is a brute force solution to a problem. DP is not a brute force solution. Thus, you might say: DP explores the solution space more optimally than BCKT. In practice, when you want to solve a problem using DP strategy, it is recommended to first build a recursive solution. Well, that recursive solution could be considered also the BCKT solution.
There are hundreds of ways to explore a solution space (wellcome to the world of optimization) "more optimally" than a brute force exploration. DP is DP because in its core it is implementing a mathematical recurrence relation, i.e., current value is a combination of past values (bottom-to-top). So, we might say, that DP is DP because the problem space satisfies exploring its solution space by using a recurrence relation. If you explore the solution space based on another idea, then that won't be a DP solution. As in any problem, the problem itself may facilitate to use one optimization technique or another, based on the problem structure itself. The structure of some problems enable to use DP optimization technique. In this sense, BCKT is more general though not all problems allow BCKT too.
Example: Sudoku enables BCKT to explore its whole solution space. However, it does not allow to use DP to explore more efficiently its solution space, since there is no recurrence relation anywhere that can be derived. However, there are other optimization techniques that fit with the problem and improve brute force BCKT.
Example: Just get the minimum of a classic mathematical function. This problem does not allow BCKT to explore the state space of the problem.
Example: Any problem that can be solved using DP can also be solved using BCKT. In this sense, the recursive solution of the problem could be considered the BCKT solution.
Hope this helps a bit.
In a very simple sentence I can say: Dynamic programming is a strategy to solve optimization problem. optimization problem is about minimum or maximum result (a single result). but in, Backtracking we use brute force approach, not for optimization problem. it is for when you have multiple results and you want all or some of them.
Depth first node generation of state space tree with bounding function is called backtracking. Here the current node is dependent on the node that generated it.
Depth first node generation of state space tree with memory function is called top down dynamic programming. Here the current node is dependant on the node it generates.
Related
Let’s use as an example the problem LeetCode 322. Coin Change
I know it is best solved by using Dynamic Programming, but I want to focus on my Brute Force solution:
class Solution:
def coinChange(self, coins: List[int], amount: int) -> int:
curr_min = float('inf')
def helper(amount):
nonlocal curr_min
if amount < 0:
return float('inf')
if amount == 0:
return 0
for coin in coins:
curr_min = min(curr_min, helper(amount-coin) + 1)
return curr_min
ans = helper(amount)
return -1 if ans == float('inf') else ans
The Recursion Tree looks like: Recursion Tree
I can say it is Divide and Conquer: We are dividing the problem into smaller sub-problems, solving individually and using those individual results to construct the result for the original problem.
I can also say it is Backtracking: we are enumerating all combinations of coin frequencies which satisfy the constraints.
I know both are implemented via Recursion, but I would like to know which paradigm my Brute Force solution belongs to: Divide and Conquer or Backtracking.
A complication in categorizing your algorithm is that there aren’t clear, well-defined boundaries between different classes of algorithms and different people might have slightly different definitions in mind.
For example, generally speaking, divide-and-conquer algorithms involve breaking the problem apart into non-overlapping subproblems. (See, for example, mergesort, quicksort, binary search, closest pair of points, etc.) In that sense, your algorithm doesn’t nicely map onto the divide-and-conquer paradigm, since the subproblems you’re considering involve some degree of overlap in the subproblems they solve. (Then again, not all divide-and-conquer algorithms have this property. See, for example, stoogesort.)
Similarly, backtracking algorithms usually, but not always, work by committing to a decision, recursively searching to see whether a solution exists given that decision, then unwinding the choice if it turns out not to lead to a solution. Your algorithm doesn’t have this property, since it explores all options and then takes the best. (When I teach intro programming, I usually classify algorithms this way. But my colleagues sometimes describe what you’re doing as backtracking!)
I would classify your algorithm as belonging to a different family of exhaustive search. The algorithm you’ve proposed essentially works by enumerating all possible ways of making change, then returning the one that uses the fewest coins. Exhaustive search algorithms are ones that work by trying all possible options and returning the best, and I think that’s the best way of classifying your strategy.
To me this doesn't fit with either paradigm.
Backtracking to me is associated with reaching a point where the candidate cannot be further developed, but here we develop it to it's end, infinity, and we don't throw it away, we use it in comparisons.
Divide and conquer I associate with a division into a relatively small number of candidate groups (the classic example is two, like binary search). To call each path in a recursion a group for the sake of Divide and Conquer would lose the latter's meaning.
The most practical answer is it doesn't matter.
Safest answer recursion. My best interpretation is that its backtracking.
I think the options here are recursion, backtracking, divide-and-conquer, and dynamic programming.
Recursion being the most general and encapsulating of backtracking, D&C, and DP. If indeed it has backtracking and D&C algorithms then recursion would be the best answer as it contains both.
In Skiena's ADM (Section 5.3.1), it says:
A typical divide-and-conquer algorithm breaks a given problem into a smaller pieces, each of which is of size n/b.
By this interpretation is doesn't meet the as we divide our solution by coins and each coin amount being a different size.
In Erickson's Algorithms (section 1.6), it says:
divide and conquer:
Divide the given instance of the problem into several independent smaller instances of exactly the same problem.
So in this case, according to the recursion tree, are not always independent (they overlap).
Which leaves backtracking. Erickson defines the 'recursive strategy' as:
A backtracking algorithm tries to construct a solution to a computational problem incrementally, one small piece at a time.
Which seems general enough to fit all DP problems under it. The provided code can be said it backtracks when a solution path fails.
Additionally, according to Wikipedia:
It is often the most convenient technique for parsing, for the knapsack problem and other combinatorial optimization problems.
Coin Change being an Unbounded Knapsack type problem, then it fits into the description of backtracking.
I just read this short post about mental models for Recursive Memoization vs Dynamic Programming, written by professor Krishnamurthi. In it, Krishnamurthi represents memoization's top-down structure as a recursion tree, and DP's bottom-up structure as a DAG where the source vertices are the first – likely smallest – subproblems solved, and the sink vertex is the final computation (essentially the graph is the same as the aforementioned recursive tree, but with all the edges flipped). Fair enough, that makes perfect sense.
Anyways, towards the end he gives a mental exercise to the reader:
Memoization is an optimization of a top-down, depth-first computation
for an answer. DP is an optimization of a bottom-up, breadth-first
computation for an answer.
We should naturally ask, what about
top-down, breadth-first
bottom-up, depth-first
Where do they fit into
the space of techniques for avoiding recomputation by trading off
space for time?
Do we already have names for them? If so, what?, or
Have we been missing one or two important tricks?, or
Is there a reason we don't have names for these?
However, he stops there, without giving his thoughts on these questions.
I'm lost, but here goes:
My interpretation is that a top-down, breadth-first computation would require a separate process for each function call. A bottom-up, depth-first approach would somehow piece together the final solution, as each trace reaches the "sink vertex". The solution would eventually "add up" to the right answer once all calls are made.
How off am I? Does anyone know the answer to his three questions?
Let's analyse what the edges in the two graphs mean. An edge from subproblem a to b represents a relation where a solution of b is used in the computation of a and must be solved before it. (The other way round in the other case.)
Does topological sort come to mind?
One way to do a topological sort is to perform a Depth First Search and on your way out of every node, process it. This is essentially what Recursive memoization does. You go down Depth First from every subproblem until you encounter one that you haven't solved (or a node you haven't visited) and you solve it.
Dynamic Programming, or bottom up - breadth first problem solving approach involves solving smaller problems and constructing solutions to larger ones from them. This is the other approach to doing a topological sort, where you visit the node with a in-degree of 0, process it, and then remove it. In DP, the smallest problems are solved first because they have a lower in-degree. (Smaller is subjective to the problem at hand.)
The problem here is the generation of a sequence in which the set of subproblems must be solved. Both top-down breadth-first and bottom-up depth-first can't do that.
Top-down Breadth-first will still end up doing something very similar to the depth-first counter part even if the process is separated into threads. There is an order in which the problems must be solved.
A bottom-up depth-first approach MIGHT be able to partially solve problems but the end result would still be similar to the breadth first counter part. The subproblems will be solved in a similar order.
Given that these approaches have almost no improvements over the other approaches, do not translate well with analogies and are tedious to implement, they aren't well established.
#AndyG's comment is pretty much on the point here. I also like #shebang's answer, but here's one that directly answers these questions in this context, not through reduction to another problem.
It's just not clear what a top-down, breadth-first solution would look like. But even if you somehow paused the computation to not do any sub-computations (one could imagine various continuation-based schemes that might enable this), there would be no point to doing so, because there would be sharing of sub-problems.
Likewise, it's unclear that a bottom-up, depth-first solution could solve the problem at all. If you proceed bottom-up but charge all the way up some spine of the computation, but the other sub-problems' solutions aren't already ready and lying in wait, then you'd be computing garbage.
Therefore, top-down, breadth-first offers no benefit, while bottom-up, depth-first doesn't even offer a solution.
Incidentally, a more up-to-date version of the above blog post is now a section in my text (this is the 2014 edition; expect updates.
This is a paragraph of the book: Introduction to Algorithms, 3rd Edition. p.336
"These two approaches yield algorithms with the same asymptotic running time,
expect in unusual circumstances where the top-down approach does not actually
recurse to examine all possible subproblems. The bottom-up approach often has
much better constant factors, since it has less overhead for procedure calls."
The Context : two approaches are first top-down + memoization(DP) and second
bottom-up method.
I got a question for you one more. Does 'overhead' of function call mean every function call needs time? Even if we solve all subproblems, top-down takes more time because of the 'overhead'?
A bottom-up approach to dynamic programming means solving all the small problems first, and then using them to find answers to the next smallest, and so on. So, for instance, if the solution to a problem of length n depends only on answers to problems of length n-1, you might start by putting in all the solutions for length 0, then you'd iteratively fill in solutions to length 1, 2, 3, and so on, each time using the answers you'd already calculated at the previous level. It is efficient in that it means you don't end up solving a sub-problem twice.
A top-down with memoization approach would look at it the other way. If you want the solution to a problem of length 10, then you do so recursively. You notice that it relies on (say) three problems of length 9, so you recursively solve them, and then you know the answer of length 10. But whenever you solve a sub-problem, you remember the answer, and whenever you need the answer to a sub-problem, you look first to see whether you've already solved it, and if you have, you return the cached answer.
The bottom-up approach is good in that it can be written iteratively (using for loops) rather than recursively, which means you don't run out of stack space on large problems, and loops are also faster. Its disadvantage is that you solve all the sub-problems, and you might not need them all to be solved in order to solve the large problem you want the answer to.
The top-down approach is slower if you need all the sub-problems solved anyway, because of the recursion overhead. But it is faster if the problem you're solving only needs a smallish subset of the sub-problems to be solved, because it only solves the ones that it needs.
It is essentially the same as the difference between eager evaluation (bottom up) and lazy evaluation (top down).
What is the main difference between dynamic programming and greedy approach in terms of usage?
As far as I understood, the greedy approach sometimes gives an optimal solution; in other cases, the dynamic programming approach gives an optimal solution.
Are there any particular conditions which must be met in order to use one approach (or the other) to obtain an optimal solution?
Based on Wikipedia's articles.
Greedy Approach
A greedy algorithm is an algorithm that follows the problem solving heuristic of making
the locally optimal choice at each stage with the hope of finding a global optimum. In
many problems, a greedy strategy does not in general produce an optimal solution, but nonetheless a greedy heuristic may yield locally optimal solutions that approximate a global optimal solution in a reasonable time.
We can make whatever choice seems best at the moment and then solve the subproblems that arise later. The choice made by a greedy algorithm may depend on choices made so far but not on future choices or all the solutions to the subproblem. It iteratively makes one greedy choice after another, reducing each given problem into a smaller one.
Dynamic programming
The idea behind dynamic programming is quite simple. In general, to solve a given problem, we need to solve different parts of the problem (subproblems), then combine the solutions of the subproblems to reach an overall solution. Often when using a more naive method, many of the subproblems are generated and solved many times. The dynamic programming approach seeks to solve each subproblem only once, thus reducing the number of computations: once the solution to a given subproblem has been computed, it is stored or "memo-ized": the next time the same solution is needed, it is simply looked up. This approach is especially useful when the number of repeating subproblems grows exponentially as a function of the size of the input.
Difference
Greedy choice property
We can make whatever choice seems best at the moment and then solve the subproblems that arise later. The choice made by a greedy algorithm may depend on choices made so far but not on future choices or all the solutions to the subproblem. It iteratively makes one greedy choice after another, reducing each given problem into a smaller one. In other words, a greedy algorithm never reconsiders its choices.
This is the main difference from dynamic programming, which is exhaustive and is guaranteed to find the solution. After every stage, dynamic programming makes decisions based on all the decisions made in the previous stage, and may reconsider the previous stage's algorithmic path to solution.
For example, let's say that you have to get from point A to point B as fast as possible, in a given city, during rush hour. A dynamic programming algorithm will look into the entire traffic report, looking into all possible combinations of roads you might take, and will only then tell you which way is the fastest. Of course, you might have to wait for a while until the algorithm finishes, and only then can you start driving. The path you will take will be the fastest one (assuming that nothing changed in the external environment).
On the other hand, a greedy algorithm will start you driving immediately and will pick the road that looks the fastest at every intersection. As you can imagine, this strategy might not lead to the fastest arrival time, since you might take some "easy" streets and then find yourself hopelessly stuck in a traffic jam.
Some other details...
In mathematical optimization, greedy algorithms solve combinatorial problems having the properties of matroids.
Dynamic programming is applicable to problems exhibiting the properties of overlapping subproblems and optimal substructure.
I would like to cite a paragraph which describes the major difference between greedy algorithms and dynamic programming algorithms stated in the book Introduction to Algorithms (3rd edition) by Cormen, Chapter 15.3, page 381:
One major difference between greedy algorithms and dynamic programming is that instead of first finding optimal solutions to subproblems and then making an informed choice, greedy algorithms first make a greedy choice, the choice that looks best at the time, and then solve a resulting subproblem, without bothering to solve all possible related smaller subproblems.
Difference between greedy method and dynamic programming are given below :
Greedy method never reconsiders its choices whereas Dynamic programming may consider the previous state.
Greedy algorithm is less efficient whereas Dynamic programming is more efficient.
Greedy algorithm have a local choice of the sub-problems whereas Dynamic programming would solve the all sub-problems and then select one that would lead to an optimal solution.
Greedy algorithm take decision in one time whereas Dynamic programming take decision at every stage.
In simple words we can say that in Dynamic Programming (having problem sending message on network) one can first examine the path which takes the shortest time and then start journey,
On the other hand Greedy algorithm take the optimal decision on the spot without thinking for the next step and on the next step change its decision again and so on...
Notes: Dynamic programming is reliable while Greedy Algorithms is not reliable always.
With the reference of Biswajit Roy:
Dynamic Programming firstly plans then Go.
and
Greedy algorithm uses greedy choice, it firstly Go then continuously Plans.
the major difference between greedy method and dynamic programming is in greedy method only one optimal decision sequence is ever generated and in dynamic programming more than one optimal decision sequence may be generated.
There are many problems that can be solved using Dynamic programming e.g. Longest increasing subsequence. This problem can be solved by using 2 approaches
Memoization (Top Down) - Using recursion to solve the sub-problem and storing the result in some hash table.
Tabulation (Bottom Up) - Using Iterative approach to solve the problem by solving the smaller sub-problems first and then using it during the execution of bigger problem.
My question is which is better approach in terms of time and space complexity?
Short answer: it depends on the problem!
Memoization usually requires more code and is less straightforward, but has computational advantages in some problems, mainly those which you do not need to compute all the values for the whole matrix to reach the answer.
Tabulation is more straightforward, but may compute unnecessary values. If you do need to compute all the values, this method is usually faster, though, because of the smaller overhead.
First understand what is dynamic programming?
If a problem at hand can be broken down to sub-problems whose solutions are also optimal and can be combined to reach solution of original/bigger problem. For such problems, we can apply dynamic programming.
It's way of solving a problem by storing the results of sub-problems in program memory and reuse it instead of recalculating it at later stage.
Remember the ideal case of dynamic programming usage is, when you can reuse the solutions of sub-problems more than one time, otherwise, there is no point in storing the result.
Now, dynamic programming can be applied in bottom-up approach(Tabulation) and top-down approach(Memoization).
Tabulation: We start with calculating solutions to smallest sub-problem and progress one level up at a time. Basically follow bottom-up approach.
Here note, that we are exhaustively finding solutions for each of the sub-problems, without knowing if they are really neeeded in future.
Memoization: We start with the original problem and keep breaking it one level down till the base case whose solution we know. In most cases, such breaking down(top-down approach) is recursive. Hence, time taken is slower if problem is using each steps sub-solutions due to recursive calls. But, in case when all sub-solutions are not needed then, Memoization performs better than Tabulation.
I found this short video quite helpful: https://youtu.be/p4VRynhZYIE
Asymptotically a dynamic programming implementation that is top down is the same as going bottom up, assuming you're using the same recurrence relation. However, bottom up is generally more efficient because of the overhead of recursion which is used in memoization.
If the problem has overlapping sub-problems property then use Memoization, else it depends on the problem