Recursive vs iterative to get more performance - performance

In terms of performance which approach is better for the knapsack problem: iterative, or recursive?
Limited to 1 sec I need to sort out which of 40 items should the knapsack be filled with to get the most valuable items, a typical knapsack problem.
I know that if I do a brute force to determine which items to select I get 2^41 - 1 subproblems to solve, so it is very unthoughtful to use this solution, but is it a way to cut down the unneeded branches and make it as efficient as the iterative form?
On the other hand if the weight is very big, the matrix would be enormous and also as inefficient as the recursive approach.

With that kind of problem, asking "iterative or recursive" doesn't get you anywhere. What you need to do is write code, measure what it is doing, start understanding what takes time and why, and as your understanding of the problem grows, you'll find more effective ways of attacking the problem.
The problem is NP-complete which means that there are at least pathological cases that cannot be solved quickly. But in practice, many problems can be solved quickly. You want to pick items with high value/weight ratio, and pick items that fill the rucksack well. And you don't want to try all possibilities, you want to find one good solution and with the help of that good solution be able to reject large sets of possibilities quickly.

If it's a typical knapsack problem, isn't it possible to use Dynamic Programming to use the previous results in an iterative way as they're stored in a matrix and using a recursive formula to evaluate the new values?

Related

when to use bottom-up DP and when to use top-down DP

I have leant that two ways of DP, but I am confused now. How we choose in different condition? And I find that in most of time top-down is more natural for me. Can anyone tell me that how to make the choice.
PS: I have read this post older post but still get confused. Need help. Don't identify my questions as duplication. I have mentioned that they are different. I hope to know how to choose and when to consider problem from top-down or bottom-up way.
To make it simple, I will explain based on my summary from some sources
Top-down: something looks like: a(n) = a(n-1) + a(n-2). With this equation, you can implement with about 4-5 lines of code by making the function a call itself. Its advantage, as you said, is quite intuitive to most developers but it costs more space (memory stack) to execute.
Bottom-up: you first calculate a(0) then a(1), and save it to some array (for instance), then you continuously savea(i) = a(i-1) + a(i-2). With this approach, you can significantly improve the performance of your code. And with big n, you can avoid stack overflow.
A slightly longer answer, but I have tried to explain my own approach to dynamic programming and what I have come to understand after solving such questions. I hope future users find it helpful. Please do feel free to comment and discuss:
A top-down solution comes more naturally when thinking about a dynamic programming problem. You start with the end result and try to figure out the ways you could have gotten there. For example, for fib(n), we know that we could have gotten here only through fib(n-1) and fib(n-2). So we call the function recursively again to calculate the answer for these two cases, which goes deeper and deeper into the tree until the base case is reached. The answer is then built back up until all the stacks are popped off and we get the final result.
To reduce duplicate calculations, we use a cache that stores a new result and returns it if the function tries to calculate it again. So, if you imagine a tree, the function call does not have to go all the way down to the leaves, it already has the answer and so it returns it. This is called memoization and is usually associated with the top-down approach.
Now, one important point I think for the bottom-up approach is that you must know the order in which the final solution has to be built. In the top-down case, you just keep breaking one thing down into many but in the bottom-up case, you must know the number and order of states that need to be involved in a calculation to go from one level to the next. In some simpler problems (eg. fib(n)), this is easy to see, but for more complex cases, it does not lend itself naturally. The approach I usually follow is to think top-down, break the final case into previous states and try to find a pattern or order to then be able to build it back up.
Regarding when to choose either of those, I would suggest the approach above to identify how the states are related to each other and being built. One important distinction you can find this way is how many calculations are really needed and how a lot might just be redundant. In the bottom up case, you have to fill an entire level before you go to the next. However, in the top down case, an entire subtree can be skipped if not needed and in such a way, a lot of extra calculations can be saved.
Hence, the choice obviously depends on the problem, but also on the inter-relation between states. It is usually the case that bottom-up is recommended because it saves you stack space as compared to the recursive approach. However, if you feel the recursion isn't too deep but is very wide and can lead to a lot of unnecessary calculations by tabularization, you can then go for top-down approach with memoization.
For example, in this question: https://leetcode.com/problems/partition-equal-subset-sum/, if you see the discussions, it is mentioned that top-down is faster than bottom-up, basically, the binary tree approach with a cache versus the knapsack bottom up build-up. I leave it as an exercise to understand the relation between the states.
Bottom-up and Top-down DP approaches are the same for many problems in terms of time and space complexity. Difference are that, bottom-up a little bit faster, because you don't need overhead for recursion and, yes, top-down more intuitive and natural.
But, real advantage of Top-bottom approach can be on some small sets of tasks, where you don't need to calculate answer for all smaller subtasks! And you can reduce time complexity in this cases.
For example you can use Top-down approach with memorization for finding N-th Fibonacci number, where the sequence is defined as a[n]=a[n-1]+a[n-2] So, you have both O(N) time for calculating it (I don't compare with O(logN) solution for finding this number). But look at the sequence a[n]=a[n/2]+a[n/2-1] with some edge cases for small N. In botton up approach you can't do it faster than O(N) where top-down algorithm will work with complexity O(logN) (or maybe some poly-logarithmic complexity, I am not sure)
To add on to the previous answers,
Optimal time:
if all sub-problems need to be solved
→ bottom-up approach
else
→ top-down approach
Optimal space:
Bottom-up approach
The question Nikhil_10 linked (i.e https://leetcode.com/problems/partition-equal-subset-sum/) doesn't require all subproblems to be solved. Hence the top-down approach is more optimal.
If you like the top-down natural then use it if you know you can implement it. bottom-up is faster than the top-down one. Sometimes Bottom-ups are easier and most of the times the bottom-up are easy. Depending on your situation make your decision.

The greedy algorithm and implementation

Hello I've just started learning greedy algorithm and I've first looked at the classic coin changing problem. I could understand the greediness (i.e., choosing locally optimal solution towards a global optimum.) in the algorithm as I am choosing the highest value of coin such that the
sum+{value of chosen coin}<=total value . Then I started to solve some greedy algorithm problem in some sites. I could solve most of the problems but couldn't figure out exactly where the greediness is applied in the problem. I coded the only solution i could think of, for the problems and got it accepted. The editorials also show the same way of solving problem but i could not understand the application of greedy paradigm in the algorithm.
Are greedy algorithms the only way of solving a particular range of problems? Or they are one way of solving problems which could be more efficient?
Could you give me pseudo codes of a same problem with and without the application of greedy paradigm?
There are lots of real life examples of greedy algorithms. One of the obvious is the coin changing problem, to make change in a certain currency, we repeatedly dispense the largest denomination, thus , to give out seventeen dollars and sixty one cents in change, we give out a ten-dollar bill, a five-dollar bill, two one-dollar bills, two quarters , one dime, and one penny. By doing this, we are guaranteed to minimize the number of bills and coins. This algorithm does not work in all monetary systems...more here
I think that there is always another way to solve a problem, but sometimes, as you've stated, it probably will be less efficient.
For example, you can always check all the options (coins permutations), store the results and choose the best, but of course the efficiency is terrible.
Hope it helps.
Greedy algorithms are just a class of algorithms that iteratively construct/improve a solution.
Imagine the most famous problem - TSP. You can formulate it as Integer Linear Programming problem and give it to an ILP solver and it will give you globally optimal solution (if it has enought time). But you could do it in a greedy way. You can construct some solution (e.g. randomly) and then look for changes (e.g. switch an order of two cities) that improve your solution and you keep doing these changes until there is no such change possible.
So the bottom line is: greedy algorithms are only a method of solving hard problems efficiently (in time, but not necessary in the quality of solution), but there are other classes of algorithms for solving such problems.
For coins, greedy algorithm is also the optimal one, therefore the "greediness" is not as visible as with some other problems.
In some cases you prefer solution, which is not the best one, but you can compute it much faster (computing the real best solution can takes years for example).
Then you choose heuristic, that should give you the best results - based on average input data, its structure and what you want to want to accomplish.
On wikipedia, there is good solution on finding the biggest sum of numbers in tree
Imagine that you have for example 2^1000 nodes in this tree. To find optimal solution, you have to visit each node once. Personal computer today is not able to do this in your lifetime, therefore you want some heuristic. Greedy alghoritm however find solution in just 1000 steps (which does not take more than one milisecond)

Algorithm without dynamic programming,less efficient solution

There is a problem asked in contest. I already solved this problem with dynamic programming and its complexity O(n^2). But i am looking for solution with less efficient way. What will be the complexity of this less efficient way. Thanks for the helps.
There is a general way to make any dynamic programming solution less efficient. The essence of dynamic programming is to store solutions to sub-problems for reuse.
To make it less efficient in a somewhat reasonable way, get rid of the sub-problem result storage. Instead, recalculate each sub-problem solution whenever you need it.
Using inefficient data structure with same algorithm can help to have O(n^3). Storing towns in a linked list instead of an array will make algorithm one order less efficient.
To make it even less efficient, it is easier to change algorithm. For example checking of all harbinger changing combinations and using minimal, which is exponential in time.

Memoization or Tabulation approach for Dynamic programming

There are many problems that can be solved using Dynamic programming e.g. Longest increasing subsequence. This problem can be solved by using 2 approaches
Memoization (Top Down) - Using recursion to solve the sub-problem and storing the result in some hash table.
Tabulation (Bottom Up) - Using Iterative approach to solve the problem by solving the smaller sub-problems first and then using it during the execution of bigger problem.
My question is which is better approach in terms of time and space complexity?
Short answer: it depends on the problem!
Memoization usually requires more code and is less straightforward, but has computational advantages in some problems, mainly those which you do not need to compute all the values for the whole matrix to reach the answer.
Tabulation is more straightforward, but may compute unnecessary values. If you do need to compute all the values, this method is usually faster, though, because of the smaller overhead.
First understand what is dynamic programming?
If a problem at hand can be broken down to sub-problems whose solutions are also optimal and can be combined to reach solution of original/bigger problem. For such problems, we can apply dynamic programming.
It's way of solving a problem by storing the results of sub-problems in program memory and reuse it instead of recalculating it at later stage.
Remember the ideal case of dynamic programming usage is, when you can reuse the solutions of sub-problems more than one time, otherwise, there is no point in storing the result.
Now, dynamic programming can be applied in bottom-up approach(Tabulation) and top-down approach(Memoization).
Tabulation: We start with calculating solutions to smallest sub-problem and progress one level up at a time. Basically follow bottom-up approach.
Here note, that we are exhaustively finding solutions for each of the sub-problems, without knowing if they are really neeeded in future.
Memoization: We start with the original problem and keep breaking it one level down till the base case whose solution we know. In most cases, such breaking down(top-down approach) is recursive. Hence, time taken is slower if problem is using each steps sub-solutions due to recursive calls. But, in case when all sub-solutions are not needed then, Memoization performs better than Tabulation.
I found this short video quite helpful: https://youtu.be/p4VRynhZYIE
Asymptotically a dynamic programming implementation that is top down is the same as going bottom up, assuming you're using the same recurrence relation. However, bottom up is generally more efficient because of the overhead of recursion which is used in memoization.
If the problem has overlapping sub-problems property then use Memoization, else it depends on the problem

What classes of algorithms can be used for timetabling?

Yes, there are many such questions like this on SO. I saw genetic algorithms were the most common answers.
Making a timetable schedule and
Algorithm for creating a school timetable
However, I am worried about these characteristics of a GA
termination condition of the program are hard to define
cant escape local maxima easily
I expect the program to be pushed to conflicting criteria and impossible solutions too readily by it's users.
Hence I want a method that
is decisive- guaranteed to reach a near-optimal situation or report that the algorithm won't reach a solution
can take both hard (inviolable) limits and soft limits
elegantly takes in user-input constraints; if user-input doesnt work, it can be added to the code without breaking it
There are 100000 exhaustively possible timetables.
I searched around and saw that metaheuristic algorithms like simulated annealing are a good candidate. What about dynamic programming algorithms?
Is a brute-force approach okay for such a data set?
What is a good algorithm that can fit the criteria?
For small input, with only 100,000 possibilities - I'd go with a plain brute force solution: Just check all possibilities, and chose the best out of them. For modern machines, running your scoring function on an input of size 100,000 is not computationally hard, and will most likely take just a few seconds.
GA and other AI algorithms, are usually used for much larger input [billions and more possibilities], so they might not be the best solution in your case.
Unlike any other solution, the brute force solution will ensure you the optimal solution, and will be terminated when it exhausts all possible solutions.
(*)Note: you can modify GA and steepest ascent hill climbing to overcome the 2nd problem you mention [escaping local maxima] by enforcing random restarts when the solution is not improving for k steps, but again - you will have no idea how close you are to the optimal solution at each point.

Resources