Dynamic Programing- complexity - algorithm

I have a homework problem that I have been trying to figure out for some time now, and I can't figure it out for the life of me.
I have a sheet of size X*Y, and a set of patterns of lesser sizes, with price values associated with them. I can cut the sheet either horizontally or vertically, and I have to find the optimized cutting pattern to get the greatest profit from the sheet.
As far as I can tell there should be (X*Y)(X+Y+#ofPatterns) recursive operations. The complexity is supposed to be exponential. Can someone please explain why?
The pseudo-code I have come up with is as follows:
Optimize( w, h ) {
best_price = 0
for(Pattern p : all patterns) {
if ( p fits into this piece of cloth && p’s price > best price) {best_price = p’s price}
}
for (i = 1…n){
L= Optimize( i, h );
R= Optimize( w-i, h);
if (L_price + R_price > best_price) { update best_price}
}
for (i = 1…n){
T= Optimize( w, i );
B= Optimize( w, h-i);
if (T_price + B_price > best_price) { update best_price}
}
return best_price;
}

The recursive case is exponential because you can at the start choose to cut your paper 0 to max width inches or 0 to max height inches and then optionally cut the remaining pieces (recurse).
This problem sounds like a bit more interesting case of this rod cutting problem since it involves two dimensions.
http://www.radford.edu/~nokie/classes/360/dp-rod-cutting.html
is a good guide. Read that should put you on the right track without blatantly answering your homework.
The relevant portion to why it is exponential when recursing:
This recursive algorithm uses the formula above and is slow
Code
-- price array p, length n
Cut-Rod(p, n)
if n = 0 then
return 0
end if
q := MinInt
for i in 1 .. n loop
q = max(q, p(i) + Cut-Rod(p, n-i)
end loop
return q
Recursion tree (shows subproblems): 4/[3,2,1,0]//[2,1,0],[1,0],0//[1,0],0,0//0
Performance: Let T(n) = number of calls to Cut-Rod(x, n), for any x
T(0)=0
T(n)=1+∑i=1nT(n−i)=1+∑j=0n−1T(j)
Solution: T(n)=2n

When calculating the complexity of a dynamic programming algorithm, we can decompose it into two subproblems: one is calculating the number of substates; and the other is calculating the time complexity of solving a particular subproblem.
But it's true that when you don't use a memoization approach, the algorithm that has a polynomial time complexity in nature would increase to exponential time complexity since you are not re-using information that you've previously calculated. (I'm pretty sure you understand this part from your dynamic programming course)
No matter whether you solve a dynamic programming problem using the memoization method or the bottom-up approach, the time complexity stays the same. I think the trouble you are having is that you are trying to draw the function call graph in your head. Instead, let's try to estimate the number of function calls this way.
You are saying that there are (X*Y)(X+Y+#ofPatterns) recursive calls.
Well, yes and no.
It's true that when you use a memoization method, there are only this many number of recursive calls. Because if you have called and calculated a certain Optimize(w0,h0), the value will be stored and the next time another function Optimize(w1,h1) calls Optimize(w0,h0), it won't do these redundant work again. And that's what makes the time complexity polynomial.
But in your current implementation, one subproblem Optimize(w0,h0) gets many redundant function calls, which means the number of recursive calls in your algorithm is not polynomial at all (for a simple example, try to draw the call graph of the recursive Fibonacci number algorithm).

Related

differential equation VS Algorithms complexity

I don't know if it's the right place to ask because my question is about how to calculate a computer science algorithm complexity using differential equation growth and decay method.
The algorithm that I would like to prove is Binary search for a sorted array, which has a complexity of log2(n)
The algorithm says: if the target value are searching for is equal to the mid element, then return its index. If if it's less, then search on the left sub-array, if greater search on the right sub-array.
As you can see each time N(t): [number of nodes at time t] is being divided by half. Therefore, we can say that it takes O(log2(n)) to find an element.
Now using differential equation growth and decay method.
dN(t)/dt = N(t)/2
dN(t): How fast the number of elements is increasing or decreasing
dt: With respect to time
N(t): Number of elements at time t
The above equation says that the number of cells is being divided by 2 with time.
Solving the above equations gives us:
dN(t)/N(t) = dt/2
ln(N(t)) = t/2 + c
t = ln(N(t))*2 + d
Even though we got t = ln(N(t)) and not log2(N(t)), we can still say that it's logarithmic.
Unfortunately, the above method, even if it makes sense while approaching it to finding binary search complexity, turns out it does not work for all algorithms. Here's a counter example:
Searching an array linearly: O(n)
dN(t)/dt = N(t)
dN(t)/N(t) = dt
t = ln(N(t)) + d
So according to this method, the complexity of searching linearly takes O(ln(n)) which is NOT true of course.
This differential equation method is called growth and decay and it's very popluar. So I would like to know if this method could be applied in computer science algorithm like the one I picked, and if yes, what did I do wrong to get incorrect result for the linear search ? Thank you
The time an algorithm takes to execute is proportional to the number
of steps covered(reduced here).
In your linear searching of the array, you have assumed that dN(t)/dt = N(t).
Incorrect Assumption :-
dN(t)/dt = N(t)
dN(t)/N(t) = dt
t = ln(N(t)) + d
Going as per your previous assumption, the binary-search is decreasing the factor by 1/2 terms(half-terms are directly reduced for traversal in each of the pass of array-traversal,thereby reducing the number of search terms by half). So, your point of dN(t)/dt=N(t)/2 was fine. But, when you are talking of searching an array linearly, obviously, you are accessing the element in one single pass and hence, your searching terms are decreasing in the order of one item in each of the passes. So, how come your assumption be true???
Correct Assumption :-
dN(t)/dt = 1
dN(t)/1 = dt
t = N(t) + d
I hope you got my point. The array elements are being accessed sequentially one pass(iteration) each. So, the array accessing is not changing in order of N(t), but in order of a constant 1. So, this N(T) order result!

Subset-Sum in Linear Time

This was a question on our Algorithms final exam. It's verbatim because the prof let us take a copy of the exam home.
(20 points) Let I = {r1,r2,...,rn} be a set of n arbitrary positive integers and the values in I are distinct. I is not given in any sorted order. Suppose we want to find a subset I' of I such that the total sum of all elements in I' is exactly 100*ceil(n^.5) (each element of I can appear at most once in I'). Present an O(n) time algorithm for solving this problem.
As far as I can tell, it's basically a special case of the knapsack problem, otherwise known as the subset-sum problem ... both of which are in NP and in theory impossible to solve in linear time?
So ... was this a trick question?
This SO post basically explains that a pseudo-polynomial (linear) time approximation can be done if the weights are bounded, but in the exam problem the weights aren't bounded and either way given the overall difficulty of the exam I'd be shocked if the prof expected us to know/come up with an obscure dynamic optimization algorithm.
There are two things that make this problem possible:
The input can be truncated to size O(sqrt(n)). There are no negative inputs, so you can discard any numbers greater than 100*sqrt(n), and all inputs are distinct so we know there are at most 100*sqrt(n) inputs that matter.
The playing field has size O(sqrt(n)). Although there are O(2^sqrt(n)) ways to combine the O(sqrt(n)) inputs that matter, you don't have to care about combinations that either leave the 100*sqrt(n) range or redundantly hit a target you can already reach.
Basically, this problem screams dynamic programming with each input being checked against each part of the 'reached number' space somehow.
The solution ends up being a matter of ensuring numbers don't reach off of themselves (by scanning in the right direction), of only looking at each number once, and of giving ourselves enough information to reconstruct the solution afterwards.
Here's some C# code that should solve the problem in the given time:
int[] FindSubsetToImpliedTarget(int[] inputs) {
var target = 100*(int)Math.Ceiling(Math.Sqrt(inputs.Count));
// build up how-X-was-reached table
var reached = new int?[target+1];
reached[0] = 0; // the empty set reaches 0
foreach (var e in inputs) {
// we go backwards to avoid reaching off of ourselves
for (var i = target; i >= e; i--) {
if (reached[i-e].HasValue) {
reached[i] = e;
}
}
}
// was target even reached?
if (!reached[target].HasValue) return null;
// build result by back-tracking via the logged reached values
var result = new List<int>();
for (var i = target; reached[i] != 0; i -= reached[i].Value) {
result.Add(reached[i].Value);
}
return result.ToArray();
}
I haven't actually tested the above code, so beware typos and off-by-ones.
With the typical DP algorithm for subset-sum problem will obtain O(N) time consuming algorithm. We use dp[i][k] (boolean) to indicate whether the first i items have a subset with sum k,the transition equation is:
dp[i][k] = (dp[i-1][k-v[i] || dp[i-1][k]),
it is O(NM) where N is the size of the set and M is the targeted sum. Since the elements are distinct and the sum must equal to 100*ceil(n^.5), we just need consider at most the first 100*ceil(n^.5) items, then we get N<=100*ceil(n^.5) and M = 100*ceil(n^.5).
The DP algorithm is O(N*M) = O(100*ceil(n^.5) * 100*ceil(n^.5)) = O(n).
Ok following is a simple solution in O(n) time.
Since the required sum S is of the order of O(n^0.5), if we formulate an algorithm of complexity S^2, then we are good since our algorithm shall be of effective complexity O(n).
Iterate once over all the elements and check if the value is less than S or not. If it is then push it in a new array. This array shall contain a maximum of S elements (O(n^.5))
Sort this array in descending order in O(sqrt(n)*logn) time ( < O(n)). This is so because logn <= sqrt(n) for all natural numbers. https://math.stackexchange.com/questions/65793/how-to-prove-log-n-leq-sqrt-n-over-natural-numbers
Now this problem is a 1D knapsack problem with W = S and number of elements = S (upper bound).
Maximize the total weight of items and see if it equals S.
It can be solved using dynamic programming in linear time (linear wrt W ~ S).

Maximum non-overlapping intervals in a interval tree

Given a list of intervals of time, I need to find the set of maximum non-overlapping intervals.
For example,
if we have the following intervals:
[0600, 0830], [0800, 0900], [0900, 1100], [0900, 1130],
[1030, 1400], [1230, 1400]
Also it is given that time have to be in the range [0000, 2400].
The maximum non-overlapping set of intervals is [0600, 0830], [0900, 1130], [1230, 1400].
I understand that maximum set packing is NP-Complete. I want to confirm if my problem (with intervals containing only start and end time) is also NP-Complete.
And if so, is there a way to find an optimal solution in exponential time, but with smarter preprocessing and pruning data. Or if there is a relatively easy to implement fixed parameter tractable algorithm. I don't want to go for an approximation algorithm.
This is not a NP-Complete problem. I can think of an O(n * log(n)) algorithm using dynamic programming to solve this problem.
Suppose we have n intervals. Suppose the given range is S (in your case, S = [0000, 2400]). Either suppose all intervals are within S, or eliminate all intervals not within S in linear time.
Pre-process:
Sort all intervals by their begin points. Suppose we get an array A[n] of n intervals.
This step takes O(n * log(n)) time
For all end points of intervals, find the index of the smallest begin point that follows after it. Suppose we get an array Next[n] of n integers.
If such begin point does not exist for the end point of interval i, we may assign n to Next[i].
We can do this in O(n * log(n)) time by enumerating n end points of all intervals, and use a binary search to find the answer. Maybe there exists linear approach to solve this, but it doesn't matter, because the previous step already take O(n * log(n)) time.
DP:
Suppose the maximum non-overlapping intervals in range [A[i].begin, S.end] is f[i]. Then f[0] is the answer we want.
Also suppose f[n] = 0;
State transition equation:
f[i] = max{f[i+1], 1 + f[Next[i]]}
It is quite obvious that the DP step take linear time.
The above solution is the one I come up with at the first glance of the problem. After that, I also think out a greedy approach which is simpler (but not faster in the sense of big O notation):
(With the same notation and assumptions as the DP approach above)
Pre-process: Sort all intervals by their end points. Suppose we get an array B[n] of n intervals.
Greedy:
int ans = 0, cursor = S.begin;
for(int i = 0; i < n; i++){
if(B[i].begin >= cursor){
ans++;
cursor = B[i].end;
}
}
The above two solutions come out from my mind, but your problem is also referred as the activity selection problem, which can be found on Wikipedia http://en.wikipedia.org/wiki/Activity_selection_problem.
Also, Introduction to Algorithms discusses this problem in depth in 16.1.

Finding time complexity of partition by quick sort metod

Here is an algorithm for finding kth smallest number in n element array using partition algorithm of Quicksort.
small(a,i,j,k)
{
if(i==j) return(a[i]);
else
{
m=partition(a,i,j);
if(m==k) return(a[m]);
else
{
if(m>k) small(a,i,m-1,k);
else small(a,m+1,j,k);
}
}
}
Where i,j are starting and ending indices of array(j-i=n(no of elements in array)) and k is kth smallest no to be found.
I want to know what is the best case,and average case of above algorithm and how in brief. I know we should not calculate termination condition in best case and also partition algorithm takes O(n). I do not want asymptotic notation but exact mathematical result if possible.
First of all, I'm assuming the array is sorted - something you didn't mention - because that code wouldn't otherwise work. And, well, this looks to me like a regular binary search.
Anyway...
The best case scenario is when either the array is one element long (you return immediately because i == j), or, for large values of n, if the middle position, m, is the same as k; in that case, no recursive calls are made and it returns immediately as well. That makes it O(1) in best case.
For the general case, consider that T(n) denotes the time taken to solve a problem of size n using your algorithm. We know that:
T(1) = c
T(n) = T(n/2) + c
Where c is a constant time operation (for example, the time to compare if i is the same as j, etc.). The general idea is that to solve a problem of size n, we consume some constant time c (to decide if m == k, if m > k, to calculate m, etc.), and then we consume the time taken to solve a problem of half the size.
Expanding the recurrence can help you derive a general formula, although it is pretty intuitive that this is O(log(n)):
T(n) = T(n/2) + c = T(n/4) + c + c = T(n/8) + c + c + c = ... = T(1) + c*log(n) = c*(log(n) + 1)
That should be the exact mathematical result. The algorithm runs in O(log(n)) time. An average case analysis is harder because you need to know the conditions in which the algorithm will be used. What is the typical size of the array? The typical size of k? What is the mos likely position for k in the array? If it's in the middle, for example, the average case may be O(1). It really depends on how you use this.

Do iterative and recursive versions of an algorithm have the same time complexity?

Say, for example, the iterative and recursive versions of the Fibonacci series. Do they have the same time complexity?
The answer depends strongly on your implementation. For the example you gave there are several possible solutions and I would say that the naive way to implement a solution has better complexity when implemented iterative. Here are the two implementations:
int iterative_fib(int n) {
if (n <= 2) {
return 1;
}
int a = 1, b = 1, c;
for (int i = 0; i < n - 2; ++i) {
c = a + b;
b = a;
a = c;
}
return a;
}
int recursive_fib(int n) {
if (n <= 2) {
return 1;
}
return recursive_fib(n - 1) + recursive_fib(n-2);
}
In both implementations I assumed a correct input i.e. n >= 1. The first code is much longer but its complexity is O(n) i.e. linear, while the second implementation is shorter but has exponential complexity O(fib(n)) = O(φ^n) (φ = (1+√5)/2) and thus is much slower.
One can improve the recursive version by introducing memoization(i.e. remembering the return values of the function you have already computed). This is usually done by introducing an array where you store the values. Here is an example:
int mem[1000]; // initialize this array with some invalid value. Usually 0 or -1
// as memset can be used for that: memset(mem, -1, sizeof(mem));
int mem_fib(int n) {
if (n <= 2) {
return mem[n] = 1;
}
if (mem[n-1] == -1) {
solve(n-1);
}
if (mem[n-2] == -1) {
solve(n-2);
}
return mem[n] = mem[n-1] + mem[n-2];
}
Here the complexity of the recursive algorithm is linear just like the iterative solution. The solution I introduced above is the top-down approach for dynamic programming solution of your problem. The bottom-up approach will lead to something very similar to the solution I introduced as iterative.
There a lot of articles on dynamic programming including in wikipedia
Depending on the problems I have met in my experience some are way harder to be solved with bottom-up approach(i.e. iterative solution), while others are hard to solve with top-down approach.
However the theory states that each problem that has an iterative solution has a recursive with the same computational complexity (and vice versa).
Hope this answer helps.
The particular recursive algorithm for calculation fibanocci series is less efficient.
Consider the following situation of finding fib(4) through the recursive algorithm
int fib(n) :
if( n==0 || n==1 )
return n;
else
return fib(n-1) + fib(n-2)
Now when the above algorithm executes for n=4
fib(4)
fib(3) fib(2)
fib(2) fib(1) fib(1) fib(0)
fib(1) fib(0)
It's a tree. It says that for calculating fib(4) you need to calculate fib(3) and fib(2) and so on.
Notice that even for a small value of 4, fib(2) is calculated twice and fib(1) is calculated thrice. This number of additions grows for large numbers.
There is a conjecture that the number of additions required for calculating fib(n) is
fib(n+1) -1
So this duplication is the one which is the cause of reduced performance in this particular algorithm.
The iterative algorithm for fibonacci series is considerably faster since it does not involve calculating the redundant things.
It may not be the same case for all the algorithms though.
If you take some recursive algorithm you can convert it to iterative by storing all function local variables in an array, effectively simulating stack on heap. If done like this there's no difference between iterative and recursive.
Note that there are (at least) two recursive Fibonacci algorithms, so for the example to be exact you need to specify which recursive algorithm you're talking about.
Yes, every iterative algorithm can be transformed into recursive version and vice versa. One way by passing continuations and the other by implementing stack structure. This is done without increase in time complexity.
If you can optimize tail-recursion then every iterative algorithm can be transformed to recursive one without increasing asymptotic memory complexity.
Yes, if you use exactly the same ideas underlying the algorithm, it does not matter. However, recursion is often easy to use with regard to iteration. For instance, writing a recursive version of the towers of Hanoi is quite easy. Transforming the recursive version into a corresponding iterative version is difficult and error prone even though it can be done. Actually there is theorem that states that every recursive algorithm can be transformed into an equivalent iterative one (doing this requires mimicking the recursion iteratively using one or more stack data structures to hold parameters passed to recursive invocations).

Resources