Finding running time of a simple linear recursive algorithm using recurrences - performance

If I had a simple recursive algorithm, such as:
numberOfMatches(A, x, i): // A is an array of values, x is a single value
the algorithm will search the array from A[1] to A[i]
count = 0
if i==0:
return 0
if A[i]=x:
count = numberOfMatches(A, x, i-1) +1
else:
count = numberOfMatches(A, x, i-1)
return count
How would I go about finding the running time (which I know from common sense is O(n)) using recurrences?
I have got T(n) = T(n-1) because the list to be searched decreases by 1 each time, however, I don't think this is right.
I also need to solve the recurrence algorithm by expanding it, and I dont even know where to start with that.

T(n)=T(n-1)+1
By induction you can prove it easily.

Related

Finding the asymptotic amount of comparisons performed by an algorithm

I have been given the following algorithm:
Supersort(A, i, j):
if(j = i): return
if(j = i + 1):
if(A[i] > A[j]):
swap(A[i], A[j])
else:
k = floor of ( (j-i+1)/3 )
Supersort(A, i, j-k) // sort first two thirds
Supersort(A, i+k, j) // sort last two thirds
Supersort(A, i, j-k) // sort first two thirds again
And I really am not sure how to analyze how many comparisons in the worst case this algorithm would make. I don't want the answer given to me, I just don't even know how to get started solving this problem. Thanks for any help
Typically when you have a recursive function the first thing you do is obtain a recurrence relation. In your case let T(n) be the cost of supersort when the input is of size n. What is that equal? The first two if stmts are just constants and the else costs T(2n/3)+T(n/3)+T(2n/3) then
T(n)=2T(2n/3)+T(n/3)+C
Then you solve that recurrence.
Correction:
in the three cases you use 2/3. I thought one of them is 1/3. In that case the recurrence is even simpler and can be solved using the master theorem
T(n)=3T(2n/3)+C

Dynamic Programming : True or False

I have a conceptual doubt regarding Dynamic Programming:
In a dynamic programming solution, the space requirement is always at least as big as the number of unique sub problems.
I thought it in terms of Fibonacci numbers:
f(n) = f(n-1) + f(n-2)
Here there are two subproblems, the space required will be at least O(n) if input is n.
Right?
But, the answer is False.
Can someone explain this?
The answer is indeed false.
For example, in your fibonacci series, you can use Dynamic Programming with O(1) space, by remembering the only 2 last numbers:
fib(n):
prev = current = 1
i = 2
while i < n:
next = prev + current
prev = current
current = next
return current
This is a common practice where you don't need all smaller subproblems to solve the bigger one, and you can discard most of them and save some space.
If you implement Fibonacci calculation using bottom-up DP, you can discard earlier results which you don't need. This is an example:
fib = [0, 1]
for i in xrange(n):
fib = [fib[1], fib[0] + fib[1]]
print fib[1]
As this example shows, you only need memorize the last two elements in the array.
This statement is not correct. But it's almost correct.
Generally dynamic programming solution needs O(number of subproblems) space. In other words, if there is a dynamic programming solution to the problem it could be implemented using O(number of subproblems) memory.
In your particular problem "calculation of Fibonacci numbers", if you write down straightforward dynamic programming solution:
Integer F(Integer n) {
if (n == 0 || n == 1) return 1;
if (memorized[n]) return memorized_value[n];
memorized_value[n] = F(n - 1) + F(n - 2);
memorized[n] = true;
return memorized_value[n];
}
it will use O(number of subproblems) memory. But as you mentioned, by analyzing the recurrence you can come up with a more optimal solution that uses O(1) memory.
P.S. The recurrence for Fibonacci numbers that you've mentioned has n + 1 subproblems. Usually by subproblems people are referring to all f values you need to calculate to calculate a particular f value. Here you need to calculate f(0), f(1), f(2), ..., f(n) in order to compute f(n).

how to write a recurrence relation for a given piece of code

In my algorithm and data structures class we were given a few recurrence relations either to solve or that we can see the complexity of an algorithm.
At first, I thought that the mere purpose of these relations is to jot down the complexity of a recursive divide-and-conquer algorithm. Then I came across a question in the MIT assignments, where one is asked to provide a recurrence relation for an iterative algorithm.
How would I actually come up with a recurrence relation myself, given some code? What are the necessary steps?
Is it actually correct that I can jot down any case i.e. worst, best, average case with such a relation?
Could possibly someone give a simple example on how a piece of code is turned into a recurrence relation?
Cheers,
Andrew
Okay, so in algorithm analysis, a recurrence relation is a function relating the amount of work needed to solve a problem of size n to that needed to solve smaller problems (this is closely related to its meaning in math).
For example, consider a Fibonacci function below:
Fib(a)
{
if(a==1 || a==0)
return 1;
return Fib(a-1) + Fib(a-2);
}
This does three operations (comparison, comparison, addition), and also calls itself recursively. So the recurrence relation is T(n) = 3 + T(n-1) + T(n-2). To solve this, you would use the iterative method: start expanding the terms until you find the pattern. For this example, you would expand T(n-1) to get T(n) = 6 + 2*T(n-2) + T(n-3). Then expand T(n-2) to get T(n) = 12 + 3*T(n-3) + 2*T(n-4). One more time, expand T(n-3) to get T(n) = 21 + 5*T(n-4) + 3*T(n-5). Notice that the coefficient of the first T term is following the Fibonacci numbers, and the constant term is the sum of them times three: looking it up, that is 3*(Fib(n+2)-1). More importantly, we notice that the sequence increases exponentially; that is, the complexity of the algorithm is O(2n).
Then consider this function for merge sort:
Merge(ary)
{
ary_start = Merge(ary[0:n/2]);
ary_end = Merge(ary[n/2:n]);
return MergeArrays(ary_start, ary_end);
}
This function calls itself on half the input twice, then merges the two halves (using O(n) work). That is, T(n) = T(n/2) + T(n/2) + O(n). To solve recurrence relations of this type, you should use the Master Theorem. By this theorem, this expands to T(n) = O(n log n).
Finally, consider this function to calculate Fibonacci:
Fib2(n)
{
two = one = 1;
for(i from 2 to n)
{
temp = two + one;
one = two;
two = temp;
}
return two;
}
This function calls itself no times, and it iterates O(n) times. Therefore, its recurrence relation is T(n) = O(n). This is the case you asked about. It is a special case of recurrence relations with no recurrence; therefore, it is very easy to solve.
To find the running time of an algorithm we need to firstly able to write an expression for the algorithm and that expression tells the running time for each step. So you need to walk through each of the steps of an algorithm to find the expression.
For example, suppose we defined a predicate, isSorted, which would take as input an array a and the size, n, of the array and would return true if and only if the array was sorted in increasing order.
bool isSorted(int *a, int n) {
if (n == 1)
return true; // a 1-element array is always sorted
for (int i = 0; i < n-1; i++) {
if (a[i] > a[i+1]) // found two adjacent elements out of order
return false;
}
return true; // everything's in sorted order
}
Clearly, the size of the input here will simply be n, the size of the array. How many steps will be performed in the worst case, for input n?
The first if statement counts as 1 step
The for loop will execute n−1 times in the worst case (assuming the internal test doesn't kick us out), for a total time of n−1 for the loop test and the increment of the index.
Inside the loop, there's another if statement which will be executed once per iteration for a total of n−1 time, at worst.
The last return will be executed once.
So, in the worst case, we'll have done 1+(n−1)+(n−1)+1
computations, for a total run time T(n)≤1+(n−1)+(n−1)+1=2n and so we have the timing function T(n)=O(n).
So in brief what we have done is-->>
1.For a parameter 'n' which gives the size of the input we assume that each simple statements that are executed once will take constant time,for simplicity assume one
2.The iterative statements like loops and inside body will take variable time depending upon the input.
Which has solution T(n)=O(n), just as with the non-recursive version, as it happens.
3.So your task is to go step by step and write down the function in terms of n to calulate the time complexity
For recursive algorithms, you do the same thing, only this time you add the time taken by each recursive call, expressed as a function of the time it takes on its input.
For example, let's rewrite, isSorted as a recursive algorithm:
bool isSorted(int *a, int n) {
if (n == 1)
return true;
if (a[n-2] > a[n-1]) // are the last two elements out of order?
return false;
else
return isSorted(a, n-1); // is the initial part of the array sorted?
}
In this case we still walk through the algorithm, counting: 1 step for the first if plus 1 step for the second if, plus the time isSorted will take on an input of size n−1, which will be T(n−1), giving a recurrence relation
T(n)≤1+1+T(n−1)=T(n−1)+O(1)
Which has solution T(n)=O(n), just as with the non-recursive version, as it happens.
Simple Enough!! Practice More to write the recurrence relation of various algorithms keeping in mind how much time each step will be executed in algorithm

Finding time complexity of partition by quick sort metod

Here is an algorithm for finding kth smallest number in n element array using partition algorithm of Quicksort.
small(a,i,j,k)
{
if(i==j) return(a[i]);
else
{
m=partition(a,i,j);
if(m==k) return(a[m]);
else
{
if(m>k) small(a,i,m-1,k);
else small(a,m+1,j,k);
}
}
}
Where i,j are starting and ending indices of array(j-i=n(no of elements in array)) and k is kth smallest no to be found.
I want to know what is the best case,and average case of above algorithm and how in brief. I know we should not calculate termination condition in best case and also partition algorithm takes O(n). I do not want asymptotic notation but exact mathematical result if possible.
First of all, I'm assuming the array is sorted - something you didn't mention - because that code wouldn't otherwise work. And, well, this looks to me like a regular binary search.
Anyway...
The best case scenario is when either the array is one element long (you return immediately because i == j), or, for large values of n, if the middle position, m, is the same as k; in that case, no recursive calls are made and it returns immediately as well. That makes it O(1) in best case.
For the general case, consider that T(n) denotes the time taken to solve a problem of size n using your algorithm. We know that:
T(1) = c
T(n) = T(n/2) + c
Where c is a constant time operation (for example, the time to compare if i is the same as j, etc.). The general idea is that to solve a problem of size n, we consume some constant time c (to decide if m == k, if m > k, to calculate m, etc.), and then we consume the time taken to solve a problem of half the size.
Expanding the recurrence can help you derive a general formula, although it is pretty intuitive that this is O(log(n)):
T(n) = T(n/2) + c = T(n/4) + c + c = T(n/8) + c + c + c = ... = T(1) + c*log(n) = c*(log(n) + 1)
That should be the exact mathematical result. The algorithm runs in O(log(n)) time. An average case analysis is harder because you need to know the conditions in which the algorithm will be used. What is the typical size of the array? The typical size of k? What is the mos likely position for k in the array? If it's in the middle, for example, the average case may be O(1). It really depends on how you use this.

Given a sorted array, find the maximum subarray of repeated values

Yet another interview question asked me to find the maximum possible subarray of repeated values given a sorted array in shortest computational time possible.
Let input array be A[1 ... n]
Find an array B of consecutive integers in A such that:
for x in range(len(B)-1):
B[x] == B[x+1]
I believe that the best algorithm is dividing the array in half and going from the middle outwards and comparing from the middle the integers with one another and finding the longest strain of the same integers from the middle. Then I would call the method recursively by dividing the array in half and calling the method on the two halves.
My interviewer said my algorithm is good but my analysis that the algorithm is O(logn) is incorrect but never got around to telling me what the correct answer is. My first question is what is the Big-O analysis of this algorithm? (Show as much work as possible please! Big-O is not my forte.) And my second question is purely for my curiosity whether there is an even more time efficient algorithm?
The best you can do for this problem is an O(n) solution, so your algorithm cannot possibly be both correct and O(lg n).
Consider for example, the case where the array contains no repeated elements. To determine this, one needs to examine every element, and examining every element is O(n).
This is a simple algorithm that will find the longest subsequence of a repeated element:
start = end = 0
maxLength = 0
i = 0
while i + maxLength < a.length:
if a[i] == a[i + maxLength]:
while i + maxLength < a.length and a[i] == a[i + maxLength]:
maxLength += 1
start = i
end = i + maxLength
i += maxLength
return a[start:end]
If you have reason to believe the subsequence will be long, you can set the initial value of maxLength to some heuristically selected value to speed things along, and then only look for shorter sequences if you don't find one (i.e. you end up with end == 0 after the first pass.)
I think we all agree that in the worst case scenario, where all of A is unique or where all of A is the same, you have to examine every element in the array to either determine there are no duplicates or determine all the array contains one number. Like the other posters have said, that's going to be O(N). I'm not sure divide & conquer helps you much with algorithmic complexity on this one, though you may be able to simplify the code a bit by using recursion. Divide & conquer really helps cut down on Big O when you can throw away large portions of the input (e.g. Binary Search), but in the case where you potentially have to examine all the input, it's not going to be much different.
I'm assuming the result here is you're just returning the size of the largest B you've found, though you could easily modify this to return B instead.
So on the algorithm front, given that A is sorted, I'm not sure there's going to be any answer faster/simpler answer than just walking through the array in order. It seems like the simplest answer is to have 2 pointers, one starting at index 0 and one starting at index 1. Compare them and then increment them both; each time they're the same you tick a counter upward to give you the current size of B and when they differ you reset that counter to zero. You also keep around a variable for the max size of a B you've found so far and update it every time you find a bigger B.
In this algorithm, n elements are visited with a constant number of calculations per each visited element, so the running time is O(n).
Given sorted array A[1..n]:
max_start = max_end = 1
max_length = 1
start = end = 1
while start < n
while A[start] == A[end] && end < n
end++
if end - start > max_length
max_start = start
max_end = end - 1
max_length = end - start
start = end
Assuming that the longest consecutive integers is only of length 1, you'll be scanning through the entire array A of n items. Thus, the complexity is not in terms of n, but in terms of len(B).
Not sure if the complexity is O(n/len(B)).
Checking the 2 edge case
- When n == len(B), you get instant result (only checking A[0] and A[n-1]
- When n == 1, you get O(n), checking all elements
- When normal case, I'm too lazy to write the algo to analyze...
Edit
Given that len(B) is not known in advance, we must take the worst case, i.e. O(n)

Resources