Finding the asymptotic amount of comparisons performed by an algorithm - algorithm

I have been given the following algorithm:
Supersort(A, i, j):
if(j = i): return
if(j = i + 1):
if(A[i] > A[j]):
swap(A[i], A[j])
else:
k = floor of ( (j-i+1)/3 )
Supersort(A, i, j-k) // sort first two thirds
Supersort(A, i+k, j) // sort last two thirds
Supersort(A, i, j-k) // sort first two thirds again
And I really am not sure how to analyze how many comparisons in the worst case this algorithm would make. I don't want the answer given to me, I just don't even know how to get started solving this problem. Thanks for any help

Typically when you have a recursive function the first thing you do is obtain a recurrence relation. In your case let T(n) be the cost of supersort when the input is of size n. What is that equal? The first two if stmts are just constants and the else costs T(2n/3)+T(n/3)+T(2n/3) then
T(n)=2T(2n/3)+T(n/3)+C
Then you solve that recurrence.
Correction:
in the three cases you use 2/3. I thought one of them is 1/3. In that case the recurrence is even simpler and can be solved using the master theorem
T(n)=3T(2n/3)+C

Related

Find an element that occurs at least k times in a sorted array in log(n) time

Given a sorted array of n elements and a number k, is it possible to find an element that occurs more than k times, in log(n) time? If there is more than one number that occurs more than k times, any of them are acceptable.
If yes, how?
Edit:
I'm able to solve the problem in linear time, and I'm happy to post that solution here - but it's fairly straightforward to solve it in n. I'm completely stumped when it comes to making it work in log(n), though, and that's what my question is about.
Here is O(n/k log(k)) solution:
i = 0
while i+k-1 < n: //don't get out of bounds
if arr[i] == arr[i+k-1]:
produce arr[i] as dupe
i = min { j | arr[j] > arr[i] } //binary search
else:
c = min { j | arr[j] == arr[i+k-1] } //binary search
i = c
The idea is, you check the element at index i+k-1, if it matches the element at index i - good, it's a dupe. Otherwise, you don't need to check all the element between i to i+k-1, only the one with the same value as arr[i+k-1].
You do need to look back for for the earliest index of this element, but you are guaranteed to exceed the index i+k by next iteration, making the total number of iteration of this algorithm O(n/k), each takes O(logk) time.
This is asymptotically better than linear time algorithm, especially for large values of k (where the algorithm decays to O(logn) for cases where k is in O(n), like for example - find element that repeats at least with frequency 0.1)
Not in general. For example, if k=2, no algorithm that doesn't in the worst case inspect every element of the array can guarantee to find a duplicate.

how to write a recurrence relation for a given piece of code

In my algorithm and data structures class we were given a few recurrence relations either to solve or that we can see the complexity of an algorithm.
At first, I thought that the mere purpose of these relations is to jot down the complexity of a recursive divide-and-conquer algorithm. Then I came across a question in the MIT assignments, where one is asked to provide a recurrence relation for an iterative algorithm.
How would I actually come up with a recurrence relation myself, given some code? What are the necessary steps?
Is it actually correct that I can jot down any case i.e. worst, best, average case with such a relation?
Could possibly someone give a simple example on how a piece of code is turned into a recurrence relation?
Cheers,
Andrew
Okay, so in algorithm analysis, a recurrence relation is a function relating the amount of work needed to solve a problem of size n to that needed to solve smaller problems (this is closely related to its meaning in math).
For example, consider a Fibonacci function below:
Fib(a)
{
if(a==1 || a==0)
return 1;
return Fib(a-1) + Fib(a-2);
}
This does three operations (comparison, comparison, addition), and also calls itself recursively. So the recurrence relation is T(n) = 3 + T(n-1) + T(n-2). To solve this, you would use the iterative method: start expanding the terms until you find the pattern. For this example, you would expand T(n-1) to get T(n) = 6 + 2*T(n-2) + T(n-3). Then expand T(n-2) to get T(n) = 12 + 3*T(n-3) + 2*T(n-4). One more time, expand T(n-3) to get T(n) = 21 + 5*T(n-4) + 3*T(n-5). Notice that the coefficient of the first T term is following the Fibonacci numbers, and the constant term is the sum of them times three: looking it up, that is 3*(Fib(n+2)-1). More importantly, we notice that the sequence increases exponentially; that is, the complexity of the algorithm is O(2n).
Then consider this function for merge sort:
Merge(ary)
{
ary_start = Merge(ary[0:n/2]);
ary_end = Merge(ary[n/2:n]);
return MergeArrays(ary_start, ary_end);
}
This function calls itself on half the input twice, then merges the two halves (using O(n) work). That is, T(n) = T(n/2) + T(n/2) + O(n). To solve recurrence relations of this type, you should use the Master Theorem. By this theorem, this expands to T(n) = O(n log n).
Finally, consider this function to calculate Fibonacci:
Fib2(n)
{
two = one = 1;
for(i from 2 to n)
{
temp = two + one;
one = two;
two = temp;
}
return two;
}
This function calls itself no times, and it iterates O(n) times. Therefore, its recurrence relation is T(n) = O(n). This is the case you asked about. It is a special case of recurrence relations with no recurrence; therefore, it is very easy to solve.
To find the running time of an algorithm we need to firstly able to write an expression for the algorithm and that expression tells the running time for each step. So you need to walk through each of the steps of an algorithm to find the expression.
For example, suppose we defined a predicate, isSorted, which would take as input an array a and the size, n, of the array and would return true if and only if the array was sorted in increasing order.
bool isSorted(int *a, int n) {
if (n == 1)
return true; // a 1-element array is always sorted
for (int i = 0; i < n-1; i++) {
if (a[i] > a[i+1]) // found two adjacent elements out of order
return false;
}
return true; // everything's in sorted order
}
Clearly, the size of the input here will simply be n, the size of the array. How many steps will be performed in the worst case, for input n?
The first if statement counts as 1 step
The for loop will execute n−1 times in the worst case (assuming the internal test doesn't kick us out), for a total time of n−1 for the loop test and the increment of the index.
Inside the loop, there's another if statement which will be executed once per iteration for a total of n−1 time, at worst.
The last return will be executed once.
So, in the worst case, we'll have done 1+(n−1)+(n−1)+1
computations, for a total run time T(n)≤1+(n−1)+(n−1)+1=2n and so we have the timing function T(n)=O(n).
So in brief what we have done is-->>
1.For a parameter 'n' which gives the size of the input we assume that each simple statements that are executed once will take constant time,for simplicity assume one
2.The iterative statements like loops and inside body will take variable time depending upon the input.
Which has solution T(n)=O(n), just as with the non-recursive version, as it happens.
3.So your task is to go step by step and write down the function in terms of n to calulate the time complexity
For recursive algorithms, you do the same thing, only this time you add the time taken by each recursive call, expressed as a function of the time it takes on its input.
For example, let's rewrite, isSorted as a recursive algorithm:
bool isSorted(int *a, int n) {
if (n == 1)
return true;
if (a[n-2] > a[n-1]) // are the last two elements out of order?
return false;
else
return isSorted(a, n-1); // is the initial part of the array sorted?
}
In this case we still walk through the algorithm, counting: 1 step for the first if plus 1 step for the second if, plus the time isSorted will take on an input of size n−1, which will be T(n−1), giving a recurrence relation
T(n)≤1+1+T(n−1)=T(n−1)+O(1)
Which has solution T(n)=O(n), just as with the non-recursive version, as it happens.
Simple Enough!! Practice More to write the recurrence relation of various algorithms keeping in mind how much time each step will be executed in algorithm

Finding running time of a simple linear recursive algorithm using recurrences

If I had a simple recursive algorithm, such as:
numberOfMatches(A, x, i): // A is an array of values, x is a single value
the algorithm will search the array from A[1] to A[i]
count = 0
if i==0:
return 0
if A[i]=x:
count = numberOfMatches(A, x, i-1) +1
else:
count = numberOfMatches(A, x, i-1)
return count
How would I go about finding the running time (which I know from common sense is O(n)) using recurrences?
I have got T(n) = T(n-1) because the list to be searched decreases by 1 each time, however, I don't think this is right.
I also need to solve the recurrence algorithm by expanding it, and I dont even know where to start with that.
T(n)=T(n-1)+1
By induction you can prove it easily.

Finding time complexity of partition by quick sort metod

Here is an algorithm for finding kth smallest number in n element array using partition algorithm of Quicksort.
small(a,i,j,k)
{
if(i==j) return(a[i]);
else
{
m=partition(a,i,j);
if(m==k) return(a[m]);
else
{
if(m>k) small(a,i,m-1,k);
else small(a,m+1,j,k);
}
}
}
Where i,j are starting and ending indices of array(j-i=n(no of elements in array)) and k is kth smallest no to be found.
I want to know what is the best case,and average case of above algorithm and how in brief. I know we should not calculate termination condition in best case and also partition algorithm takes O(n). I do not want asymptotic notation but exact mathematical result if possible.
First of all, I'm assuming the array is sorted - something you didn't mention - because that code wouldn't otherwise work. And, well, this looks to me like a regular binary search.
Anyway...
The best case scenario is when either the array is one element long (you return immediately because i == j), or, for large values of n, if the middle position, m, is the same as k; in that case, no recursive calls are made and it returns immediately as well. That makes it O(1) in best case.
For the general case, consider that T(n) denotes the time taken to solve a problem of size n using your algorithm. We know that:
T(1) = c
T(n) = T(n/2) + c
Where c is a constant time operation (for example, the time to compare if i is the same as j, etc.). The general idea is that to solve a problem of size n, we consume some constant time c (to decide if m == k, if m > k, to calculate m, etc.), and then we consume the time taken to solve a problem of half the size.
Expanding the recurrence can help you derive a general formula, although it is pretty intuitive that this is O(log(n)):
T(n) = T(n/2) + c = T(n/4) + c + c = T(n/8) + c + c + c = ... = T(1) + c*log(n) = c*(log(n) + 1)
That should be the exact mathematical result. The algorithm runs in O(log(n)) time. An average case analysis is harder because you need to know the conditions in which the algorithm will be used. What is the typical size of the array? The typical size of k? What is the mos likely position for k in the array? If it's in the middle, for example, the average case may be O(1). It really depends on how you use this.

Guards and demand

You have N guards in a line each with a demand of coins. You can skip paying a guard only if his demand is less than what you have totally paid before reaching him. Find the least number of coins you spend to cross all guards.
I think its a DP problem but can't come up with a formula. Another approach would be to binary search on the answer, but how do I verify if a number of coins is a possible answer?
This is indeed a dynamic programming problem.
Consider the function f(i, j), which is true (one) if there is an assignment of the first i guards which give you cost j. You can arrange function f(i, j) in a table of size n x S, where S is the sum of all the guards demand.
Let us denote d_i as the demand of guard i.
You can easily compute the column f(i+1) if you have f(i) by simply scanning f(i) and assigning f(i+1, j + d_i) as one if f(i + 1, j) is true and j < d_i, or f(i + 1, j) if j >= d_i.
This runs in O(nS) time and O(S) space (you only need to keep two columns per time), which is only pseudopolynomial (and quadratic-like if demands are somehow bounded and does not grow with n).
A common trick to reduce the complexity of a DP problem is to get an upper bound B on the value of the optimal solution. This way, you can prune unnecessary rows, obtaining a time complexity of O(nB) (well, even S is an upper-bound, but a very naïve one).
It turns out that, in our case, B = 2M, where M is the maximum demand of a guard.
In fact, consider the function best_assignment(i), which gives you the minimum amount of coins to pass the first i guards.
Let j be the guard with demand M. If best_assignment(j - 1) > M, then obviously the best assignment for the whole sequence is pay the guards for the best assignment of the first j-1 guards and skip the others, otherwise the upper-bound is given by best_assignment(j - 1) + M < 2M.
But how much best_assignment(j - 1) can be in the first case? It cannot be more than 2M.
This can be proven by contradiction. Let us suppose that best_assignment(j - 1) > 2M. In this assignment, the guard j-1 is paid? No, because 2M - d_{j-1} > d_{j-1}, thus it does not need to be paid. The same argument holds for j-2, j-3, ... 1, thus no guard is paid, which is absurd unless M = 0 (a very naïve case to be checked).
Since the upper-bound is proved to be 2M, the DP illustrated above with n columns and 2M rows solves the problem, with time complexity O(nM) and space complexity O(M).
function crossCost(amtPaidAlready, curIdx, demands){
//base case: we are at the end of the line
if (curIdx >= demands.size()){
return amtPaidAlready;
}
costIfWePay = crossCost(amtPaidAlready + demands[curIdx], curIdx+1, demands);
//can we skip paying the guard?
if (demands[curIdx] < amtPaidAlready){
costIfWeDontPay = crossCost(amtPaidAlready, curIdx+1, demands);
return min(costIfWePay, costIfWeDontPay);
}
//can't skip paying
else{
return costIfWePay;
}
}
This runs in O(2^N) time because it may call itself twice per execution. It's a good candidate for memoization, because it is a pure function with no side effects.
Here's my approach:
int guards[N];
int minSpent;
void func(int pos, int current_spent){
if(pos > N)
return;
if(pos == N && current_spent < minSpent){
minSpent = current_spent;
return;
}
if(guards[pos] < current_spent) // If current guard can be skipped
func(pos+1,current_spent); // just skip it to the next guard
func(pos+1,current_spent+guards[pos]); // In either cases try taking the current guard
}
Used in this way:
minSpent = MAX_NUM;
func(1,guards[0]);
This will try all possibilities its O(2^N), hope this helps.

Resources