Recurrence relations, analyze algorithm - algorithm

Alright I have some difficulty completely understanding recurrence relations.
So for example if I'm analyzing quick sort in the worst case using Θ-notation, giving the array an input of unsorted positive numbers;
When base case n <= 3 I sort the small array using insertionsort.
Time: O(1) or O(n^2)?
I use linear search to set the pivot as the median of all elements.
Time: Θ(n)
Partitioning left and right of pivot and performing recursion.
Time: 2 * T(n/2) I think
would the recurrence be:
T(n) = O(1) + Θ(n) + 2 * T(n/2) then?
Something tells me the base case would instead take O(n^2) time because if the input is small enough that would be the worst case.
Would that then give me the recurrence:
T(n) = O(n^2) + Θ(n) + 2 * T(n/2)?

When base case n <= 3 I sort the small array using insertionsort.
always O(1)
I use linear search to set the pivot as the median of all elements.
You may want to clarify this more, to find the median as pivot is not a simple task of doing linear search. The few ways are i) using
Quick Select Algorithm (average O(N)) or ii) sorting the subpartition
O(NlogN) iii) median-of-median algorithm O(N).
Partitioning left and right of pivot and performing recursion.
2F(N/2) + N
Putting it all together (Assuming that the pivot is always the median & that you take O(N) to find the pivot each time):
Best_Case = Worst_Case (since the pivot is always median)
F(3) = F(2) = F(1) = 1
F(n) = 2F(N/2) + 2N
= 2(2F(N/2^2) + 2(N/2)) + 2N
= 2(2)F(N/2^2) + 2N + 2N
= 2(3)F(N/2^3) + 3(2)N
= 2(LogN)F(N/N) + (2N)LogN
= O(NlogN)

Related

Calculate the time complexity of recurrence relation f(n) = f(n/2) + f(n/3)

How to calculate time complexity of recurrence relation f(n) = f(n/2) + f(n/3). We have base case at n=1 and n=0.
How to calculate time complexity for general case i.e f(n) = f(n/x) + f(n/y), where x<n and y<n.
Edit-1 :(after first answer posted) every number considered is integer.
Edit-2 :(after first answer posted) I like the answer given by Mbo but is it possible to answer this without using any fancy theorem like master theorem etc.Like by making tree etc.
However users are free to answer the way they like and i will try to understand.
In "layman terms" you can get dependence with larger coefficient:
T(n) = T(n/2) + T(n/2) + O(1)
build call tree for n=2^k and see that the last tree level contains 2^k items, higher level 2^k-1 items, next one 2^k-2 and so on. Sum of sequence (geometric progression)
2^k + 2^k-1 + 2^k-2 + ... + 1 = 2^(k+1) = 2*n
so complexity for this dependence is linear too.
Now get dependence with smaller (zero) second coefficient:
T(n) = T(n/2) + O(1)
and ensure in linear complexity too.
Seems clear that complexity of recurrence in question lies between complexities for these simpler examples, and is linear.
In general case recurrences with complex branching might be solved with Aktra-Bazzi method (more general approach than Master theorem)
I assume that dependence is
T(n) = T(n/2) + T(n/3) + O(1)
In this case g=1, to find p we should numerically solve
(1/2)^p + (1/3)^p = 1
and get p~0.79, then integrate
T(x) = Theta(x^0.79 * (1 + Int[1..x]((1/u^0.79)*du))) =
Theta(x^0.79 * (1 + 4.8*x^0.21 - 4.8) =
Theta(x^0.79 + 4.8*x) =
Theta(x)
So complexity is linear

Time complexity of one recursive algorithm

here we have an algorithm
T(n) = n-1 + T(i-1) + T(n-i)
T(1) = 1
How to calculate it's time complexity?
i is between 1 and n
I can recognise this as quick sort algorithm (randomized quick sort).
I am sure the question somehow missed the summation part.
Okay! you can use substitution method over here..check with O(n^2). you will get the idea that O(n^2) is the worst case time complexity.
The average case is a bit tricky. Then the pivot can be any element from 1 to n. Then analyse it. Here also you can apply substituin with T(n)=O(nlogn).
I think we should solve it like this
if i = 2 then we have
T(n) = n + T(n-2) = Theta(n^2)
if i = n/2 then we have
T(n) = n-1 + T(n/2 -1) + T(n/2) = Theta(n.logn)
then we have upper bound O(n^2) and algorithm is in order of O(n^2)

Running time of merge sort on linked lists?

i came across this piece of code to perform merge sort on a link list..
the author claims that it runs in time O(nlog n)..
here is the link for it...
http://www.geeksforgeeks.org/merge-sort-for-linked-list/
my claim is that it takes atleast O(n^2) time...and here is my argument...
look, you divide the list(be it array or linked list), log n times(refer to recursion tree), during each partition, given a list of size i=n, n/2, ..., n/2^k, we would take O(i) time to partition the original/already divided list..since sigma O(i)= O(n),we can say , we take O(n) time to partition for any given call of partition(sloppily), so given the time taken to perform a single partition, the question now arises as to how many partitions are going to happen all in all, we observe that the number of partitions at each level i is equal to 2^i , so summing 2^0+2^1+....+2^(lg n ) gives us [2(lg n)-1] as the sum which is nothing but (n-1) on simplification , implying that we call partition n-1, (let's approximate it to n), times so , the complexity is atleast big omega of n^2..
if i am wrong, please let me know where...thanks:)
and then after some retrospection , i applied master method to the recurrence relation where i replaced theta of 1 which is there for the conventional merge sort on arrays with theta of n for this type of merge sort (because the divide and combine operations take theta of n time each), the running time turned out to be theta of (n lg n)...
also i noticed that the cost at each level is n (because 2 power i * (n/(2pow i)))...is the time taken for each level...so its theta of n at each level* lg n levels..implying that its theta of (n lg n).....did i just solve my own question??pls help i am kinda confused myself..
The recursive complexity definition for an input list of size n is
T(n) = O(n) + 2 * T(n / 2)
Expanding this we get:
T(n) = O(n) + 2 * (O(n / 2) + 2 * T(n / 4))
= O(n) + O(n) + 4 * T(n / 4)
Expanding again we get:
T(n) = O(n) + O(n) + O(n) + 8 * T(n / 8)
Clearly there is a pattern here. Since we can repeat this expansion exactly O(log n) times, we have
T(n) = O(n) + O(n) + ... + O(n) (O(log n) terms)
= O(n log n)
You are performing a sum twice for some weird reason.
To split and merge a linked list of size n, it takes O(n) time. The depth of recursion is O(log n)
Your argument was that a splitting step takes O(i) time and sum of split steps become O(n) And then you call it the time taken to perform only one split.
Instead, lets consider this, a problem of size n forms two n/2 problems, four n/4 problems eight n/8 and so on until 2^log n n/2^logn subproblems are formed. Sum these up you get O(nlogn) to perform splits.
Another O(nlogn) to combine sub problems.

Worst Case Performance of Quicksort

I am trying to prove the following worst-case scenario for the Quicksort algorithm but am having some trouble. Initially, we have an array of size n, where n = ij. The idea is that at every partition step of Quicksort, you end up with two sub-arrays where one is of size i and the other is of size i(j-1). i in this case is an integer constant greater than 0. I have drawn out the recursive tree of some examples and understand why this is a worst-case scenario and that the running time will be theta(n^2). To prove this, I've used the iteration method to solve the recurrence equation:
T(n) = T(ij) = m if j = 1
T(n) = T(ij) = T(i) + T(i(j-1)) + cn if j > 1
T(i) = m
T(2i) = m + m + c*2i = 2m + 2ci
T(3i) = m + 2m + 2ci + 3ci = 3m + 5ci
So it looks like the recurrence is:
j
T(n) = jm + ci * sum k - 1
k=1
At this point, I'm a bit lost as to what to do. It looks the summation at the end will result in j^2 if expanded out, but I need to show that it somehow equals n^2. Any explanation on how to continue with this would be appreciated.
Pay attention, the quicksort algorithm worst case scenario is when you have two subproblems of size 0 and n-1. In this scenario, you have this recurrence equations for each level:
T(n) = T(n-1) + T(0) < -- at first level of tree
T(n-1) = T(n-2) + T(0) < -- at second level of tree
T(n-2) = T(n-3) + T(0) < -- at third level of tree
.
.
.
The sum of costs at each level is an arithmetic serie:
n n(n-1)
T(n) = sum k = ------ ~ n^2 (for n -> +inf)
k=1 2
It is O(n^2).
Its a problem of simple mathematics. The complexity as you have calculated correctly is
O(jm + ij^2)
what you have found out is a parameterized complextiy. The standard O(n^2) is contained in this as follows - assuming i=1 you have a standard base case so m=O(1) hence j=n therefore we get O(n^2). if you put ij=n you will get O(nm/i+n^2/i) . Now what you should remember is that m is a function of i depending upon what you will use as the base case algorithm hence m=f(i) thus you are left with O(nf(i)/i + n^2/i). Now again note that since there is no linear algorithm for general sorting hence f(i) = omega(ilogi) which will give you O(nlogi + n^2/i). So you have only one degree of freedom that is i. Check that for any value of i you cannot reduce it below nlogn which is the best bound for comparison based.
Now what I am confused is that you are doing some worst case analysis of quick sort. This is not the way its done. When you say worst case it implies you are using randomization in which case the worst case will always be when i=1 hence the worst case bound will be O(n^2). An elegant way to do this is explained in randomized algorithm book by R. Motwani and Raghavan alternatively if you are a programmer then you look at Cormen.

Average Runtime of Quickselect

Wikipedia states that the average runtime of quickselect algorithm (Link) is O(n). However, I could not clearly understand how this is so. Could anyone explain to me (via recurrence relation + master method usage) as to how the average runtime is O(n)?
Because
we already know which partition our desired element lies in.
We do not need to sort (by doing partition on) all the elements, but only do operation on the partition we need.
As in quick sort, we have to do partition in halves *, and then in halves of a half, but this time, we only need to do the next round partition in one single partition (half) of the two where the element is expected to lie in.
It is like (not very accurate)
n + 1/2 n + 1/4 n + 1/8 n + ..... < 2 n
So it is O(n).
Half is used for convenience, the actual partition is not exact 50%.
To do an average case analysis of quickselect one has to consider how likely it is that two elements are compared during the algorithm for every pair of elements and assuming a random pivoting. From this we can derive the average number of comparisons. Unfortunately the analysis I will show requires some longer calculations but it is a clean average case analysis as opposed to the current answers.
Let's assume the field we want to select the k-th smallest element from is a random permutation of [1,...,n]. The pivot elements we choose during the course of the algorithm can also be seen as a given random permutation. During the algorithm we then always pick the next feasible pivot from this permutation therefore they are chosen uniform at random as every element has the same probability of occurring as the next feasible element in the random permutation.
There is one simple, yet very important, observation: We only compare two elements i and j (with i<j) if and only if one of them is chosen as first pivot element from the range [min(k,i), max(k,j)]. If another element from this range is chosen first then they will never be compared because we continue searching in a sub-field where at least one of the elements i,j is not contained in.
Because of the above observation and the fact that the pivots are chosen uniform at random the probability of a comparison between i and j is:
2/(max(k,j) - min(k,i) + 1)
(Two events out of max(k,j) - min(k,i) + 1 possibilities.)
We split the analysis in three parts:
max(k,j) = k, therefore i < j <= k
min(k,i) = k, therefore k <= i < j
min(k,i) = i and max(k,j) = j, therefore i < k < j
In the third case the less-equal signs are omitted because we already consider those cases in the first two cases.
Now let's get our hands a little dirty on calculations. We just sum up all the probabilities as this gives the expected number of comparisons.
Case 1
Case 2
Similar to case 1 so this remains as an exercise. ;)
Case 3
We use H_r for the r-th harmonic number which grows approximately like ln(r).
Conclusion
All three cases need a linear number of expected comparisons. This shows that quickselect indeed has an expected runtime in O(n). Note that - as already mentioned - the worst case is in O(n^2).
Note: The idea of this proof is not mine. I think that's roughly the standard average case analysis of quickselect.
If there are any errors please let me know.
In quickselect, as specified, we apply recursion on only one half of the partition.
Average Case Analysis:
First Step: T(n) = cn + T(n/2)
where, cn = time to perform partition, where c is any constant(doesn't matter). T(n/2) = applying recursion on one half of the partition.Since it's an average case we assume that the partition was the median.
As we keep on doing recursion, we get the following set of equation:
T(n/2) = cn/2 + T(n/4) T(n/4) = cn/2 + T(n/8) .. . T(2) = c.2 + T(1) T(1) = c.1 + ...
Summing the equations and cross-cancelling like values produces a linear result.
c(n + n/2 + n/4 + ... + 2 + 1) = c(2n) //sum of a GP
Hence, it's O(n)
I also felt very conflicted at first when I read that the average time complexity of quickselect is O(n) while we break the list in half each time (like binary search or quicksort). It turns out that breaking the search space in half each time doesn't guarantee an O(log n) or O(n log n) runtime. What makes quicksort O(n log n) and quickselect is O(n) is that we always need to explore all branches of the recursive tree for quicksort and only a single branch for quickselect. Let's compare the time complexity recurrence relations of quicksort and quickselect to prove my point.
Quicksort:
T(n) = n + 2T(n/2)
= n + 2(n/2 + 2T(n/4))
= n + 2(n/2) + 4T(n/4)
= n + 2(n/2) + 4(n/4) + ... + n(n/n)
= 2^0(n/2^0) + 2^1(n/2^1) + ... + 2^log2(n)(n/2^log2(n))
= n (log2(n) + 1) (since we are adding n to itself log2 + 1 times)
Quickselect:
T(n) = n + T(n/2)
= n + n/2 + T(n/4)
= n + n/2 + n/4 + ... n/n
= n(1 + 1/2 + 1/4 + ... + 1/2^log2(n))
= n (1/(1-(1/2))) = 2n (by geometric series)
I hope this convinces you why the average runtime of quickselect is O(n).

Resources