Recurrence of Merge-Sort - need explanation - algorithm

This is the recurrence of the worst-case running time T(n) of the Merge-Sort procedure.
What is T?
why 2T(n/2) ?
For which operation is the O(n) ?

For simplicity, assume that n is a power of 2 so that each divide step yields two subproblems, both of size exactly n/2.
The base case occurs when n = 1.
When n ≥ 2, time for merge sort steps:
Divide: Just compute q as the average of p and r, which takes constant time i.e. Θ(1).
why 2T(n/2) ?
Conquer: Recursively solve 2 subproblems, each of size n/2, which is 2T(n/2).
For which operation is the O(n) ?
Combine: MERGE on an n-element subarray takes Θ(n) time.
Summed together they give a function that is linear in n, which is Θ(n). Therefore, the recurrence for merge sort running time is

Related

What is the time complexity of the merge step of mergesort?

I know this algorithm has a time complexity of o(nlogn), but if we speak about only the merge step, is this one still o(nlogn)? Or is it reduced to o(logn)? I believe the second is the answer but since we still have to touch every element in the array, I'm suspecting the complexity remains the same
Cheers!
The "split" step is the one that takes o(logn), and the merge one is o(n), just realized that via a comment.
The split step of Merge Sort will take O(n) instead of O(log(n)).
If we have the runtime function of split step:
T(n) = 2T(n/2) + O(1)
with T(n) is the runtime for input size n, 2 is the number of new problems and n/2 is the size of each new problem, O(1) is the constant time to split an array in half.
We also has the base case: T(4) = O(1) and T(3) = O(1) .
We might come up with (not really accurate):
T(n) = n/2 * O(1) = O(n/2) = O(n)
Moreover, to understand the time complexity of Merge step (finger algorithm), we should understand the number of sub-array.
The number of sub-array has the asymptotic growth rate at the worst case = O(n/2 + 1) = O(n).
The "Finger Algorithm" grow linear with the growth of number of sub-array, it will loop through each sub-array O(n), and at each sub-array at the worst case it will need to loop 2 more times -> the time complexity of merge step (finger algorithm) = O(2n) = O(n).

Logarithmic function in time complexity

How does a program's worst case or average case dependent on log function? How does the base of log come in play?
The log factor appears when you split your problem to k parts, of size n/k each and then "recurse" (or mimic recursion) on some of them.
A simple example is the following loop:
foo(n):
while n > 0:
n = n/2
print n
The above will print n, n/2, n/4, .... , 1 - and there are O(logn) such values.
the complexity of the above program is O(logn), since each printing requires constant amount of time, and number of values n will get along the way is O(logn)
If you are looking for "real life" examples, in quicksort (and for simplicity let's assume splitting to exactly two halves), you split the array of size n to two subarrays of size n/2, and then you recurse on both of them - and invoke the algorithm on each half.
This makes the complexity function of:
T(n) = 2T(n/2) + O(n)
From master theorem, this is in Theta(nlogn).
Similarly, on binary search - you split the problem to two parts, and recurse only on one of them:
T(n) = T(n/2) + 1
Which will be in Theta(logn)
The base is not a factor in big O complexity, because
log_k(n) = log_2(n)/log_2(k)
and log_2(k) is constant, for any constant k.

Recurrence for the Worst-Case Running Time of Quicksort

Assume we constructed a quicksort and the pivot value takes linear time. Find the recurrence for worst-case running time.
My answer:
T(n)= T(n-1) + T(1) + theta(n)
Worst case occurs when the subarrays are completely unbalanced.
There is 1 element in one subarray and (n-1) elements in the other subarray.
theta(n) because it takes running time n to find the pivot.
Am I doing this correctly?
Your recurrence is mostly correct, but you don't actually have two recursive calls made. In the worst-case for quicksort, the pivot will be the largest or smallest element in the array, so you'll recur on one giant array of size n - 1. The other subarray has length 0, so no recursive calls are made. To top everything off, the total work done is Θ(n) per level, so the recurrence relation would more appropriately be
T(n) = T(n - 1) + Θ(n)
This in turn then solves to Θ(n2).
Hope this helps!
you cannot observe, because according to my research T(N)= T(N-K)+T(K-1)+n
we cannot observe exact value until we have
value of k,
T(n) = T(an/(a+b)) + T(bn/(a+b)) + n
Where a/(a+b) and b/(a+b) are fractions of array under consideration

Merge sort time complexity vs my algorithm. Big O

Here is an algorithm I am trying to analyse (see below). I do not understand why this has a O(n) time complexity when the merge sorts has O(n logn), they both seems to be doing the same thing.
then both have the same j time complexity, if you left j be the row then 2^j X c(n/2^j) = cn and they both have a running time of log n, where n is the number of elements.
Algorithm: BinarySum(A, i, n)
Input: An array A and integers i and n.
Output: The sum of the n integers in A starting at index i.
if n = 1 then
return A[i]
return BinarySum(A, i, [n/2] ) + BinarySum(A, i + [n/2], [n/2])
thanks,
daniel
You are processing for constant time each member of an array. No matter how are you doing this, the resulting complexity will be O(n). By the way, if you use a pen and paper method for a simple example, you'll see that in fact you are calling array's elements exactly in order they appear in the array, which means that this algorithm is equivalent to simple iterative summation.
Formal proof for O(n) complexity follows directly from the Master theorem. Recurrence relation for your algorithm is
T(n) = 2 T(n/2)
which is covered by case 1 of the theorem. From this complexity is calculated as
O(n ^ log_2_(2)) = O(n)
As for merge sort, its recurrence relation is
T(n) = 2 T(n/2) + O(n)
which is a totally different story - case 2 of the Master theorem.
The recurrence formula for your algorithm is;
2T(n/2) = O(n)
whereas the recurrence formula for the merge sort is;
2T(n/2) + O(n) = O(n log n)
as there are two recursive calls + a call to a merge function which takes O(n). Your function just makes two recursive calls, check out the break down;
http://www.cs.virginia.edu/~luebke/cs332.fall00/lecture3/sld004.htm
Consider the following pseudo code:
1 MergeSort(a, p, r)
2 if P<r // check for base case
3 then q = FLOOR p+r/2 // Divide
4 MergeSort(a, p, q) // conquer
5 MergeSort(a, q+1, r) // conquer
6 Merge(a, p, q, r) // Merge
Now the complexity will be as follows:
for
Line 3 :- O(1) since it takes constant time.
Line 4 :- T(n/2) because it operates on the half of elements.
Line 5 :- T(n/2) because it operates on the half of elements.
Line 6 :- T(n) because it operates on the all the elements
Now using the recurrence relation as mentioned by #Lunar we can state that the time complexity is equivalent to :- O(nlgn)

Is worst case analysis not equal to asymptotic bounds

Can someone explain to me why this is true. I heard a professor mention this is his lecture
The two notions are orthogonal.
You can have worst case asymptotics. If f(n) denotes the worst case time taken by a given algorithm with input n, you can have eg. f(n) = O(n^3) or other asymptotic upper bounds of the worst case time complexity.
Likewise, you can have g(n) = O(n^2 log n) where g(n) is the average time taken by the same algorithm with (say) uniformly distributed (random) inputs of size n.
Or you can have h(n) = O(n) where h(n) is the average time taken by the same algorithm with particularly distributed random inputs of size n (eg. almost sorted sequences for a sorting algorithm).
Asymptotic notation is a "measure". You have to specify what you want to count: worst case, best case, average, etc.
Sometimes, you are interested in stating asymptotic lower bounds of (say) the worst case complexity. Then you write f(n) = Omega(n^2) to state that in the worst case, the complexity is at least n^2. The big-Omega notation is opposite to big-O: f = Omega(g) if and only if g = O(f).
Take quicksort for an example. Each successive recursive call n of quicksort has a run-time complexity T(n) of
T(n) = O(n) + 2 T[ (n-1)/2 ]
in the 'best case' if the unsorted input list is splitted into two equal sublists of size (n-1)/2 in each call. Solving for T(n) gives O(n log n), in this case. If the partition is not perfect, and the two sublists are not of equal size n, i.e.
T(n) = O(n) + T(k) + T(n - 1 - k),
we still obtain O(n log n) even if k=1, just with a larger constant factor. This is because the number of recursive calls of quicksort is rising exponentially while processing the input list as long as k>0.
However, in the 'worst case' no division of the input list takes place, i.e.:
T(n) = O(n) + T(0) + T(n - 1) = O(n) + O(n-1) + T(n-1) + T(n-2) ... .
This happens e.g. if we take the first element of a sorted list as the pivot element.
Here, T(0) means one of the resulting sublists is zero and therefore takes no computing time (since the sublist has zero elements). All the remaining load T(n-1) is needed for the second sublist. In this case, we obtain O(n²).
If an algorithm had no worst case scenario, it would be not only be O[f(n)] but also o[f(n)] (Big-O vs. little-o notation).
The asymptotic bound is the expected behaviour as the number of operations go to infinity. Mathematically it is just that lim as n goes to infinity. The worst case behaviour however is applicable to finite number of operations.

Resources