Big Oh Complexity of Merge Sort - algorithm

I had a lecture on Big Oh for Merge Sort and I'm confused.
What was shown is:
0 Merges [<----- n -------->] = n
1 Merge [<--n/2--][-n/2--->] = (n/2 + n/2) = n
2 Merges [n/4][n/4][n/4][n/4] = 2(n/4 + n/4) = n
....
log(n) merges = n
Total = (n + n + n + ... + n) = lg n
= O(n log n)
I don't understand why (n + n + ... + n) can also be expressed as log base 2 of n and how they got for 2 merges = 2(n/4 + n/4)

In the case of 1 merge, you have two sub arrays to be sorted where each sub-array will take time proportional to n/2 to be sorted. In that sense, to sort those two sub-arrays you need a time proportional to n.
Similarly, when you are doing 2 merges, there are 4 sub arrays to be sorted where each will be taking a time proportional to n/4 which will again sum up to n.
Similarly, if you have n merges, it will take a time proportional to n to sort all the sub-arrays. In that sense, we can write the time taken by merge sort as follows.
T(n) = 2 * T(n/2) + n
You will understand that this recursive call can go deep (say to a height of h) until n/(2^h) = 1. By taking log here, we get h=log(n). That is how log(n) came to the scene. Here log is taken from base 2.
Since you have log(n) steps where each step takes a time proportional to n, total time taken can be expressed as,
n * log(n)
In big O notations, we give this as an upper bound O(nlog(n)). Hope you got the idea.
Following image of the recursion tree will enlighten you further.

The last line of the following part written in your question,
0 Merges [<----- n -------->] = n
1 Merge [<--n/2--][-n/2--->] = (n/2 + n/2) = n
2 Merges [n/4][n/4][n/4][n/4] = 2(n/4 + n/4) = n
....
n merges = n --This line is incorrect!
is wrong. You will not have total n merges of size n, but Log n merges of size n.
At every level, you divide the problem size into 2 problems of half the size. As you continue diving, the total divisions that you can do is Log n. (How? Let's say total divisions possible is x. Then n = 2x or x = Log2n.)
Since at each level you do a total work of O(n), therefore for Log n levels, the sum total of all work done will be O(n Log n).

You've got a deep of log(n) and a width of n for your tree. :)

The log portion is the result of "how many times can I split my data in two before I have only one element left?" This is the depth of your recursion tree. The multiple of n comes from the fact that for each of those levels in the tree you'll look at every element in your data set once after all merge steps at that level.
recurse downwards:
n unsorted elements
[n/2][n/2] split until singletons...
...
merge n elements at each step when recursing back up
[][][]...[][][]
[ ] ... [ ]
...
[n/2][n/2]
n sorted elements

It's very simple. Each merge takes O(n) as you demonstrated. The number of merges you need to do is log n (base 2), because each merge doubles the size of the sorted sections.

Related

If stack operations are constant time O(1), what is the time complexity of this algorithm?

BinaryConversion:
We are inputting a positive integer n with the output being a binary representation of n on a stack.
What would the time complexity here be? I'm thinking it's O(n) as the while loop halves every time, meaning the iterations for a set of inputs size 'n' decrease to n/2, n/4, n/8 etc.
Applying sum of geometric series whereby n = a and r = 1/2, we get 2n.
Any help appreciated ! I'm still a noob.
create empty stack S
while n > 0 do
push (n mod 2) onto S
n = floor(n / 2)
end while
return S
If the loop was
while n>0:
for i in range n:
# some action
n = n/2
Then the complexity would have been O(n + n/2 + n/4 ... 1) ~ O(n), and your answer would have been correct.
while n > 0 do
# some action
n = n / 2
Here however, the complexity will should be the number of times the outer loop runs, since the amount of work done in each iteration is O(1). So the answer will be O(log(n)) (since n is getting halved each time).
The number of iterations is the number of times you have to divide n by 2 to get 0, which is O(log n).

What if we split the array in merge sort into 4 parts?? or eight parts?

I came across this question in one of the slides of Stanford, that what would be the effect on the complexity of the code of merge sort if we split the array into 4 or 8 instead of 2.
It would be the same: O(n log n). You will have a shorter tree and the base of the logarithm will change, but that doesn't matter for big-oh, because a logarithm in a base a differs from a logarithm in base b by a constant:
log_a(x) = log_b(x) / log_b(a)
1 / log_b(a) = constant
And big-oh ignores constants.
You will still have to do O(n) work per tree level in order to merge the 4 or 8 or however many parts, which, combined with more recursive calls, might just make the whole thing even slower in practice.
In general, you can split your array into equal size subarrays of any size and then sort the subarrays recursively, and then use a min-heap to keep extracting the next smallest element from the collection of sorted subarrays. If the number of subarrays you break into is constant, then the execution time for each min-heap per operation is constant, so you arrive at the same O(n log n) time.
Intuitively it would be the same as there is no much difference between splitting the array into two parts and then doing it again or splitting it to 4 parts from the beginning.
A more official proof by induction based on this (I'll assume that the array is split into k):
Definitions:
Let T(N) - number of array stores to mergesort of input of size N
Then mergesort recurrence T(N) = k*T(N/k) + N (for N > 1, T(1) = 0)
Claim:
If T(N) satisfies the recurrence above then T(N) = Nlg(N)
Note - all the logarithms below are on base k
Proof:
Base case: N=1
Inductive hypothesis: T(N) = NlgN
Goal: show that T(kN) = kN(lg(kN))
T(kN) = kT(N) + kN [mergesort recurrence]
= kNlgN + kN [inductive hypothesis]
= kN(lg[(kN/k)] [algebra]
= kN(lg(kN) - lgk) [algebra]
= kN(lg(kN) - 1) + kN [algebra - for base k, lg(k )= 1]
= kNlg(kN) [QED]

Worst case time complexity for this stupid sort?

The code looks like:
for (int i = 1; i < N; i++) {
if (a[i] < a[i-1]) {
swap(i, i-1);
i = 0;
}
}
After trying out a few things i figure the worst case is when the input array is in descending order. Then looks like the compares will be maximum and hence we will consider only compares. Then it seems it would be a sum of sums, i.e sum of ... {1+2+3+...+(n-1)}+{1+2+3+...+(n-2)}+{1+2+3+...+(n-3)}+ .... + 1 if so what would be O(n) ?
If I am not on the right path can someone point out what O(n) would be and how can it be derived? cheers!
For starters, the summation
(1 + 2 + 3 + ... + n) + (1 + 2 + 3 + ... + n - 1) + ... + 1
is not actually O(n). Instead, it's O(n3). You can see this because the sum 1 + 2 + ... + n = O(n2, and there are n copies of each of them. You can more properly show that this summation is Θ(n3) by looking at the first n / 2 of these terms. Each of those terms is at least 1 + 2 + 3 + ... + n / 2 = Θ(n2), so there are n / 2 copies of something that's Θ(n2), giving a tight bound of Θ(n3).
We can upper-bound the total runtime of this algorithm at O(n3) by noting that every swap decreases the number of inversions in the array by one (an inversion is a pair of elements out of place). There can be at most O(n2) inversions in an array and a sorted array has no inversions in it (do you see why?), so there are at most O(n2) passes over the array and each takes at most O(n) work. That collectively gives a bound of O(n3).
Therefore, the Θ(n3) worst-case runtime you've identified is asymptotically tight, so the algorithm runs in time O(n3) and has worst-case runtime Θ(n3).
Hope this helps!
It does one iteration of the list per swap. The maximum number of swaps necessary is O(n * n) for a reversed list. Doing each iteration is O(n).
Therefore the algorithm is O(n * n * n).
This is one half of the infamous Bubble Sort, which has a O(N^2). This partial sort has O(N) because the For loop goes from 1 to N. After one iteration, you will end up with the largest element at the end of the list and the rest of the list in some changed order. To be a proper Bubble Sort, it needs another loop inside this one to iterate j from 1 to N-i and do the same thing. The If goes inside the inner loop.
Now you have two loops, one inside the other, and they both go from 1 to N (sort of). You will have N * N or N^2 iterations. Thus O(N^2) for the Bubble Sort.
Now you have take your next step as a programmer: Finish writing the Bubble Sort and make it work correctly. Try it with different lengths of list a and see how long it takes. Then never use it again. ;-)

Running time of merge sort on linked lists?

i came across this piece of code to perform merge sort on a link list..
the author claims that it runs in time O(nlog n)..
here is the link for it...
http://www.geeksforgeeks.org/merge-sort-for-linked-list/
my claim is that it takes atleast O(n^2) time...and here is my argument...
look, you divide the list(be it array or linked list), log n times(refer to recursion tree), during each partition, given a list of size i=n, n/2, ..., n/2^k, we would take O(i) time to partition the original/already divided list..since sigma O(i)= O(n),we can say , we take O(n) time to partition for any given call of partition(sloppily), so given the time taken to perform a single partition, the question now arises as to how many partitions are going to happen all in all, we observe that the number of partitions at each level i is equal to 2^i , so summing 2^0+2^1+....+2^(lg n ) gives us [2(lg n)-1] as the sum which is nothing but (n-1) on simplification , implying that we call partition n-1, (let's approximate it to n), times so , the complexity is atleast big omega of n^2..
if i am wrong, please let me know where...thanks:)
and then after some retrospection , i applied master method to the recurrence relation where i replaced theta of 1 which is there for the conventional merge sort on arrays with theta of n for this type of merge sort (because the divide and combine operations take theta of n time each), the running time turned out to be theta of (n lg n)...
also i noticed that the cost at each level is n (because 2 power i * (n/(2pow i)))...is the time taken for each level...so its theta of n at each level* lg n levels..implying that its theta of (n lg n).....did i just solve my own question??pls help i am kinda confused myself..
The recursive complexity definition for an input list of size n is
T(n) = O(n) + 2 * T(n / 2)
Expanding this we get:
T(n) = O(n) + 2 * (O(n / 2) + 2 * T(n / 4))
= O(n) + O(n) + 4 * T(n / 4)
Expanding again we get:
T(n) = O(n) + O(n) + O(n) + 8 * T(n / 8)
Clearly there is a pattern here. Since we can repeat this expansion exactly O(log n) times, we have
T(n) = O(n) + O(n) + ... + O(n) (O(log n) terms)
= O(n log n)
You are performing a sum twice for some weird reason.
To split and merge a linked list of size n, it takes O(n) time. The depth of recursion is O(log n)
Your argument was that a splitting step takes O(i) time and sum of split steps become O(n) And then you call it the time taken to perform only one split.
Instead, lets consider this, a problem of size n forms two n/2 problems, four n/4 problems eight n/8 and so on until 2^log n n/2^logn subproblems are formed. Sum these up you get O(nlogn) to perform splits.
Another O(nlogn) to combine sub problems.

Average Runtime of Quickselect

Wikipedia states that the average runtime of quickselect algorithm (Link) is O(n). However, I could not clearly understand how this is so. Could anyone explain to me (via recurrence relation + master method usage) as to how the average runtime is O(n)?
Because
we already know which partition our desired element lies in.
We do not need to sort (by doing partition on) all the elements, but only do operation on the partition we need.
As in quick sort, we have to do partition in halves *, and then in halves of a half, but this time, we only need to do the next round partition in one single partition (half) of the two where the element is expected to lie in.
It is like (not very accurate)
n + 1/2 n + 1/4 n + 1/8 n + ..... < 2 n
So it is O(n).
Half is used for convenience, the actual partition is not exact 50%.
To do an average case analysis of quickselect one has to consider how likely it is that two elements are compared during the algorithm for every pair of elements and assuming a random pivoting. From this we can derive the average number of comparisons. Unfortunately the analysis I will show requires some longer calculations but it is a clean average case analysis as opposed to the current answers.
Let's assume the field we want to select the k-th smallest element from is a random permutation of [1,...,n]. The pivot elements we choose during the course of the algorithm can also be seen as a given random permutation. During the algorithm we then always pick the next feasible pivot from this permutation therefore they are chosen uniform at random as every element has the same probability of occurring as the next feasible element in the random permutation.
There is one simple, yet very important, observation: We only compare two elements i and j (with i<j) if and only if one of them is chosen as first pivot element from the range [min(k,i), max(k,j)]. If another element from this range is chosen first then they will never be compared because we continue searching in a sub-field where at least one of the elements i,j is not contained in.
Because of the above observation and the fact that the pivots are chosen uniform at random the probability of a comparison between i and j is:
2/(max(k,j) - min(k,i) + 1)
(Two events out of max(k,j) - min(k,i) + 1 possibilities.)
We split the analysis in three parts:
max(k,j) = k, therefore i < j <= k
min(k,i) = k, therefore k <= i < j
min(k,i) = i and max(k,j) = j, therefore i < k < j
In the third case the less-equal signs are omitted because we already consider those cases in the first two cases.
Now let's get our hands a little dirty on calculations. We just sum up all the probabilities as this gives the expected number of comparisons.
Case 1
Case 2
Similar to case 1 so this remains as an exercise. ;)
Case 3
We use H_r for the r-th harmonic number which grows approximately like ln(r).
Conclusion
All three cases need a linear number of expected comparisons. This shows that quickselect indeed has an expected runtime in O(n). Note that - as already mentioned - the worst case is in O(n^2).
Note: The idea of this proof is not mine. I think that's roughly the standard average case analysis of quickselect.
If there are any errors please let me know.
In quickselect, as specified, we apply recursion on only one half of the partition.
Average Case Analysis:
First Step: T(n) = cn + T(n/2)
where, cn = time to perform partition, where c is any constant(doesn't matter). T(n/2) = applying recursion on one half of the partition.Since it's an average case we assume that the partition was the median.
As we keep on doing recursion, we get the following set of equation:
T(n/2) = cn/2 + T(n/4) T(n/4) = cn/2 + T(n/8) .. . T(2) = c.2 + T(1) T(1) = c.1 + ...
Summing the equations and cross-cancelling like values produces a linear result.
c(n + n/2 + n/4 + ... + 2 + 1) = c(2n) //sum of a GP
Hence, it's O(n)
I also felt very conflicted at first when I read that the average time complexity of quickselect is O(n) while we break the list in half each time (like binary search or quicksort). It turns out that breaking the search space in half each time doesn't guarantee an O(log n) or O(n log n) runtime. What makes quicksort O(n log n) and quickselect is O(n) is that we always need to explore all branches of the recursive tree for quicksort and only a single branch for quickselect. Let's compare the time complexity recurrence relations of quicksort and quickselect to prove my point.
Quicksort:
T(n) = n + 2T(n/2)
= n + 2(n/2 + 2T(n/4))
= n + 2(n/2) + 4T(n/4)
= n + 2(n/2) + 4(n/4) + ... + n(n/n)
= 2^0(n/2^0) + 2^1(n/2^1) + ... + 2^log2(n)(n/2^log2(n))
= n (log2(n) + 1) (since we are adding n to itself log2 + 1 times)
Quickselect:
T(n) = n + T(n/2)
= n + n/2 + T(n/4)
= n + n/2 + n/4 + ... n/n
= n(1 + 1/2 + 1/4 + ... + 1/2^log2(n))
= n (1/(1-(1/2))) = 2n (by geometric series)
I hope this convinces you why the average runtime of quickselect is O(n).

Resources