In a top-down merge sort the recursive functions are called in this fashion:
void mergesort(Item a[], int l, int r) {
if (r <= l) return;
int m = (r+l)/2;
mergesort(a, l, m);
mergesort(a, m+1, r);
merge(a, l, m, r);
}
It is given in text book that space complexity of this strategy is O(n). whereas if we look at the recursion closely : we are passing pointer to array in recursive calls. Second the recursion is resolved in preorder order of traversal by merging bottom nodes to parent nodes. so at each time there are O(logn) variables on stack (or O(log n) frames on stack). So how is it that space complexity is O(n) inspite of having in-place merging techniques?
So how is it that space complexity is O(n) inspite of having in-place merging techniques?
Because the implementation given in your book probably doesn't use an in-place merging technique. If an O(1) space and O(n log n) time sort is required, heapsort is usually preferred to merge sort since it is much easier. Only when you're talking about sorting lists does doing an O(1) merge sort make sense... and then, it is easy to do. Merge sort specified for e.g. a linked list would be O(1) space and O(n log n) time.
The fundamental misunderstanding here seems to be this: time complexities apply to algorithms, not the problems they solve. I can write an O(n^3) merge sort if I want... doesn't mean my algorithm isn't O(n^3), and it doesn't say anything about your O(n log n) merge sort. This is a little different from computational complexity, where we talk about e.g. problems being in P... a problem is in P if there's a polynomial time algorithm for it. However, problems in P can also be solved by non-polynomial time algorithms, and if you think about it, it's trivial to construct such an algorithm. Same goes for space complexities.
You are right that the space taken up by the recursive calls is O(log n).
But the space taken by the the array itself is O(n).
The total space complexity is O(n) + O(log n).
This is O(n), because it is bounded above by (n)=>2(n).
How are you going to even store n items in log n space? That doesn't make sense. If you're sorting n items, O(n) space is the best you're going to get.
Since you are not allocating any space inside the mergesort function besides the constant's, space complexity of this one is O(lg(n)). But your merge procedure will allocate memory, in case of array, hence keeping that mind it becomes O(lg(n)) + O(n) = O(n). If you use linked list you can avoid the scratch space inside merge procedure hence arriving at O(lg(n) best.
Related
I am currently a CS major, and I am practising different algorithmic questions. I make sure I try to analyse time and space complexity for every question.
However, I have a doubt:
If there are two steps (steps which call recursive functions for varying size of OP) in the algorithm, i.e.
int a = findAns(arr1)
int b = findAns(arr2)
return max(a,b);
Would the worst time complexity of this be: O(N1) + O(N2) or simply, O(max(N1,N2)). I ask because at a time, we would be calling the function with only single input array.
While calculating worst case space complexity, if it comes out to be, O(N) + O(logN), since N > logN, would we discard O(logN) or since logN is also dependent on N and say worst space complexity is O(N), we would say, worst case space complexity to be O(N) only.
I have to answer the following question:
What sorting algorithm is recommended if the first n-m part
is already sorted and the remaining part m is unsorted? Are there any algorithms that take O(n log m) comparisons? What about O(m log n) comparisons?
I just can't find the solution.
My first idea was insertion sort because O(n) for almost sorted sequence. But since we don't know the size of m the Runtime is very likely to be O(n^2) eventough the sequence is half sorted already isn't it?
Then I tought perhabs its quick sort because it takes (Sum from k=1 to n) Cavg (1-m) + Cavg (n-m) comparisons. But after ignoring the n-m part of the sequence the remaining sequence is 1-m in quicksort and not m.
Merge Sort and heap sort should have a runtime of O(m log m) for the remaining sequence m I would say.
Does anyone have an idea or can give me some advice?
Greetings
Have you tried sorting remaining part m separately as O(m log (m)) complexity (with any algorithm you like: MergeSort, HeapSort, QuickSort, ...) and then merge that part with sorted part using MergeSort (You won't even need to fully implement MergeSort - just single pass of it's inner loop body to merge two sorted sequences)?
That would result in O(m*log(m) + n + m) = O(m*log(m) + n) complexity. I don't believe it is possible to find better asymptotic complexity on single-core CPU. Although it will require additional O(n+m) memory for merging result array.
Which sort algorithm works best on mostly sorted data?
Sounds like insertion and bubble are good. You are free to implement as many as you want then test to see which is faster/fewer operations by supplying them partially sorted data.
Given a sequence S of n integer elements, I need a function min(i,j) that finds the minimum element of the sequence between index i and index j (both inclusive) such that:
Initialization takes O(n);
Memory space O(n);
min(i,j) takes O(log(n)).
Please suggest an algorithm for this.
Segmenttree is that what you need because it fulfils all your requirements.
Initialisation takes O(n) with Segment Tree
Memory is also O(n)
Queries can be done in O(log n)
Beside this, the tree is dynamic and can support updating in O(log n). This means one can modify the element of some element i in O(log n) and still retrieve the minimum.
This TopCoder tutorial: An < O(n), O(1) > approach discusses your problem in a more detail way. In the notation, means the approach takes f(n) complexity to setup, and g(n) complexity to query.
Also, this post chews the algorithm again: Range Minimum Query <O(n), O(1)> approach (from tree to restricted RMQ).
Hope them clarifies your question :)
Segment tree is just what you need(it can be build in O(n) time and one query takes O(log n) time).
Here is an article about it: http://wcipeg.com/wiki/Segment_tree.
Even though there is an algorithm that uses O(n) time for initialization and O(1) time per query, segment tree can be a good choice because it is much simpler.
I was going through the book "Cracking the Coding Interview" and came across the question
"Write a program to sort a stack in ascending order. You may use additional stacks to hold items, but you may not copy the elements into any other data structures (such as an array). The stack supports the following operations: push, pop, peek, isEmpty."
The book gave an answer with O(n^2) time complexity and O(n) space.
However, I came across this blog providing an answer in O(n log n) time complexity using quicksort approach.
What I was wondering was is the space complexity O(n^2) though? Since each call to the method involves initializing another two stacks, along with another two recursive calls.
I'm still a little shaky on space complexity. I'm not sure if this would be O(n^2) space with the new stacks spawned from each recursive call being smaller than the ones a level up.
If anyone could give a little explanation behind their answer, that would be great.
The space complexity is also O(n log n) in average case. If space complexity happens to be O(n^2), then how can time complexity be O(n log n), as each space allocated need at least one access.
So, in average case, assuming that stack is divided in half each time, at ith depth of recursion, size of array becomes O(n/2^i) with 2^i recursion branches on ith depth.
So total size allocated on ith depth is O(n/2^i) *2^i = O(n).
Since maximum depth is log n, so overall space complexity is O(n log n).
However, in worst case, space complexity is O(n^2).
In this method of quicksort, the space complexity will exactly follow the time complexity - the reason is quite simple. You are dividing the sub stacks recursively (using the pivot) until each element is in a stack of size one. This leads to (2^x = n) divisions of x sub stacks (log n depth) and in the end you have n stacks each of size one. Hence the total space complexity will be O(n*log n).
Keep in mind that in this case, the space complexity will follow the time complexity exactly as we are literally occupying new space at each iteration. So, in the worst case, the space complexity will be O(n^2).
I am reviewing my data structures and algorithm analysis lesson, and I get a question that how to determine to the space complexity of merge sort and quick sort
algorithms ?
The depth of recursion is only O(log n) for linked list merge-sort
The amount of extra storage space needed for contiguous quick sort is O(n).
My thoughts:
Both use divide-and-conquer strategy, so I guess the space complexity of linked list merge sort should be same as the contiguous quick sort. Actually I opt for O(log n) because before every iteration or recursion call the list is divided in half.
Thanks for any pointers.
The worst case depth of recursion for quicksort is not (necessarily) O(log n), because quicksort doesn't divide the data "in half", it splits it around a pivot which may or may not be the median. It's possible to implement quicksort to address this[*], but presumably the O(n) analysis was of a basic recursive quicksort implementation, not an improved version. That would account for the discrepancy between what you say in the blockquote, and what you say under "my thoughts".
Other than that I think your analysis is sound - neither algorithm uses any extra memory other than a fixed amount per level of recursion, so depth of recursion dictates the answer.
Another possible way to account for the discrepancy, I suppose, is that the O(n) analysis is just wrong. Or, "contiguous quicksort" isn't a term I've heard before, so if it doesn't mean what I think it does ("quicksorting an array"), it might imply a quicksort that's necessarily space-inefficient in some sense, such as returning an allocated array instead of sorting in-place. But it would be silly to compare quicksort and mergesort on the basis of the depth of recursion of the mergesort vs. the size of a copy of the input for the quicksort.
[*] Specifically, instead of calling the function recursively on both parts, you put it in a loop. Make a recursive call on the smaller part, and loop around to do the bigger part, or equivalently push (pointers to) the larger part onto a stack of work to do later, and loop around to do the smaller part. Either way, you ensure that the depth of the stack never exceeds log n, because each chunk of work not put on the stack is at most half the size of the chunk before it, down to a fixed minimum (1 or 2 if you're sorting purely with quicksort).
I'm not really familiar with the term "contiguous quicksort". But quicksort can have either O(n) or O(log n) space complexity depending on how it is implemented.
If it is implemented as follows:
quicksort(start,stop) {
m=partition(start,stop);
quicksort(start,m-1);
quicksort(m+1,stop);
}
Then the space complexity is O(n), not O(log n) as is commonly believed.
This is because you are pushing onto the stack twice at each level, so the space complexity is determined from the recurrance:
T(n) = 2*T(n/2)
Assuming the partitioning divides the array into 2 equal parts (best case). The solution to this according to the Master Theorem is T(n) = O(n).
If we replace the second quicksort call with tail recursion in the code snippet above, then you get T(n) = T(n/2) and therefore T(n) = O(log n) (by case 2 of the Master theorem).
Perhaps the "contiguous quicksort" refers to the first implementation because the two quicksort calls are next to each other, in which case the space complexity is O(n).