Sort Stack Ascending Order (Space Analysis) - sorting

I was going through the book "Cracking the Coding Interview" and came across the question
"Write a program to sort a stack in ascending order. You may use additional stacks to hold items, but you may not copy the elements into any other data structures (such as an array). The stack supports the following operations: push, pop, peek, isEmpty."
The book gave an answer with O(n^2) time complexity and O(n) space.
However, I came across this blog providing an answer in O(n log n) time complexity using quicksort approach.
What I was wondering was is the space complexity O(n^2) though? Since each call to the method involves initializing another two stacks, along with another two recursive calls.
I'm still a little shaky on space complexity. I'm not sure if this would be O(n^2) space with the new stacks spawned from each recursive call being smaller than the ones a level up.
If anyone could give a little explanation behind their answer, that would be great.

The space complexity is also O(n log n) in average case. If space complexity happens to be O(n^2), then how can time complexity be O(n log n), as each space allocated need at least one access.
So, in average case, assuming that stack is divided in half each time, at ith depth of recursion, size of array becomes O(n/2^i) with 2^i recursion branches on ith depth.
So total size allocated on ith depth is O(n/2^i) *2^i = O(n).
Since maximum depth is log n, so overall space complexity is O(n log n).
However, in worst case, space complexity is O(n^2).

In this method of quicksort, the space complexity will exactly follow the time complexity - the reason is quite simple. You are dividing the sub stacks recursively (using the pivot) until each element is in a stack of size one. This leads to (2^x = n) divisions of x sub stacks (log n depth) and in the end you have n stacks each of size one. Hence the total space complexity will be O(n*log n).
Keep in mind that in this case, the space complexity will follow the time complexity exactly as we are literally occupying new space at each iteration. So, in the worst case, the space complexity will be O(n^2).

Related

How to know which function has complexity of log n [duplicate]

This question already has answers here:
What does O(log n) mean exactly?
(32 answers)
Closed 1 year ago.
I was learning about the time complexities and was stuck at Big O(log n) bcoz I was unable to identify which function has the complexity of log n as compared to other complexities such as O(n), O(n2) or O(n3) which can be easily identified by counting the number of for loops in the fucntion
You want to look at two things:
How many times does the loop iterate? (depth)
How much of the array does it access during each iteration? (breadth)
For the depth:
If each iteration divides the number of remaining iterations by some amount (often 2), there are probably log(n) iterations, so the depth is O(log(n)). The exact value it's divided by doesn't matter for big O, since log_2(n), log_e(n), log_10(n), etc. are all constant multiples of each other.
If it iterates a fixed number of times, it's O(1).
If it iterates n times (or a constant multiple of that), it's O(n)
For the breadth, ask how many elements of the original array it needs to look at each iteration.
If that number doesn't depend on the size at all, breadth is O(1) (e.g. in a binary search, we only look at a single element each iteration, regardless of the array size).
If you look at the entire array, or some constant fraction of it, e.g. n/2, the breadth is O(n). This is often the case for the good sorting algorithms. (These generally work by recursion rather than looping, which means the depth is depth of recursion rather than number of iterations. For these, you ask how much of the array is accessed collectively by all recursive calls at the same layer. If you haven't learned recursion yet, feel free to ignore this parenthetical for now.)
Once you have the big O estimates of breadth and depth, just multiply them together.
Binary search of a sorted array has depth log(n) and breadth O(1), so it's O(log(n))
Merge sort has depth O(log(n)) and breadth O(n), so it's O(n log(n))
Adding all the numbers in an array has depth O(n) and breadth O(1), so it's O(n).
Adding two numbers has depth O(1) and breadth O(1), so it's O(1).
There are complications in practice, of course (usually for the recursive cases), but these heuristics will get you started. A technique that might be useful for the more complicated cases is sketching out the elements that are accessed by each iteration/recursive call. Depth vertically, breadth horizontally. As long as you don't have multiple function calls accessing the same memory in the same row, you can usually see what's happening well enough to add things up.

What is the time complexity of clearing a heap?

I have googled for lots of websites and they all say "the time complexity of clearing a heap is O(n log n)." The reason is:
Swapping the tailing node the root costs O(1).
Swapping "the new root" to suitable place costs O(level) = O(log n).
So deleting a node (the root) costs O(log n).
So deleting all n nodes costs O(n log n).
In my opinion, the answer is right but not "tight" because:
The heap (or its level) becoming smaller during deleting.
As a result, the cost of "swapping the new root to suitable place" becomes smaller.
The aforementioned reason of "O(n log n)" does not embody such change.
The time complexity of creating a heap is proved as O(n) at here.
I tend to believe that the time complexity of clearing a heap is O(n) as well because creating and clearing is very similar - both contain "swapping node to suitable position" and "change of heap size".
However, when considering O(n) time for clearing a heap, here is a contradiction:
By creating and clearing a heap, it is possible to sort an array in O(n) time.
The lower limit of time complexity of sorting is O(n log n).
I have thought about the question for a whole day but still been confused.
What on earth clearing a heap costs? Why?
As you correctly observe, the time taken is O((log n) + (log n-1) + ... + (log 2) + (log 1)). That's the same as O(log(n!)), which is the same as O(n log n) (proof in many places, but for example: What is O(log(n!)) and O(n!) and Stirling Approximation).
So you're right that the argument given for the time complexity of removing every element of a heap being O(nlog n) is wrong, but the result is still right.
Your equivalence between creating and "clearing" the heap is wrong. When you create the heap, there's a lot of slack because the heap invariant allows many choices at every level and this happens to mean that it's possible to find a valid ordering of the elements in O(n) time. When "clearing" the heap, there's no such slack (and the standard proof about comparison sorts needing at least n log n time proves that it's not possible).

Difference in Space Complexity of different sorting algorithms

I am trying to understand Space Complexities of different sorting algorithms.
http://bigocheatsheet.com/?goback=.gde_98713_member_241501229
from the above link I found that the complexity of
bubble sort,insertion and selection sort is O(1)
where as quick sort is O(log(n)) and merge sort is O(n).
we were actually not allocating extra memory in any of the algorithms.
Then why the space complexities are different when we are using the same array to sort them?
When you run code, memory is assigned in two ways:
Implicitly, as you set up function calls.
Explicitly, as you create chunks of memory.
Quicksort is a good example of implicit use of memory. While I'm doing a quicksort, I'm recursively calling myself O(n) times in the worst case, O(log(n)) in the average case. Those recursive calls each take O(1) space to keep track of, leading to a O(n) worst case and O(log(n)) average case.
Mergesort is a good example of explicit use of memory. I take two blocks of sorted data, create a place to put the merge, and then merge from those two into that merge. Creating a place to put the merge is O(n) data.
To get down to O(1) memory you need to both not assign memory, AND not call yourself recursively. This is true of all of bubble, insertion and selection sorts.
It's important to keep in mind that there are a lot of different ways to implement each of these algorithms, and each different implementation has a different associated space complexity.
Let's start with merge sort. The most common implementation of mergesort on arrays works by allocating an external buffer in which to perform the merges of the individual ranges. This requires space to hold all the elements of the array, which takes extra space Θ(n). However, you could alternatively use an in-place merge for each merge, which means that the only extra space you'd need would be space for the stack frames of the recursive calls, dropping the space complexity down to Θ(log n), but increasing the runtime of the algorithm by a large constant factor. You could alternatively do a bottom-up mergesort using in-place merging, which requires only O(1) extra space but with a higher constant factor.
On the other hand, if you're merge sorting linked lists, then the space complexity is going to be quite different. You can merge linked lists in space O(1) because the elements themselves can easily be rewired. This means that the space complexity of merge sorting linked lists is Θ(log n) from the space needed to store the stack frames for the recursive calls.
Let's look at quicksort as another example. Quicksort doesn't normally allocate any external memory, but it does need space for the stack frames it uses. A naive implementation of quicksort might need space Θ(n) in the worst case for stack frames if the pivots always end up being the largest or smallest element of the array, since in that case you keep recursively calling the function on arrays of size n - 1, n - 2, n - 3, etc. However, there's a standard optimization you can perform that's essentially tail-call elimination: you recursively invoke quicksort on the smaller of the two halves of the array, then reuse the stack space from the current call for the larger half. This means that you only allocate new memory for a recursive call on subarrays of size at most n / 2, then n / 4, then n / 8, etc. so the space usage drops to O(log n).
I'll assume the array we're sorting is passed by reference, and I'm assuming the space for the array does not count in the space complexity analysis.
The space complexity of quicksort can be made O(n) (and expected O(log n) for randomized quicksort) with clever implementation: e.g. don't copy the whole sub-arrays, but just pass on indexes.
The O(n) for quicksort comes from the fact that the number of "nested" recursive calls can be O(n): think of what happens if you keep making unlucky choices for the pivot. While each stack frame takes O(1) space, there can be O(n) stack frames. The expected depth (i.e. expected stack space) is O(log n) if we're talking about randomized quicksort.
For merge sort I'd expect the space complexity to be O(log n) because you make at most O(log n) "nested" recursive calls.
The results you're citing also count the space taken by the arrays: then the time complexity of merge sort is O(log n) for stack space plus O(n) for array, which means O(n) total space complexity. For quicksort it is O(n)+O(n)=O(n).

Analysis of speed and memory for heapsort

I tried googling and wiki'ing these questions but can't seem to find concrete answers. Most of what I found involved using proofs with the master theorem, but I'm hoping for something in plain English that can be more intuitively remembered. Also I am not in school and these questions are for interviewing.
MEMORY:
What exactly does it mean to determine big-o in terms of memory usage? For example, why is heapsort considered to run with O(1) memory when you have to store all n items? Is it because you are creating only one structure for the heap? Or is it because you know its size and so you can create it on the stack, which is always constant memory usage?
SPEED:
How is the creation of the heap done in O(n) time if adding elements is done in O(1) but percolating is done in O(logn)? Wouldn't that mean you do n inserts at O(1) making it O(n) and percolating after each insert is O(logn). So O(n) * O(logn) = O(nlogn) in total. I also noticed most implementations of heap sort use a heapify function instead of percolating to create the heap? Since heapify does n comparisons at O(logn) that would be O(nlogn) and with n inserts at O(1) we would get O(n) + O(nlogn) = O(nlogn)? Wouldn't the first approach yield better performance than the second with small n?
I kind of assumed this above, but is it true that doing an O(1) operation n times would result in O(n) time? Or does n * O(1) = O(1)?
So I found some useful info about building a binary heap from wikipedia: http://en.wikipedia.org/wiki/Binary_heap#Building_a_heap.
I think my main source of confusion was how "inserting" into a heap is both O(1) and O(logn), even though the first shouldn't be called an insertion and maybe just a build step or something. So you wouldn't use heapify anymore after you've already created your heap, instead you'd use the O(logn) insertion method.
The method of adding items iteratively while maintaining the heap property runs in O(nlogn) and creating the heap without respecting the heap property, and then heapifying, actually runs in O(n), the reason which isn't very intuitive and requires a proof, so I was wrong about that.
The removal step to get the ordered items is the same cost, O(nlogn), after each method has a heap that respects the heap property.
So in the end you'd have either an O(1) + O(n) + O(nlogn) = O(nlogn) for the build heap method, and an O(nlogn) + O(nlogn) = O(nlogn) for the insertion method. Obviously the first is preferable, especially for small n.

How to compute the algorithmic space complexity

I am reviewing my data structures and algorithm analysis lesson, and I get a question that how to determine to the space complexity of merge sort and quick sort
algorithms ?
The depth of recursion is only O(log n) for linked list merge-sort
The amount of extra storage space needed for contiguous quick sort is O(n).
My thoughts:
Both use divide-and-conquer strategy, so I guess the space complexity of linked list merge sort should be same as the contiguous quick sort. Actually I opt for O(log n) because before every iteration or recursion call the list is divided in half.
Thanks for any pointers.
The worst case depth of recursion for quicksort is not (necessarily) O(log n), because quicksort doesn't divide the data "in half", it splits it around a pivot which may or may not be the median. It's possible to implement quicksort to address this[*], but presumably the O(n) analysis was of a basic recursive quicksort implementation, not an improved version. That would account for the discrepancy between what you say in the blockquote, and what you say under "my thoughts".
Other than that I think your analysis is sound - neither algorithm uses any extra memory other than a fixed amount per level of recursion, so depth of recursion dictates the answer.
Another possible way to account for the discrepancy, I suppose, is that the O(n) analysis is just wrong. Or, "contiguous quicksort" isn't a term I've heard before, so if it doesn't mean what I think it does ("quicksorting an array"), it might imply a quicksort that's necessarily space-inefficient in some sense, such as returning an allocated array instead of sorting in-place. But it would be silly to compare quicksort and mergesort on the basis of the depth of recursion of the mergesort vs. the size of a copy of the input for the quicksort.
[*] Specifically, instead of calling the function recursively on both parts, you put it in a loop. Make a recursive call on the smaller part, and loop around to do the bigger part, or equivalently push (pointers to) the larger part onto a stack of work to do later, and loop around to do the smaller part. Either way, you ensure that the depth of the stack never exceeds log n, because each chunk of work not put on the stack is at most half the size of the chunk before it, down to a fixed minimum (1 or 2 if you're sorting purely with quicksort).
I'm not really familiar with the term "contiguous quicksort". But quicksort can have either O(n) or O(log n) space complexity depending on how it is implemented.
If it is implemented as follows:
quicksort(start,stop) {
m=partition(start,stop);
quicksort(start,m-1);
quicksort(m+1,stop);
}
Then the space complexity is O(n), not O(log n) as is commonly believed.
This is because you are pushing onto the stack twice at each level, so the space complexity is determined from the recurrance:
T(n) = 2*T(n/2)
Assuming the partitioning divides the array into 2 equal parts (best case). The solution to this according to the Master Theorem is T(n) = O(n).
If we replace the second quicksort call with tail recursion in the code snippet above, then you get T(n) = T(n/2) and therefore T(n) = O(log n) (by case 2 of the Master theorem).
Perhaps the "contiguous quicksort" refers to the first implementation because the two quicksort calls are next to each other, in which case the space complexity is O(n).

Resources