How to compute the algorithmic space complexity - algorithm

I am reviewing my data structures and algorithm analysis lesson, and I get a question that how to determine to the space complexity of merge sort and quick sort
algorithms ?
The depth of recursion is only O(log n) for linked list merge-sort
The amount of extra storage space needed for contiguous quick sort is O(n).
My thoughts:
Both use divide-and-conquer strategy, so I guess the space complexity of linked list merge sort should be same as the contiguous quick sort. Actually I opt for O(log n) because before every iteration or recursion call the list is divided in half.
Thanks for any pointers.

The worst case depth of recursion for quicksort is not (necessarily) O(log n), because quicksort doesn't divide the data "in half", it splits it around a pivot which may or may not be the median. It's possible to implement quicksort to address this[*], but presumably the O(n) analysis was of a basic recursive quicksort implementation, not an improved version. That would account for the discrepancy between what you say in the blockquote, and what you say under "my thoughts".
Other than that I think your analysis is sound - neither algorithm uses any extra memory other than a fixed amount per level of recursion, so depth of recursion dictates the answer.
Another possible way to account for the discrepancy, I suppose, is that the O(n) analysis is just wrong. Or, "contiguous quicksort" isn't a term I've heard before, so if it doesn't mean what I think it does ("quicksorting an array"), it might imply a quicksort that's necessarily space-inefficient in some sense, such as returning an allocated array instead of sorting in-place. But it would be silly to compare quicksort and mergesort on the basis of the depth of recursion of the mergesort vs. the size of a copy of the input for the quicksort.
[*] Specifically, instead of calling the function recursively on both parts, you put it in a loop. Make a recursive call on the smaller part, and loop around to do the bigger part, or equivalently push (pointers to) the larger part onto a stack of work to do later, and loop around to do the smaller part. Either way, you ensure that the depth of the stack never exceeds log n, because each chunk of work not put on the stack is at most half the size of the chunk before it, down to a fixed minimum (1 or 2 if you're sorting purely with quicksort).

I'm not really familiar with the term "contiguous quicksort". But quicksort can have either O(n) or O(log n) space complexity depending on how it is implemented.
If it is implemented as follows:
quicksort(start,stop) {
m=partition(start,stop);
quicksort(start,m-1);
quicksort(m+1,stop);
}
Then the space complexity is O(n), not O(log n) as is commonly believed.
This is because you are pushing onto the stack twice at each level, so the space complexity is determined from the recurrance:
T(n) = 2*T(n/2)
Assuming the partitioning divides the array into 2 equal parts (best case). The solution to this according to the Master Theorem is T(n) = O(n).
If we replace the second quicksort call with tail recursion in the code snippet above, then you get T(n) = T(n/2) and therefore T(n) = O(log n) (by case 2 of the Master theorem).
Perhaps the "contiguous quicksort" refers to the first implementation because the two quicksort calls are next to each other, in which case the space complexity is O(n).

Related

Best runtime for n-1 comparisons?

If an algorithm must make n-1 comparisons to find a certain element, then can we assume that best possible runtime of the algorithm is O(n)?
I know that the lower bound for sorting algorithms is nlogn but since we only return the found one element, I figured it would be possible to do better in terms of run time?
Thanks!
To find a certain element in an unsorted list you need O(n).
But if you sort the array (takes O(n log n) in general) you can find a certain element in O(log n).
So if you want to find often elements in the same list it is most likely worth to sort the list to then be able to find elements much more efficient.
If your array is unsorted and you find some element in it then in worst case Linear search algorithm make n-1 comparisons and time complexity will be O(n).
But if you want to reduce your time complexity then first sort your array and use Binary search algorithm it is take O(logn) in worst case.
So Binary search algorithm is more efficient then linear search.
For unsorted elements, worst case is when you have to go over all the elements, i.e., O(N). If you need many look-ups then you have several pre-processing alternatives that speed up all future accesses.
Option 1: put the elements in a standard hash table. Creating the hash table costs O(N), on average, and later pay O(1) on average for each lookup. This assumes that a reasonable hash-function can be created for this type of elements.
Most languages/libraries implement bucket-based hash-tables, which in pathological cases can put all elements in one bucket, costing O(N) per lookup.
Option 2: there are other hash-table implementations that don't suffer from pathological O(N) cases. The Robin Hood hashing (Wikipedia) (more at Programming.Guide) guarantees O(log N) lookup in the worst case, with average of O(1).
Option 3: another option is to sort elements in O(N log N) once, and then use binary-search to lookup in O(log N). Usually this is slower than Robin Hood hashing (Option 2).
Option 4: If the values are simple integers with limited range, with max-min around N, then it is possible to put the values in an array (list), such that array[value-min] will contain a count of how many times the value appears in the input. It costs O(N) to construct, and O(1) to lookup. Better, the constants for both preprocessing and lookup are significantly lower than in any other method.
Note: I didn't mention the O(N) counting-sort as an alternative to the general case of O(N log N) sorting (option 3), since if max(value)-min(value) is small enough for counting-sort, then option 4 is relevant and is simpler and faster.
If applicable, choose option 4, otherwise if you wish to invest time and code then choose option 2. If 4 isn't applicable, and 2 is not worth the effort in your case, then choose option 2 if you don't mind the pathological worst-case (never choose option 2 when an adversary may want to harm you in a DOS attack).
Your question has nothing to do with sorting, let alone linear search.
If you claim that n-1 comparisons are mandated, then your problem has certainly complexity Ω(n). But with that information alone, you can't guarantee O(n) because it is not said that these n-1 comparisons are sufficient, nor that the algorithm does not perform extra operations, for instance to decide which comparisons to perform. It could turn out that your algorithm is O(n³) with no chance to do better, but we can't tell.
Best case complexity: Ω(n).
Worst case complexity: unknown.

How to know which function has complexity of log n [duplicate]

This question already has answers here:
What does O(log n) mean exactly?
(32 answers)
Closed 1 year ago.
I was learning about the time complexities and was stuck at Big O(log n) bcoz I was unable to identify which function has the complexity of log n as compared to other complexities such as O(n), O(n2) or O(n3) which can be easily identified by counting the number of for loops in the fucntion
You want to look at two things:
How many times does the loop iterate? (depth)
How much of the array does it access during each iteration? (breadth)
For the depth:
If each iteration divides the number of remaining iterations by some amount (often 2), there are probably log(n) iterations, so the depth is O(log(n)). The exact value it's divided by doesn't matter for big O, since log_2(n), log_e(n), log_10(n), etc. are all constant multiples of each other.
If it iterates a fixed number of times, it's O(1).
If it iterates n times (or a constant multiple of that), it's O(n)
For the breadth, ask how many elements of the original array it needs to look at each iteration.
If that number doesn't depend on the size at all, breadth is O(1) (e.g. in a binary search, we only look at a single element each iteration, regardless of the array size).
If you look at the entire array, or some constant fraction of it, e.g. n/2, the breadth is O(n). This is often the case for the good sorting algorithms. (These generally work by recursion rather than looping, which means the depth is depth of recursion rather than number of iterations. For these, you ask how much of the array is accessed collectively by all recursive calls at the same layer. If you haven't learned recursion yet, feel free to ignore this parenthetical for now.)
Once you have the big O estimates of breadth and depth, just multiply them together.
Binary search of a sorted array has depth log(n) and breadth O(1), so it's O(log(n))
Merge sort has depth O(log(n)) and breadth O(n), so it's O(n log(n))
Adding all the numbers in an array has depth O(n) and breadth O(1), so it's O(n).
Adding two numbers has depth O(1) and breadth O(1), so it's O(1).
There are complications in practice, of course (usually for the recursive cases), but these heuristics will get you started. A technique that might be useful for the more complicated cases is sketching out the elements that are accessed by each iteration/recursive call. Depth vertically, breadth horizontally. As long as you don't have multiple function calls accessing the same memory in the same row, you can usually see what's happening well enough to add things up.

Difference in Space Complexity of different sorting algorithms

I am trying to understand Space Complexities of different sorting algorithms.
http://bigocheatsheet.com/?goback=.gde_98713_member_241501229
from the above link I found that the complexity of
bubble sort,insertion and selection sort is O(1)
where as quick sort is O(log(n)) and merge sort is O(n).
we were actually not allocating extra memory in any of the algorithms.
Then why the space complexities are different when we are using the same array to sort them?
When you run code, memory is assigned in two ways:
Implicitly, as you set up function calls.
Explicitly, as you create chunks of memory.
Quicksort is a good example of implicit use of memory. While I'm doing a quicksort, I'm recursively calling myself O(n) times in the worst case, O(log(n)) in the average case. Those recursive calls each take O(1) space to keep track of, leading to a O(n) worst case and O(log(n)) average case.
Mergesort is a good example of explicit use of memory. I take two blocks of sorted data, create a place to put the merge, and then merge from those two into that merge. Creating a place to put the merge is O(n) data.
To get down to O(1) memory you need to both not assign memory, AND not call yourself recursively. This is true of all of bubble, insertion and selection sorts.
It's important to keep in mind that there are a lot of different ways to implement each of these algorithms, and each different implementation has a different associated space complexity.
Let's start with merge sort. The most common implementation of mergesort on arrays works by allocating an external buffer in which to perform the merges of the individual ranges. This requires space to hold all the elements of the array, which takes extra space Θ(n). However, you could alternatively use an in-place merge for each merge, which means that the only extra space you'd need would be space for the stack frames of the recursive calls, dropping the space complexity down to Θ(log n), but increasing the runtime of the algorithm by a large constant factor. You could alternatively do a bottom-up mergesort using in-place merging, which requires only O(1) extra space but with a higher constant factor.
On the other hand, if you're merge sorting linked lists, then the space complexity is going to be quite different. You can merge linked lists in space O(1) because the elements themselves can easily be rewired. This means that the space complexity of merge sorting linked lists is Θ(log n) from the space needed to store the stack frames for the recursive calls.
Let's look at quicksort as another example. Quicksort doesn't normally allocate any external memory, but it does need space for the stack frames it uses. A naive implementation of quicksort might need space Θ(n) in the worst case for stack frames if the pivots always end up being the largest or smallest element of the array, since in that case you keep recursively calling the function on arrays of size n - 1, n - 2, n - 3, etc. However, there's a standard optimization you can perform that's essentially tail-call elimination: you recursively invoke quicksort on the smaller of the two halves of the array, then reuse the stack space from the current call for the larger half. This means that you only allocate new memory for a recursive call on subarrays of size at most n / 2, then n / 4, then n / 8, etc. so the space usage drops to O(log n).
I'll assume the array we're sorting is passed by reference, and I'm assuming the space for the array does not count in the space complexity analysis.
The space complexity of quicksort can be made O(n) (and expected O(log n) for randomized quicksort) with clever implementation: e.g. don't copy the whole sub-arrays, but just pass on indexes.
The O(n) for quicksort comes from the fact that the number of "nested" recursive calls can be O(n): think of what happens if you keep making unlucky choices for the pivot. While each stack frame takes O(1) space, there can be O(n) stack frames. The expected depth (i.e. expected stack space) is O(log n) if we're talking about randomized quicksort.
For merge sort I'd expect the space complexity to be O(log n) because you make at most O(log n) "nested" recursive calls.
The results you're citing also count the space taken by the arrays: then the time complexity of merge sort is O(log n) for stack space plus O(n) for array, which means O(n) total space complexity. For quicksort it is O(n)+O(n)=O(n).

Sort Stack Ascending Order (Space Analysis)

I was going through the book "Cracking the Coding Interview" and came across the question
"Write a program to sort a stack in ascending order. You may use additional stacks to hold items, but you may not copy the elements into any other data structures (such as an array). The stack supports the following operations: push, pop, peek, isEmpty."
The book gave an answer with O(n^2) time complexity and O(n) space.
However, I came across this blog providing an answer in O(n log n) time complexity using quicksort approach.
What I was wondering was is the space complexity O(n^2) though? Since each call to the method involves initializing another two stacks, along with another two recursive calls.
I'm still a little shaky on space complexity. I'm not sure if this would be O(n^2) space with the new stacks spawned from each recursive call being smaller than the ones a level up.
If anyone could give a little explanation behind their answer, that would be great.
The space complexity is also O(n log n) in average case. If space complexity happens to be O(n^2), then how can time complexity be O(n log n), as each space allocated need at least one access.
So, in average case, assuming that stack is divided in half each time, at ith depth of recursion, size of array becomes O(n/2^i) with 2^i recursion branches on ith depth.
So total size allocated on ith depth is O(n/2^i) *2^i = O(n).
Since maximum depth is log n, so overall space complexity is O(n log n).
However, in worst case, space complexity is O(n^2).
In this method of quicksort, the space complexity will exactly follow the time complexity - the reason is quite simple. You are dividing the sub stacks recursively (using the pivot) until each element is in a stack of size one. This leads to (2^x = n) divisions of x sub stacks (log n depth) and in the end you have n stacks each of size one. Hence the total space complexity will be O(n*log n).
Keep in mind that in this case, the space complexity will follow the time complexity exactly as we are literally occupying new space at each iteration. So, in the worst case, the space complexity will be O(n^2).

Heapsort. How is it possible so simulate worstcase-scenario?

I am rather clear on how to programme it, but I am not sure on the definition, e.g. how to write it down in mathematics terms.
A normal heapsort is done with N elements in O notation. So O(log(n))
I just started with heapsort, so I might be a little bit off here.
But how can I for example look for a random element, when there are N elements?
And then pick that random element and delete it?
I was thinking that in a worst case - situation it has to go through the whole tree (Because the element could either be at the first place or at the last place, e.g. highest or lowest).
But how can I write that down in mathematics terms?
Heapsort's worst case performance is O(n log n), and to quote alestanis:
Max in max-heap: O(1). Min in min-heap: O(1). Opposite cases in O(n).
Here's an SO-answer explaining how to do the opposite cases in O(1) if you create the heap yourself.
To build maxheap array worstcase is O(n) and to max heapify complexcity in worst case is O(logn) so HeapSort worstCase is O(nlogn)

Resources