Proposition . In the resizing array implementation of Stack,
the average number of array accesses for any sequence of operations starting from
an empty data structure is constant in the worst case.
Proof sketch: For each push() that causes the array to grow ( say from size N to
size 2N), consider the N/2 - 1 push() operations that most recently caused the
stack size to grow to k, for k from N/2 + 2 to N. Averaging the 4N array accesses to
grow the array with N/2 array accesses (one for each push), we get an average cost
of 9 array accesses per operation. Proving that the number of array accesses used by
any sequence of M operations is proportional to M is more intricate.
(Algorithms 4th Edition Chapter 1.4)
I didn't understand the Proof Sketch Completely. Please help me in getting this understand.
I think this is sort of amortized analysis where you charge requests like push() for work that isn't directly due to them, and then show that nobody has to pay too high a bill, which means that the average cost of work done is small.
In this case you have to copy the entire array when you run out of space, but you double the size when you do this, so you don't copy very often - e.g. at size 1, 2, 4, 8, 16... Here we bill each array copy to the push() operations which have been done since the last array copy. This means that if you do nothing but push() then each push() gets the bill only for the first array copy that occurs after it, so if the bill (split over a number of push() operations) is small per push() then the amortized cost is small.
If the array is of size N before it runs out of space and gets doubled in size then this article says this costs 4N operations, which sounds reasonable, and we don't care about constant factors much anyway. This gets split over all the operations since the last doubling. The last doubling was from size N/2 to size N so there are about N/2 of them. This gets you 4N ops split over N/2 push() operations so each push gets a shared bill of 8. Don't forget that a push() involves an array write whether or not it triggers a size-doubling and you get an average cost of 9 writes per push().
Related
This question already has answers here:
Space requirements of a merge-sort
(2 answers)
Closed 2 years ago.
At first look, it makes sense that merge sort has space complexity of O(n) because to sort the unsorted array I'm splitting and creating subarrays but the sum of sizes of all the subarray will be n.
Question : The main concern that I have is that of memeory allocation of mergerSort() function during recurssion. I have a main stack, and each function call to mergerSort() ( recussively) will be pushed on the stack. Now each recussively called mergeSort() function will have its own stack. Therefore, say if we have made 5 recussive calls to mergeSort() then the main stack will contain 5 function call where each function call will have its own function stack. Now each function stack will have its own local varibales like left subarray and right subarray that the function creates. Hence, each of the 5 function stacks should have 5 different subarrays in memeory. So shouldn't the space grow with the growth in recussive calls ?
Memory should be linear
Although each call to mergeSort triggers two recursive calls, so it makes sense to talk about and draw the binary tree of recursive calls, only one of those two recursive calls is performed at a time; the first call ends before the second call starts. Hence, at any given time, only one branch of the tree is being explored. The "call stack" represents this branch.
The depth of the recursion tree is at most log(n), therefore the height of the call stack is at most log(n).
How much memory does it take to explore one branch? In other words, how much memory is allocated on the call stack, at most, at any given time?
At the bottom of the call stack, there is an array of size n.
On top of that is an array of size n/2.
On top of that is an array of size n/4.
Etc...
So the total size of the call stack is at most n + n/2 + n/4 + ... < 2n.
Hence the total size of the call stack is at most 2n.
Possible memory leak
If your implementation of merge sort allocates a new array at every recursive call, and you forget to free those arrays at the end of the call, then the total allocated memory becomes the total memory required for the whole tree, instead of just one branch.
Consider all the nodes at a given depth in the tree. The subarrays of these nodes add up to form the whole array. For instance, the root of the tree has an array of length n; then one level below that, there are two subarrays representing two halves of the original array; then one level below that, there are four subarrays representing four fourth of the original array; etc. Hence each level of the tree requires memory n. There are log(n) levels to the tree. Hence the total amount of memory allocated for the whole tree would be n log(n).
Conclusion
If merge sort has no memory leaks, then its space complexity is linear O(n). In addition, it is possible (although not always desirable) to implement merge sort in-place, in which case the space complexity is constant O(1) (all operations are performed directly inside the input array).
However, if your implementation of merge sort has a memory leak, i.e., you keep allocating new arrays in recursive calls, but do not free them when the recursive call returns, then it could easily have space complexity O(n log n).
Assuming we are given k sorted arrays (each of size n), in which case is using a priority heap better than a traditional merge (similar to the one used in merge-sort) and vice-versa?
Priority Queue Approach: In this approach, we have a min heap of size k (initially, the first element from each of the arrays is added to the heap). We now remove the min element (from one of the input arrays), put this in the final array and insert a new element from that same input array. This approach takes O(kn log k) time and O(kn) space. Note: It takes O(kn) space because that's the size of the final array and this dominates the size of the heap while calculating the asymptotic space complexity.
Traditional Merge: In this approach, we merge the first 2 arrays to get a sorted array of size 2n. We repeat this for all the input arrays and after the first pass, we obtain k/2 sorted arrays each of size 2n. We repeat this process until we get the final array. Each pass has a time complexity of O(kn) since one element will be added to the corresponding output array after each comparison. And we have log k passes. So, the total time complexity is O(kn log k). And since we can delete the input arrays after each pass, the space used at any point is O(kn).
As we can see, the asymptotic time and space complexities are exactly the same in both the approaches. So, when exactly do we prefer one over the other? I understand that for an external sort the Priority Queue approach is better because you only need O(k) in-memory space and you can read and write each element from and back to disk. But how do these approaches stack up against each other when we have enough memory?
The total number of operations, compares + moves, is about the same either way. A k-way merge does more compares but fewer moves. My system has an 8 way cache (Intel 3770K 3.5 ghz), which in the case of a 4 way merge sort, allows for 4 lines of cache for the 4 input runs and 1 line of cache for the merged output run. In 64 bit mode, there are 16 registers that can be used for working variables, 8 of them used for pointers to the current and end position of each "run" (compiler optimization).
On my system, I compared a 4 way merge (no heap, ~3 compares per element moved) versus a 2 way merge (~1 compare per move, but twice as many passes), the 4 way has 1.5 times the number of compares, but 0.5 times the number of moves, so essentially the same number of operations, but the 4 way is about 15% faster due to cache issues.
I don't know if 16 registers is enough for a 6 way merge to be a tiny bit faster, and 16 register is not enough for an 8 way merge (some of the working variable would be memory / cache based). Trying to use a heap probably wouldn't help as the heap would be memory / cache based (not register based).
A k-way merge is mostly useful for external sorts, where compare time is ignored due to the much larger overhead of moves.
Number of operations required to implimented merge is :
:::::: 6n (logn+1)= 6nlogn+6n.
logn+1 is the number of levels in merge sort. What is 6n here?
In the case of a crude merge sort: two reads to compare two elements, one read and one write to copy the smaller element to a working array, then later another read and another write to copy elements back to the original array, for a total of 6 memory accesses per element (except for boundary cases like reaching the end of a run, in which case the remainder of the other run is just copied without compares). A more optimized merge sort avoids the copy back step by alternating the direction of merge depending on the merge pass if bottom up, or the recursion level if top down, reducing the 6 to a 4. If an element fits in a register, then after a compare, the element will be in a register and will not have to be re-read, reducing the 6 to a 3.
I'm not sure what mean by "what is 6n"? If you are asking about the complexity of your algorithm (merge sort), it can be reduced to nlog(n). You can ignore the the coefficients in your problem as they are negligible when accounting for big O complexity. When calculating nlog(n) + n, you can also ignore n as it will increase at a much slower rate than nlog(n). This leaves you with a complexity of nlog(n).
I was reading the javadocs on HashSet when I came across the interesting statement:
This class offers constant time performance for the basic operations (add, remove, contains and size)
This confuses me greatly, as I don't understand how one could possibly get constant time, O(1), performance for a comparison operation. Here are my thoughts:
If this is true, then no matter how much data I'm dumping into my HashSet, I will be able to access any element in constant time. That is, if I put 1 element in my HashSet, it will take the same amount of time to find it as if I had a googolplex of elements.
However, this wouldn't be possible if I had a constant number of buckets, or a consistent hash function, since for any fixed number of buckets, the number of elements in that bucket will grow linearly (albeit slowly, if the number is big enough) with the number of elements in the set.
Then, the only way for this to work is to have a changing hash function every time you insert an element (or every few times). A simple hash function that never any collisions would satisfy this need. One toy example for strings could be: Take the ASCII value of the strings and concatenate them together (because adding could result in a conflict).
However, this hash function, and any other hash function of this sort will likely fail for large enough strings or numbers etc. The number of buckets that you can form is immediately limited by the amount of stack/heap space you have, etc. Thus, skipping locations in memory can't be allowed indefinitely, so you'll eventually have to fill in the gaps.
But if at some point there's a recalculation of the hash function, this can only be as fast as finding a polynomial which passes through N points, or O(nlogn).
Thus arrives my confusion. While I will believe that the HashSet can access elements in O(n/B) time, where B is the number of buckets it has decided to use, I don't see how a HashSet could possibly perform add or get functions in O(1) time.
Note: This post and this post both don't address the concerns I listed..
The number of buckets is dynamic, and is approximately ~2n, where n is the number of elements in the set.
Note that HashSet gives amortized and average time performance of O(1), not worst case. This means, we can suffer an O(n) operation from time to time.
So, when the bins are too packed up, we just create a new, bigger array, and copy the elements to it.
This costs n operations, and is done when number of elements in the set exceeds 2n/2=n, so it means, the average cost of this operation is bounded by n/n=1, which is a constant.
Additionally, the number of collisions a HashMap offers is also constant on average.
Assume you are adding an element x. The probability of h(x) to be filled up with one element is ~n/2n = 1/2. The probability of it being filled up with 3 elements, is ~(n/2n)^2 = 1/4 (for large values of n), and so on and so on.
This gives you an average running time of 1 + 1/2 + 1/4 + 1/8 + .... Since this sum converges to 2, it means this operation takes constant time on average.
What I know about hashed structures is that to keep a O(1) complexity for insertion removal you need to have a good hash function to avoid collisions and the structure should not be full ( if the structure is full you will have collisions).
Normally hashed structures define a kind of fill limit, by example 70%.
When the number of object make the structure be filled more than this limit than you should extend it size to stay below the limit and warranty performances. Generally you double the size of the structure when reaching the limit so that structure size grow faster than number of elements and reduce the number of resize/maintenance operations to perform
This is a kind of maintenance operation that consists on rehashing all elements contained int he structure to redistribute them in the resized structure. For sure this has a cost whose complexity is O(n) with n the number of elements stored in the structure but this cost is not integrated in the add function that will make the maintenance operation needed
I think this is what disturb you.
I learned also that the hash function generally depends on size of the structure that is used as parameter (there was something like max number of elements to reach the limit is a prime number of structure size to reduce the probability of collision or something like that) meaning that you don't change the hash function itself, you just change on of its parameters.
To answer to your comment there is not warranty if bucket 0 or 1 was filled that when you resize to 4 new elements will go inside bucket 3 and 4. Perhaps resizing to 4 make elements A and B now be in buckets 0 and 3
For sure all above is theorical and in real life you don`t have infinite memory, you can have collisions and maintenance has a cost etc so that's why you need to have an idea about the number of objects that you will store and do a trade off with available memory to try to choose an initial size of hashed structure that will limit the need to perform maintenance operations and allow you to stay in the O(1) performances
http://en.wikipedia.org/wiki/Dynamic_array#Performance
What exactly does it mean?
I thought inserting at the end would be O(n), as you'd have to allocate say, twice the space of the original array, and then move all the items to that location and finally insert the item. How is this O(1)?
Amortized O(1) efficiency means that the sum of the runtimes of n insertions will be O(n), even if any individual operation may take a lot longer.
You are absolutely correct that appending an element can take O(n) time because of the work required to copy everything over. However, because the array is doubled each time it is expanded, expensive doubling steps happen exponentially less and less frequently. As a result, the total work done in n inserts comes out to be O(n) rather than O(n2).
To elaborate: suppose you want to insert a total of n elements. The total amount of work done copying elements when resizing the vector will be at most
1 + 2 + 4 + 8 + ... + n ≤ 2n - 1
This is because first you copy one element, then twice that, then twice that, etc., and in the absolute worst case copy over all n elements. The sum of this geometric series works out to 2n - 1, so at most O(n) elements get moved across all copy steps. Since you do n inserts and only O(n) total work copying across all of them, the amortized efficiency is O(1) per operation. This doesn't say each operation takes O(1) time, but that n operations takes O(n) time total.
For a graphical intuition behind this, as well as a rationale for doubling the array versus just increasing it by a small amount, you might want to check out these lecture slides. The pictures toward the end might be very relevant.
Hope this helps!
Each reallocation in isolation is O(N), yes. But then on the next N insertions, you don't need to do anything. So the "average" cost per insertion is O(1). We say that "the cost is amortized across multiple operations".