Complexity of cache-oblivious stacks and queues - algorithm

I have read that a cache oblivious stack can be implemented using a doubling array.
Can someone please explain how the analysis makes each push and pop have a 1/B amortized I/O complexity?

A stack supports the following operations:
Push
Pop
While these two operations can be performed using a singly-linked list with O(1) push and O(1) pop, it suffers from caching problems, since the stored elements are dispersed through memory. For this approach, we push to the front of the list, and pop from the front of the list.
We can use a dynamic array as our data structure, and push and pop to the end of the array. (We will keep track of the last filled position in the array as our index, and modify it as we push and pop elements).
Popping will be O(1) since we don't need to resize the array.
If there is an extra space at the end of the array, pushing will be O(1).
Problem is when we try to push an element but there is no space for it. In this case we create a new array which is twice as large (2n), then copy each of the n elements over, followed by pushing the element.
Suppose we have an array which is already size n, but starts empty.
If I push n+1 elements onto the array, then the first n elements take O(1)*n = O(n) time.
The +1 element takes O(n) time, since it must build a new copy of the array.
So pushing n+1 elements into the array is O( 2n ), but we can get rid of the constant and just say it is O(n) or linear in the number of elements.
So while pushing a single element may take longer than a constant operation, pushing a large number of elements takes a linear amount of work.
The dynamic array is cache-friendly since all elements are a close to each other as possible, so multiple elements should be in the same cache-lines.

I would think standard stacks are cache-oblivious. You fault on only 1/B of the accesses because any sequence of push/pop must be adjacent addresses, so you can hit a new cache line only once every B operations. (Note: argument requires at least 2 cache lines to prevent thrashing.)

Related

Why is Merge sort space complexity O(n)? [duplicate]

This question already has answers here:
Space requirements of a merge-sort
(2 answers)
Closed 2 years ago.
At first look, it makes sense that merge sort has space complexity of O(n) because to sort the unsorted array I'm splitting and creating subarrays but the sum of sizes of all the subarray will be n.
Question : The main concern that I have is that of memeory allocation of mergerSort() function during recurssion. I have a main stack, and each function call to mergerSort() ( recussively) will be pushed on the stack. Now each recussively called mergeSort() function will have its own stack. Therefore, say if we have made 5 recussive calls to mergeSort() then the main stack will contain 5 function call where each function call will have its own function stack. Now each function stack will have its own local varibales like left subarray and right subarray that the function creates. Hence, each of the 5 function stacks should have 5 different subarrays in memeory. So shouldn't the space grow with the growth in recussive calls ?
Memory should be linear
Although each call to mergeSort triggers two recursive calls, so it makes sense to talk about and draw the binary tree of recursive calls, only one of those two recursive calls is performed at a time; the first call ends before the second call starts. Hence, at any given time, only one branch of the tree is being explored. The "call stack" represents this branch.
The depth of the recursion tree is at most log(n), therefore the height of the call stack is at most log(n).
How much memory does it take to explore one branch? In other words, how much memory is allocated on the call stack, at most, at any given time?
At the bottom of the call stack, there is an array of size n.
On top of that is an array of size n/2.
On top of that is an array of size n/4.
Etc...
So the total size of the call stack is at most n + n/2 + n/4 + ... < 2n.
Hence the total size of the call stack is at most 2n.
Possible memory leak
If your implementation of merge sort allocates a new array at every recursive call, and you forget to free those arrays at the end of the call, then the total allocated memory becomes the total memory required for the whole tree, instead of just one branch.
Consider all the nodes at a given depth in the tree. The subarrays of these nodes add up to form the whole array. For instance, the root of the tree has an array of length n; then one level below that, there are two subarrays representing two halves of the original array; then one level below that, there are four subarrays representing four fourth of the original array; etc. Hence each level of the tree requires memory n. There are log(n) levels to the tree. Hence the total amount of memory allocated for the whole tree would be n log(n).
Conclusion
If merge sort has no memory leaks, then its space complexity is linear O(n). In addition, it is possible (although not always desirable) to implement merge sort in-place, in which case the space complexity is constant O(1) (all operations are performed directly inside the input array).
However, if your implementation of merge sort has a memory leak, i.e., you keep allocating new arrays in recursive calls, but do not free them when the recursive call returns, then it could easily have space complexity O(n log n).

How does merge sort have space complexity O(n) for worst case?

O(n) complexity means merge sort in worst case takes a memory space equal to the number of elements present in the initial array. But hasn't it created new arrays while making the recursive calls? How that space is not counted?
A worst case implementation of top down merge sort could take more space than the original array, if it allocates both halves of the array in mergesort() before making the recursive calls to itself.
A more efficient top down merge sort uses an entry function that does a one time allocation of a temp buffer, passing the temp buffer's address as a parameter to one of a pair of mutually recursive functions that generate indices and merge data between the two arrays.
In the case of a bottom up merge sort, a temp array 1/2 the size of the original array could be used, merging both halves of the array, ending up with the first half of data in the temp array, and the second half in the original array, then doing a final merge back into the original array.
However the space complexity is O(n) in either case, since constants like 2 or 1/2 are ignored for big O.
MergeSort has enough with a single buffer of the same size as the original array.
In the usual version, you perform a merge from the array to the extra buffer and copy back to the array.
In an advanced version, you perform the merges from the array to the extra buffer and conversely, alternately.
Note: This answer is wrong, as was pointed out to me in the comments. I leave it here as I believe it is helpful to most people who wants to understand these things, but remember that this algorithm is actually called in-place mergesort and can have a different runtime complexity than pure mergesort.
Merge sort is easy to implement to use the same array for everything, without creating new arrays. Just send the bounds in each recursive call. So something like this (in pseudocode):
mergesort(array) ->
mergesort'(array, 0, length of array - 1)
mergesort'(array, start, end) ->
mergesort'(array, start, end/2)
mergesort'(array, end/2+1, end)
merge(array, start, end/2, end/2+1, end)
merge(array, start1, end1, start2, end2) ->
// This function merges the two partitions
// by just moving elements inside array
In Merge Sort, space complexity is always omega(n) as you have to store the elements somewhere. Additional space complexity can be O(n) in an implementation using arrays and O(1) in linked list implementations. In practice implementations using lists need additional space for list pointers, so unless you already have the list in memory it shouldn't matter. edit if you count stack frames, then it's O(n)+ O(log n) , so still O(n) in case of arrays. In case of lists it's O(log n) additional memory.
That's why in merge-sort complexity analysis people mention 'additional space requirement' or things like that. It's obvious that you have to store the elements somewhere, but it's always better to mention 'additional memory' to keep purists at bay.

Dequeue algorithm

Write four O(1)-time procedures to insert elements into and delete elements from both ends of a deque constructed from an array.
In my implementation I have maintained 4 pointers front1,rear1,front2,rear2.
Do you have any other algorithm with less pointers and O(1) complexity ? Please explain.
There are two common ways to implement a deque:
Doubly linked list: You implement a doubly linked list, and maintain pointers to the front and the end of the list. It is easy to both insert and remove the start/end of the linked list in O(1) time.
A circular dynamic array: In here, you have an array, which is treated as circular array (so elements in index=arr.length-1 and index=0 are regarded as adjacent).
In this implementation you hold the index number of the "head", and the "tail". Adding element to the "head" is done to index head-1 (while moving the head backward), and adding element to the tail is done by writing it to index tail+1.
This method is amortized O(1), and has better constants then the linked list implementation. However, it is not "strict worst case" O(1), since if the number of elements exceeds the size of the array, you need to reallocate a new array and move elements from the old one to the new one. This takes O(n) time (but needs to be done after at least O(n) operations), and thus it is O(1) amortized analysis, but can still fall to O(n) from time to time.

EDIT: Never mind

EDIT: Wow I'm so sorry....I somehow confused the LinkedList and ArrayList columns in the second graph >_> I didn't sleep much ....sorry...At least one answer did help me in other ways, with a detailed explanation, so this post wasn't a TOTAL waste...
I did find some topics about this but there were contradictions in posts, so I wanted confirmation on who was correct.
This topic here is what I found:
When to use LinkedList over ArrayList?
The most upvoted answer says:
"For LinkedList
get is O(n)
add is O(1)
remove is O(n)
Iterator.remove is O(1)
For ArrayList
get is O(1)
add is O(1) amortized, but O(n) worst-case since the array must be resized and copied
remove is O(n)"
But then someone else posted a link here that says:
http://leepoint.net/notes-java/algorithms/big-oh/bigoh.html
Algorithm ArrayList LinkedList
access front O(1) O(1)
access back O(1) O(1)
access middle O(1) O(N)
insert at front O(N) O(1)
insert at back O(1) O(1)
insert in middle O(N) O(1)
There is no contradiction between the two sources cited in the question.
First a few thoughts about LinkedLists:
In a linked list, we need to move a pointer through the list to get access to any particular element to either delete it, examine it, or insert a new element before it. Since the java.util.LinkedList implementation contains a reference to the front and back of the list, we have immediate accesss to the front and back of the list and this explains why any operation involving the front or back of the list is O(1). If an operation is done using an Iterator, then the pointer is already where you need it to be. So to remove an element from the middle takes O(n) time, but if the Iterator already spent O(n) operations getting to the middle, then iter.remove() can execute in O(1).
Now conisider ArrayList:
Under the hood, ArrayList stores data in a primitive array. So while we can access any element in O(1) time, adding or removing an element will require that the entire array be shifted down by one element and this takes O(n) time. If we are adding or removing the last element, this does not require any shifting, so this can run in O(1).
This means that calling list.add(newItem) takes O(1), but occasionally there is no room at the end of the list, so the entire list needs to be copied into new memory before ArrayList can perform the add. However, since every time ArrayList resizes itself it doubles the previous capacity, this copy operation only happens log2 n times when adding n elements. So we still say that add runs in O(1) time. If you know how many elements you will be adding when the ArrayList is created, you can give it an initial capacity to improve performance by avoiding the copy operation.

Big Oh notation - push and pop

I think I am starting to understand at least the theory behind big Oh notation, i.e. it is a way of measuring the rate at which the speed of a function grows. In other words, big O quantifies an algorithm's efficiency. But the implementation of it is something else.
For example, in the best case scenario push and pull operations will be O(1) because the number of steps it takes to remove from or add to the stack are going to be fixed. Regardless of the value, the process will be the same.
I'm trying to envision how a sequence of events such as push and pop can degrade performance from O(1) to O(n^2). If I have an array of n/2 capacity, n push and pop operations, and a dynamic array that doubles or halves its capacity when full or half full, how is it possible that the sequence in which these operations occur can affect the speed in which a program completes? Since push and pop work on the top element of the stack, I'm having trouble seeing how efficiency goes from a constant to O(n^2).
Thanks in advance.
You're assuming that the dynamic array does its resize operations quite intelligently. If this is not the case, however, you might end up with O(n^2) runtime: Suppose the array does not double its size when full but simply is resized to size+1. Also, suppose it starts with size 1. You'd insert the first element in O(1). When inserting the second elment, the array would need to be resized to size 2, requiring it to copy the previous value. When inserting element k, it would currently have size k-1, and need to be resized to size k, resulting in k-1 elements that need to be copied, and so on.
Thus, for inserting n elements, you'd end up with copying the array n-1 times: O(n) resizes. The copy operations are also linearly dependent on n since the more elements are have been inserted, the more need to be copied: O(n) copies per resize. This results in O(n*n) = O(n^2) as its runtime complexity.
If I implement a stack as (say) a linked list, then pushes and pops will always be constant time (i.e. O(1)).
I would not choose a dynamic array implementation for a stack, unless runtime wasn't an issue for me, I happened to have a dynamic array ready-built and available to use, and I didn't have a more efficient stack implementation handy. However, if I did use an array that resized up or down when it became full or half-empty respectively, its runtime would be O(1) while the numbers of pushes and pops are low enough not to trigger the resize and O(n) when there is a resize (hence overall O(n)).
I can't think of a case where a dynamic array used as a stack could deliver performance as bad as O(n^2) unless there was a bug in its implementation.

Resources