I'm learning data structures from a "Fundamentals of Data structures in C" by Sahni. In the topic, Circular Queue using Dynamic Array, the author has mentioned below point,
Let capacity be the initial capacity of the circular queue,We must
first increase the size of the array using realloc,this will copy
maximum of capacity elements on to the new array. To get a proper
circular queue configuration, we must slide elements in the right
segment(i.e, elements A and B) to the right end of the array(refer
diagram 3.7.d). The array doubling and the slide to the right together
copy at most 2 * capacity -2 elements.
I understand array doubling copies at most capacity elements. But how does array doubling and slide to right copy at most 2 * capacity -2 elements??
Let us try to justify worst case scenario:
For a queue with capacity = N, there are maximum N-1 elements present in the queue.
So, when we double the queue size, we need to copy all these N-1 elements to new queue, and at max there can be N-1 shifts(for elements).
So in total, 2*(N-1) = 2*N - 2
Related
I have N bins (homogenous or heterogeneous in size, depending on variant of task) in which I am trying to fit M items (always heterogeneous in size). Items can be larger than a single bin and are allowed to overflow to the next bin(s) (but no wrap around from bin N-1 to 0).
The more bins an item spans, the higher its allocation cost.
I want to minimize the overall allocation cost of fitting all M into N bins.
Everything is static. It is guaranteed that all M fit in N.
I think I am looking for a variant of the Bin Packing algorithm. But any hints towards an existing solution/approximation/good heuristic are appreciated.
My current approach looks like this:
sort items by size
for i in items:
for b in bins:
try allocation of i starting at b
if allocation valid:
record cost
do allocation of i in b with lowest recorded cost
update all b fill level
So basically a greedy by size approach with O(MxNxC) runtime, where C~"longest allocation across banks" (try allocation takes C time).
I would suggest dynamic programming for the exact solution. To make it easier to visualize, assume each bin's size is the length of an array of cells. Since they are contiguous, you can visualize them as a contiguous set of arrays, e.g.
|xxx|xx|xxx|xxxx|
the delimiters of the arrays are || and the positions in the arrays are given by x. so this has 4 arrays, arr_0 of size 3, arr_1 of size 2 and so on.
If you place an item at position i, it will ocuppy position i to i+(h-1), where h is the size of the items. E.g. if items are of size h=5, and you place an item at position 1, you would get
|xb|bb|bbx|xxxx|
One trick to use is that if we introduce the additional constraint that the items need to be inserted “in order”, i.e. the first inserted is the leftmost inserted item, the second is the second-leftmost inserted item, etc. then this problem will have the same optimal solution solution as the original one (since we can just take the optimal solution and insert it in order).
Consider pos(k-1,i) to be the optimal position to insert the k-th object
Given that we the k-1th object ended at (I-1). opt_c(k-i,i) the optimal extra-cost of inserting the k-1…N pieces, given that the k-1th object ended at (I-1).
Then pos(N-1,i) can be easily calculated by running through the cells and calculating the extra-cost (and even easier by noting that at least one of the borders should match up with pos(N-1,i), but to make analysis easier we will evaluate the extra-cost each of the I…NumCells-h). opt_c(N-1,i) equals this extra-cost.
Similarly,
pos(N-2,i) = argmin_x extra_cost(x,i, pos(N-1,x+h)) + opt(N-1,x+h)
Where extra_cost is the extra_cost(x,i,j) of inserting at x, given that the last inserted object ended at I-1 and the next inserted object will start at j.
And by substituting x = opt(N-2,i)
opt(N-2,i) = extra_cost(opt(N-2,i),I, pos(N-1,opt(N-2,i)+h)) + opt(N-1,opt(N-2,i)+h)
By induction, for all 1<=W<N-1
pos(W,i) = argmin_x extra_cost(x,i, pos(W+1,x+h)) + opt(W+1,x+h)
And
opt(W,i) = extra_cost( pos(W,i),I, pos(N+1, pos(W,i)+h)) + opt(N-1, pos(W,i)+h)
And your final result is given by the minimal over i of all opt(0,i).
I am bit confused. If I have an array I have to build a tree. To compare the childern I have to know how large my array is in this case its N = 6 so I have to divide it by 2 so I get 3. That means I have to start from index 3 to compare with the parent node. If the child is greater than the parent node then I have to swap it otherwise I don't have to. Then I go to index 2 and compare with the parent if the children is greater than the parent node then I have to swap it. Then index 1 I have to compare with the children and swap it if needed. So I have created a Max heap. But know I don't get it but why do I have to exchange A1 with A[6] then A1 with A[5]. Finally I dont get the Max heap I get the Min Heap? What does Heapify mean?
Thanks alot I appreciate every answer!
One of my exercise is Illustrate the steps of Heapsort by filling in the arrays and the tree representations
There are many implementations of a heap data structure, but one is talking about a specific implicit binary heap. Heap-sort is done in-place, so it uses this design. Binary heaps require a compete binary tree, so it can be represented as an implicit structure built out of the array: for every A[n] in zero-based array,
A[0] is the root; if n != 0, A[floor((n-1)/2)] is the parent;
if 2n+1 is in the range of the the array, then A[2n+1] is the left child, or else it is a leaf node;
if 2n+2 is in the range of the array, then A[2n+2] is the right child.
Say one's array is, [10,14,19,21,23,31], is represented implicitly by the homomorphism, using the above rules, as,
This is not following the max-heap invariants, so one must heapify, probably using Floyd's heap construction which uses sift down and runs in O(n). Now you have a heap and a sorted array of no length, ([31,23,19,21,14,10],[]), (this is all implicit, since the heap takes no extra memory, it's just an array in memory.) The visualisation of the heap at this stage,
We pop off the maximum element of the heap and use sift up to restore the heap shape. Now the heap is one smaller and we've taken the maximum element and stored unshifted it into our array, ([23,21,19,10,14],[31]),
repeat, ([21,14,19,10],[23,31]),
([19,14,10],[21,23,31]),
([14,10],[19,21,23,31]),
([10],[14,19,21,23,31]),
The heap size is one, so one's final sorted array is [10,14,19,21,23,31]. If one used a min-heap and the same algorithm, then the array would be sorted the other way.
Heap sort is a two phase process. In the first phase, you turn the array in a heap with the maximum value at the top A[1]. This is the first transition circled in red. After this phase, the heap is in the array from index 1 to 6, and the biggest value is at index 1 in A[1].
In the second phase we sort the values. This is a multistep process where we extract the biggest value from the heap and put it in place in the sorted array.
The heap is on the left side of the array and will shrink toward the left. The sorted array is on the right of the array and grows to the left.
At each step we swap the top of the heap A[1] that contains the biggest value of the heap, with the last value of the heap. The sorted array has then grown one position to the left. Since the value that has been put in A[1] is not the biggest, we have to restore the heap. This operation called max-heapify. After this process, A[1] contains the biggest value in the heap whose size has been reduced by one element.
By repeatedly extracting the biggest value left in the heap, we can sort the values in the array.
The drawing of the binary tree is very confusing. It's size should shrink at each step because the size of the heap shrinks.
I think when inserting a new node into a heap, the amount of nodes it might passes by is logN, why is it (1 + logN), where is 1 from?
This is necessary to account for the border case when the number of notes is 2n. A heap of n levels fits 2n-1 objects, so adding one more object starts the new level:
Black squares represent seven elements of a three-level heap. Red element is number eight. If your search takes you to the location of this last element, you end up with four comparisons, even though log28 is three.
I was reading Elementary Data Structures from CLRS and while reading Queue ADT I came across this:
When Q.head = Q.tail + 1 , the queue is full, and if we attempt to enqueue an
element, then the queue overflows.
Is it always true? because if Q.tail equals Q.length then we set Q.tail = 1 according to the text. Therefore if we completely fill the Queue then Q.tail and Q.head will be pointing to the same position (index 1) and the above condition shall not hold. What am I missing here? Please point out where am I misinterpreting the text. Thanks in advance.
Here Attribute Q.head indexes, or points to, queue's head. The attribute Q.tail indexes the next location at which a newly arriving element will be inserted into the queue.
As mentioned in the same paragraph in CLRS,
to implement a queue of at most n-1 elements using an array Q[1...n].
which means one position is left. It's for checking if the queue is full. If we use all the array positions, the empty queue condition and full queue condition would be the same, which is Q.head=Q.tail. #siddstuff has explained the wrap around feature, Q.head = Q.tail+1 means there is only one empty position left, so the queue is full.
Wrap around feature of queue:
You need to understand the fact that location 1 in the array immediately follows location n in circular order.
For example
Predecessor of element g at index 1 is f at index 11. Tail pointer always points to the next empty location where new element will be inserted, in enqueue operation, before inserting element we check for overflow condition, if Q.tail +1 = Q.head, it means tail is reached at head location, means no free space, means queue is full.
NOTE: (n-1) length queue can be created with the array of length n.
It's been been six years but none of these answers point out the fact that in circular buffers, there is no clean way to differentiate the case where the buffer full vs empty cases. In both cases head = tail.
Most workarounds hinder readability and introduce complexities, so when implementing circular buffers, we make a few assumptions that solve this problem and maintain simplicity.
We deliberately use only N-1 elements in the N element buffer.
When head = tail it means the buffer empty.
tail + 1 = head means the buffer full.
Here is a good read on implementing circular buffers.
I don't have your book, but from how I would implement a cyclic buffer: The condition head = tail + 1 means that if an element is inserted then tail is increased by one and then tail = head. But if head is equal to tail the queue is considered empty.
Just to clarify, the reason you can't allow the array to be completely filled is because then there would be no way to determine if it's full or empty.
To check if it's empty Q.head = Q.tails is the only way, for you can't rely on something like Q.head = 1 and Q.tails = 1 since the queue could be empty in any position, not just position 1.
That's the reason a queue created with an array of length n can only hold up to n-1 elements, and to check if it's full we do Q.tail + 1 = Q.head (or (Q.tail + 1) mod n to account for the case where Q.tail points at position n).
When the queue is implemented with an array of Q[1..n], It can hold up to n-1 elements. But the condition for is_full must be [head == tail+1] mod n not just head == tail+1.
All this confusion arises from the fact that CLRS takes array indices starting at 1 and not zero.
So, we have to implement queue of capacity n-1 using n elements because, we we have n elements and we do n%n we will get zero, and there is no such index as per CLRS. So, we restrict to n-1 capacity, so that modulo always gives atleast 1.
Given a stream of numbers how would you keep track of the 1,000,000th largest one?
I was asked this in an interview.
One way would be to keep a minimum heap, and restrict the heap's size to 1,000,000. While the heap hasn't reached a 1,000,000 items, we'll add each new item from the stream into our heap. When the heap gets full, we'll compare each new item from the stream to the minimum in the heap, and if it's bigger than the minimum, we'll eject the minimum and insert the new item. This way, the heap's minimum item is always the 1,000,000th largest value.
Pseudo code example:
Handle_Stream_Item(item):
if(MinHeap.size < 1000000):
MinHeap.insert(item)
else if (item > MinHeap.min()):
MinHeap.extractMin()
MinHeap.insert(item)
As each number is read from the stream, add it to a B-TREE structure.
https://en.wikipedia.org/wiki/B-tree
Starting with the million and first number, after adding the new number, remove the right-most (i.e. largest) one from the B-TREE.
At any one time, the right-most number in the B-TREE will be your desired number.