Which data structure to use as Stack with random position insertion ability? - data-structures

Card game, where used cards go back to the deck, but are not inserted of the bottom, but at a random position within the 2nd half of the deck.
Such as given a deck with cards [c5, c6, c7, c8, c9]
the next pop operation should return c5, and after c5 is used, it should be reinserted into the deck within the 2nd half,
ie before c8 OR after c8 Or after c9
How would you design an efficient data structure for this deck?
Amount of cards is constant

If the number of cards is really always 52, then an array makes a fine data structure. Since the number of elements is constant, all operations are constant time. Moving elements in consecutive memory locations is extremely fast in modern systems. You'll have a hard time doing better with a more sophisticated data structure.
If the number of cards can vary arbitrarily, I suggest a an order statistic tree. In such trees, each node tracks the number of nodes in the subtree of which it is the root. This enables insertion and removal at any ordinal position. Effectively, this gives you an array where insertion and removal are O(log n) it the tree is balanced.
Happily since you are re-inserting randomly, you get expected O(log n) performance "for free" if you initially create a balanced tree for the deck. If you want worst case O(log n), you can make the tree self-balancing (red-black, AVL, etc.).
To deal from the top of the deck, remove position 0. To re-insert, pick a random number in 0-51, and insert at that ordinal position.

Related

Best sorting algorithm - Partially sorted linked list

Problem- Given a sorted doubly link list and two numbers C and K. You need to decrease the info of node with data K by C and insert the new node formed at its correct position such that the list remains sorted.
I would think of insertion sort for such problem, because, insertion sort at any instance looks like, shown bunch of cards,
that are partially sorted. For insertion sort, number of swaps is equivalent to number of inversions. Number of compares is equivalent to number of exchanges + (N-1).
So, in the given problem(above), if node with data K is decreased by C, then the sorted linked list became partially sorted. Insertion sort is the best fit.
Another point is, amidst selection of sorting algorithm, if sorting logic applied for array representation of data holds best fit, then same sorting logic should holds best fit for linked list representation of same data.
For this problem, Is my thought process correct in choosing insertion sort?
Maybe you mean something else, but insertion sort is not the best algorithm, because you actually don't need to sort anything. If there is only one element with value K then it doesn't make a big difference, but otherwise it does.
So I would suggest the following algorithm O(n), ignoring edge cases for simplicity:
Go forward in the list until the value of the current node is > K - C.
Save this node, all the reduced nodes will be inserted before this one.
Continue to go forward while the value of the current node is < K
While the value of the current node is K, remove node, set value to K - C and insert it before the saved node. This could be optimized further, so that you only do one remove and insert operation of the whole sublist of nodes which had value K.
If these decrease operations can be batched up before the sorted list must be available, then you can simply remove all the decremented nodes from the list. Then, sort them, and perform a two-way merge into the list.
If the list must be maintained in order after each node decrement, then there is little choice but to remove the decremented node and re-insert in order.
Doing this with a linear search for a deck of cards is probably acceptable, unless you're running some monstrous Monte Carlo simulation involving cards, that runs for hours or day, so that optimization counts.
Otherwise the way we would deal with the need to maintain order would be to use an ordered sequence data structure: balanced binary tree (red-black, splay) or a skip list. Take the node out of the structure, adjust value, re-insert: O(log N).

IOI Qualifier INOI task 2

I can't figure out how to solve question 2 in the following link in an efficient manner:
http://www.iarcs.org.in/inoi/2012/inoi2012/inoi2012-qpaper.pdf
You can do this in On log n) time. (Or linear if you really care to.) First, pad the input array out to the next power of two using some really big negative number. Now, build an interval tree-like data structure; recursively partition your array by dividing it in half. Each node in the tree represents a subarray whose length is a power of two and which begins at a position that is a multiple of its length, and each nonleaf node has a "left half" child and a "right half" child.
Compute, for each node in your tree, what happens when you add 0,1,2,3,... to that subarray and take the maximum element. Notice that this is trivial for the leaves, which represent subarrays of length 1. For internal nodes, this is simply the maximum of the left child with length/2 + right child. So you can build this tree in linear time.
Now we want to run a sequence of n queries on this tree and print out the answers. The queries are of the form "what happens if I add k,k+1,k+2,...n,1,...,k-1 to the array and report the maximum?"
Notice that, when we add that sequence to the whole array, the break between n and 1 either occurs at the beginning/end, or smack in the middle, or somewhere in the left half, or somewhere in the right half. So, partition the array into the k,k+1,k+2,...,n part and the 1,2,...,k-1 part. If you identify all of the nodes in the tree that represent subarrays lying completely inside one of the two sequences but whose parents either don't exist or straddle the break-point, you will have O(log n) nodes. You need to look at their values, add various constants, and take the maximum. So each query takes O(log n) time.

Find the pair of bitstrings with the largest number of common set bits

I want to find an algorithm to find the pair of bitstrings in an array that have the largest number of common set bits (among all pairs in the array). I know it is possible to do this by comparing all pairs of bitstrings in the array, but this is O(n2). Is there a more efficient algorithm? Ideally, I would like the algorithm to work incrementally by processing one incoming bitstring in each iteration.
For example, suppose we have this array of bitstrings (of length 8):
B1:01010001
B2:01101010
B3:01101010
B4:11001010
B5:00110001
The best pair here is B2 and B3, which have four common set bits.
I found a paper that appears to describe such an algorithm (S. Taylor & T. Drummond (2011); "Binary Histogrammed Intensity Patches for Efficient and Robust Matching"; Int. J. Comput. Vis. 94:241–265), but I don't understand this description from page 252:
This can be incrementally updated in each iteration as the only [bitstring] overlaps that need recomputing are those for the new parent feature and any other [bitstrings] in the root whose “most overlapping feature” was one of the two selected for combination. This avoids the need for the O(N2) overlap comparison in every iteration and allows a forest for a typically-sized database of 700 features to be built in under a second.
As far as I can tell, Taylor & Drummond (2011) do not purport to give an O(n) algorithm for finding the pair of bitstrings in an array with the largest number of common set bits. They sketch an argument that a record of the best such pairs can be updated in O(n) after a new bitstring has been added to the array (and two old bitstrings removed).
Certainly the explanation of the algorithm on page 252 is not very clear, and I think their sketch argument that the record can be updated in O(n) is incomplete at best, so I can see why you are confused.
Anyway, here's my best attempt to explain Algorithm 1 from the paper.
Algorithm
The algorithm takes an array of bitstrings and constructs a lookup tree. A lookup tree is a binary forest (set of binary trees) whose leaves are the original bitstrings from the array, whose internal nodes are new bitstrings, and where if node A is a parent of node B, then A & B = A (that is, all the set bits in A are also set in B).
For example, if the input is this array of bitstrings:
then the output is the lookup tree:
The algorithm as described in the paper proceeds as follows:
Let R be the initial set of bitstrings (the root set).
For each bitstring f1 in R that has no partner in R, find and record its partner (the bitstring f2 in R − {f1} which has the largest number of set bits in common with f1) and record the number of bits they have in common.
If there is no pair of bitstrings in R with any common set bits, stop.
Let f1 and f2 be the pair of bitstrings in R with the largest number of common set bits.
Let p = f1 & f2 be the parent of f1 and f2.
Remove f1 and f2 from R; add p to R.
Go to step 2.
Analysis
Suppose that the array contains n bitstrings of fixed length. Then the algorithm as described is O(n3) because step 2 is O(n2), and there are O(n) iterations, because at each iteration we remove two bitstrings from R and add one.
The paper contains an argument that step 2 is Ω(n2) only on the first time around the loop, and on other iterations it is O(n) because we only have to find the partner of p "and any other bitstrings in R whose partner was one of the two selected for combination." However, this argument is not convincing to me: it is not clear that there are only O(1) other such bitstrings. (Maybe there's a better argument?)
We could bring the algorithm down to O(n2) by storing the number of common set bits between every pair of bitstrings. This requires O(n2) extra space.
Reference
S. Taylor & T. Drummond (2011). "Binary Histogrammed Intensity Patches for Efficient and Robust Matching". Int. J. Comput. Vis. 94:241–265.
Well for each bit position you could maintain two sets, those with that position on and those with it off. The sets could be placed in two binary trees for example.
Then you just perform set unions, first with all eight bits, than every combination of 7 and so on, until you find union with two elements.
The complexity here grows exponentially in the bit size, but if it is small and fixed this isn't a problem.
Another way to do it might be to look at the n k-bit strings as n points in a kD space, and your task is to find the two points closest together. There are a number of geometric algorithms to do this.

d-heaps deletion algorithm

Binary heaps are so simple that they are almost always used when
priority queues are needed. A simple generalization is a d-heap, which
is exactly like a binary heap except that all nodes have d children
(thus, a binary heap is a 2-heap).
Notice that a d-heap is much more shallow than a binary heap,
improving the running time of inserts to O(log( base(d)n)). However,
the delete_min operation is more expensive, because even though the
tree is shallower, the minimum of d children must be found, which
takes d - 1 comparisons using a standard algorithm. This raises the
time for this operation to O(d logdn). If d is a constant, both
running times are, of course, O(log n).
My question is for d childeren we should have d comparisions, how author concluded that d-1 comparisions using a standard algorithm.
Thanks!
You have one comparison less than children.
E.g. for two children a1 and a2 you compare only once a1<=>a2 to find the smaller one.
For three children a1, a2, a3 you compare once to find the smaller of a1 and a2 and a second time to compare the smaller one to a3.
By induction you see that for each additional child you need an additional comparison, comparing the minimum of the previous list with the newly added child.
Thus, in general for d children you need d-1 comparisons to find the minimum.

Build data structure to perform reverse and report queries in poly-logn time

Given a sequence of n numbers, {a1, a2, a3, …, an}. Build a data structure such that the following operations can be performed in poly-logn time.
Reverse(i, j):
Reverse all the elements in the range i to j, as shown below:
Original Sequence: <… ai-1, ai, ai+1, …, aj-1, aj, aj+1, …>
Sequence after swap: <… ai-1, aj, aj-1, …, ai-1, ai, aj+1, …>
Report(i):
Report the i-th element in the sequence, i.e. ai.
Here, poly-logn means some power of log n. like log(n) · log(n) may be acceptable.
[Note: Thanks to Prof. Baswana for asking this question.]
I was thinking of using a binary tree, with a node augmented with a Left|Right indicator and the number of elements in this sub-tree.
If the indicator is set to Left then begin by reading the left child, then read the right one
Else (set to Right) then begin by reading the right child, then read the left one
The Report is fairly obvious: O(log n)
The Revert is slightly more complicated, and I am unsure if it'd really work.
The idea would be to "isolate" the sequence of elements to reverse in a particular sub-tree (the lowest possible). This subtree contains range [a..b] including [i..j]
Reverse the minimum sub-tree that contains this sequence (change of the indicator)
Apply the Revert operation to [a..i-1] and [j+1..b]
Not sure it really works though :/
EDIT:
The previous solution does not work :) I can't imagine a solution that does not rearrange the tree, and they do not respect the complexity requirements.
I'll leave this there in case it gives some idea to someone else, and I'll delete it afterward unless I find a solution myself.
Splay trees + your decorations get O(log n) amortized. The structural problems Matthieu encountered are dealt with by the fact that in O(log n) amortized time, we can change the root to any node we like.
(Note: this data structure is an important piece of local search algorithms for the Traveling Salesman Problem, where people have found that two- and three-level trees with high arity are more efficient in practice.)

Resources