Having trouble understanding the K-way merge algorithm (Counter example given) - algorithm

In K way merge sort, the solution that uses a heap: essentially maintains a heap and constantly extracts max from that heap. I have a counterexample for why this won't work well.
5 -> 1 -> 0
4 -> 2 -> 1
3 -> 2 -> 0
Suppose we initialize our heap. It contains {5, 4, 3}.
We run extract max, we obtain 5 and add that into our new list (that represents the final solution). Our heap now looks like {4,3}. We then refill our heap with the head of list that we extracted the max element from.
This implies that we get something like this: {4, 3, 1}.
This doesn't make sense to me. This heap doesn't represent the top K elements anymore. 1 shouldn't be used to refill the heap, it should have been 2. So, this O(nlgk) method doesn't make much sense to me.
I hope someone can shed light on how this algorithm works because I'm stuck here.

The max heap always contains the max elements of k lists (or arrays). For your 'counter' example:
5 -> 1 -> 0
4 -> 2 -> 1
3 -> 2 -> 0
The heap is {5, 4, 3} contains max elements of these three lists.
Now you extract 5 from the heap, means you also remove 5 from the first list:
5-->1-->0: after extract 5, the list now is 1-->0: so 1 now is the top of the list.
Then the new heap is {4, 3, 1}, still contains max elements of lists.
Lets continue your example: the current heap after extracting 5 and heapifying is:
{4, 3, 1}
Extract 4 from the heap, means you also remove 4 from:
4-->2-->1: remove 4 you have 2-->1. 2 now is the top element of the list.
Then a new heap now is
{3, 2, 1}
Keep doing this, you get what you want (descending list).

Related

Convert the permutation sequence A to B by selecting a set in A then reversing that set and inserting that set at the beginning of A

Given the sequence A and B consisting of N numbers that are permutations of 1,2,3,...,N. At each step, you choose a set S in sequence A in order from left to right (the numbers selected will be removed from A), then reverse S and add all elements in S to the beginning of the sequence A. Find a way to transform A into B in log2(n) steps.
Input: N <= 10^4 (number of elements of sequence A, B) and 2 permutations sequence A, B.
Output: K (Number of steps to convert A to B). The next K lines are the set of numbers S selected at each step.
Example:
Input:
5 // N
5 4 3 2 1 // A sequence
2 5 1 3 4 // B sequence
Output:
2
4 3 1
5 2
Step 0: S = {}, A = {5, 4, 3, 2, 1}
Step 1: S = {4, 3, 1}, A = {5, 2}. Then reverse S => S = {1, 3, 4}. Insert S to beginning of A => A = {1, 3, 4, 5, 2}
Step 2: S = {5, 2}, A = {1, 3, 4}. Then reverse S => S = {2, 5}. Insert S to beginning of A => A = {2, 5, 1, 3, 4}
My solution is to use backtracking to consider all possible choices of S in log2(n) steps. However, N is too large so is there a better approach? Thank you.
For each operation of combined selecting/removing/prepending, you're effectively sorting the elements relative to a "pivot", and preserving order. With this in mind, you can repeatedly "sort" the items in backwards order (by that I mean, you sort on the most significant bit last), to achieve a true sort.
For an explicit example, lets take an example sequence 7 3 1 8. Rewrite the terms with their respective positions in the final sorted list (which would be 1 3 7 8), to get 2 1 0 3.
7 -> 2 // 7 is at index 2 in the sorted array
3 -> 1 // 3 is at index 0 in the sorted array
1 -> 0 // so on
8 -> 3
This new array is equivalent to the original- we are just using indices to refer to the values indirectly (if you squint hard enough, we're kinda rewriting the unsorted list as pointers to the sorted list, rather than values).
Now, lets write these new values in binary:
2 10
1 01
0 00
3 11
If we were to sort this list, we'd first sort by the MSB (most significant bit) and then tiebreak only where necessary on the subsequent bit(s) until we're at the LSB (least significant bit). Equivalently, we can sort by the LSB first, and then sort all values on the next most significant bit, and continuing in this fashion until we're at the MSB. This will work, and correctly sort the list, as long as the sort is stable, that is- it doesn't change the order of elements that are considered equal.
Let's work this out by example: if we sorted these by the LSB, we'd get
2 10
0 00
1 01
3 11
-and then following that up with a sort on the MSB (but no tie-breaking logic this time), we'd get:
0 00
1 01
2 10
3 11
-which is the correct, sorted result.
Remember the "pivot" sorting note at the beginning? This is where we use that insight. We're going to take this transformed list 2 1 0 3, and sort it bit by bit, from the LSB to the MSB, with no tie-breaking. And to do so, we're going to pivot on the criteria <= 0.
This is effectively what we just did in our last example, so in the name of space I won't write it out again, but have a look again at what we did in each step. We took the elements with the bits we were checking that were equal to 0, and moved them to the beginning. First, we moved 2 (10) and 0 (00) to the beginning, and then the next iteration we moved 0 (00) and 1 (01) to the beginning. This is exactly what operation your challenge permits you to do.
Additionally, because our numbers are reduced to their indices, the max value is len(array)-1, and the number of bits is log2() of that, so overall we'll only need to do log2(n) steps, just as your problem statement asks.
Now, what does this look like in actual code?
from itertools import product
from math import log2, ceil
nums = [5, 9, 1, 3, 2, 7]
size = ceil(log2(len(nums)-1))
bit_table = list(product([0, 1], repeat=size))
idx_table = {x: i for i, x in enumerate(sorted(nums))}
for bit_idx in range(size)[::-1]:
subset_vals = [x for x in nums if bit_table[idx_table[x]][bit_idx] == 0]
nums.sort(key=lambda x: bit_table[idx_table[x]][bit_idx])
print(" ".join(map(str, subset_vals)))
You can of course use bitwise operators to accomplish the bit magic ((thing << bit_idx) & 1) if you want, and you could del slices of the list + prepend instead of .sort()ing, this is just a proof-of-concept to show that it actually works. The actual output being:
1 3 7
1 7 9 2
1 2 3 5

Practical algorithms for permuting external memory

On a spinning disk, I have N records that I want to permute. In RAM, I have an array of N indices that contain the desired permutation. I also have enough RAM to hold n records at a time. What algorithm can I use to execute the permutation on disk as quickly as possible, taking into account the fact that sequential disk access is a lot faster?
I have plenty of excess disk to use for intermediate files, if desired.
This is a known problem. Find the cycles in your permutation order. For instance, given five records to permute [1, 0, 3, 4, 2], you have cycles (0, 1) and (2, 3, 4). You do this by picking an unused starting position; follow the index pointers until you return to your starting point. The sequence of pointers describes a cycle.
You then permute the records with an internal temporary variable, one record long.
temp = disk[0]
disk[0] = disk[1]
disk[1] = temp
temp = disk[2]
disk[2] = disk[3]
disk[3] = disk[4]
disk[4] = temp
Note that you can also perform the permutation as you traverse the pointers. You will also need some method to recall which positions have already been permuted, such as clearing the permutation index (set it to -1).
Can you see how to generalize that?
This is an problem with interval coordination. I'll simplify the notation slightly by changing the memory available to M records -- having upper- and lower-case N is a little confusing.
First, we re-cast the permutations as a series of intervals, the rotational span during which a record needs to reside in RAM. If a record needs to be written to a lower-numbered position, we increase the endpoint by the list size, to indicate the wraparound -- have to wait for the next disk rotation. For instance, using my earlier example, we expand the list:
[1, 0, 3, 4, 2]
0 -> 1
1 -> 0+5
2 -> 3
3 -> 4
4 -> 2+5
Now, we apply standard greedy scheduling resolution. First, sort by endpoint:
[0, 1]
[2, 3]
[3, 4]
[1, 5]
[4, 7]
Now, apply the algorithm for M-1 "lanes"; the extra one is needed for swap space. We fill each lane, appending the interval with the earliest endpoint, whose start-point doesn't overlap:
[0, 1] [2, 3] [3, 4] [4, 7]
[1, 5]
We can do this in a total of 7 "ticks" if M >= 3. If M=2, we defer the second lane by 2 rotations to [11, 15].
Sneftal's nice example gives us more troubles, with deeper overlap:
[0, 4]
[1, 5]
[2, 6]
[3, 7]
[4, 0+8]
[5, 1+8]
[6, 2+8]
[7, 3+8]
This requires 4 "lanes" if available, deferring lanes as needed if M < 5.
The pathological case is where every record in the permutation needs to be copied back one position, such as [3, 0, 1, 2], with M=2.
[0, 3]
[1, 4]
[2, 5]
[3, 6]
In this case, we walk through the deferral cycle multiple times. At the end of every rotation, we have to defer all remaining intervals by one rotation, resulting in
[0, 3] [3, 6] [2+4, 5+4] [1+4+4, 4+4+4]
Does that get you moving, or do you need more detail?
I have an idea, which might need further improvement. But here it goes:
suppose the hdd has the following structure:
5 4 1 2 3
And we want to write out this permutation:
2 3 5 1 4
Since hdd is a circular buffer, and assuming it can only rotate in one direction, we can write the above permutation using shifts as such:
5 >> 2
4 >> 3
1 >> 1
2 >> 2
3 >> 2
So let's put that in an array, and since we know it is a circular array, lets put its mirrors side by side:
| 2 3 1 2 2 | 2 3 1 2 2| 2 3 1 2 2 | 2 3 1 2 2 |... Inf
Since we want to favor sequential reads, (or writes) we can put a cost function to the above series. Let the cost function be linear, i. e:
0 1 2 3 4 5 6 7 8 9 10 ... Inf
Now, let us add the cost function to the above series, but how to select the starting point?
The idea is to select the starting point such that you get the maximum congruent monotonically increasing sequence.
For example, if you select the 0 point to be on "3", you'll get
(1) | - 3 2 4 5 | 6 8 7 9 10 | ...
If you select the 0 point to be on "2", the one just right of "1", you'll get:
(2) | - - - 2 3 | 4 6 5 7 8 | ...
Since we are trying to favor consecutive reads, lets define our read-write function to work as such:
f():
At any currently pointed hdd location, function will read the currently pointed hdd file, into available RAM. (namely, total space - 1, because we want to save 1 for swap)
If no available space is left on RAM for read, the function will assert and program will halt.
At any current hdd location, if ram holds the value that we want to be written in that hdd location, function reads the current file into swap space, writes the wanted value from the ram to hdd, and destroys the value in ram.
If a value is placed into hdd, function will check if the sequence is completed. If it is, program will return with success.
Now, we should note that if the following holds:
shift amount <= n - 1 (n : available memory we can hold)
We can traverse the hard disk in once pass using the above function. For example:
current: 4 5 6 7 0 1 2 3
we want: 0 1 2 3 4 5 6 7
n : 5
We can start anywhere we want, say from the initial "4". We read 4 items sequentially, (n has 4 items now) and we start placing from 0 1 2 3, (we can because n = 5 total, and 4 is used. 1 is used for swap). So the total operations is 4 consecutive reads, and then r-w operations for 8 times.
Using that analogy, it becomes clear that if we subtract "n-1" from equations (1) and (2), the positions which have value "<= 0" will be a better suit for initial position because the ones higher than zero will definitely require another pass.
So we select eq. (2) and subtract, for let's say "n = 3", we subtract 2 from eq. (2):
(2) | - - - 0 1 | 2 4 3 5 6 | ...
Now it is clear that, using f(), and starting from 0, assuming n = 3, we will have a starting operation as such: r, r, r-w, r-w, ...
So, how do we do the rest and find minimum cost? We will place an array with initial minimum cost, just below equation (2). The positions in that array will signify where we want f() to be executed.
| - - - 0 1 | 2 4 3 5 6 | ...
| - - - 1 1 | 1 1 1 1 1 | ...
The second array, the ones with 1's and 0's tell the program where to execute f(). Note that, if we assumed those locations wrong, f() will assert.
Before we start actually placing files into hdd, we of course want to see if the f() positions are correct. We check if there are assertions, we we will try to minimize cost whilst removing all assertions. So, e.g:
(1) 1111000000000000001111
(2) 1111111000000000000000
(1) obviously has higher cost that (2). So the question simplifies on finding the 1-0 array.
Some ideas on finding the best array:
Simplest solution is to write out all 1's and turn assertions into 0's. (essentially it's a skip). This method is guaranteed to work.
Brute force: write an array of as shown in (2) and start shifting 1's to right, in such an order that tries out every permutation available:
1111111100000000
1111111010000000
1111110110000000
...
Full random approach: Plug in mt1997 and start permuting. Whenever you see a sharp drop in cost, stop executing and implement hdd copy-paste. You won't find the global minimum, but you'll get a nice trade-off.
Genetic algorithms: For permutations where "shift count is much lower than n - 1", the methodology provided in this answer should (?) provide a global minimum and smooth gradients. This allows one to use genetic algorithms without relying on mutations too much.
One advantage I find in this approach is that, since OP mentioned that this is a real life problem, the method provides an easy(ier?) way to change cost functions. It is easier to detect the effect of say, having lots of contigous small files to be copied vs. having a single huge file. Or perhaps rrwwrrww is better than rrrrwwww?
Does any of this even make sense? We will have to try out ...

How to perform range updates in sqrt{n} time?

I have an array and I have to perform query and updates on it.
For queries, I have to find frequency of a particular number in a range from l to r and for update, I have to add x from some range l to r.
How to perform this?
I thought of sqrt{n} optimization but I don't know how to perform range updates with this time complexity.
Edit - Since some people are asking for an example, here is one
Suppose the array is of size n = 8
and it is
1 3 3 4 5 1 2 3
And there are 3 queries to help everybody explain about what I am trying to say
Here they are
q 1 5 3 - This means that you have to find the frequency of 3 in range 1 to 5 which is 2 as 3 appears on 2nd and 3rd position.
second is update query and it goes like this - u 2 4 6 -> This means that you have to add 6 in the array from range 2 to 4. So the new array will become
1 9 9 10 5 1 2 3
And the last query is again the same as first one which will now return 0 as there is no 3 in the array from position 1 to 5 now.
I believe things must be more clear now. :)
I developed this algorithm long time (20+ years) ago for Arithmetic coder.
Both Update and Retrieve are performed in O(log(N)).
I named this algorithm "Method of Intervals". Let I show you the example.
Imagine, we have 8 intervals, with numbers 0-7:
+--0--+--1--+--2-+--3--+--4--+--5--+--6--+--7--+
Lets we create additional set of intervals, each spawns pair of original ones:
+----01-----+----23----+----45-----+----67-----+
Thereafter, we'll create the extra one layer of intervals, spawn pairs of 2nd:
+---------0123---------+---------4567----------+
And at last, we create single interval, covers all 8:
+------------------01234567--------------------+
As you see, in this structure, to retrieve right border of the interval [5], you needed just add together length of intervals [0123] + [45]. to retrieve left border of the interval [5], you needed sum of length the intervals [0123] + [4] (left border for 5 is right border for 4).
Of course, left border of the interval [0] is always = 0.
When you'll watch this proposed structure carefully, you will see, the odd elements in the each layers aren't needed. I say, you do not needed elements 1, 3, 5, 7, 23, 67, 4567, since these elements aren't used, during Retrieval or Update.
Lets we remove the odd elements and make following remuneration:
+--1--+--x--+--3-+--x--+--5--+--x--+--7--+--x--+
+-----2-----+-----x----+-----6-----+-----x-----+
+-----------4----------+-----------x-----------+
+----------------------8-----------------------+
As you see, with this remuneration, used the numbers [1-8]. Lets they will be array indexes. So, you see, there is used memory O(N).
To retrieve right border of the interval [7], you needed add length of the values with indexes 4,6,7. To update length of the interval [7], you needed add difference to all 3 of these values. As result, both Retrieval and Update are performed for Log(N) time.
Now is needed algorithm, how by the original interval number compute set of indexes in this data structure. For instance - how to convert:
1 -> 1
2 -> 2
3 -> 3,2
...
7 -> 7,6,4
This is easy, if we will see binary representation for these numbers:
1 -> 1
10 -> 10
11 -> 11,10
111 -> 111,110,100
As you see, in the each chain - next value is previous value, where rightmost "1" changed to "0". Using simple bit operation "x & (x - 1)", we can wtite a simple loop to iterate array indexes, related to the interval number:
int interval = 7;
do {
int index = interval;
do_something(index);
} while(interval &= interval - 1);

Find the swapped nodes in binary search tree

Two of the nodes of a Binary Search Tree are swapped.
Input Tree:
10
/ \
5 8
/ \
2 20
In the above tree, nodes 20 and 8 must be swapped to fix the tree.
Output tree:
10
/ \
5 20
/ \
2 8
I followed the solution given in here. But I feel the solution is incorrect because:
As per the site:
The swapped nodes are not adjacent in the inorder traversal of the BST.
For example, Nodes 5 and 25 are swapped in {3 5 7 8 10 15 20 25}.
The inorder traversal of the given tree is 3 25 7 8 10 15 20 5 If we
observe carefully, during inorder traversal, we find node 7 is smaller
than the previous visited node 25. Here save the context of node 25
(previous node). Again, we find that node 5 is smaller than the
previous node 20. This time, we save the context of node 5 ( current
node ). Finally swap the two node’s values.
So my point is if it is considering 25 because it is greater than 7 than it should consider 20 as well because it is also greater than 5. So is this correct solution or I am missing something?
Yes. It is considering 25 because it is greater than 7. But, it should not consider 20 as well because it is also greater than 5. Instead, it should consider 5 because it is less than 20.
This example is not very good, because the position of 5 in the original array is the last one. Let's consider a sorted array {1, 2, 3, 4, 5}. Swap 2 and 4, then we get {1, 4, 3, 2, 5}. If two elements (not adjacent) in a sorted array is swapped, for all pairs like (A[i], A[i+1]), there will be exactly two pairs that is in wrong order, namely descending order. In the case of {1, 4, 3, 2, 5}, we have pair (4, 3), and pair (3, 2). Suppose we have pair (A[p], A[p+1]) and pair (A[q], A[q+1]), such that A[p] > A[p+1] and A[q] > A[q+1], we can claim that it is A[p] and A[q+1] being swapped. In the case of {1, 4, 3, 2, 5}, it is 4 and 2 being swapped.
Now come back to the example 3 25 7 8 10 15 20 5, in which 25, 7 and 20 5 are the only two pairs in wrong order. Then 25 and 5 are the two elements being swapped.
Following #jeffreys' notation,
if we have pair (A[p], A[p+1]) and pair (A[q], A[q+1]), such that A[p] > A[p+1] and A[q] > A[q+1], we can claim that it is A[p] and A[q+1] being swapped
You know that there's only a single swap, that would create either 2 discrepancies in the sorted order, or only one if they're adjacent. Let's say p < q, so the A[p],A[p+1] is the first descending pair, and the q's are the second.
If there's no second couple, than swapping the first couple would fix the tree, that's the easy part. Otherwise we know there are two non-adjacent nodes.
Out of the A[p] and A[p+1] let's say that A[p+1] was the one out of place. Since this is the first couple we would have to move A[p+1] forward towards the second couple, but that means that it's still going to be smaller than the earlier A[p] that stayed in place, so we would not create a sorted array. We must therefore chose A[p].
Same goes for the A[q] and A[q+1], let's say that A[q] was out of place, that means we'll have to move it backwards, and it would still be larger than A[q+1] appearing later, again breaking sort.

Permutation of a vector

suppose I have a vector:
0 1 2 3 4 5
[45,89,22,31,23,76]
And a permutation of its indices:
[5,3,2,1,0,4]
Is there an efficient way to resort it according to the permutation thus obtaining:
[76,31,22,89,45,23]
Using at most O(1) additional space?
Yes. Starting from the leftmost position, we put the element there in its correct position i by swapping it with the (other) misplaced element at that position i. This is where we need the O(1) additional space. We keep swapping pairs of elements around until the element in this position is correct. Only then do we proceed to the next position and do the same thing.
Example:
[5 3 2 1 0 4] initial state
[4 3 2 1 0 5] swapped (5,4), 5 is now in the correct position, but 4 is still wrong
[0 3 2 1 4 5] swapped (4,0), now both 4 and 0 are in the correct positions, move on to next position
[0 1 2 3 4 5] swapped (3,1), now 1 and 3 are both in the correct positions, move on to next position
[0 1 2 3 4 5] all elements are in the correct positions, end.
Note:
Since each swap operation puts at least one (of the two) elements in the correct position, we need no more than N such swaps altogether.
Zach's solution is very good.
Still, I was wondering why there is any need to sort. If you have the permutation of the indices, use the values as a pointer to the old array.
This may eliminate the need to sort the array in the first place. This is not a solution that can be used in all cases, but it will work fine in most cases.
For example:
a = [45,89,22,31,23,76];
b = [5,3,2,1,0,4]
Now if you want to lop through the values in a, you can do something like (pseudo-code):
for i=0 to 4
{
process(a[i]);
}
If you want to loop through the values in the new order, do:
for i=0 to 4
{
process(a[b[i]]);
}
As mentioned earlier, this soluion may be sufficient in many cases, but may not in some other cases. For other cases you can use the solution by Zach.But for the cases where this solution can be used, it is better because no sorting is needed at all.

Resources