I have a shuffling problem. There is lots of pages and discussions about shuffling a array of values completely, like a stack of cards.
What I need is a shuffle that will uniformly displace the array elements at most N places away from its starting position.
That is If N is 2 then element I will be shuffled at most to a position from I-2 to I+2 (within the bounds of the array).
This has proven to be tricky with some simple solutions resulting in a directional bias to the element movement, or by a non-uniform amount.
You're right, this is tricky! First, we need to establish some more rules, to ensure we don't create artificially non-random results:
Elements can be left in the position they started in. This is a necessary part of any fair shuffle, and also ensures our shuffle will work for N=0.
When N is larger than an element's distance from the start or end of the array, it's allowed to be moved to the other side. We could tweak the algorithm to forbid this, but it would violate the "uniformly" requirement - elements near either end would be more likely to stay put than elements near the middle.
Now we can actually solve the problem.
Generate an array of random value in the range i + [-N, N] where i is the current index in the array. Normalize values outside the array bounds (e.g. -1 should become length-1 and length should become 0).
Look for pairs of duplicate values (collisions) in the array, and recompute them. You have a few options:
Recompute both values until they don't collide with each other, they could both still collide with other values.
Recompute just one until it doesn't collide with the other, the first value could still collide, but the second should now be unique, which might mean fewer calls to the RNG.
Identify the set of available indices for each collision (e.g. in [3, 1, 1, 0] index 2 is available), pick a random value from that set, and set one of the array values to selected result. This avoids needing to loop until the collision is resolved, but is more complex to code and risks running into a case where the set is empty.
However you address individual collisions, repeat the process until every value in the array is unique.
Now move each element in the original array to the index specified in the array we generated.
I'm not sure how to best implement #2, I'd suggest you benchmark it. If you don't want to take the time to benchmark, I'd go with the first option. The others are optimizations that might be faster, but might actually end up being slower.
This solution has an unbounded runtime in theory, but should terminate reasonably quickly in practice. Again, benchmark and test it before using it anywhere critical.
One possible solution I have come up with though how 'naive' it is I am not certain. Especially at edges, the far edge especially.
create a array of flags (boolean) N long (representing elements that have been swapped)
For At each index check if it has already been swapped (according first element in flags array) if so, move on to next (see below)
rotate the flags array, deleting the first element (representing this
element), and add a new 'not swapped' element to end. ASIDE: This
maybe done using a modulus array lookup, to avoid having to actually
move array contents, especially for large N
Loop...
pick a number from 0 to N (or less than N, if N plus current
index is larger that array being shuffled.
If 0, element swaps with itself, move to next.
Otherwise if that element marked as swapped, Loop and try again.
Note there is always 2 elements in flags array that can be picks, itself
and the last element (unless close to end of array being shuffled)
Swap current element with selected unswapped element, mark the selected element as swapped in the flags array. Loop to next element
Related
I have a ragged array represented as a contiguous block of memory along with its "shape", corresponding to the length of each row, and its "offsets", corresponding to the index of the first element in each row. To illustrate, I have an array conceptually like this:
[[0, 0, 0],
[1, 1, 1, 1],
[2],
[3, 3],
[4, 4]]
Represented in-memory as:
values: [0, 0, 0, 1, 1, 1, 1, 2, 3, 3, 4, 4]
shape: [3, 4, 1, 2, 2]
offsets: [0, 3, 7, 8, 10]
I may have in the hundreds of millions of rows with typically, say, 3-20 four-byte floats per row, though with no hard upper bound on the row length.
I wish to shuffle the rows of this array randomly. Since the array is ragged, I can't see how the Fisher-Yates algorithm can be applied in a straightforward manner. I see how I can carry out a shuffle by randomly permuting the array shape, pre-allocating a new array, and then copying over rows according to the permutation generating the new shape with some book-keeping on the indexes. However, I do not necessarily have the RAM required to duplicate the array for the purposes of this operation.
Therefore, my question is whether there is a good way to perform this shuffle in-place, or using only limited extra memory? Run-time is also a concern, but shuffling is unlikely to be the main bottleneck.
For illustration purposes, I wrote a quick toy-version in Rust here, which attempts to implement the shuffle sketched above with allocation of a new array.
shape is redundant since shape[i] is offset[i+1]-offset[i] (if you extend offset by one element containing the length of the values array). But since your data structure has both these fields, you could shuffle your array by just in-place shuffling the two descriptor vectors (in parallel), using F-Y. This would be slightly easier if shape and offset were combined into an array of pairs (offset, length), which also might improve locality of reference, but it's certainly not critical if you have some need for the separate arrays.
That doesn't preserve the contiguity of the rows in the values list, but if all your array accesses are through offset, it will not require any other code modification.
It is possible to implement an in-place swap of two variable-length subsequences using a variant of the classic rotate-with-three-reversals algorithm. Given P V Q, a sequence conceptually divided into three variable length parts, we first reverse P, V, and Q in-place independently producing PR VR QR. Then we reverse the entire sequence in place, yielding Q V P. (Afterwards, you'd need to fixup the offsets array.)
That's linear time in the length of the span from P to Q, but as a shuffle algorithm it will add up to quadratic time, which is impractical for "hundreds of millions" of rows.
As often happens, I started with a complex idea and then simplified it. Here is the simple version, with the complex one below.
What we're going to do is quicksort it into a random arrangement. The key operation is partitioning. That is we want to take a section of m blocks and randomly partition them into m_l blocks on the left and m_r on the right.
The idea here is this. We have a queue of temporarily copied blocks on the left, and a queue of temporarily copied blocks on the right. It is a queue of blocks, but the queue size is the number of elements in it. The partitioning logic looks like this:
while m_l + m_r < m:
pick the larger queue, breaking ties randomly
if the queue is empty:
read a block into it
get block from queue
if random() < m_l / (m_l + m_r):
# put the block on the left
until we have enough room to write the block:
copy blocks into the left queue
write block to the left
m_l--
else:
# put the block on the right
until we have enough room to write the block:
copy blocks into the right queue
write block to the right
m_r--
And now we need to recursively repeat until we've quicksorted it into a random order.
Note that, unlike with a regular quicksort, we are constructing our partitions to be exactly the size we expect. So we don't have to recurse. Instead we can:
# pass 1
partition whole array
# pass 2
partition 0..(m//2-1)
partition (m//2)..(m-1)
# pass 3
partition 0..(m//4-1)
partition (m//4)..(m//2-1)
partition (m//2)..((3*m)//4-1)
partition ((3*m)//4)..(m-1)
etc.
The result is time O(n * log(m)). And the queues will never get past 5k data where k is the largest block size.
Here is an approach that we can calculate in time O(n log(n)). The maximum space needed is O(k) where k is the maximum block size.
First note, shape and offsets are largely redundant. Because shape[i] = offset[i+1] - offset[i] for all i but the last. So with O(1) extra data (which we already have in values.len()) we can make shape redundant, then (ab)use it, however we want, and then calculate it at the end.
So let's start by picking a random permutation of 0..(shape.len()-1) and placing it in shape. This will be where each element will go, and can be found in time O(n) using Fisher-Yates.
Our idea now is to use quicksort to actually get them to the right places.
First, our pivot. For O(n) work in a single pass we can add up the lengths of all blocks which will come before the median block, and also find the length of said median block.
Now quicksort is dependent upon being able to swap things. But we can't do that directly (your whole problem). However the idea is that we'll partition from the middle out. And so the values, shape and offsets arrays will have beginning and ending sections that we haven't gotten to, and a piece in the middle that we've rewritten. Where those sections meet we'll also need to have queues of blocks copied off of the left and right and not yet written back. And, of course, we'll need to have a record of where the boundaries are.
So the idea is this.
set up the data structures.
copy a few blocks in the middle to one of the queues - enough to have a place for the median block to go.
record where the median will go
while have not finished partitioning:
pick the larger queue (break ties by farthest from its end, then randomly)
if it is empty:
read a block into it
figure out where its next block needs to be written
copy blocks in its way to the appropriate queue
write the block out
Where writing the block out means writing its elements to the right place, setting its offset, and setting its shape to the still final location for that block.
This operation will partition around the median block. Recursively repeat to sort each side into blocks being in their final position.
And, finally, fix the shape array back to what it was supposed to be.
The time complexity of this is O(n log(n)) for the same reason that quicksort is. As for space complexity, if k is the largest block size, any time the queues get past size 4k then the next block you extract must be able to be written, so they cannot grow any farther. This makes the maximum space used O(k).
At first, I am given an array of fixed size, call it v. The typical size of v would be a few thousand entries. I start by computing the maximum of that array.
Following that, I am periodically given a new value for v[i] and need to recompute the value of the maximum.
What is a practically fast way (average time) of computing that maximum?
Edit: we can assume that the process is:
1) uniformly choosing a random entry;
2) changing its value to a uniform value between [0,1].
I believe this specifies the problem a bit better and allows an unequivocal "best answer" (which will depend on the array size).
You can maintain a max-heap of that array. The element can be index to the array. for every element of the array, you should also have some indexes to the element of max-heap. so every time v[i] is changed, you only need O(log(n)) to maintain the heap. (if v[i] is increased, it will go up in the heap, if v[i] is decreased, it will go down in the heap).
If the changes to the array are random, e.g. v[rand()%size] = rand(), then most of the time the max won't decrease.
There are two main ways I can think of to handle this: keep the full collection sorted on the fly, or track just the few (or one) highest elements. The choice depends on the relative importance of worst-case, average case, and fast-path. (Including code and data cache footprint of the common case where the change doesn't affect anything you're tracking.)
Really low complexity / overhead / code size: O(1) average case, O(N) worst case.
Just track the current max, (and optionally its position, if you can't get the old value to see if it == max before applying the change). On the rare occasion that the element holding the max decreased, rescan the whole array. Otherwise just see if the new element is greater than max.
The average complexity should be O(1) amortized: O(N) for N changes, since on average one of N changes affects the element holding the max. (And only half those changes decrease it).
A bit more overhead and code size, but less frequent scans of the full array: O(1) typical case, O(N) worst case.
Keep a priority queue of the 4 or 8 highest elements in the array (position and value). When an element in the PQueue is modified, remove it from the PQueue. Try to re-add the new value to the PQueue, but only if it won't be the smallest element. (It might be smaller than some other element we're not tracking). If the PQueue is empty, rescan the array to rebuild it to full size. The current max is the front of the PQueue. Rescanning the array should be quite rare, and in most cases we only have to touch about one cache line of data holding our PQueue.
Since the small PQueue needs to support fast access to the smallest and the largest element, and even finding elements that aren't the min or max, a sorted-array implementation probably makes the most sense, rather than a Heap. If it's only 8 elements, a linear search is probably best, too. (From the smallest element upwards, so the search ends right away if the old value of the element modified is less than the smallest value in the PQueue, the search stops right away.)
If you want to optimize the fast-path (position modified wasn't in the PQueue), you could store the PQueue as struct pqueue { unsigned pos[8]; int val[8]; }, and use vector instructions (e.g. x86 SSE/AVX2) to test i against all 8 positions in one or two tests. Hrm, actually just checking the old val to see if it's less than PQ.val[0] should be a good fast-path.
To track the current size of the PQueue, it's probably best to use a separate counter, rather than a sentinel value in pos[]. Checking for the sentinel every loop iteration is probably slower. (esp. since you'd prob. need to use pos to hold the sentinel values; maybe make it signed after all and use -1?) If there was a sentinel you could use in val[], that might be ok.
slower O(log N) average case, but no full-rescan worst case:
Xiaotian Pei's solution of making the whole array a heap. (This doesn't work if the ordering of v[] matters. You could keep all the elements in a Heap as well as in the ordered array, but that sounds cumbersome.) Re-heapifying after changing a random element will probably write several other cache lines every time, so the common case is much slower than for the methods that only track the top one or few elements.
something else clever I haven't thought of?
I am implementing an parallel sorting by regular sampling algorithm which is described here. I am stuck in a point at which I need to migrate sorted sublists to proper places of the sorted array. The problem can be stated in that way: There is one global array. The array has been divided into p subarrays.Each of those subarrays was sorted. p-1 global pivot elements were determined and each sub-array was divided into p sub-sub arrays (yellow, red, green). Now I need to move those sub-sub-arrays so that sub-sub-arrays with local index i are in the thread i (so they are ordered in such manner at which colors are neighbouring and the order from left to right remains).
Actually serial algorithm will do, but I just have no clever idea how to obtain a proper permutation. The following figure shows a case for p=3 threads. Yellow color denotes a sub-sub-array 0, red - 1, green - 2.
The sub-sub arrays may have different sizes.
Ok Seems like I don't have enough reputation to comment on your question, so I shall take the route of posting the answer.
So let me get this straigh. You are stuck on phase 3 of this algo. Right?
How about this:
Let's have p linkedLists of indexes. Let each process communicate the index ranges to process i; as the indexes are communicated, append the indexes to list of process i. When all the communications are over, you shall have the all the indexes for process i in the list of process i. Node of this list should be a data structre like
Node {
index
valueOfIndex
}
Now as you populate the list, copy its value also in the list.
Once you are through with the process. You can recrate your array for process i using its list i.
????
It is a coding interview question. We are given an array say random_arr and we need to sort it using only the swap function.
Also the number of swaps for each element in random_arr are limited. For this you are given an array parent_arr, containing number of swaps for each element of random_arr.
Constraints:
You should use swap function.
Every element may repeat minimum 5 times and maximum 26 times.
You cannot make elements of given array to 0.
You should not write helper functions.
Now I will explain how parent_arr is declared. If parent_arr is like:
parent_arr[] = {a,b,c,d,...,z} then
a can be swapped at most one time.
b can be swapped at most two times.
if parent_arr[] = {c,b,a,....,z} then
c can be swapped at most one time.
b can be swapped at most two times.
a can be swapped at most three times
My solution:
For each element in random_arr[] store that how many elements are below it, if it is sorted. Now select element having minimum swap count from parent_arr[] and check whether it exist in random_arr[]. If yes and it if has occurred more than one time then it will have more than one location where it can be placed. Now choose the position(rather element at that position, preciously) with maximum swap count and swap it. Now decrease the swap count for that element and sort the parent_arr[] and repeat the process.
But it is quite inefficient and its correctness can't be proved. Any ideas?
First, let's simplify your algorithm; then let's informally prove its correctness.
Modified algorithm
Observe that once you computed the number of elements below each number in the sorted sequence, you have enough information to determine for each group of equal elements x their places in the sorted array. For example, if c is repeated 7 times and has 21 elements ahead of it, then cs will occupy the range [21..27] (all indexes are zero-based; the range is inclusive of its ends).
Go through the parent_arr in the order of increasing number of swaps. For each element x, find the beginning of its target range rb; also note the end of its target range re. Now go through the elements of random_arr outside of the [rb..re] range. If you see x, swap it into the range. After swapping, increment rb. If you see that random_arr[rb] is equal to x, continue incrementing: these xs are already in the right spot, you wouldn't need to swap them.
Informal proof of correctness
Now lets prove the correctness of the above. Observe that once an element is swapped into its place, it is never moved again. When you reach an element x in the parent_arr, all elements with lower number of swaps are already processed. By construction of the algorithm this means that these elements are already in place. Suppose that x has k number of allowed swaps. When you swap it into its place, you move another element out.
This replaced element cannot be x, because the algorithm skips xs when looking for a destination in the target range [rb..re]. Moreover, the replaced element cannot be one of elements below x in the parent_arr, because all elements below x are in their places already, and therefore cannot move. This means that the swap count of the replaced element is necessarily k+1 or more. Since by the time that we finish processing x we have exhausted at most k swaps on any element (which is easy to prove by induction), any element that we swap out to make room for x will have at least one remaining swap that would allow us to swap it in place when we get to it in the order dictated by the parent_arr.
If I have N arrays, what is the best(Time complexity. Space is not important) way to find the common elements. You could just find 1 element and stop.
Edit: The elements are all Numbers.
Edit: These are unsorted. Please do not sort and scan.
This is not a homework problem. Somebody asked me this question a long time ago. He was using a hash to solve the problem and asked me if I had a better way.
Create a hash index, with elements as keys, counts as values. Loop through all values and update the count in the index. Afterwards, run through the index and check which elements have count = N. Looking up an element in the index should be O(1), combined with looping through all M elements should be O(M).
If you want to keep order specific to a certain input array, loop over that array and test the element counts in the index in that order.
Some special cases:
if you know that the elements are (positive) integers with a maximum number that is not too high, you could just use a normal array as "hash" index to keep counts, where the number are just the array index.
I've assumed that in each array each number occurs only once. Adapting it for more occurrences should be easy (set the i-th bit in the count for the i-th array, or only update if the current element count == i-1).
EDIT when I answered the question, the question did not have the part of "a better way" than hashing in it.
The most direct method is to intersect the first 2 arrays and then intersecting this intersection with the remaining N-2 arrays.
If 'intersection' is not defined in the language in which you're working or you require a more specific answer (ie you need the answer to 'how do you do the intersection') then modify your question as such.
Without sorting there isn't an optimized way to do this based on the information given. (ie sorting and positioning all elements relatively to each other then iterating over the length of the arrays checking for defined elements in all the arrays at once)
The question asks is there a better way than hashing. There is no better way (i.e. better time complexity) than doing a hash as time to hash each element is typically constant. Empirical performance is also favorable particularly if the range of values is can be mapped one to one to an array maintaining counts. The time is then proportional to the number of elements across all the arrays. Sorting will not give better complexity, since this will still need to visit each element at least once, and then there is the log N for sorting each array.
Back to hashing, from a performance standpoint, you will get the best empirical performance by not processing each array fully, but processing only a block of elements from each array before proceeding onto the next array. This will take advantage of the CPU cache. It also results in fewer elements being hashed in favorable cases when common elements appear in the same regions of the array (e.g. common elements at the start of all arrays.) Worst case behaviour is no worse than hashing each array in full - merely that all elements are hashed.
I dont think approach suggested by catchmeifyoutry will work.
Let us say you have two arrays
1: {1,1,2,3,4,5}
2: {1,3,6,7}
then answer should be 1 and 3. But if we use hashtable approach, 1 will have count 3 and we will never find 1, int his situation.
Also problems becomes more complex if we have input something like this:
1: {1,1,1,2,3,4}
2: {1,1,5,6}
Here i think we should give output as 1,1. Suggested approach fails in both cases.
Solution :
read first array and put into hashtable. If we find same key again, dont increment counter. Read second array in same manner. Now in the hashtable we have common elelements which has count as 2.
But again this approach will fail in second input set which i gave earlier.
I'd first start with the degenerate case, finding common elements between 2 arrays (more on this later). From there I'll have a collection of common values which I will use as an array itself and compare it against the next array. This check would be performed N-1 times or until the "carry" array of common elements drops to size 0.
One could speed this up, I'd imagine, by divide-and-conquer, splitting the N arrays into the end nodes of a tree. The next level up the tree is N/2 common element arrays, and so forth and so on until you have an array at the top that is either filled or not. In either case, you'd have your answer.
Without sorting and scanning the best operational speed you'll get for comparing 2 arrays for common elements is O(N2).