Algorithm for Shuffling a Linked List in n log n time - algorithm

I'm trying to shuffle a linked list using a divide-and-conquer algorithm that randomly shuffles a linked list in linearithmic (n log n) time and logarithmic (log n) extra space.
I'm aware that I can do a Knuth shuffle similar to that could be used in a simple array of values, but I'm not sure how I would do this with divide-and-conquer. What I mean is, what am I actually dividing? Do I just divide to each individual node in the list and then randomly assemble the list back together using some random value?
Or do I give each node a random number and then do a mergesort on the nodes based on the random numbers?

What about the following? Perform the same procedure as merge sort. When merging, instead of selecting an element (one-by-one) from the two lists in sorted order, flip a coin. Choose whether to pick an element from the first or from the second list based on the result of the coin flip.
Edit (2022-01-12): As GA1 points out in the answer below, this algorithm doesn't produce a permutation uniformly at random.
Algorithm.
shuffle(list):
if list contains a single element
return list
list1,list2 = [],[]
while list not empty:
move front element from list to list1
if list not empty: move front element from list to list2
shuffle(list1)
shuffle(list2)
if length(list2) < length(list1):
i = pick a number uniformly at random in [0..length(list2)]
insert a dummy node into list2 at location i
# merge
while list1 and list2 are not empty:
if coin flip is Heads:
move front element from list1 to list
else:
move front element from list2 to list
if list1 not empty: append list1 to list
if list2 not empty: append list2 to list
remove the dummy node from list
The key point for space is that splitting the list into two does not require any extra space. The only extra space we need is to maintain log n elements on the stack during recursion.
The point with the dummy node is to realize that inserting and removing a dummy element keeps the distribution of the elements uniform.
Edit (2022-01-12): As Riley points out in the comments, the analysis below is flawed.
Analysis.
Why is the distribution uniform? After the final merge, the probability P_i(n) of any given number ending up in the position i is as follows. Either it was:
in the i-th place in its own list, and the list won the coin toss the first i times, the probability of this is 1/2^i;
in the i-1-st place in its own list, and the list won the coin toss i-1 times including the last one and lost once, the probability of this is (i-1) choose 1 times 1/2^i;
in the i-2-nd place in its own list, and the list won the coin toss i-2 times including the last one and lost twice, the probability of this is (i-1) choose 2 times 1/2^i;
and so on.
So the probability
P_i(n) = \sum_{j=0}^{i-1} (i-1 choose j) * 1/2^i * P_j(n/2).
Inductively, you can show that P_i(n) = 1/n. I let you verify the base case and assume that P_j(n/2) = 2/n. The term \sum_{j=0}^{i-1} (i-1 choose j) is exactly the number of i-1-bit binary numbers, i.e. 2^{i-1}. So we get
P_i(n) = \sum_{j=0}^{i-1} (i-1 choose j) * 1/2^i * 2/n
= 2/n * 1/2^i * \sum_{j=0}^{i-1} (i-1 choose j)
= 1/n * 1/2^{i-1} * 2^{i-1}
= 1/n
I hope this makes sense. The only assumption we need is that n is even, and that the two lists are shuffled uniformly. This is achieved by adding (and then removing) the dummy node.
P.S. My original intuition was nowhere near rigorous, but I list it just in case. Imagine we assign numbers between 1 and n at random to the elements of the list. And now we run a merge sort with respect to these numbers. At any given step of the merge, it needs to decide which of the heads of the two lists is smaller. But the probability of one being greater than the other should be exactly 1/2, so we can simulate this by flipping a coin.
P.P.S. Is there a way to embed LaTeX here?

Code
Up shuffle approach
This (lua) version is improved from foxcub's answer to remove the need of dummy nodes.
In order to slightly simplify the code in this answer, this version suppose that your lists know their sizes. In the event they don't, you can always find it in O(n) time, but even better: a few simple adaptation in the code can be done to not require to compute it beforehand (like subdividing one over two instead of first and second half).
function listUpShuffle (l)
local lsz = #l
if lsz <= 1 then return l end
local lsz2 = math.floor(lsz/2)
local l1, l2 = {}, {}
for k = 1, lsz2 do l1[#l1+1] = l[k] end
for k = lsz2+1, lsz do l2[#l2+1] = l[k] end
l1 = listUpShuffle(l1)
l2 = listUpShuffle(l2)
local res = {}
local i, j = 1, 1
while i <= #l1 or j <= #l2 do
local rem1, rem2 = #l1-i+1, #l2-j+1
if math.random() < rem1/(rem1+rem2) then
res[#res+1] = l1[i]
i = i+1
else
res[#res+1] = l2[j]
j = j+1
end
end
return res
end
To avoid using dummy nodes, you have to compensate for the fact that the two intermediate lists can have different lengths by varying the probability to choose in each list. This is done by testing a [0,1] uniform random number against the ratio of nodes popped from the first list over the total number of node popped (in the two lists).
Down shuffle approach
You can also shuffle while you subdivide recursively, which in my humble tests showed slightly (but consistently) better performance. It might come from the fewer instructions, or on the other hand it might have appeared due to cache warmup in luajit, so you will have to profile for your use cases.
function listDownShuffle (l)
local lsz = #l
if lsz <= 1 then return l end
local lsz2 = math.floor(lsz/2)
local l1, l2 = {}, {}
for i = 1, lsz do
local rem1, rem2 = lsz2-#l1, lsz-lsz2-#l2
if math.random() < rem1/(rem1+rem2) then
l1[#l1+1] = l[i]
else
l2[#l2+1] = l[i]
end
end
l1 = listDownShuffle(l1)
l2 = listDownShuffle(l2)
local res = {}
for i = 1, #l1 do res[#res+1] = l1[i] end
for i = 1, #l2 do res[#res+1] = l2[i] end
return res
end
Tests
The full source is in my listShuffle.lua Gist.
It contains code that, when executed, prints a matrix representing, for each element of the input list, the number of times it appears at each position of the output list, after a specified number of run. A fairly uniform matrix 'show' the uniformity of the distribution of characters, hence the uniformity of the shuffle.
Here is an example run with 1000000 iteration using a (non power of two) 3 element list :
>> luajit listShuffle.lua 1000000 3
Up shuffle bias matrix:
333331 332782 333887
333377 333655 332968
333292 333563 333145
Down shuffle bias matrix:
333120 333521 333359
333435 333088 333477
333445 333391 333164

You can actually do better than that: the best list shuffle algorithm is O(n log n) time and just O(1) space. (You can also shuffle in O(n) time and O(n) space by constructing a pointer array for the list, shuffling it in place using Knuth and re-threading the list accordingly.)
Complexity proof
To see why O(n log n) time is minimal for O(1) space, observe that:
With O(1) space, updating the successor of an arbitrary list element necessarily takes O(n) time.
Wlog, you can assume that whenever you update one element, you also update all the other elements (leaving them unchanged if you wish), as this also takes just O(n) time.
With O(1) space, there are at most O(1) elements to choose from for the successor of any element you're updating (which specific elements these are will obviously depend on the algorithm).
Therefore, a single O(n) time update of all the elements could result in at most c^n different list permutations.
Since there are n! = O(n^n) = O(c^(n log n)) possible list permutations, you require at least O(log n) updates.
Linked-list data structure (because Python)
import collections
class Cons(collections.Sequence):
def __init__(self, head, tail=None):
self.head = head
self.tail = tail
def __getitem__(self, index):
current, n = self, index
while n > 0:
if isinstance(current, Cons):
current, n = current.tail, n - 1
else:
raise ValueError("Out of bounds index [{0}]".format(index))
return current
def __len__(self):
current, length = self, 0
while isinstance(current, Cons):
current, length = current.tail, length + 1
return length
def __repr__(self):
current, rep = self, []
while isinstance(current, Cons):
rep.extend((str(current.head), "::"))
current = current.tail
rep.append(str(current))
return "".join(rep)
Merge-style algorithm
Here is an O(n log n) time and O(1) space algorithm based on iterative merge sort. The basic idea is simple: shuffle the left half, then the right half, then merge them by randomly selecting from the two lists. Two things worth noting:
By making the algorithm iterative rather than recursive, and returning a pointer to the new last element at the end of every merge step, we reduce the space requirement to O(1) while keeping the time cost minimal.
To make sure that all possibilities are equally likely for all input sizes, we use probabilities from the Gilbert–Shannon–Reeds model riffle shuffle when merging (see http://en.wikipedia.org/wiki/Gilbert%E2%80%93Shannon%E2%80%93Reeds_model).
import random
def riffle_lists(head, list1, len1, list2, len2):
"""Riffle shuffle two sublists in place. Returns the new last element."""
for _ in range(len1 + len2):
if random.random() < (len1 / (len1 + len2)):
next, list1, len1 = list1, list1.tail, len1 - 1
else:
next, list2, len2 = list2, list2.tail, len2 - 1
head.tail, head = next, next
head.tail = list2
return head
def shuffle_list(list):
"""Shuffle a list in place using an iterative merge-style algorithm."""
dummy = Cons(None, list)
i, n = 1, len(list)
while (i < n):
head, nleft = dummy, n
while (nleft > i):
head = riffle_lists(head, head[1], i, head[i + 1], min(i, nleft - i))
nleft -= 2 * i
i *= 2
return dummy[1]
Another algorithm
Another interesting O(n log n) algorithm that produces not-quite-uniform shuffles involves simply riffle shuffling the list 3/2 log_2(n) times. As described in http://en.wikipedia.org/wiki/Gilbert%E2%80%93Shannon%E2%80%93Reeds_model, this leaves only a constant number of bits of information.

I'd say, that foxcub's answer is wrong. To prove that I will introduce a helpful definition for a perfectly shuffled list (call it array or sequence or whatever you want).
Definition: Assume we have a List L containing the elements a1, a2 ... an and the indexes 1, 2, 3..... n. If we expose the L to a shuffle operation (to which internals we have no access) L is perfectly shuffled if and only if by knowing indexes of some k (k< n) elements we can't deduce the indexes of remaining n-k elements. That is the remaining n-k elements are equally probable to be revealed at any of the remaining n-k indexes.
Example: if we have a four element list [a, b, c, d] and after shuffling it, we know that its first element is a ([a, .., .., ..]) than the probability for any of the elements b, c, d to occur in, let's say, the third cell equals 1/3.
Now, the smallest list for which the algorithm does not fulfil the definition has three elements. But the algorithm converts it to a 4-element list anyway, so we will try to show its incorrectness for a 4-element list.
Consider an input L = [a, b, c, d]Following the first run of the algorithm the L will be divided into l1 = [a, c] and l2 = [b, d]. After shuffling these two sublists (but before merging into the four-element result) we can get four equally probable 2-elements lists:
l1shuffled = [a , c] l2shuffled = [b , d]
l1shuffled = [a , c] l2shuffled = [d , b]
l1shuffled = [c , a] l2shuffled = [b , d]
l1shuffled = [c , a] l2shuffled = [d , b]
Now try to answer two questions.
1. What is the probability that after merging into the final result a will be the first element of the list.
Simply enough, we can see that only two of the four pairs above (again, equally probable) can give such a result (p1 = 1/2). For each of these pairs heads must be drawed during first flipping in the merge routine (p2 = 1/2). Thus the probability for having a as the first element of the Lshuffled is p = p1*p2 = 1/4, which is correct.
2. Knowing that a is on the first position of the Lshuffled, what is the probability of having c (we could as well choose b or d without loss of generality) on the second position of the Lshuffled
Now, according to the above definition of a perfectly shuffled list, the answer should be 1/3, since there are three numbers to put in the three remaining cells in the list
Let's see if the algorithm assures that.
After choosing 1 as the first element of the Lshuffled we would now have either:
l1shuffled = [c] l2shuffled = [b, d]
or:
l1shuffled = [c] l2shuffled = [d, b]
The probability of choosing 3 in both cases is equal to the probability of flipping heads (p3 = 1/2), thus the probability of having 3 as the second element of Lshuffled, when knowing that the first element element of Lshuffled is 1 equals 1/2. 1/2 != 1/3 which ends the proof of the incorrectness of the algorithm.
The interesting part is that the algorithm fullfils the necessary (but not sufficient) condition for a perfect shuffle, namely:
Given a list of n elements, for every index k (<n), for every element ak: after shuffling the list m times, if we have counted the times when ak occured on the k index, this count will tend to m/n by probability, with m tending to infinity.

Here is one possible solution:
#include <stdlib.h>
typedef struct node_s {
struct node_s * next;
int data;
} node_s, *node_p;
void shuffle_helper( node_p first, node_p last ) {
static const int half = RAND_MAX / 2;
while( (first != last) && (first->next != last) ) {
node_p firsts[2] = {0, 0};
node_p *lasts[2] = {0, 0};
int counts[2] = {0, 0}, lesser;
while( first != last ) {
int choice = (rand() <= half);
node_p next = first->next;
first->next = firsts[choice];
if( !lasts[choice] ) lasts[choice] = &(first->next);
++counts[choice];
first = next;
}
lesser = (counts[0] < counts[1]);
if( !counts[lesser] ) {
first = firsts[!lesser];
*(lasts[!lesser]) = last;
continue;
}
*(lasts[0]) = firsts[1];
*(lasts[1]) = last;
shuffle_helper( firsts[lesser], firsts[!lesser] );
first = firsts[!lesser];
last = *(lasts[!lesser]);
}
}
void shuffle_list( node_p thelist ) { shuffle_helper( thelist, NULL ); }
This is basically quicksort, but with no pivot, and with random partitioning.
The outer while loop replaces a recursive call.
The inner while loop randomly moves each element into one of two sublists.
After the inner while loop, we connect the sublists to one another.
Then, we recurse on the smaller sublist, and loop on the larger.
Since the smaller sublist can never be more than half the size of the initial list, the worst case depth of recursion is the log base two of the number of elements. The amount of memory needed is O(1) times the depth of recursion.
The average runtime, and number of calls to rand() is O(N log N).
More precise runtime analysis requires an understanding of the phrase "almost surely."

Bottom up merge sort without compares. while merging don't do any comparison just swap the elements.

You could traverse over the list, randomly generating 0 or 1 at each node.
If it is 1, remove the node and place it as the first node of the list.
If its is 0, do nothing.
loop this until you reach the end of the list.

Related

Algorithm to generate permutations by order of fewest positional changes

I'm looking for an algorithm to generate or iterate through all permutations of a list of objects such that:
They are generated by fewest to least positional changes from the original. So first all the permutations with a single pair of elements swapped, then all the permutations with only two pairs of elements swapped, etc.
The list generated is complete, so for n objects in a list there should be n! total, unique permutations.
Ideally (but not necessarily) there should be a way of specifying (and generating) a particular permutation without having to generate the full list first and then reference the index.
The speed of the algorithm is not particularly important.
I've looked through all the permutation algorithms that I can find, and none so far have met criteria 1 and 2, let alone 3.
I have an idea how I could write this algorithm myself using recursion, and filtering for duplicates to only get unique permutations. However, if there is any existing algorithm I'd much rather use something proven.
This code answers your requirement #3, which is to compute permutation at index N directly.
This code relies on the following principle:
The first permutation is the identity; then the next (n choose 2) permutations just swap two elements; then the next (n choose 3)(subfactorial(3)) permutations derange 3 elements; then the next (n choose 4)(subfactorial(4)) permutations derange 4 elements; etc. To find the Nth permutation, first figure out how many elements it deranges by finding the largest K such that sum[k = 0 ^ K] (n choose k) subfactorial(k) ⩽ N.
This number K is found by function number_of_derangements_for_permutation_at_index in the code.
Then, the relevant subset of indices which must be deranged is computed efficiently using more_itertools.nth_combination.
However, I didn't have a function nth_derangement to find the relevant derangement of the deranged subset of indices. Hence the last step of the algorithm, which computes this derangement, could be optimised if there exists an efficient function to find the nth derangement of a sequence efficiently.
As a result, this last step takes time proportional to idx_r, where idx_r is the index of the derangement, a number between 0 and factorial(k), where k is the number of elements which are deranged by the returned permutation.
from sympy import subfactorial
from math import comb
from itertools import count, accumulate, pairwise, permutations
from more_itertools import nth_combination, nth
def number_of_derangements_for_permutation_at_index(n, idx):
#n = len(seq)
for k, (low_acc, high_acc) in enumerate(pairwise(accumulate((comb(n,k) * subfactorial(k) for k in count(2)), initial=1)), start=2):
if low_acc <= idx < high_acc:
return k, low_acc
def is_derangement(seq, perm):
return all(i != j for i,j in zip(seq, perm))
def lift_permutation(seq, deranged, permutation):
result = list(seq)
for i,j in zip(deranged, permutation):
result[i] = seq[j]
return result
# THIS FUNCTION NOT EFFICIENT
def nth_derangement(seq, idx):
return nth((p for p in permutations(seq) if is_derangement(seq, p)),
idx)
def nth_permutation(seq, idx):
if idx == 0:
return list(seq)
n = len(seq)
k, acc = number_of_derangements_for_permutation_at_index(n, idx)
idx_q, idx_r = divmod(idx - acc, subfactorial(k))
deranged = nth_combination(range(n), k, idx_q)
derangement = nth_derangement(deranged, idx_r) # TODO: FIND EFFICIENT VERSION
return lift_permutation(seq, deranged, derangement)
Testing for correctness on small data:
print( [''.join(nth_permutation('abcd', i)) for i in range(24)] )
# ['abcd',
# 'bacd', 'cbad', 'dbca', 'acbd', 'adcb', 'abdc',
# 'bcad', 'cabd', 'bdca', 'dacb', 'cbda', 'dbac', 'acdb', 'adbc',
# 'badc', 'bcda', 'bdac', 'cadb', 'cdab', 'cdba', 'dabc', 'dcab', 'dcba']
Testing for speed on medium data:
from math import factorial
seq = 'abcdefghij'
n = len(seq) # 10
N = factorial(n) // 2 # 1814400
perm = ''.join(nth_permutation(seq, N))
print(perm)
# fcjdibaehg
Imagine a graph with n! nodes labeled with every permutation of n elements. If we add edges to this graph such that nodes which can be obtained by swapping one pair of elements are connected, an answer to your problem is obtained by doing a breadth-first search from whatever node you like.
You can actually generate the graph or just let it be implied and just deduce at each stage what nodes should be adjacent (and of course, keep track of ones you've already visited, to avoid revisiting them).
I concede this probably doesn't help with point 3, but maybe is a viable strategy for getting points 1 and 2 answered.
To solve 1 & 2, you could first generate all possible permutations, keeping track of how many swaps occurred during generation for each list. Then sort them by number of swaps. Which I think is O(n! + nlgn) = O(n!)

Minimal number of swaps?

There are N characters in a string of types A and B in the array (same amount of each type). What is the minimal number of swaps to make sure that no two adjacent chars are same if we can only swap two adjacent characters ?
For example, input is:
AAAABBBB
The minimal number of swaps is 6 to make the array ABABABAB. But how would you solve it for any kind of input ? I can only think of O(N^2) solution. Maybe some kind of sort ?
If we need just to count swaps, then we can do it with O(N).
Let's assume for simplicity that array X of N elements should become ABAB... .
GetCount()
swaps = 0, i = -1, j = -1
for(k = 0; k < N; k++)
if(k % 2 == 0)
i = FindIndexOf(A, max(k, i))
X[k] <-> X[i]
swaps += i - k
else
j = FindIndexOf(B, max(k, j))
X[k] <-> X[j]
swaps += j - k
return swaps
FindIndexOf(element, index)
while(index < N)
if(X[index] == element) return index
index++
return -1; // should never happen if count of As == count of Bs
Basically, we run from left to right, and if a misplaced element is found, it gets exchanged with the correct element (e.g. abBbbbA** --> abAbbbB**) in O(1). At the same time swaps are counted as if the sequence of adjacent elements would be swapped instead. Variables i and j are used to cache indices of next A and B respectively, to make sure that all calls together of FindIndexOf are done in O(N).
If we need to sort by swaps then we cannot do better than O(N^2).
The rough idea is the following. Let's consider your sample: AAAABBBB. One of Bs needs O(N) swaps to get to the A B ... position, another B needs O(N) to get to A B A B ... position, etc. So we get O(N^2) at the end.
Observe that if any solution would swap two instances of the same letter, then we can find a better solution by dropping that swap, which necessarily has no effect. An optimal solution therefore only swaps differing letters.
Let's view the string of letters as an array of indices of one kind of letter (arbitrarily chosen, say A) into the string. So AAAABBBB would be represented as [0, 1, 2, 3] while ABABABAB would be [0, 2, 4, 6].
We know two instances of the same letter will never swap in an optimal solution. This lets us always safely identify the first (left-most) instance of A with the first element of our index array, the second instance with the second element, etc. It also tells us our array is always in sorted order at each step of an optimal solution.
Since each step of an optimal solution swaps differing letters, we know our index array evolves at each step only by incrementing or decrementing a single element at a time.
An initial string of length n = 2k will have an array representation A of length k. An optimal solution will transform this array to either
ODDS = [1, 3, 5, ... 2k]
or
EVENS = [0, 2, 4, ... 2k - 1]
Since we know in an optimal solution instances of a letter do not pass each other, we can conclude an optimal solution must spend min(abs(ODDS[0] - A[0]), abs(EVENS[0] - A[0])) swaps to put the first instance in correct position.
By realizing the EVENS or ODDS choice is made only once (not once per letter instance), and summing across the array, we can count the minimum number of needed swaps as
define count_swaps(length, initial, goal)
total = 0
for i from 0 to length - 1
total += abs(goal[i] - initial[i])
end
return total
end
define count_minimum_needed_swaps(k, A)
return min(count_swaps(k, A, EVENS), count_swaps(k, A, ODDS))
end
Notice the number of loop iterations implied by count_minimum_needed_swaps is 2 * k = n; it runs in O(n) time.
By noting which term is smaller in count_minimum_needed_swaps, we can also tell which of the two goal states is optimal.
Since you know N, you can simply write a loop that generates the values with no swaps needed.
#define N 4
char array[N + N];
for (size_t z = 0; z < N + N; z++)
{
array[z] = 'B' - ((z & 1) == 0);
}
return 0; // The number of swaps
#Nemo and #AlexD are right. The algorithm is order n^2. #Nemo misunderstood that we are looking for a reordering where two adjacent characters are not the same, so we can not use that if A is after B they are out of order.
Lets see the minimum number of swaps.
We dont care if our first character is A or B, because we can apply the same algorithm but using A instead of B and viceversa everywhere. So lets assume that the length of the word WORD_N is 2N, with N As and N Bs, starting with an A. (I am using length 2N to simplify the calculations).
What we will do is try to move the next B right to this A, without taking care of the positions of the other characters, because then we will have reduce the problem to reorder a new word WORD_{N-1}. Lets also assume that the next B is not just after A if the word has more that 2 characters, because then the first step is done and we reduce the problem to the next set of characters, WORD_{N-1}.
The next B should be as far as possible to be in the worst case, so it is after half of the word, so we need $N-1$ swaps to put this B after the A (maybe less than that). Then our word can be reduced to WORD_N = [A B WORD_{N-1}].
We se that we have to perform this algorithm as most N-1 times, because the last word (WORD_1) will be already ordered. Performing the algorithm N-1 times we have to make
N_swaps = (N-1)*N/2.
where N is half of the lenght of the initial word.
Lets see why we can apply the same algorithm for WORD_{N-1} also assuming that the first word is A. In this case it matters than the first word should be the same as in the already ordered pair. We can be sure that the first character in WORD_{N-1} is A because it was the character just next to the first character in our initial word, ant if it was B the first work can perform only a swap between these two words and or none and we will already have WORD_{N-1} starting with the same character than WORD_{N}, while the first two characters of WORD_{N} are different at the cost of almost 1 swap.
I think this answer is similar to the answer by phs, just in Haskell. The idea is that the resultant-indices for A's (or B's) are known so all we need to do is calculate how far each starting index has to move and sum the total.
Haskell code:
Prelude Data.List> let is = elemIndices 'B' "AAAABBBB"
in minimum
$ map (sum . zipWith ((abs .) . (-)) is) [[1,3..],[0,2..]]
6 --output

Efficient iteration over sorted partial sums

I have a list of N positive numbers sorted in ascending order, L[0] to L[N-1].
I want to iterate over subsets of M distinct list elements (without replacement, order not important), 1 <= M <= N, sorted according to their partial sum. M is not fixed, the final result should consider all possible subsets.
I only want the K smallest subsets efficiently (ideally polynomial in K). The obvious algorithm of enumerating all subsets with M <= K is O(K!).
I can reduce the problem to subsets of fixed size M, by placing K iterators (1 <= M <= K) in a min-heap and having the master iterator operate on the heap root.
Essentially I need the Python function call:
sorted(itertools.combinations(L, M), key=sum)[:K]
... but efficient (N ~ 200, K ~ 30), should run in less than 1sec.
Example:
L = [1, 2, 5, 10, 11]
K = 8
answer = [(1,), (2,), (1,2), (5,), (1,5), (2,5), (1,2,5), (10,)]
Answer:
As David's answer shows, the important trick is that for a subset S to be outputted, all subsets of S must have been previously outputted, in particular the subsets where only 1 element has been removed. Thus, every time you output a subset, you can add all 1-element extensions of this subset for consideration (a maximum of K), and still be sure that the next outputted subset will be in the list of all considered subsets up to this point.
Fully working, more efficient Python function:
def sorted_subsets(L, K):
candidates = [(L[i], (i,)) for i in xrange(min(len(L), K))]
for j in xrange(K):
new = candidates.pop(0)
yield tuple(L[i] for i in new[1])
new_candidates = [(L[i] + new[0], (i,) + new[1]) for i in xrange(new[1][0])]
candidates = sorted(candidates + new_candidates)[:K-j-1]
UPDATE, found an O(K log K) algorithm.
This is similar to the trick above, but instead of adding all 1-element extensions with the elements added greater than the max of the subset, you consider only 2 extensions: one that adds max(S)+1, and the other one that shifts max(S) to max(S) + 1 (that would eventually generate all 1-element extensions to the right).
import heapq
def sorted_subsets_faster(L, K):
candidates = [(L[0], (0,))]
for j in xrange(K):
new = heapq.heappop(candidates)
yield tuple(L[i] for i in new[1])
i = new[1][-1]
if i+1 < len(L):
heapq.heappush(candidates, (new[0] + L[i+1], new[1] + (i+1,)))
heapq.heappush(candidates, (new[0] - L[i] + L[i+1], new[1][:-1] + (i+1,)))
From my benchmarks, it is faster for ALL values of K.
Also, it is not necessary to supply in advance the value of K, we can just iterate and stop whenever, without changing the efficiency of the algorithm. Also note that the number of candidates is bounded by K+1.
It might be possible to improve even further by using a priority deque (min-max heap) instead of a priority queue, but frankly I'm satisfied with this solution. I'd be interested in a linear algorithm though, or a proof that it's impossible.
Here's some rough Python-ish pseudo-code:
final = []
L = L[:K] # Anything after the first K is too big already
sorted_candidates = L[]
while len( final ) < K:
final.append( sorted_candidates[0] ) # We keep it sorted so the first option
# is always the smallest sum not
# already included
# If you just added a subset of size A, make a bunch of subsets of size A+1
expansion = [sorted_candidates[0].add( x )
for x in L and x not already included in sorted_candidates[0]]
# We're done with the first element, so remove it
sorted_candidates = sorted_candidates[1:]
# Now go through and build a new set of sorted candidates by getting the
# smallest possible ones from sorted_candidates and expansion
new_candidates = []
for i in range(K - len( final )):
if sum( expansion[0] ) < sum( sorted_candidates[0] ):
new_candidates.append( expansion[0] )
expansion = expansion[1:]
else:
new_candidates.append( sorted_candidates[0] )
sorted_candidates = sorted_candidates[1:]
sorted_candidates = new_candidates
We'll assume that you will do things like removing the first element of an array in an efficient way, so the only real work in the loop is in building expansion and in rebuilding sorted_candidates. Both of these have fewer than K steps, so as an upper bound, you're looking at a loop that is O(K) and that is run K times, so O(K^2) for the algorithm.

How to generate a permutation?

My question is: given a list L of length n, and an integer i such that 0 <= i < n!, how can you write a function perm(L, n) to produce the ith permutation of L in O(n) time? What I mean by ith permutation is just the ith permutation in some implementation defined ordering that must have the properties:
For any i and any 2 lists A and B, perm(A, i) and perm(B, i) must both map the jth element of A and B to an element in the same position for both A and B.
For any inputs (A, i), (A, j) perm(A, i)==perm(A, j) if and only if i==j.
NOTE: this is not homework. In fact, I solved this 2 years ago, but I've completely forgotten how, and it's killing me. Also, here is a broken attempt I made at a solution:
def perm(s, i):
n = len(s)
perm = [0]*n
itCount = 0
for elem in s:
perm[i%n + itCount] = elem
i = i / n
n -= 1
itCount+=1
return perm
ALSO NOTE: the O(n) requirement is very important. Otherwise you could just generate the n! sized list of all permutations and just return its ith element.
def perm(sequence, index):
sequence = list(sequence)
result = []
for x in xrange(len(sequence)):
idx = index % len(sequence)
index /= len(sequence)
result.append( sequence[idx] )
# constant time non-order preserving removal
sequence[idx] = sequence[-1]
del sequence[-1]
return result
Based on the algorithm for shuffling, but we take the least significant part of the number each time to decide which element to take instead of a random number. Alternatively consider it like the problem of converting to some arbitrary base except that the base name shrinks for each additional digit.
Could you use factoradics? You can find an illustration via this MSDN article.
Update: I wrote an extension of the MSDN algorithm that finds i'th permutation of n things taken r at a time, even if n != r.
A computational minimalistic approach (written in C-style pseudocode):
function perm(list,i){
for(a=list.length;a;a--){
list.switch(a-1,i mod a);
i=i/a;
}
return list;
}
Note that implementations relying on removing elements from the original list tend to run in O(n^2) time, at best O(n*log(n)) given a special tree style list implementation designed for quickly inserting and removing list elements.
The above code rather than shrinking the original list and keeping it in order just moves an element from the end to the vacant location, still makes a perfect 1:1 mapping between index and permutation, just a slightly more scrambled one, but in pure O(n) time.
So, I think I finally solved it. Before I read any answers, I'll post my own here.
def perm(L, i):
n = len(L)
if (n == 1):
return L
else:
split = i%n
return [L[split]] + perm(L[:split] + L[split+1:], i/n)
There are n! permutations. The first character can be chosen from L in n ways. Each of those choices leave (n-1)! permutations among them. So this idea is enough for establishing an order. In general, you will figure out what part you are in, pick the appropriate element and then recurse / loop on the smaller L.
The argument that this works correctly is by induction on the length of the sequence. (sketch) For a length of 1, it is trivial. For a length of n, you use the above observation to split the problem into n parts, each with a question on an L' with length (n-1). By induction, all the L's are constructed correctly (and in linear time). Then it is clear we can use the IH to construct a solution for length n.

Algorithm to determine indices i..j of array A containing all the elements of another array B

I came across this question on an interview questions thread. Here is the question:
Given two integer arrays A [1..n] and
B[1..m], find the smallest window
in A that contains all elements of
B. In other words, find a pair < i , j >
such that A[i..j] contains B[1..m].
If A doesn't contain all the elements of
B, then i,j can be returned as -1.
The integers in A need not be in the same order as they are in B. If there are more than one smallest window (different, but have the same size), then its enough to return one of them.
Example: A[1,2,5,11,2,6,8,24,101,17,8] and B[5,2,11,8,17]. The algorithm should return i = 2 (index of 5 in A) and j = 9 (index of 17 in A).
Now I can think of two variations.
Let's suppose that B has duplicates.
This variation doesn't consider the number of times each element occurs in B. It just checks for all the unique elements that occur in B and finds the smallest corresponding window in A that satisfies the above problem. For example, if A[1,2,4,5,7] and B[2,2,5], this variation doesn't bother about there being two 2's in B and just checks A for the unique integers in B namely 2 and 5 and hence returns i=1, j=3.
This variation accounts for duplicates in B. If there are two 2's in B, then it expects to see at least two 2's in A as well. If not, it returns -1,-1.
When you answer, please do let me know which variation you are answering. Pseudocode should do. Please mention space and time complexity if it is tricky to calculate it. Mention if your solution assumes array indices to start at 1 or 0 too.
Thanks in advance.
Complexity
Time: O((m+n)log m)
Space: O(m)
The following is provably optimal up to a logarithmic factor. (I believe the log factor cannot be got rid of, and so it's optimal.)
Variant 1 is just a special case of variant 2 with all the multiplicities being 1, after removing duplicates from B. So it's enough to handle the latter variant; if you want variant 1, just remove duplicates in O(m log m) time. In the following, let m denote the number of distinct elements in B. We assume m < n, because otherwise we can just return -1, in constant time.
For each index i in A, we will find the smallest index s[i] such that A[i..s[i]] contains B[1..m], with the right multiplicities. The crucial observation is that s[i] is non-decreasing, and this is what allows us to do it in amortised linear time.
Start with i=j=1. We will keep a tuple (c[1], c[2], ... c[m]) of the number of times each element of B occurs, in the current window A[i..j]. We will also keep a set S of indices (a subset of 1..m) for which the count is "right" (i.e., k for which c[k]=1 in variant 1, or c[k] = <the right number> in variant 2).
So, for i=1, starting with j=1, increment each c[A[j]] (if A[j] was an element of B), check if c[A[j]] is now "right", and add or remove j from S accordingly. Stop when S has size m. You've now found s[1], in at most O(n log m) time. (There are O(n) j's, and each set operation took O(log m) time.)
Now for computing successive s[i]s, do the following. Increment i, decrement c[A[i]], update S accordingly, and, if necessary, increment j until S has size m again. That gives you s[i] for each i. At the end, report the (i,s[i]) for which s[i]-i was smallest.
Note that although it seems that you might be performing up to O(n) steps (incrementing j) for each i, the second pointer j only moves to the right: so the total number of times you can increment j is at most n. (This is amortised analysis.) Each time you increment j, you might perform a set operation that takes O(log m) time, so the total time is O(n log m). The space required was for keeping the tuple of counts, the set of elements of B, the set S, and some constant number of other variables, so O(m) in all.
There is an obvious O(m+n) lower bound, because you need to examine all the elements. So the only question is whether we can prove the log factor is necessary; I believe it is.
Here is the solution I thought of (but it's not very neat).
I am going to illustrate it using the example in the question.
Let A[1,2,5,11,2,6,8,24,101,17,8] and B[5,2,11,8,17]
Sort B. (So B = [2,5,8,11,17]). This step takes O(log m).
Allocate an array C of size A. Iterate through elements of A, binary search for it in the sorted B, if it is found enter it's "index in sorted B + 1" in C. If its not found, enter -1. After this step,
A = [1 , 2, 5, 11, 2, 6, 8, 24, 101, 17, 8] (no changes, quoting for ease).
C = [-1, 1, 2, 4 , 1, -1, 3, -1, -1, 5, 3]
Time: (n log m), Space O(n).
Find the smallest window in C that has all the numbers from 1 to m. For finding the window, I can think of two general directions:
a. A bit oriented approach where in I set the bit corresponding to each position and finally check by some kind of ANDing.
b. Create another array D of size m, go through C and when I encounter p in C, increment D[p]. Use this for finding the window.
Please leave comments regarding the general approach as such, as well as for 3a and 3b.
My solution:
a. Create a hash table with m keys, one for each value in B. Each key in H maps to a dynamic array of sorted indices containing indices in A that are equal to B[i]. This takes O(n) time. We go through each index j in A. If key A[i] exists in H (O(1) time) then add an value containing the index j of A to the list of indices that H[A[i]] maps to.
At this point we have 'binned' n elements into m bins. However, total storage is just O(n).
b. The 2nd part of the algorithm involves maintaining a ‘left’ index and a ‘right’ index for each list in H. Lets create two arrays of size m called L and R that contain these values. Initially in our example,
We also keep track of the “best” minimum window.
We then iterate over the following actions on L and R which are inherently greedy:
i. In each iteration, we compute the minimum and maximum values in L and R.
For L, Lmax - Lmin is the window and for R, Rmax - Rmin is the window. We update the best window if one of these windows is better than the current best window. We use a min heap to keep track of the minimum element in L and a max heap to keep track of the largest element in R. These take O(m*log(m)) time to build.
ii. From a ‘greedy’ perspective, we want to take the action that will minimize the window size in each L and R. For L it intuitively makes sense to increment the minimum index, and for R, it makes sense to decrement the maximum index.
We want to increment the array position for the minimum value until it is larger than the 2nd smallest element in L, and similarly, we want to decrement the array position for the largest value in R until it is smaller than the 2nd largest element in R.
Next, we make a key observation:
If L[i] is the minimum value in L and R[i] is less than the 2nd smallest element in L, ie, if R[i] were to still be the minimum value in L if L[i] were replaced with R[i], then we are done. We now have the “best” index in list i that can contribute to the minimum window. Also, all the other elements in R cannot contribute to the best window since their L values are all larger than L[i]. Similarly if R[j] is the maximum element in R and L[j] is greater than the 2nd largest value in R, we are also done by setting R[j] = L[j]. Any other index in array i to the left of L[j] has already been accounted for as have all indices to the right of R[j], and all indices between L[j] and R[j] will perform poorer than L[j].
Otherwise, we simply increment the array position L[i] until it is larger than the 2nd smallest element in L and decrement array position R[j] (where R[j] is the max in R) until it is smaller than the 2nd largest element in R. We compute the windows and update the best window if one of the L or R windows is smaller than the best window. We can do a Fibonacci search to optimally do the increment / decrement. We keep incrementing L[i] using Fibonacci increments until we are larger than the 2nd largest element in L. We can then perform binary search to get the smallest element L[i] that is larger than the 2nd largest element in L, similar for the set R. After the increment / decrement, we pop the largest element from the max heap for R and the minimum element for the min heap for L and insert the new values of L[i] and R[j] into the heaps. This is an O(log(m)) operation.
Step ii. would terminate when Lmin can’t move any more to the right or Rmax can’t move any more to the left (as the R/L values are the same). Note that we can have scenarios in which L[i] = R[i] but if it is not the minimum element in L or the maximum element in R, the algorithm would still continue.
Runtime analysis:
a. Creation of the hash table takes O(n) time and O(n) space.
b. Creation of heaps: O(m*log(m)) time and O(m) space.
c. The greedy iterative algorithm is a little harder to analyze. Its runtime is really bounded by the distribution of elements. Worst case, we cover all the elements in each array in the hash table. For each element, we perform an O(log(m)) heap update.
Worst case runtime is hence O(n*log(m)) for the iterative greedy algorithm. In the best case, we discover very fast that L[i] = R[i] for the minimum element in L or the maximum element in R…run time is O(1)*log(m) for the greedy algorithm!
Average case seems really hard to analyze. What is the average “convergence” of this algorithm to the minimum window. If we were to assume that the Fibonacci increments / binary search were to help, we could say we only look at m*log(n/m) elements (every list has n/m elements) in the average case. In that case, the running time of the greedy algorithm would be m*log(n/m)*log(m).
Total running time
Best case: O(n + m*log(m) + log(m)) time = O(n) assuming m << n
Average case: O(n + m*log(m) + m*log(n/m)*log(m)) time = O(n) assuming m << n.
Worst case: O(n + n*log(m) + m*log(m)) = O(n*log(m)) assuming m << n.
Space: O(n + m) (hashtable and heaps) always.
Edit: Here is a worked out example:
A[5, 1, 1, 5, 6, 1, 1, 5]
B[5, 6]
H:
{
5 => {1, 4, 8}
6 => {5}
}
Greedy Algorithm:
L => {1, 1}
R => {3, 1}
Iteration 1:
a. Lmin = 1 (since H{5}[1] < H{6}[1]), Lmax = 5. Window: 5 - 1 + 1= 5
Increment Lmin pointer, it now becomes 2.
L => {2, 1}
Rmin = H{6}[1] = 5, Rmax = H{5}[3] = 8. Window = 8 - 5 + 1 = 4. Best window so far = 4 (less than 5 computed above).
We also note the indices in A (5, 8) for the best window.
Decrement Rmax, it now becomes 2 and the value is 4.
R => {2, 1}
b. Now, Lmin = 4 (H{5}[2]) and the index i in L is 1. Lmax = 5 (H{6}[1]) and the index in L is 2.
We can't increment Lmin since L[1] = R[1] = 2. Thus we just compute the window now.
The window = Lmax - Lmin + 1 = 2 which is the best window so far.
Thus, the best window in A = (4, 5).
struct Pair {
int i;
int j;
};
Pair
find_smallest_subarray_window(int *A, size_t n, int *B, size_t m)
{
Pair p;
p.i = -1;
p.j = -1;
// key is array value, value is array index
std::map<int, int> map;
size_t count = 0;
int i;
int j;
for(i = 0; i < n, ++i) {
for(j = 0; j < m; ++j) {
if(A[i] == B[j]) {
if(map.find(A[i]) == map.end()) {
map.insert(std::pair<int, int>(A[i], i));
} else {
int start = findSmallestVal(map);
int end = findLargestVal(map);
int oldLength = end-start;
int oldIndex = map[A[i]];
map[A[i]] = i;
int _start = findSmallestVal(map);
int _end = findLargestVal(map);
int newLength = _end - _start;
if(newLength > oldLength) {
// revert back
map[A[i]] = oldIndex;
}
}
}
}
if(count == m) {
break;
}
}
p.i = findSmallestVal(map);
p.j = findLargestVal(map);
return p;
}

Resources