Efficient iteration over sorted partial sums - algorithm

I have a list of N positive numbers sorted in ascending order, L[0] to L[N-1].
I want to iterate over subsets of M distinct list elements (without replacement, order not important), 1 <= M <= N, sorted according to their partial sum. M is not fixed, the final result should consider all possible subsets.
I only want the K smallest subsets efficiently (ideally polynomial in K). The obvious algorithm of enumerating all subsets with M <= K is O(K!).
I can reduce the problem to subsets of fixed size M, by placing K iterators (1 <= M <= K) in a min-heap and having the master iterator operate on the heap root.
Essentially I need the Python function call:
sorted(itertools.combinations(L, M), key=sum)[:K]
... but efficient (N ~ 200, K ~ 30), should run in less than 1sec.
L = [1, 2, 5, 10, 11]
K = 8
answer = [(1,), (2,), (1,2), (5,), (1,5), (2,5), (1,2,5), (10,)]
As David's answer shows, the important trick is that for a subset S to be outputted, all subsets of S must have been previously outputted, in particular the subsets where only 1 element has been removed. Thus, every time you output a subset, you can add all 1-element extensions of this subset for consideration (a maximum of K), and still be sure that the next outputted subset will be in the list of all considered subsets up to this point.
Fully working, more efficient Python function:
def sorted_subsets(L, K):
candidates = [(L[i], (i,)) for i in xrange(min(len(L), K))]
for j in xrange(K):
new = candidates.pop(0)
yield tuple(L[i] for i in new[1])
new_candidates = [(L[i] + new[0], (i,) + new[1]) for i in xrange(new[1][0])]
candidates = sorted(candidates + new_candidates)[:K-j-1]
UPDATE, found an O(K log K) algorithm.
This is similar to the trick above, but instead of adding all 1-element extensions with the elements added greater than the max of the subset, you consider only 2 extensions: one that adds max(S)+1, and the other one that shifts max(S) to max(S) + 1 (that would eventually generate all 1-element extensions to the right).
import heapq
def sorted_subsets_faster(L, K):
candidates = [(L[0], (0,))]
for j in xrange(K):
new = heapq.heappop(candidates)
yield tuple(L[i] for i in new[1])
i = new[1][-1]
if i+1 < len(L):
heapq.heappush(candidates, (new[0] + L[i+1], new[1] + (i+1,)))
heapq.heappush(candidates, (new[0] - L[i] + L[i+1], new[1][:-1] + (i+1,)))
From my benchmarks, it is faster for ALL values of K.
Also, it is not necessary to supply in advance the value of K, we can just iterate and stop whenever, without changing the efficiency of the algorithm. Also note that the number of candidates is bounded by K+1.
It might be possible to improve even further by using a priority deque (min-max heap) instead of a priority queue, but frankly I'm satisfied with this solution. I'd be interested in a linear algorithm though, or a proof that it's impossible.

Here's some rough Python-ish pseudo-code:
final = []
L = L[:K] # Anything after the first K is too big already
sorted_candidates = L[]
while len( final ) < K:
final.append( sorted_candidates[0] ) # We keep it sorted so the first option
# is always the smallest sum not
# already included
# If you just added a subset of size A, make a bunch of subsets of size A+1
expansion = [sorted_candidates[0].add( x )
for x in L and x not already included in sorted_candidates[0]]
# We're done with the first element, so remove it
sorted_candidates = sorted_candidates[1:]
# Now go through and build a new set of sorted candidates by getting the
# smallest possible ones from sorted_candidates and expansion
new_candidates = []
for i in range(K - len( final )):
if sum( expansion[0] ) < sum( sorted_candidates[0] ):
new_candidates.append( expansion[0] )
expansion = expansion[1:]
new_candidates.append( sorted_candidates[0] )
sorted_candidates = sorted_candidates[1:]
sorted_candidates = new_candidates
We'll assume that you will do things like removing the first element of an array in an efficient way, so the only real work in the loop is in building expansion and in rebuilding sorted_candidates. Both of these have fewer than K steps, so as an upper bound, you're looking at a loop that is O(K) and that is run K times, so O(K^2) for the algorithm.


Algorithm to generate permutations by order of fewest positional changes

I'm looking for an algorithm to generate or iterate through all permutations of a list of objects such that:
They are generated by fewest to least positional changes from the original. So first all the permutations with a single pair of elements swapped, then all the permutations with only two pairs of elements swapped, etc.
The list generated is complete, so for n objects in a list there should be n! total, unique permutations.
Ideally (but not necessarily) there should be a way of specifying (and generating) a particular permutation without having to generate the full list first and then reference the index.
The speed of the algorithm is not particularly important.
I've looked through all the permutation algorithms that I can find, and none so far have met criteria 1 and 2, let alone 3.
I have an idea how I could write this algorithm myself using recursion, and filtering for duplicates to only get unique permutations. However, if there is any existing algorithm I'd much rather use something proven.
This code answers your requirement #3, which is to compute permutation at index N directly.
This code relies on the following principle:
The first permutation is the identity; then the next (n choose 2) permutations just swap two elements; then the next (n choose 3)(subfactorial(3)) permutations derange 3 elements; then the next (n choose 4)(subfactorial(4)) permutations derange 4 elements; etc. To find the Nth permutation, first figure out how many elements it deranges by finding the largest K such that sum[k = 0 ^ K] (n choose k) subfactorial(k) ⩽ N.
This number K is found by function number_of_derangements_for_permutation_at_index in the code.
Then, the relevant subset of indices which must be deranged is computed efficiently using more_itertools.nth_combination.
However, I didn't have a function nth_derangement to find the relevant derangement of the deranged subset of indices. Hence the last step of the algorithm, which computes this derangement, could be optimised if there exists an efficient function to find the nth derangement of a sequence efficiently.
As a result, this last step takes time proportional to idx_r, where idx_r is the index of the derangement, a number between 0 and factorial(k), where k is the number of elements which are deranged by the returned permutation.
from sympy import subfactorial
from math import comb
from itertools import count, accumulate, pairwise, permutations
from more_itertools import nth_combination, nth
def number_of_derangements_for_permutation_at_index(n, idx):
#n = len(seq)
for k, (low_acc, high_acc) in enumerate(pairwise(accumulate((comb(n,k) * subfactorial(k) for k in count(2)), initial=1)), start=2):
if low_acc <= idx < high_acc:
return k, low_acc
def is_derangement(seq, perm):
return all(i != j for i,j in zip(seq, perm))
def lift_permutation(seq, deranged, permutation):
result = list(seq)
for i,j in zip(deranged, permutation):
result[i] = seq[j]
return result
def nth_derangement(seq, idx):
return nth((p for p in permutations(seq) if is_derangement(seq, p)),
def nth_permutation(seq, idx):
if idx == 0:
return list(seq)
n = len(seq)
k, acc = number_of_derangements_for_permutation_at_index(n, idx)
idx_q, idx_r = divmod(idx - acc, subfactorial(k))
deranged = nth_combination(range(n), k, idx_q)
derangement = nth_derangement(deranged, idx_r) # TODO: FIND EFFICIENT VERSION
return lift_permutation(seq, deranged, derangement)
Testing for correctness on small data:
print( [''.join(nth_permutation('abcd', i)) for i in range(24)] )
# ['abcd',
# 'bacd', 'cbad', 'dbca', 'acbd', 'adcb', 'abdc',
# 'bcad', 'cabd', 'bdca', 'dacb', 'cbda', 'dbac', 'acdb', 'adbc',
# 'badc', 'bcda', 'bdac', 'cadb', 'cdab', 'cdba', 'dabc', 'dcab', 'dcba']
Testing for speed on medium data:
from math import factorial
seq = 'abcdefghij'
n = len(seq) # 10
N = factorial(n) // 2 # 1814400
perm = ''.join(nth_permutation(seq, N))
# fcjdibaehg
Imagine a graph with n! nodes labeled with every permutation of n elements. If we add edges to this graph such that nodes which can be obtained by swapping one pair of elements are connected, an answer to your problem is obtained by doing a breadth-first search from whatever node you like.
You can actually generate the graph or just let it be implied and just deduce at each stage what nodes should be adjacent (and of course, keep track of ones you've already visited, to avoid revisiting them).
I concede this probably doesn't help with point 3, but maybe is a viable strategy for getting points 1 and 2 answered.
To solve 1 & 2, you could first generate all possible permutations, keeping track of how many swaps occurred during generation for each list. Then sort them by number of swaps. Which I think is O(n! + nlgn) = O(n!)

Generate one permutation from an index

Is there an efficient algorithm to generate a permutation from one index provided? The permutations do not need to have any specific ordering and it just needs to return every permutation once per every possible index. The set I wish to permute is all integers from 0~255.
If I understand the question correctly, the problem is as follows: You are given two integers n and k, and you want to find the kth permutation of n integers. You don't care about it being the kth lexicographical permutation, but it's just easier to be lexicographical so let's stick with that.
This is not too bad to compute. The base permutation is 1,2,3,4...n. This is the k=0 case. Consider what happens if you were to swap the 1 and 2: by moving the 1, you are passing up every single permutation where 1 goes first, and there are (n-1)! of those (since you could have permuted 2,3,4..n if you fixed the 1 in place). Thus, the algorithm is as follows:
for i from 1 to n:
j = k / (n-i)! // integer division, so rounded down
k -= j * (n-i)!
place down the jth unplaced number
This will iteratively produce the kth lexicographical permutation, since it repeatedly solves a sub-problem with a smaller set of numbers to place, and decrementing k along the way.
There is an implementation in python in module more-itertools: nth_permutation.
Here is an implementation, adapted from the code of more_itertools.nth_permutation:
from sympy import factorial
def nth_permutation(iterable, index):
pool = list(iterable)
n = len(pool)
c = factorial(n)
index = index % c
result = [0] * n
q = index
for d in range(1, n + 1):
q, i = divmod(q, d)
if 0 <= n - d < n:
result[n - d] = i
if q == 0:
return tuple(map(pool.pop, result))
print( nth_permutation(range(6), 360) )
# (3, 0, 1, 2, 4, 5)

Find longest sequences with sufficient average score

I have a long list of scores between 0 and 1. How do I efficiently find all contiguous sublists longer than x elements such that the average score in each sublist is not less than y?
E.g., how do I find all contiguous sublists longer than 300 elements such that the average score of these sublists is not less than 0.8?
I'm mainly interested in the LONGEST sublists that fulfill these criteria, not actually all sublists. So I'm looking for all longest sublists.
If you want only the longest such substrings, this can be solved in O(n log n) time by transforming the problem slightly and then binary-searching over maximum solution lengths.
Let the input list of scores be x[1], ..., x[n]. Let's transform this list by subtracting y from each element, to form the list z[1], ..., z[n], whose elements may be positive or negative. Notice that any sublist x[i .. j] has average score at least y if and only if the sum of elements in the corresponding sublist in z (i.e., z[i] + z[i+1] + ... + z[j]) is at least 0. So, if we had a way to compute the maximum sum T of any sublist in z[] efficiently (spoiler: we do), this would, as a side effect, tell us if there is any sublist in x[] that has average score at least y: if T >= 0 then there is at least 1 such sublist, while if T < 0 then there is no sublist in x[] (not even a single-element sublist) that has average score at least y. But this doesn't yet give us all the information we need to answer your original question, since nothing forces the maximum-sum sublist in z to have maximum length: it could well be that a longer sublist exists that has lower overall average, while still having average at least y.
This can be addressed by generalising the problem of finding the sublist with maximum sum: instead of asking for a sublist with maximum sum overall, we will now ask for a sublist having maximum sum among all sublists having length at least some given k. I'll now describe an algorithm that, given a list of numbers z[1], ..., z[n], each of which can be positive or negative, and any positive integer k, will compute the maximum sum of any sublist of z[] having length at least k, as well as the location of a particular sublist that achieves this sum, and has longest possible length among all sublists having this sum. It's a slight generalisation of Kadane's algorithm.
FindMaxSumLongerThan(z[], k):
v = 0 # Sum of the rightmost k numbers in the current sublist
For i from 1 to k:
v = v + z[i]
best = v
bestStart = 1
bestEnd = k
# Now for each i, with k+1 <= i <= n, find the biggest sum ending at position i.
tail = -1 # Will contain the maximum sum among all lists ending at i-k
tailLen = 0 # The length of the longest list having the above sum
For i from k+1 to n:
If tail >= 0:
tail = tail + z[i-k]
tailLen = tailLen + 1
tail = z[i-k]
tailLen = 1
If tail >= 0:
nonnegTail = tail
nonnegTailLen = tailLen
nonnegTail = 0
nonnegTailLen = 0
v = v + z[i] - z[i-k] # Slide the window right 1 position
If v + nonnegTail > best:
best = v + nonnegTail
bestStart = i - k - nonnegTailLen + 1
bestEnd = i
The above algorithm takes O(n) time and O(1) space, returning the maximum sum in best and the beginning and ending positions of some sublist that achieves that sum in bestStart and bestEnd, respectively.
How is the above useful? For a given input list x[], suppose we first transform x[] into z[] by subtracting y from each element as described above; this will be the z[] passed into every call to FindMaxSumLongerThan(). We can view the value of best that results from calling the function with z[] and a given minimum sublist length k as a mathematical function of k: best(k). Since FindMaxSumLongerThan() finds the maximum sum of any sublist of z[] having length at least k, best(k) is a nonincreasing function of k. (Say we set k=5 and found that the maximum sum of any sublist is 42; then we are guaranteed to find a total of at least 42 if we try again with k=4 or k=3.) That means we can binary search on k to find the largest k such that best(k) >= 0: that k will then be the longest sublist of x[] that has average value at least y. The resulting bestStart and bestEnd will identify a particular sublist having this property; it's easy to modify the algorithm to find all (at most n -- one per rightmost position) of these sublists without increasing the time complexity.
I think that general solution is always O(N^2). I will demonstrate a code in Python and some optimizations you can implement to increase the performance by several orders of magnitude.
Let's generate some data:
from random import random
scores_list = [random() for i in range(10000)]
scores_len = len(scores_list)
Let's say these are our target values:
# Your average
avg = 0.55
# Your min lenght
min_len = 10
Here is a naive brute force solution
res = []
for i in range(scores_len - min_len):
for j in range(i+min_len, scores_len):
l = scores_list[i:j]
if sum(l) / (j - i) >= avg:
That will run very slowly because it has to perform 10000^2 (10^8) operations.
Here is how we can do it better. It is still quadratic but there is some tricks wich allows it to perform much much faster:
res = []
i = 0
while i < scores_len - min_len:
j = i + min_len
di = scores_len
dj = 0
current_sum = sum(scores_list[i:j])
while j < scores_len:
current_sum += sum(scores_list[j-dj:j])
current_avg = current_sum/(j - i)
if current_avg >= avg:
dj = 1
di = 1
dj = max(1, int((avg * (j - i) - current_sum)/(1 - avg)))
di = min(di, max(1, int(((j-i) * avg - current_sum)/avg)))
j += dj
i += di
For uniform distribution (which we have here) and for given target values it will perform only less than 10^6 operations (~7 * 10^5) and this is by two orders of magnitude less than brute force solution.
So basically if you have a few target sublists it will perform very good. And if you have a lot of them this algorithm will be about the same as a brute force one.

How to find pair with kth largest sum?

Given two sorted arrays of numbers, we want to find the pair with the kth largest possible sum. (A pair is one element from the first array and one element from the second array). For example, with arrays
[2, 3, 5, 8, 13]
[4, 8, 12, 16]
The pairs with largest sums are
13 + 16 = 29
13 + 12 = 25
8 + 16 = 24
13 + 8 = 21
8 + 12 = 20
So the pair with the 4th largest sum is (13, 8). How to find the pair with the kth largest possible sum?
Also, what is the fastest algorithm? The arrays are already sorted and sizes M and N.
I am already aware of the O(Klogk) solution , using Max-Heap given here .
It also is one of the favorite Google interview question , and they demand a O(k) solution .
I've also read somewhere that there exists a O(k) solution, which i am unable to figure out .
Can someone explain the correct solution with a pseudocode .
Please DON'T post this link as answer/comment.It DOESN'T contain the answer.
I start with a simple but not quite linear-time algorithm. We choose some value between array1[0]+array2[0] and array1[N-1]+array2[N-1]. Then we determine how many pair sums are greater than this value and how many of them are less. This may be done by iterating the arrays with two pointers: pointer to the first array incremented when sum is too large and pointer to the second array decremented when sum is too small. Repeating this procedure for different values and using binary search (or one-sided binary search) we could find Kth largest sum in O(N log R) time, where N is size of the largest array and R is number of possible values between array1[N-1]+array2[N-1] and array1[0]+array2[0]. This algorithm has linear time complexity only when the array elements are integers bounded by small constant.
Previous algorithm may be improved if we stop binary search as soon as number of pair sums in binary search range decreases from O(N2) to O(N). Then we fill auxiliary array with these pair sums (this may be done with slightly modified two-pointers algorithm). And then we use quickselect algorithm to find Kth largest sum in this auxiliary array. All this does not improve worst-case complexity because we still need O(log R) binary search steps. What if we keep the quickselect part of this algorithm but (to get proper value range) we use something better than binary search?
We could estimate value range with the following trick: get every second element from each array and try to find the pair sum with rank k/4 for these half-arrays (using the same algorithm recursively). Obviously this should give some approximation for needed value range. And in fact slightly improved variant of this trick gives range containing only O(N) elements. This is proven in following paper: "Selection in X + Y and matrices with sorted rows and columns" by A. Mirzaian and E. Arjomandi. This paper contains detailed explanation of the algorithm, proof, complexity analysis, and pseudo-code for all parts of the algorithm except Quickselect. If linear worst-case complexity is required, Quickselect may be augmented with Median of medians algorithm.
This algorithm has complexity O(N). If one of the arrays is shorter than other array (M < N) we could assume that this shorter array is extended to size N with some very small elements so that all calculations in the algorithm use size of the largest array. We don't actually need to extract pairs with these "added" elements and feed them to quickselect, which makes algorithm a little bit faster but does not improve asymptotic complexity.
If k < N we could ignore all the array elements with index greater than k. In this case complexity is equal to O(k). If N < k < N(N-1) we just have better complexity than requested in OP. If k > N(N-1), we'd better solve the opposite problem: k'th smallest sum.
I uploaded simple C++11 implementation to ideone. Code is not optimized and not thoroughly tested. I tried to make it as close as possible to pseudo-code in linked paper. This implementation uses std::nth_element, which allows linear complexity only on average (not worst-case).
A completely different approach to find K'th sum in linear time is based on priority queue (PQ). One variation is to insert largest pair to PQ, then repeatedly remove top of PQ and instead insert up to two pairs (one with decremented index in one array, other with decremented index in other array). And take some measures to prevent inserting duplicate pairs. Other variation is to insert all possible pairs containing largest element of first array, then repeatedly remove top of PQ and instead insert pair with decremented index in first array and same index in second array. In this case there is no need to bother about duplicates.
OP mentions O(K log K) solution where PQ is implemented as max-heap. But in some cases (when array elements are evenly distributed integers with limited range and linear complexity is needed only on average, not worst-case) we could use O(1) time priority queue, for example, as described in this paper: "A Complexity O(1) Priority Queue for Event Driven Molecular Dynamics Simulations" by Gerald Paul. This allows O(K) expected time complexity.
Advantage of this approach is a possibility to provide first K elements in sorted order. Disadvantages are limited choice of array element type, more complex and slower algorithm, worse asymptotic complexity: O(K) > O(N).
EDIT: This does not work. I leave the answer, since apparently I am not the only one who could have this kind of idea; see the discussion below.
A counter-example is x = (2, 3, 6), y = (1, 4, 5) and k=3, where the algorithm gives 7 (3+4) instead of 8 (3+5).
Let x and y be the two arrays, sorted in decreasing order; we want to construct the K-th largest sum.
The variables are: i the index in the first array (element x[i]), j the index in the second array (element y[j]), and k the "order" of the sum (k in 1..K), in the sense that S(k)=x[i]+y[j] will be the k-th greater sum satisfying your conditions (this is the loop invariant).
Start from (i, j) equal to (0, 0): clearly, S(1) = x[0]+y[0].
for k from 1 to K-1, do:
if x[i+1]+ y[j] > x[i] + y[j+1], then i := i+1 (and j does not change) ; else j:=j+1
To see that it works, consider you have S(k) = x[i] + y[j]. Then, S(k+1) is the greatest sum which is lower (or equal) to S(k), and such as at least one element (i or j) changes. It is not difficult to see that exactly one of i or j should change.
If i changes, the greater sum you can construct which is lower than S(k) is by setting i=i+1, because x is decreasing and all the x[i'] + y[j] with i' < i are greater than S(k). The same holds for j, showing that S(k+1) is either x[i+1] + y[j] or x[i] + y[j+1].
Therefore, at the end of the loop you found the K-th greater sum.
tl;dr: If you look ahead and look behind at each iteration, you can start with the end (which is highest) and work back in O(K) time.
Although the insight underlying this approach is, I believe, sound, the code below is not quite correct at present (see comments).
Let's see: first of all, the arrays are sorted. So, if the arrays are a and b with lengths M and N, and as you have arranged them, the largest items are in slots M and N respectively, the largest pair will always be a[M]+b[N].
Now, what's the second largest pair? It's going to have perhaps one of {a[M],b[N]} (it can't have both, because that's just the largest pair again), and at least one of {a[M-1],b[N-1]}. BUT, we also know that if we choose a[M-1]+b[N-1], we can make one of the operands larger by choosing the higher number from the same list, so it will have exactly one number from the last column, and one from the penultimate column.
Consider the following two arrays: a = [1, 2, 53]; b = [66, 67, 68]. Our highest pair is 53+68. If we lose the smaller of those two, our pair is 68+2; if we lose the larger, it's 53+67. So, we have to look ahead to decide what our next pair will be. The simplest lookahead strategy is simply to calculate the sum of both possible pairs. That will always cost two additions, and two comparisons for each transition (three because we need to deal with the case where the sums are equal);let's call that cost Q).
At first, I was tempted to repeat that K-1 times. BUT there's a hitch: the next largest pair might actually be the other pair we can validly make from {{a[M],b[N]}, {a[M-1],b[N-1]}. So, we also need to look behind.
So, let's code (python, should be 2/3 compatible):
def kth(a,b,k):
M = len(a)
N = len(b)
if k > M*N:
raise ValueError("There are only %s possible pairs; you asked for the %sth largest, which is impossible" % M*N,k)
(ia,ib) = M-1,N-1 #0 based arrays
# we need this for lookback
nottakenindices = (0,0) # could be any value
nottakensum = float('-inf')
for i in range(k-1):
optionone = a[ia]+b[ib-1]
optiontwo = a[ia-1]+b[ib]
biggest = max((optionone,optiontwo))
#first deal with look behind
if nottakensum > biggest:
if optionone == biggest:
newnottakenindices = (ia,ib-1)
else: newnottakenindices = (ia-1,ib)
ia,ib = nottakenindices
nottakensum = biggest
nottakenindices = newnottakenindices
#deal with case where indices hit 0
elif ia <= 0 and ib <= 0:
ia = ib = 0
elif ia <= 0:
ia = 0
nottakensum = float('-inf')
elif ib <= 0:
ib = 0
nottakensum = float('-inf')
#lookahead cases
elif optionone > optiontwo:
#then choose the first option as our next pair
nottakensum,nottakenindices = optiontwo,(ia-1,ib)
elif optionone < optiontwo: # choose the second
nottakensum,nottakenindices = optionone,(ia,ib-1)
#next two cases apply if options are equal
elif a[ia] > b[ib]:# drop the smallest
nottakensum,nottakenindices = optiontwo,(ia-1,ib)
else: # might be equal or not - we can choose arbitrarily if equal
nottakensum,nottakenindices = optionone,(ia,ib-1)
#+2 - one for zero-based, one for skipping the 1st largest
data = (i+2,a[ia],b[ib],a[ia]+b[ib],ia,ib)
narrative = "%sth largest pair is %s+%s=%s, with indices (%s,%s)" % data
print (narrative) #this will work in both versions of python
if ia <= 0 and ib <= 0:
raise ValueError("Both arrays exhausted before Kth (%sth) pair reached"%data[0])
return data, narrative
For those without python, here's an ideone: http://ideone.com/tfm2MA
At worst, we have 5 comparisons in each iteration, and K-1 iterations, which means that this is an O(K) algorithm.
Now, it might be possible to exploit information about differences between values to optimise this a little bit, but this accomplishes the goal.
Here's a reference implementation (not O(K), but will always work, unless there's a corner case with cases where pairs have equal sums):
import itertools
def refkth(a,b,k):
(rightia,righta),(rightib,rightb) = sorted(itertools.product(enumerate(a),enumerate(b)), key=lamba((ia,ea),(ib,eb):ea+eb)[k-1]
data = k,righta,rightb,righta+rightb,rightia,rightib
narrative = "%sth largest pair is %s+%s=%s, with indices (%s,%s)" % data
print (narrative) #this will work in both versions of python
return data, narrative
This calculates the cartesian product of the two arrays (i.e. all possible pairs), sorts them by sum, and takes the kth element. The enumerate function decorates each item with its index.
The max-heap algorithm in the other question is simple, fast and correct. Don't knock it. It's really well explained too. https://stackoverflow.com/a/5212618/284795
Might be there isn't any O(k) algorithm. That's okay, O(k log k) is almost as fast.
If the last two solutions were at (a1, b1), (a2, b2), then it seems to me there are only four candidate solutions (a1-1, b1) (a1, b1-1) (a2-1, b2) (a2, b2-1). This intuition could be wrong. Surely there are at most four candidates for each coordinate, and the next highest is among the 16 pairs (a in {a1,a2,a1-1,a2-1}, b in {b1,b2,b1-1,b2-1}). That's O(k).
(No it's not, still not sure whether that's possible.)
[2, 3, 5, 8, 13]
[4, 8, 12, 16]
Merge the 2 arrays and note down the indexes in the sorted array. Here is the index array looks like (starting from 1 not 0)
[1, 2, 4, 6, 8]
[3, 5, 7, 9]
Now start from end and make tuples. sum the elements in the tuple and pick the kth largest sum.
public static List<List<Integer>> optimization(int[] nums1, int[] nums2, int k) {
// 2 * O(n log(n))
List<List<Integer>> results = new ArrayList<>(k);
int endIndex = 0;
// Find the number whose square is the first one bigger than k
for (int i = 1; i <= k; i++) {
if (i * i >= k) {
endIndex = i;
// The following Iteration provides at most endIndex^2 elements, and both arrays are in ascending order,
// so k smallest pairs must can be found in this iteration. To flatten the nested loop, refer
// 'https://stackoverflow.com/questions/7457879/algorithm-to-optimize-nested-loops'
for (int i = 0; i < endIndex * endIndex; i++) {
int m = i / endIndex;
int n = i % endIndex;
List<Integer> item = new ArrayList<>(2);
results.sort(Comparator.comparing(pair->pair.get(0) + pair.get(1)));
return results.stream().limit(k).collect(Collectors.toList());
Key to eliminate O(n^2):
Avoid cartesian product(or 'cross join' like operation) of both arrays, which means flattening the nested loop.
Downsize iteration over the 2 arrays.
Sort both arrays (Arrays.sort offers O(n log(n)) performance according to Java doc)
Limit the iteration range to the size which is just big enough to support k smallest pairs searching.

Algorithm for Shuffling a Linked List in n log n time

I'm trying to shuffle a linked list using a divide-and-conquer algorithm that randomly shuffles a linked list in linearithmic (n log n) time and logarithmic (log n) extra space.
I'm aware that I can do a Knuth shuffle similar to that could be used in a simple array of values, but I'm not sure how I would do this with divide-and-conquer. What I mean is, what am I actually dividing? Do I just divide to each individual node in the list and then randomly assemble the list back together using some random value?
Or do I give each node a random number and then do a mergesort on the nodes based on the random numbers?
What about the following? Perform the same procedure as merge sort. When merging, instead of selecting an element (one-by-one) from the two lists in sorted order, flip a coin. Choose whether to pick an element from the first or from the second list based on the result of the coin flip.
Edit (2022-01-12): As GA1 points out in the answer below, this algorithm doesn't produce a permutation uniformly at random.
if list contains a single element
return list
list1,list2 = [],[]
while list not empty:
move front element from list to list1
if list not empty: move front element from list to list2
if length(list2) < length(list1):
i = pick a number uniformly at random in [0..length(list2)]
insert a dummy node into list2 at location i
# merge
while list1 and list2 are not empty:
if coin flip is Heads:
move front element from list1 to list
move front element from list2 to list
if list1 not empty: append list1 to list
if list2 not empty: append list2 to list
remove the dummy node from list
The key point for space is that splitting the list into two does not require any extra space. The only extra space we need is to maintain log n elements on the stack during recursion.
The point with the dummy node is to realize that inserting and removing a dummy element keeps the distribution of the elements uniform.
Edit (2022-01-12): As Riley points out in the comments, the analysis below is flawed.
Why is the distribution uniform? After the final merge, the probability P_i(n) of any given number ending up in the position i is as follows. Either it was:
in the i-th place in its own list, and the list won the coin toss the first i times, the probability of this is 1/2^i;
in the i-1-st place in its own list, and the list won the coin toss i-1 times including the last one and lost once, the probability of this is (i-1) choose 1 times 1/2^i;
in the i-2-nd place in its own list, and the list won the coin toss i-2 times including the last one and lost twice, the probability of this is (i-1) choose 2 times 1/2^i;
and so on.
So the probability
P_i(n) = \sum_{j=0}^{i-1} (i-1 choose j) * 1/2^i * P_j(n/2).
Inductively, you can show that P_i(n) = 1/n. I let you verify the base case and assume that P_j(n/2) = 2/n. The term \sum_{j=0}^{i-1} (i-1 choose j) is exactly the number of i-1-bit binary numbers, i.e. 2^{i-1}. So we get
P_i(n) = \sum_{j=0}^{i-1} (i-1 choose j) * 1/2^i * 2/n
= 2/n * 1/2^i * \sum_{j=0}^{i-1} (i-1 choose j)
= 1/n * 1/2^{i-1} * 2^{i-1}
= 1/n
I hope this makes sense. The only assumption we need is that n is even, and that the two lists are shuffled uniformly. This is achieved by adding (and then removing) the dummy node.
P.S. My original intuition was nowhere near rigorous, but I list it just in case. Imagine we assign numbers between 1 and n at random to the elements of the list. And now we run a merge sort with respect to these numbers. At any given step of the merge, it needs to decide which of the heads of the two lists is smaller. But the probability of one being greater than the other should be exactly 1/2, so we can simulate this by flipping a coin.
P.P.S. Is there a way to embed LaTeX here?
Up shuffle approach
This (lua) version is improved from foxcub's answer to remove the need of dummy nodes.
In order to slightly simplify the code in this answer, this version suppose that your lists know their sizes. In the event they don't, you can always find it in O(n) time, but even better: a few simple adaptation in the code can be done to not require to compute it beforehand (like subdividing one over two instead of first and second half).
function listUpShuffle (l)
local lsz = #l
if lsz <= 1 then return l end
local lsz2 = math.floor(lsz/2)
local l1, l2 = {}, {}
for k = 1, lsz2 do l1[#l1+1] = l[k] end
for k = lsz2+1, lsz do l2[#l2+1] = l[k] end
l1 = listUpShuffle(l1)
l2 = listUpShuffle(l2)
local res = {}
local i, j = 1, 1
while i <= #l1 or j <= #l2 do
local rem1, rem2 = #l1-i+1, #l2-j+1
if math.random() < rem1/(rem1+rem2) then
res[#res+1] = l1[i]
i = i+1
res[#res+1] = l2[j]
j = j+1
return res
To avoid using dummy nodes, you have to compensate for the fact that the two intermediate lists can have different lengths by varying the probability to choose in each list. This is done by testing a [0,1] uniform random number against the ratio of nodes popped from the first list over the total number of node popped (in the two lists).
Down shuffle approach
You can also shuffle while you subdivide recursively, which in my humble tests showed slightly (but consistently) better performance. It might come from the fewer instructions, or on the other hand it might have appeared due to cache warmup in luajit, so you will have to profile for your use cases.
function listDownShuffle (l)
local lsz = #l
if lsz <= 1 then return l end
local lsz2 = math.floor(lsz/2)
local l1, l2 = {}, {}
for i = 1, lsz do
local rem1, rem2 = lsz2-#l1, lsz-lsz2-#l2
if math.random() < rem1/(rem1+rem2) then
l1[#l1+1] = l[i]
l2[#l2+1] = l[i]
l1 = listDownShuffle(l1)
l2 = listDownShuffle(l2)
local res = {}
for i = 1, #l1 do res[#res+1] = l1[i] end
for i = 1, #l2 do res[#res+1] = l2[i] end
return res
The full source is in my listShuffle.lua Gist.
It contains code that, when executed, prints a matrix representing, for each element of the input list, the number of times it appears at each position of the output list, after a specified number of run. A fairly uniform matrix 'show' the uniformity of the distribution of characters, hence the uniformity of the shuffle.
Here is an example run with 1000000 iteration using a (non power of two) 3 element list :
>> luajit listShuffle.lua 1000000 3
Up shuffle bias matrix:
333331 332782 333887
333377 333655 332968
333292 333563 333145
Down shuffle bias matrix:
333120 333521 333359
333435 333088 333477
333445 333391 333164
You can actually do better than that: the best list shuffle algorithm is O(n log n) time and just O(1) space. (You can also shuffle in O(n) time and O(n) space by constructing a pointer array for the list, shuffling it in place using Knuth and re-threading the list accordingly.)
Complexity proof
To see why O(n log n) time is minimal for O(1) space, observe that:
With O(1) space, updating the successor of an arbitrary list element necessarily takes O(n) time.
Wlog, you can assume that whenever you update one element, you also update all the other elements (leaving them unchanged if you wish), as this also takes just O(n) time.
With O(1) space, there are at most O(1) elements to choose from for the successor of any element you're updating (which specific elements these are will obviously depend on the algorithm).
Therefore, a single O(n) time update of all the elements could result in at most c^n different list permutations.
Since there are n! = O(n^n) = O(c^(n log n)) possible list permutations, you require at least O(log n) updates.
Linked-list data structure (because Python)
import collections
class Cons(collections.Sequence):
def __init__(self, head, tail=None):
self.head = head
self.tail = tail
def __getitem__(self, index):
current, n = self, index
while n > 0:
if isinstance(current, Cons):
current, n = current.tail, n - 1
raise ValueError("Out of bounds index [{0}]".format(index))
return current
def __len__(self):
current, length = self, 0
while isinstance(current, Cons):
current, length = current.tail, length + 1
return length
def __repr__(self):
current, rep = self, []
while isinstance(current, Cons):
rep.extend((str(current.head), "::"))
current = current.tail
return "".join(rep)
Merge-style algorithm
Here is an O(n log n) time and O(1) space algorithm based on iterative merge sort. The basic idea is simple: shuffle the left half, then the right half, then merge them by randomly selecting from the two lists. Two things worth noting:
By making the algorithm iterative rather than recursive, and returning a pointer to the new last element at the end of every merge step, we reduce the space requirement to O(1) while keeping the time cost minimal.
To make sure that all possibilities are equally likely for all input sizes, we use probabilities from the Gilbert–Shannon–Reeds model riffle shuffle when merging (see http://en.wikipedia.org/wiki/Gilbert%E2%80%93Shannon%E2%80%93Reeds_model).
import random
def riffle_lists(head, list1, len1, list2, len2):
"""Riffle shuffle two sublists in place. Returns the new last element."""
for _ in range(len1 + len2):
if random.random() < (len1 / (len1 + len2)):
next, list1, len1 = list1, list1.tail, len1 - 1
next, list2, len2 = list2, list2.tail, len2 - 1
head.tail, head = next, next
head.tail = list2
return head
def shuffle_list(list):
"""Shuffle a list in place using an iterative merge-style algorithm."""
dummy = Cons(None, list)
i, n = 1, len(list)
while (i < n):
head, nleft = dummy, n
while (nleft > i):
head = riffle_lists(head, head[1], i, head[i + 1], min(i, nleft - i))
nleft -= 2 * i
i *= 2
return dummy[1]
Another algorithm
Another interesting O(n log n) algorithm that produces not-quite-uniform shuffles involves simply riffle shuffling the list 3/2 log_2(n) times. As described in http://en.wikipedia.org/wiki/Gilbert%E2%80%93Shannon%E2%80%93Reeds_model, this leaves only a constant number of bits of information.
I'd say, that foxcub's answer is wrong. To prove that I will introduce a helpful definition for a perfectly shuffled list (call it array or sequence or whatever you want).
Definition: Assume we have a List L containing the elements a1, a2 ... an and the indexes 1, 2, 3..... n. If we expose the L to a shuffle operation (to which internals we have no access) L is perfectly shuffled if and only if by knowing indexes of some k (k< n) elements we can't deduce the indexes of remaining n-k elements. That is the remaining n-k elements are equally probable to be revealed at any of the remaining n-k indexes.
Example: if we have a four element list [a, b, c, d] and after shuffling it, we know that its first element is a ([a, .., .., ..]) than the probability for any of the elements b, c, d to occur in, let's say, the third cell equals 1/3.
Now, the smallest list for which the algorithm does not fulfil the definition has three elements. But the algorithm converts it to a 4-element list anyway, so we will try to show its incorrectness for a 4-element list.
Consider an input L = [a, b, c, d]Following the first run of the algorithm the L will be divided into l1 = [a, c] and l2 = [b, d]. After shuffling these two sublists (but before merging into the four-element result) we can get four equally probable 2-elements lists:
l1shuffled = [a , c] l2shuffled = [b , d]
l1shuffled = [a , c] l2shuffled = [d , b]
l1shuffled = [c , a] l2shuffled = [b , d]
l1shuffled = [c , a] l2shuffled = [d , b]
Now try to answer two questions.
1. What is the probability that after merging into the final result a will be the first element of the list.
Simply enough, we can see that only two of the four pairs above (again, equally probable) can give such a result (p1 = 1/2). For each of these pairs heads must be drawed during first flipping in the merge routine (p2 = 1/2). Thus the probability for having a as the first element of the Lshuffled is p = p1*p2 = 1/4, which is correct.
2. Knowing that a is on the first position of the Lshuffled, what is the probability of having c (we could as well choose b or d without loss of generality) on the second position of the Lshuffled
Now, according to the above definition of a perfectly shuffled list, the answer should be 1/3, since there are three numbers to put in the three remaining cells in the list
Let's see if the algorithm assures that.
After choosing 1 as the first element of the Lshuffled we would now have either:
l1shuffled = [c] l2shuffled = [b, d]
l1shuffled = [c] l2shuffled = [d, b]
The probability of choosing 3 in both cases is equal to the probability of flipping heads (p3 = 1/2), thus the probability of having 3 as the second element of Lshuffled, when knowing that the first element element of Lshuffled is 1 equals 1/2. 1/2 != 1/3 which ends the proof of the incorrectness of the algorithm.
The interesting part is that the algorithm fullfils the necessary (but not sufficient) condition for a perfect shuffle, namely:
Given a list of n elements, for every index k (<n), for every element ak: after shuffling the list m times, if we have counted the times when ak occured on the k index, this count will tend to m/n by probability, with m tending to infinity.
Here is one possible solution:
#include <stdlib.h>
typedef struct node_s {
struct node_s * next;
int data;
} node_s, *node_p;
void shuffle_helper( node_p first, node_p last ) {
static const int half = RAND_MAX / 2;
while( (first != last) && (first->next != last) ) {
node_p firsts[2] = {0, 0};
node_p *lasts[2] = {0, 0};
int counts[2] = {0, 0}, lesser;
while( first != last ) {
int choice = (rand() <= half);
node_p next = first->next;
first->next = firsts[choice];
if( !lasts[choice] ) lasts[choice] = &(first->next);
first = next;
lesser = (counts[0] < counts[1]);
if( !counts[lesser] ) {
first = firsts[!lesser];
*(lasts[!lesser]) = last;
*(lasts[0]) = firsts[1];
*(lasts[1]) = last;
shuffle_helper( firsts[lesser], firsts[!lesser] );
first = firsts[!lesser];
last = *(lasts[!lesser]);
void shuffle_list( node_p thelist ) { shuffle_helper( thelist, NULL ); }
This is basically quicksort, but with no pivot, and with random partitioning.
The outer while loop replaces a recursive call.
The inner while loop randomly moves each element into one of two sublists.
After the inner while loop, we connect the sublists to one another.
Then, we recurse on the smaller sublist, and loop on the larger.
Since the smaller sublist can never be more than half the size of the initial list, the worst case depth of recursion is the log base two of the number of elements. The amount of memory needed is O(1) times the depth of recursion.
The average runtime, and number of calls to rand() is O(N log N).
More precise runtime analysis requires an understanding of the phrase "almost surely."
Bottom up merge sort without compares. while merging don't do any comparison just swap the elements.
You could traverse over the list, randomly generating 0 or 1 at each node.
If it is 1, remove the node and place it as the first node of the list.
If its is 0, do nothing.
loop this until you reach the end of the list.
