Parallel Computing - Shuffle - parallel-processing

I am looking to shuffle an array in parallel. I have found that doing an algorithm similar to bitonic sort but with a random (50/50) re-order results in an equal distribution but only if the array is a power of 2. I've considered the Yates Fisher Shuffle but I can't see how I could parallel-ize it in order to avoid O(N) computations.
Any advice?

There's a good clear recent paper on this here and the references, especially Shun et al 2015 are worth a read.
But basically you can do this using the same sort of approach that's used in sort -R: shuffle by giving each row a random key value and sorting on that key. And there are lots of ways to do good parallel distributed sort.
Here's a basic version in python + MPI using an odd-even sort; it goes through P communication steps if P is the number of processors. You can do better than that, but this is pretty simple to understand; it's discussed in this question.
from __future__ import print_function
import sys
import random
from mpi4py import MPI
def exchange(localdata, sendrank, recvrank):
Perform a merge-exchange with a neighbour;
sendrank sends local data to recvrank,
which merge-sorts it, and then sends lower
data back to the lower-ranked process and
keeps upper data
rank = comm.Get_rank()
assert rank == sendrank or rank == recvrank
assert sendrank < recvrank
if rank == sendrank:
comm.send(localdata, dest=recvrank)
newdata = comm.recv(source=recvrank)
bothdata = list(localdata)
otherdata = comm.recv(source=sendrank)
bothdata = bothdata + otherdata
comm.send(bothdata[:len(otherdata)], dest=sendrank)
newdata = bothdata[len(otherdata):]
return newdata
def print_by_rank(data, rank, nprocs):
""" crudely attempt to print data coherently """
for proc in range(nprocs):
if proc == rank:
print(str(rank)+": "+str(data))
def odd_even_sort(data):
rank = comm.Get_rank()
nprocs = comm.Get_size()
for step in range(1, nprocs+1):
if ((rank + step) % 2) == 0:
if rank < nprocs - 1:
data = exchange(data, rank, rank+1)
elif rank > 0:
data = exchange(data, rank-1, rank)
return data
def main():
# everyone get their data
rank = comm.Get_rank()
nprocs = comm.Get_size()
n_per_proc = 5
data = list(range(n_per_proc*rank, n_per_proc*(rank+1)))
if rank == 0:
print_by_rank(data, rank, nprocs)
# tag your data with random values
data = [(random.random(), item) for item in data]
# now sort it by these random tags
data = odd_even_sort(data)
if rank == 0:
print_by_rank([x for _, x in data], rank, nprocs)
return 0
if __name__ == "__main__":
Running gives:
$ mpirun -np 5 python
0: [0, 1, 2, 3, 4]
1: [5, 6, 7, 8, 9]
2: [10, 11, 12, 13, 14]
3: [15, 16, 17, 18, 19]
4: [20, 21, 22, 23, 24]
0: [19, 17, 4, 20, 9]
1: [23, 12, 3, 2, 8]
2: [14, 6, 13, 15, 1]
3: [11, 0, 22, 16, 18]
4: [5, 10, 21, 7, 24]


Detect outlier in repeating sequence

I have a repeating sequence of say 0~9 (but may start and stop at any of these numbers). e.g.:
And it has outliers at random location, including 1st and last one, e.g.:
I need to find & correct the outliers, in the above example, I need correct the first "9" into "3", and "8" into "5", etc..
What I came up with is to construct a sequence with no outlier of desired length, but since I don't know which number the sequence starts with, I'd have to construct 10 sequences each starting from "0", "1", "2" ... "9". And then I can compare these 10 sequences with the given sequence and find the one sequence that match the given sequence the most. However this is very inefficient when the repeating pattern gets large (say if the repeating pattern is 0~99, I'd need to create 100 sequences to compare).
Assuming there won't be consecutive outliers, is there a way to find & correct these outliers efficiently?
edit: added some explanation and added the algorithm tag. Hopefully it is more appropriate now.
I'm going to propose a variation of #trincot's fine answer. Like that one, it doesn't care how many outliers there may be in a row, but unlike that one doesn't care either about how many in a row aren't outliers.
The base idea is just to let each sequence element "vote" on what the first sequence element "should be". Whichever gets the most votes wins. By construction, this maximizes the number of elements left unchanged: after the 1-liner loop ends, votes[i] is the number of elements left unchanged if i is picked as the starting point.
def correct(numbers, mod=None):
# this part copied from #trincot's program
if mod is None: # if argument is not provided:
# Make a guess what the range is of the values
mod = max(numbers) + 1
votes = [0] * mod
for i, x in enumerate(numbers):
# which initial number would make x correct?
votes[(x - i) % mod] += 1
winning_count = max(votes)
winning_numbers = [i for i, v in enumerate(votes)
if v == winning_count]
if len(winning_numbers) > 1:
raise ValueError("ambiguous!", winning_numbers)
winning_number = winning_numbers[0]
for i in range(len(numbers)):
numbers[i] = (winning_number + i) % mod
return numbers
Then, e.g.,
>>> correct([9,4,5,6,7,8,9,0,1,2,3,4,8,6,7,0,9,0,1,2,3,4,1,6,7,8,9,0,1,6])
[3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2]
>>> correct([1, 5, 3, 7, 5, 9])
ValueError: ('ambiguous!', [1, 4])
That is, it's impossible to guess whether you want [1, 2, 3, 4, 5, 6] or [4, 5, 6, 7, 8, 9]. They both have 3 numbers "right", and despite that there are never two adjacent outliers in either case.
I would do a first scan of the list to find the longest sublist in the input that maintains the right order. We will then assume that those values are all correct, and calculate backwards what the first value would have to be to produce those values in that sublist.
Here is how that would look in Python:
def correct(numbers, mod=None):
if mod is None: # if argument is not provided:
# Make a guess what the range is of the values
mod = max(numbers) + 1
# Find the longest slice in the list that maintains order
start = 0
longeststart = 0
longest = 1
expected = -1
for last in range(len(numbers)):
if numbers[last] != expected:
start = last
elif last - start >= longest:
longest = last - start + 1
longeststart = start
expected = (numbers[last] + 1) % mod
# Get from that longest slice what the starting value should be
val = (numbers[longeststart] - longeststart) % mod
# Repopulate the list starting from that value
for i in range(len(numbers)):
numbers[i] = val
val = (val + 1) % mod
# demo use
numbers = [9,4,5,6,7,8,9,0,1,2,3,4,8,6,7,0,9,0,1,2,3,4,1,6,7,8,9,0,1,6]
correct(numbers, 10) # for 0..9 provide 10 as argument, ...etc
The advantage of this method is that it would even give a good result if there were errors with two consecutive values, provided that there are enough correct values in the list of course.
Still this runs in linear time.
Here is another way using groupby and count from Python's itertools module:
from itertools import count, groupby
def correct(lst):
groupped = [list(v) for _, v in groupby(lst, lambda a, b=count(): a - next(b))]
# Check if all groups are singletons
if all(len(k) == 1 for k in groupped):
raise ValueError('All groups are singletons!')
for k, v in zip(groupped, groupped[1:]):
if len(k) < 2:
out = v[0] - 1
if out >= 0:
yield out
yield from k
yield from k
# check last element of the groupped list
if len(v) < 2:
yield k[-1] + 1
yield from v
lst = "9,4,5,6,7,8,9,0,1,2,3,4,8,6,7,0,9,0,1,2,3,4,1,6,7,8,9,0,1,6"
lst = [int(k) for k in lst.split(',')]
out = list(correct(lst))
[3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2]
For the case of [1, 5, 3, 7, 5, 9] this solution will return something not accurate, because i can't see which value you want to modify. This is why the best solution is to check & raise a ValueError if all groups are singletons.
Like this?
numbers = [9,4,5,6,7,8,9,0,1,2,3,4,8,6,7,0,9,0,1,2,3,4,1,6,7,8,9,0,1,6]
i = 0
for n in numbers[:-1]:
i += 1
if n > numbers[i] and n > 0:
numbers[i-1] = numbers[i]-1
elif n > numbers[i] and n == 0:
numbers[i - 1] = 9
n = numbers[-1]
if n > numbers[0] and n > 0:
numbers[-1] = numbers[0] - 1
elif n > numbers[0] and n == 0:
numbers[-1] = 9

Algorithm to efficiently select rows from a matrix such that column totals are equal

The practical application of this problem is group assignment in a psychology study, but the theoretical formulation is this:
I have a matrix (the actual matrix is 27x72, but I'll pick a 4x8 as an example):
1 0 1 0
0 1 0 1
1 1 0 0
0 1 1 0
0 0 1 1
1 0 1 0
1 1 0 0
0 1 0 1
I want to pick half of the rows out of this matrix such that the column totals are equal (thus effectively creating two matrices with equivalent column totals). I cannot rearrange values within the rows.
I have tried some brute force solutions, but my matrix is too large for that to be effective, even having chosen some random restrictions first. It seems to me that the search space could be constrained with a better algorithm, but I haven't been able to think of one thus far. Any ideas? It is also possible that there is no solution, so an algorithm would have to be able to deal with that. I have been working in R, but I could switch to python easily.
Found a solution thanks to ljeabmreosn. Karmarkar-Karp worked great for an algorithm, and converting the rows to base 73 was inspired. I had a surprising hard time finding code that would actually give me the sub-sequences rather than just the final difference (maybe most people are only interested in this problem in the abstract?). Anyway this was the code:
First I converted my rows in to base 73 as the poster suggested. To do this I used the basein package in python, defining an alphabet with 73 characters and then using the basein.decode function to convert to decimel.
For the algorithm, I just added code to print the sub-sequence indices from this mailing list message from Tim Peters:
from __future__ import nested_scopes
import sys
import bisect
class _Num:
def __init__(self, value, index):
self.value = value
self.i = index
def __lt__(self, other):
return self.value < other.value
# This implements the Karmarkar-Karp heuristic for partitioning a set
# in two, i.e. into two disjoint subsets s.t. their sums are
# approximately equal. It produces only one result, in O(N*log N)
# time. A remarkable property is that it loves large sets: in
# general, the more numbers you feed it, the better it does.
class Partition:
def __init__(self, nums):
self.nums = nums
sorted = [_Num(nums[i], i) for i in range(len(nums))]
self.sorted = sorted
def run(self):
sorted = self.sorted[:]
N = len(sorted)
connections = [[] for i in range(N)]
while len(sorted) > 1:
bigger = sorted.pop()
smaller = sorted.pop()
# Force these into different sets, by "drawing a
# line" connecting them.
i, j = bigger.i, smaller.i
diff = bigger.value - smaller.value
assert diff >= 0
bisect.insort(sorted, _Num(diff, i))
# Now sorted contains only 1 element x, and x.value is
# the difference between the subsets' sums.
# Theorem: The connections matrix represents a spanning tree
# on the set of index nodes, and any tree can be 2-colored.
# 2-color this one (with "colors" 0 and 1).
index2color = [None] * N
def color(i, c):
if index2color[i] is not None:
assert index2color[i] == c
index2color[i] = c
for j in connections[i]:
color(j, 1-c)
color(0, 0)
# Partition the indices by their colors.
subsets = [[], []]
for i in range(N):
return subsets
if not sys.argv:
print "error no arguments provided"
elif sys.argv[1]:
f = open(sys.argv[1], "r")
x = [int(line.strip()) for line in f]
N = 50
import math
p = Partition(x)
s, t =
sum1 = 0L
sum2 = 0L
for i in s:
sum1 += x[i]
for i in t:
sum2 += x[i]
print "Set 1:"
print s
print "Set 2:"
print t
print "Set 1 sum", repr(sum1)
print "Set 2 sum", repr(sum2)
print "difference", repr(abs(sum1 - sum2))
This gives the following output:
Set 1:
[0, 3, 5, 6, 9, 10, 12, 15, 17, 19, 21, 22, 24, 26, 28, 31, 32, 34, 36, 38, 41, 43, 45, 47, 48, 51, 53, 54, 56, 59, 61, 62, 65, 66, 68, 71]
Set 2:
[1, 2, 4, 7, 8, 11, 13, 14, 16, 18, 20, 23, 25, 27, 29, 30, 33, 35, 37, 39, 40, 42, 44, 46, 49, 50, 52, 55, 57, 58, 60, 63, 64, 67, 69, 70]
Set 1 sum 30309344369339288555041174435706422018348623853211009172L
Set 2 sum 30309344369339288555041174435706422018348623853211009172L
difference 0L
Which provides the indices of the proper subsets in a few seconds. Thanks everybody!
Assuming each entry in the matrix can either be 0 or 1, this problem seems to be in the same family as the Partition Problem which only has a pseudo-polynomial time algorithm. Let r be the number of rows in the matrix and c be the number of columns in the matrix. Then, encode each row to a c-digit number of base r+1. This is to ensure when adding each encoding, there is no need to carry, thus equivalent numbers in this base will equate to two sets of rows whose column sums are equivalent. So in your example, you would convert each row into a 4-digit number of base 9. This would yield the numbers (converted into base 10):
10109 => 73810
01019 => 8210
11009 => 81010
01109 => 9010
00119 => 1010
10109 => 73810
11009 => 81010
01019 => 8210
Although you probably couldn't use the pseudo-polynomial time algorithm with this method, you could use a simple heuristic with some decision trees to try to speed up the bruteforce. Using the numbers above, you could try to use the Karmarkar-Karp heuristic. Implemented below is the first step of algorithm in Python 3:
# Sorted (descending) => 810, 810, 738, 738, 90, 82, 82, 10
from queue import PriorityQueue
def karmarkar_karp_partition(arr):
pqueue = PriorityQueue()
for e in arr:
pqueue.put_nowait((-e, e))
for _ in range(len(arr)-1):
_, first = pqueue.get_nowait()
_, second = pqueue.get_nowait()
diff = first - second
pqueue.put_nowait((-diff, diff))
return pqueue.get_nowait()[1]
Here is the algorithm fully implemented. Note that this method is simply a heuristic and may fail to find the best partition.

How to slice a rank 4 tensor in TensorFlow?

I am trying to slice a four-dimensional tensor using the tf.slice() operator, as follows:
x_image = tf.reshape(x, [-1,28,28,1], name='Images_2D')
slice_im = tf.slice(x_image,[0,2,2],[1, 24, 24])
However, when I try to run this code, I get the following exception:
raise ValueError("Shape %s must have rank %d" % (self, rank))
ValueError: Shape TensorShape([Dimension(None), Dimension(28), Dimension(28), Dimension(1)]) must have rank 3
How can I slice this tensor?
The tf.slice(input, begin, size) operator requires that the begin and size vectors—which define the subtensor to be sliced—have the same length as the number of dimensions in input. Therefore, to slice a 4-D tensor, you must pass a vector (or list) of four numbers as the second and third arguments of tf.slice().
For example:
x_image = tf.reshape(x, [-1, 28, 28, 1], name='Images_2D')
slice_im = tf.slice(x_image, [0, 2, 2, 0], [1, 24, 24, 1])
# Or, using the indexing operator:
slice_im = x_image[0:1, 2:26, 2:26, :]
The indexing operator is slightly more powerful, as it can also reduce the rank of the output, if for a dimension you specify a single integer, rather than a range:
slice_im = x_image[0:1, 2:26, 2:26, :]
print slice_im_2d.get_shape() # ==> [1, 24, 24, 1]
slice_im_2d = x_image[0, 2:26, 2:26, 0]
print slice_im_2d.get_shape() # ==> [24, 24]

Recursive merge sort in Ruby

I am trying to write a ruby method which performs a merge sort recursively. I have the method working, but It's one of those times where I accidentally got it working so I have no idea WHY it works, and would love to understand how the code I have written works. In psuedocode, the steps I followed look like this.
Split the original array of length n until I have n arrays of length 1
Merge and sort 2 arrays of length m at time to return an array of length m*2
Repeat the step above until I have a single now sorted array of length n
Basically what this looks like to me is a large tree branching out into n branches, with each branch containing an array of length 1. Then I need to take these n branches and somehow merge them back into a single branch within the method.
def merge_sort(arr)
return arr if arr.length == 1
merge(merge_sort(arr.slice(0, arr.length/2)),
merge_sort(arr.slice(arr.length/2, arr[-1])))
def merge(arr1, arr2)
sorted = []
less_than = arr1[0] <=> arr2[0]
less_than = (arr1[0] == nil ? 1 : -1) if less_than == nil
case less_than
when -1
sorted << arr1[0]
arr1 = arr1.drop(1)
when 0
sorted << arr1[0]
sorted << arr2[0]
arr1 = arr1.drop(1)
arr2 = arr2.drop(1)
when 1
sorted << arr2[0]
arr2 = arr2.drop(1)
end until (arr1.length == 0 && arr2.length == 0)
#Returns => [1, 3, 3, 6, 8, 11, 22, 24, 46, 53, 54, 65, 68, 76, 79, 80, 98]
The method I have actually correctly sorts the list, but I am not totally sure how the method combines each branch and then returns the sorted merged list, rather than just the first two length one arrays it combines.
Also, If anyone has ideas for how I can make the merge method prettier to look more like the ruby code I have grown to love please let me know.
Here is my implementation of mergesort in Ruby
def mergesort(array)
return array if array.length == 1
middle = array.length / 2
merge mergesort(array[0...middle]), mergesort(array[middle..-1])
def merge(left, right)
result = []
until left.length == 0 || right.length == 0 do
result << (left.first <= right.first ? left.shift : right.shift)
result + left + right
As you can see, the mergesort method is basically the same as yours, and this is where the recursion occurs so that is what I will focus on.
First, you have your base case: return array if array.length == 1 This is what allows the recursion to work and not go on indefinitely.
Next, in my implementation I have defined a variable middle to represent the middle of the array: middle = array.length / 2
Finally, the third line is where all the work occurs: merge mergesort(array[0...middle]), mergesort(array[middle..-1])
What you are doing here is telling the merge method to merge the mergesorted left half with the mergesorted right half.
If you assume your input array is [9, 1, 5, 4] what you are saying is merge mergesort([9, 1]), mergesort([5, 4]).
In order to perform the merge, you first have to mergesort [9, 1] and mergesort [5, 4]. The recursion then becomes
merge((merge mergesort([9]), mergesort([1])), (merge mergesort([5]), mergesort([4])))
When we recurse again, the mergesort([9]) has reached the base case and returns [9]. Similarly, mergesort([1]) has also reached the base case and returns [1]. Now you can merge [9] and [1]. The result of the merge is [1, 9].
Now for the other side of the merge. We have to figure out the result of merge mergesort([5]), mergesort([4]) before we can merge it with [1, 9]. Following the same procedure as the left side, we get to the base case of [5] and [4] and merge those to get [4, 5].
Now we need to merge [1, 9] with [4, 5].
On the first pass, result receives 1 because 1 <= 4.
On the next pass, we are working with result = [1], left = [9], and right = [4, 5]. When we see if left.first <= right.first we see that it is false, so we return right.shift, or 4. Now result = [1, 4].
On the third pass, we are working with result = [1, 4], left = [9], and right = [5]. When we see if left.first <= right.first we see that it is false, so we return right.shift, or 5. Now result = [1, 4, 5].
Here the loop ends because right.length == 0.
We simply concatenate result + left + right or [1, 4, 5] + [9] + [], which results in a sorted array.
Here is my version of a recursive merge_sort method for Ruby. Which does the exact same as above, but slightly different.
def merge_sort(array)
array.length <= 1 ? array : merge_helper(merge_sort(array[0...array.length / 2]), merge_sort(array[array.length / 2..-1]))
def merge_helper(left, right, merged = [])
left.first <= right.first ? merged << left.shift : merged << right.shift until left.length < 1 || right.length < 1
merged + left + right
p merge_sort([]) # => []
p merge_sort([20, 8]) # => [8, 20]
p merge_sort([16, 14, 11]) # => [11, 14, 16]
p merge_sort([18, 4, 7, 19, 17]) # => [4, 7, 17, 18, 19]
p merge_sort([10, 12, 15, 13, 16, 7, 19, 2]) # => [2, 7, 10, 12, 13, 15, 16, 19]
p merge_sort([3, 14, 10, 8, 11, 7, 18, 17, 2, 5, 9, 20, 19]) # => [2, 3, 5, 7, 8, 9, 10, 11, 14, 17, 18, 19, 20]

Random's randint won't work in a for-loop

I'm trying to create a list with random length filled with lists of random lengths by using this code:
import random
solitaire = [None]*(random.randint(1,5))
for pile in solitaire:
number = random.randint(0, 10)
Easy enough I thought but when I ran this code my powershell window froze as it was expecting an input or something, I had to cancel the script with ctr+c and then got the message:
Traceback (most recent call last):
File "", line 254, in <module>
number = random.randint(0, 10)
File "C:\Python34\lib\", line 218, in randint
return self.randrange(a, b+1)
File "C:\Python34\lib\", line 170, in randrange
def randrange(self, start, stop=None, step=1, _int=int):
What does this mean? Why won't the code run?
number = random.randint(0, 10)
Seems to work just fine so why won't it inside the for-loop?
you don't say anything about the content of the lists, supposing that they also contain random integers, then a possible solution could be the following:
It creates a list with random length filled with lists of random lengths containing random integers
import random
#This is the list which will containt all the lists
solitaire = list(range(random.randint(MIN_LIST_OF_LISTS_LENGTH,MAX_LIST_OF_LISTS_LENGTH)))
for i, pile in enumerate(solitaire):
solitaire[i] = [
random.randint(MIN_LIST_ELEMENT, MAX_LIST_ELEMENT) for x in
range(0, random.randint(MIN_LIST_LENGTH, MAX_LIST_LENGTH))
It will generate outputs like these:
[[10, 3], [5, 2, 7, 7, 6], [5], [9, 3, 2, 6], [2, 4, 4], [4, 5, 10, 9, 10]]
[[5, 1], [5, 1, 1], [1, 1, 7, 3, 1]]
[[9, 1, 6, 7], [10, 7, 1, 7, 4]]
