Sum array values with sum equals X - algorithm

I have an integer collection. I need to get all possibilites that sum of values are equal to X.
I need something like this.
It can be written in: delphi, c#, php, RoR, python, cobol, vb, vb.net

That's a subset sum problem. And it is NP-Complete.
The only way to implement this would be generate all possible combinations and compare the sum values. Optimization techniques exists though.
Here's one in C#:
static class Program
{
static int TargetSum = 10;
static int[] InputData = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
static void Main()
{
// find all permutations
var permutations = Permute(InputData);
// check each permutation for the sum
foreach (var item in permutations) {
if (item.Sum() == TargetSum) {
Console.Write(string.Join(" + ", item.Select(n => n.ToString()).ToArray()));
Console.Write(" = " + TargetSum.ToString());
Console.WriteLine();
}
}
Console.ReadKey();
}
static IEnumerable<int[]> Permute(int[] data) { return Permute(data, 0); }
static IEnumerable<int[]> Permute(int[] data, int level)
{
// reached the edge yet? backtrack one step if so.
if (level >= data.Length) yield break;
// yield the first #level elements
yield return data.Take(level + 1).ToArray();
// permute the remaining elements
for (int i = level + 1; i < data.Length; i++) {
var temp = data[level];
data[level] = data[i];
data[i] = temp;
foreach (var item in Permute(data, level + 1))
yield return item;
temp = data[i];
data[i] = data[level];
data[level] = temp;
}
}
}

Dynamic Programming would yield the best runtime for an exact solution. The Subset Sum Problem page on Wikipedia has some pseudo-code for the algorithm. Essentially you order all the numbers and add up all the possible sequences in order such that you minimize the number of additions. The runtime is pseudo-polynomial.
For a polynomial algorithm you could use an Approximation Algorithm. Pseudo-code is also available at the Subset Sum Problem page.
Of the two algorithms I would choose the dynamic programming one since it is straight-forward and has a good runtime with most data sets.
However if the integers are all non-negative and fit with the description on the Wikipedia page then you could actually do this in polynomial time with the approximation algorithm.

Related

Gin Rummy - Algorithm for determining optimal melding

A similar question to this has been asked here, but in my question, rather than being restricted to melds of size 3, melds can be any size.
In Gin Rummy, for any particular set of cards, cards can be grouped into either sets or runs. A set is a group of 3 or more cards that are all of the same rank (2-D, 2-C, 2-H or 7-D, 7-C, 7-H, 7-S). A run is a group of 3 or more cards with consecutive ranks and identical suits (A-D, 2-D, 3-C or 7-C, 8-C, 9-C, 10-C, J-C). Cards not belonging to a group are called "deadwood".
The goal of my algorithm is to find the optimal melding for a particular set of cards, which is one that minimizes the sum of the values of all the deadwood (The values of number cards are their associated numbers, the value of the ace is 1, and the value of face cards is 10.).
My original attempt at an algorithm worked on the assumption that for any run and group of sets that conflicted, either the run would exist or the group of sets would exist. Under this assumption, the algorithm could just calculate the sum of the values of the run and the sum of the values of all the sets, and keep whichever was greater. For example, if we had the groups
[2-D, 3-D, 4-D], [2-D, 2-C, 2-H], and [4-D, 4-H, 4-S]. The sum of the run's value would be 2 + 3 + 4 = 9, and the sum of the all the set's values would be 2 + 2 + 2 + 4 + 4 + 4 = 18. In this case, this would mean the two sets would be kept for the optimal melding and the run would not be used (3-D would be deadwood).
This assumption worked for groups of size 3, but fails with larger groups. For example, consider the following two groups:
[4-D, 5-D, 6-D, 7-D], [7-D, 7-H, 7-S]
The optimal grouping for this ends up being [4-D, 5-D, 6-D] and [7-D, 7-H, 7-S]. The conflicting set and part of the run is kept. I'm not sure how to create an algorithm, that isn't just brute force.
Any help or ideas would be appreciated.
EDIT
I'm realizing that my original algorithm doesn't even work for size 3 melds. In the case of the following groups:
[4-D, 5-D, 6-D], [4-C, 5-C, 6-C], [6-D, 6-C, 6-S]
The algorithm would look at the two runs individually, and conclude that they should be removed in favor of the set, but the optimal solution would be to keep both runs and remove the set.
Still looking for help in creating an algorithm that works in all edge cases.
My previous answer got deleted as I didn't really provide an explanation, and simply provided a link to a script with an algorithm for this problem. I realize why that isn't fit for an answer on stack overflow now. Here's my attempt at a complete answer.
The algorithm I show here is based on the approach found here: https://gist.github.com/yalue/2622575. The solution is a backtracking search as Paul Hankin suggested above.
The algorithm creates an object called MeldNode:
class MeldNode {
Cards cards;
MeldNode* parent;
double value;
MeldNode(Cards cards, MeldNode* parent) : cards(cards), parent(parent) {
value = sum of values of cards;
if (parent is not null){
value += parent->value;
}
}
}
Here value is equal to the sum of the card values of the cards provided and the value of the parent.
A function cleanMeldGroup is created which, given an array of melds and a meld, will return an array of melds with only melds that don't conflict with the given meld.
Melds cleanMeldGroup(Melds melds, Cards meldAvoid) {
Melds cleanMelds;
for (Cards meld : melds) {
bool clean = true;
for (Card cardA : meld) {
for (Card cardB : meldAvoid) {
if (cardA == cardB) {
clean = false;
}
}
}
if (clean) {
cleanMelds.push(meld);
}
}
return cleanMelds;
}
Next, a function getBestNode is created, which uses backtracking to find a meld node containing the data for the optimal melding combination.
MeldNode* getBestNode(Melds melds, MeldNode* rootNode) {
MeldNode* best = rootNode;
for (Meld meld : melds) {
MeldNode* node = new MeldNode(meld, rootNode);
MeldNode* newTree = getBestNode(cleanMeldGroup(melds, meld), node);
if (best is null || newTree->value > best->value){
best = newTree;
}
}
}
Note that as this is written now, this would result in memory leaks in c++. If necessary, take measures to free memory when the data has been used (You could free the memory of Nodes that aren't part of the best tree in this function, and then free the memory of Nodes that are part of the best tree after you use them).
Finally, the optimal melding can be determined as follows using the getOptimalMelding function.
Melds getOptimalMelding(Cards hand) {
Sort hand by rank and then by suit;
Melds possibleMelds;
// Find all possible runs
int runLength = 1;
for (int i = 0; i < hand.size(); i++) {
if (hand[i].suit == hand[i - 1].suit && hand[i].rank == hand[i - 1].rank + 1) {
runLength++;
} else {
if (runLength >= 3) {
for (int size = 3; size <= runLength; size++) {
for (int start = 0; start <= runLength - size; start++) {
Cards run;
for (int j = i - runLength + start; j < i - runLength + start + s; j++) {
run.push(hand[j]);
}
possibleMelds.push(run);
}
}
}
runLength = 1;
}
}
if (runLength >= 3) {
for (int size = 3; size <= runLength; size++) {
for (int start = 0; start <= runLength - size; start++) {
Cards run;
for (int j = i - runLength + start; j < i - runLength + start + s; j++) {
run.push(hand[j]);
}
possibleMelds.push(run);
}
}
}
// Find all possible sets
for (int i = 1; i <= 13; i++) {
Cards set;
for (Card card : hand) {
if (card.rank == i) {
set.push(card);
}
}
if (set.size() >= 3) {
possibleMelds.push(set);
}
if (set.size() == 4) {
for (Card card : set) {
Cards subset;
for (Card add : set) {
if (add != card) {
subset.push(add);
}
}
possibleMelds.push(subset);
}
}
}
// Find Optimal Melding Combination
MeldNode* bestNode = getBestNode(possibleMelds, null);
Melds optimalMelds;
while (bestNode is not null){
optimalMelds.push(bestNode.cards);
bestNode = bestNode->parent;
}
return optimalMelds;
Note that possibleMelds contains all possible melds of all sizes. For example, for the hand [2-D, 3-D, 4-D, 5-D, 5-H, 5-S, 5-C, 10-S, 9-C, 8-H], possibleMelds would contain the following groups:
[2-D, 3-D, 4-D],
[3-D, 4-D, 5-D],
[2-D, 3-D, 4-D, 5-D],
[5-D, 5-H, 5-S, 5-C],
[5-H, 5-S, 5-C],
[5-D, 5-S, 5-C],
[5-D, 5-H, 5-C],
[5-D, 5-H, 5-S]

Recursive Brute Force 0-1 Knapsack - add items selected output

I am practicing recursive algorithms because although I love recursion, I am still having trouble when there is "double" recursion going on. So I created this brute force 0-1 Knapsack algorithm which will output the final weight and best value, and its pretty good, but I decided that information is only relevant if you know which items are behind those numbers. I am stuck here, though. I want to do this elegantly, without creating a mess of code, and perhaps I am over-limiting my thinking trying to meet that goal. I thought I would post the code here and see if anyone had some nifty ideas about adding code to output the chosen items. This is Java:
public class Knapsack {
static int num_items = 4;
static int weights[] = { 3, 5, 1, 4 };
static int benefit[] = { 2, 4, 3, 6 };
static int capacity = 10;
static int new_sack[] = new int[num_items];
static int max_value = 0;
static int weight = 0;
// O(n2^n) brute force algorithm (i.e. check all combinations) :
public static void findMaxValue(int n, int currentWeight, int currentValue) {
if ((n == 0) && (currentWeight <= capacity) && (currentValue > max_value)) {
max_value = currentValue;
weight = currentWeight;
}
if (n == 0) {
return;
}
findMaxValue(n - 1, currentWeight, currentValue);
findMaxValue(n - 1, currentWeight + weights[n - 1], currentValue + benefit[n - 1]);
}
public static void main(String[] args) {
findMaxValue(num_items, 0, 0);
System.out.println("The max value you can get is: " + max_value + " with weight: " + weight);
// System.out.println(Arrays.toString(new_sack));
}
}
The point of the 0-1 Knapsack algorithm is to find if excluding or including an item in the knapsack results in a higher value. Your code doesn't compare these two possibilities. The code to do this would look like:
public int knapsack(int[] weights, int[] values, int n, int capacity) {
if (n == 0 || capacity == 0)
return 0;
if (weights[n-1] > capacity) // if item won't fit in knapsack
return knapsack(weights, values, n-1, capacity); // look at next item
// Compare if excluding or including item results in greater value
return max(
knapsack(weights, values, n-1, capacity), // exclude item
values[n] + knapsack(weights, values, n-1, capacity - weights[n-1])); // include item
}

Best approach to fit numbers

I have the following set of integers {2,9,4,1,8}. I need to divide this set into two subsets so that the sum of the sets results in 14 and 10 respectively. In my example the answer is {2,4,8} and {9,1}. I am not looking for any code. I am pretty sure there must be a standard algorithm to solve this problem. Since i was not successful in googling and finding out that myself, i posted my query here. So what will be the best way to approach this problem?
My try was like this...
public class Test {
public static void main(String[] args) {
int[] input = {2, 9, 4, 1, 8};
int target = 14;
Stack<Integer> stack = new Stack<>();
for (int i = 0; i < input.length; i++) {
stack.add(input[i]);
for (int j = i+1;j<input.length;j++) {
int sum = sumInStack(stack);
if (sum < target) {
stack.add(input[j]);
continue;
}
if (target == sum) {
System.out.println("Eureka");
}
stack.remove(input[i]);
}
}
}
private static int sumInStack(Stack<Integer> stack) {
int sum = 0;
for (Integer integer : stack) {
sum+=integer;
}
return sum;
}
}
I know this approach is not even close to solve the problem
I need to divide this set into two subsets so that the sum of the sets results in 14 and 10 respectively.
If the subsets have to sum to certain values, then it had better be true that the sum of the entire set is the sum of those values, i.e. 14+10=24 in your example. If you only have to find the two subsets, then the problem isn't very difficult — find any subset that sums to one of those values, and the remaining elements of the set must sum to the other value.
For the example set you gave, {2,9,4,1,8}, you said that the answer is {9,1}, {2,4,8}, but notice that that's not the only answer; there's also {2,8}, {9,4,1}.

Find the largest subset of it which form a sequence

I came across this problem during an interview forum.,
Given an int array which might contain duplicates, find the largest subset of it which form a sequence.
Eg. {1,6,10,4,7,9,5}
then ans is 4,5,6,7
Sorting is an obvious solution. Can this be done in O(n) time.
My take on the problem is that this cannot be done O(n) time & the reason is that if we could do this in O(n) time we could do sorting in O(n) time also ( without knowing the upper bound).
As a random array can contain all the elements in sequence but in random order.
Does this sound a plausible explanation ? your thoughts.
I believe it can be solved in O(n) if you assume you have enough memory to allocate an uninitialized array of a size equal to the largest value, and that allocation can be done in constant time. The trick is to use a lazy array, which gives you the ability to create a set of items in linear time with a membership test in constant time.
Phase 1: Go through each item and add it to the lazy array.
Phase 2: Go through each undeleted item, and delete all contiguous items.
In phase 2, you determine the range and remember it if it is the largest so far. Items can be deleted in constant time using a doubly-linked list.
Here is some incredibly kludgy code that demonstrates the idea:
int main(int argc,char **argv)
{
static const int n = 8;
int values[n] = {1,6,10,4,7,9,5,5};
int index[n];
int lists[n];
int prev[n];
int next_existing[n]; //
int prev_existing[n];
int index_size = 0;
int n_lists = 0;
// Find largest value
int max_value = 0;
for (int i=0; i!=n; ++i) {
int v=values[i];
if (v>max_value) max_value=v;
}
// Allocate a lazy array
int *lazy = (int *)malloc((max_value+1)*sizeof(int));
// Set items in the lazy array and build the lists of indices for
// items with a particular value.
for (int i=0; i!=n; ++i) {
next_existing[i] = i+1;
prev_existing[i] = i-1;
int v = values[i];
int l = lazy[v];
if (l>=0 && l<index_size && index[l]==v) {
// already there, add it to the list
prev[n_lists] = lists[l];
lists[l] = n_lists++;
}
else {
// not there -- create a new list
l = index_size;
lazy[v] = l;
index[l] = v;
++index_size;
prev[n_lists] = -1;
lists[l] = n_lists++;
}
}
// Go through each contiguous range of values and delete them, determining
// what the range is.
int max_count = 0;
int max_begin = -1;
int max_end = -1;
int i = 0;
while (i<n) {
// Start by searching backwards for a value that isn't in the lazy array
int dir = -1;
int v_mid = values[i];
int v = v_mid;
int begin = -1;
for (;;) {
int l = lazy[v];
if (l<0 || l>=index_size || index[l]!=v) {
// Value not in the lazy array
if (dir==1) {
// Hit the end
if (v-begin>max_count) {
max_count = v-begin;
max_begin = begin;
max_end = v;
}
break;
}
// Hit the beginning
begin = v+1;
dir = 1;
v = v_mid+1;
}
else {
// Remove all the items with value v
int k = lists[l];
while (k>=0) {
if (k!=i) {
next_existing[prev_existing[l]] = next_existing[l];
prev_existing[next_existing[l]] = prev_existing[l];
}
k = prev[k];
}
v += dir;
}
}
// Go to the next existing item
i = next_existing[i];
}
// Print the largest range
for (int i=max_begin; i!=max_end; ++i) {
if (i!=max_begin) fprintf(stderr,",");
fprintf(stderr,"%d",i);
}
fprintf(stderr,"\n");
free(lazy);
}
I would say there are ways to do it. The algorithm is the one you already describe, but just use a O(n) sorting algorithm. As such exist for certain inputs (Bucket Sort, Radix Sort) this works (this also goes hand in hand with your argumentation why it should not work).
Vaughn Cato suggested implementation is working like this (its working like a bucket sort with the lazy array working as buckets-on-demand).
As shown by M. Ben-Or in Lower bounds for algebraic computation trees, Proc. 15th ACM Sympos. Theory Comput., pp. 80-86. 1983 cited by J. Erickson in pdf Finding Longest Arithmetic Progressions, this problem cannot be solved in less than O(n log n) time (even if the input is already sorted into order) when using an algebraic decision tree model of computation.
Earlier, I posted the following example in a comment to illustrate that sorting the numbers does not provide an easy answer to the question: Suppose the array is given already sorted into ascending order. For example, let it be (20 30 35 40 47 60 70 80 85 95 100). The longest sequence found in any subsequence of the input is 20,40,60,80,100 rather than 30,35,40 or 60,70,80.
Regarding whether an O(n) algebraic decision tree solution to this problem would provide an O(n) algebraic decision tree sorting method: As others have pointed out, a solution to this subsequence problem for a given multiset does not provide a solution to a sorting problem for that multiset. As an example, consider set {2,4,6,x,y,z}. The subsequence solver will give you the result (2,4,6) whenever x,y,z are large numbers not in arithmetic sequence, and it will tell you nothing about the order of x,y,z.
What about this? populate a hash-table so each value stores the start of the range seen so far for that number, except for the head element that stores the end of the range. O(n) time, O(n) space. A tentative Python implementation (you could do it with one traversal keeping some state variables, but this way seems more clear):
def longest_subset(xs):
table = {}
for x in xs:
start = table.get(x-1, x)
end = table.get(x+1, x)
if x+1 in table:
table[end] = start
if x-1 in table:
table[start] = end
table[x] = (start if x-1 in table else end)
start, end = max(table.items(), key=lambda pair: pair[1]-pair[0])
return list(range(start, end+1))
print(longest_subset([1, 6, 10, 4, 7, 9, 5]))
# [4, 5, 6, 7]
here is a un-optimized O(n) implementation, maybe you will find it useful:
hash_tb={}
A=[1,6,10,4,7,9,5]
for i in range(0,len(A)):
if not hash_tb.has_key(A[i]):
hash_tb[A[i]]=A[i]
max_sq=[];cur_seq=[]
for i in range(0,max(A)):
if hash_tb.has_key(i):
cur_seq.append(i)
else:
if len(cur_seq)>len(max_sq):
max_sq=cur_seq
cur_seq=[]
print max_sq

Shuffle list, ensuring that no item remains in same position

I want to shuffle a list of unique items, but not do an entirely random shuffle. I need to be sure that no element in the shuffled list is at the same position as in the original list. Thus, if the original list is (A, B, C, D, E), this result would be OK: (C, D, B, E, A), but this one would not: (C, E, A, D, B) because "D" is still the fourth item. The list will have at most seven items. Extreme efficiency is not a consideration. I think this modification to Fisher/Yates does the trick, but I can't prove it mathematically:
function shuffle(data) {
for (var i = 0; i < data.length - 1; i++) {
var j = i + 1 + Math.floor(Math.random() * (data.length - i - 1));
var temp = data[j];
data[j] = data[i];
data[i] = temp;
}
}
You are looking for a derangement of your entries.
First of all, your algorithm works in the sense that it outputs a random derangement, ie a permutation with no fixed point. However it has a enormous flaw (which you might not mind, but is worth keeping in mind): some derangements cannot be obtained with your algorithm. In other words, it gives probability zero to some possible derangements, so the resulting distribution is definitely not uniformly random.
One possible solution, as suggested in the comments, would be to use a rejection algorithm:
pick a permutation uniformly at random
if it hax no fixed points, return it
otherwise retry
Asymptotically, the probability of obtaining a derangement is close to 1/e = 0.3679 (as seen in the wikipedia article). Which means that to obtain a derangement you will need to generate an average of e = 2.718 permutations, which is quite costly.
A better way to do that would be to reject at each step of the algorithm. In pseudocode, something like this (assuming the original array contains i at position i, ie a[i]==i):
for (i = 1 to n-1) {
do {
j = rand(i, n) // random integer from i to n inclusive
} while a[j] != i // rejection part
swap a[i] a[j]
}
The main difference from your algorithm is that we allow j to be equal to i, but only if it does not produce a fixed point. It is slightly longer to execute (due to the rejection part), and demands that you be able to check if an entry is at its original place or not, but it has the advantage that it can produce every possible derangement (uniformly, for that matter).
I am guessing non-rejection algorithms should exist, but I would believe them to be less straight-forward.
Edit:
My algorithm is actually bad: you still have a chance of ending with the last point unshuffled, and the distribution is not random at all, see the marginal distributions of a simulation:
An algorithm that produces uniformly distributed derangements can be found here, with some context on the problem, thorough explanations and analysis.
Second Edit:
Actually your algorithm is known as Sattolo's algorithm, and is known to produce all cycles with equal probability. So any derangement which is not a cycle but a product of several disjoint cycles cannot be obtained with the algorithm. For example, with four elements, the permutation that exchanges 1 and 2, and 3 and 4 is a derangement but not a cycle.
If you don't mind obtaining only cycles, then Sattolo's algorithm is the way to go, it's actually much faster than any uniform derangement algorithm, since no rejection is needed.
As #FelixCQ has mentioned, the shuffles you are looking for are called derangements. Constructing uniformly randomly distributed derangements is not a trivial problem, but some results are known in the literature. The most obvious way to construct derangements is by the rejection method: you generate uniformly randomly distributed permutations using an algorithm like Fisher-Yates and then reject permutations with fixed points. The average running time of that procedure is e*n + o(n) where e is Euler's constant 2.71828... That would probably work in your case.
The other major approach for generating derangements is to use a recursive algorithm. However, unlike Fisher-Yates, we have two branches to the algorithm: the last item in the list can be swapped with another item (i.e., part of a two-cycle), or can be part of a larger cycle. So at each step, the recursive algorithm has to branch in order to generate all possible derangements. Furthermore, the decision of whether to take one branch or the other has to be made with the correct probabilities.
Let D(n) be the number of derangements of n items. At each stage, the number of branches taking the last item to two-cycles is (n-1)D(n-2), and the number of branches taking the last item to larger cycles is (n-1)D(n-1). This gives us a recursive way of calculating the number of derangements, namely D(n)=(n-1)(D(n-2)+D(n-1)), and gives us the probability of branching to a two-cycle at any stage, namely (n-1)D(n-2)/D(n-1).
Now we can construct derangements by deciding to which type of cycle the last element belongs, swapping the last element to one of the n-1 other positions, and repeating. It can be complicated to keep track of all the branching, however, so in 2008 some researchers developed a streamlined algorithm using those ideas. You can see a walkthrough at http://www.cs.upc.edu/~conrado/research/talks/analco08.pdf . The running time of the algorithm is proportional to 2n + O(log^2 n), a 36% improvement in speed over the rejection method.
I have implemented their algorithm in Java. Using longs works for n up to 22 or so. Using BigIntegers extends the algorithm to n=170 or so. Using BigIntegers and BigDecimals extends the algorithm to n=40000 or so (the limit depends on memory usage in the rest of the program).
package io.github.edoolittle.combinatorics;
import java.math.BigInteger;
import java.math.BigDecimal;
import java.math.MathContext;
import java.util.Random;
import java.util.HashMap;
import java.util.TreeMap;
public final class Derangements {
// cache calculated values to speed up recursive algorithm
private static HashMap<Integer,BigInteger> numberOfDerangementsMap
= new HashMap<Integer,BigInteger>();
private static int greatestNCached = -1;
// load numberOfDerangementsMap with initial values D(0)=1 and D(1)=0
static {
numberOfDerangementsMap.put(0,BigInteger.valueOf(1));
numberOfDerangementsMap.put(1,BigInteger.valueOf(0));
greatestNCached = 1;
}
private static Random rand = new Random();
// private default constructor so class isn't accidentally instantiated
private Derangements() { }
public static BigInteger numberOfDerangements(int n)
throws IllegalArgumentException {
if (numberOfDerangementsMap.containsKey(n)) {
return numberOfDerangementsMap.get(n);
} else if (n>=2) {
// pre-load the cache to avoid stack overflow (occurs near n=5000)
for (int i=greatestNCached+1; i<n; i++) numberOfDerangements(i);
greatestNCached = n-1;
// recursion for derangements: D(n) = (n-1)*(D(n-1) + D(n-2))
BigInteger Dn_1 = numberOfDerangements(n-1);
BigInteger Dn_2 = numberOfDerangements(n-2);
BigInteger Dn = (Dn_1.add(Dn_2)).multiply(BigInteger.valueOf(n-1));
numberOfDerangementsMap.put(n,Dn);
greatestNCached = n;
return Dn;
} else {
throw new IllegalArgumentException("argument must be >= 0 but was " + n);
}
}
public static int[] randomDerangement(int n)
throws IllegalArgumentException {
if (n<2)
throw new IllegalArgumentException("argument must be >= 2 but was " + n);
int[] result = new int[n];
boolean[] mark = new boolean[n];
for (int i=0; i<n; i++) {
result[i] = i;
mark[i] = false;
}
int unmarked = n;
for (int i=n-1; i>=0; i--) {
if (unmarked<2) break; // can't move anything else
if (mark[i]) continue; // can't move item at i if marked
// use the rejection method to generate random unmarked index j &lt i;
// this could be replaced by more straightforward technique
int j;
while (mark[j=rand.nextInt(i)]);
// swap two elements of the array
int temp = result[i];
result[i] = result[j];
result[j] = temp;
// mark position j as end of cycle with probability (u-1)D(u-2)/D(u)
double probability
= (new BigDecimal(numberOfDerangements(unmarked-2))).
multiply(new BigDecimal(unmarked-1)).
divide(new BigDecimal(numberOfDerangements(unmarked)),
MathContext.DECIMAL64).doubleValue();
if (rand.nextDouble() < probability) {
mark[j] = true;
unmarked--;
}
// position i now becomes out of play so we could mark it
//mark[i] = true;
// but we don't need to because loop won't touch it from now on
// however we do have to decrement unmarked
unmarked--;
}
return result;
}
// unit tests
public static void main(String[] args) {
// test derangement numbers D(i)
for (int i=0; i<100; i++) {
System.out.println("D(" + i + ") = " + numberOfDerangements(i));
}
System.out.println();
// test quantity (u-1)D_(u-2)/D_u for overflow, inaccuracy
for (int u=2; u<100; u++) {
double d = numberOfDerangements(u-2).doubleValue() * (u-1) /
numberOfDerangements(u).doubleValue();
System.out.println((u-1) + " * D(" + (u-2) + ") / D(" + u + ") = " + d);
}
System.out.println();
// test derangements for correctness, uniform distribution
int size = 5;
long reps = 10000000;
TreeMap<String,Integer> countMap = new TreeMap&ltString,Integer>();
System.out.println("Derangement\tCount");
System.out.println("-----------\t-----");
for (long rep = 0; rep < reps; rep++) {
int[] d = randomDerangement(size);
String s = "";
String sep = "";
if (size > 10) sep = " ";
for (int i=0; i<d.length; i++) {
s += d[i] + sep;
}
if (countMap.containsKey(s)) {
countMap.put(s,countMap.get(s)+1);
} else {
countMap.put(s,1);
}
}
for (String key : countMap.keySet()) {
System.out.println(key + "\t\t" + countMap.get(key));
}
System.out.println();
// large random derangement
int size1 = 1000;
System.out.println("Random derangement of " + size1 + " elements:");
int[] d1 = randomDerangement(size1);
for (int i=0; i<d1.length; i++) {
System.out.print(d1[i] + " ");
}
System.out.println();
System.out.println();
System.out.println("We start to run into memory issues around u=40000:");
{
// increase this number from 40000 to around 50000 to trigger
// out of memory-type exceptions
int u = 40003;
BigDecimal d = (new BigDecimal(numberOfDerangements(u-2))).
multiply(new BigDecimal(u-1)).
divide(new BigDecimal(numberOfDerangements(u)),MathContext.DECIMAL64);
System.out.println((u-1) + " * D(" + (u-2) + ") / D(" + u + ") = " + d);
}
}
}
In C++:
template <class T> void shuffle(std::vector<T>&arr)
{
int size = arr.size();
for (auto i = 1; i < size; i++)
{
int n = rand() % (size - i) + i;
std::swap(arr[i-1], arr[n]);
}
}

Resources