How to find trend (growth/decrease/stationarity) of a data series - algorithm

I am trying to extract the OEE trend of a manufacturing machine. I already have a dataset of OEE calculated more or less every 30 seconds for each manufacturing machine and stored in a database.
What I want to do is to extract a subset of the dataset (say, last 30 minutes) and state if the OEE has grown, decreased or has been stable (withing a certain threshold). My task is NOT to forecast what will be the next value of OEE, but just to know if has decreased (desired return value: -1), grown (desired return value: +1) or been stable (desired return value: 0) based on the dataset. I am using Java 8 in my project.
Here is an example of dataset:
71.37
71.37
70.91
70.30
70.30
70.42
70.42
69.77
69.77
69.29
68.92
68.92
68.61
68.61
68.91
68.91
68.50
68.71
69.27
69.26
69.89
69.85
69.98
69.93
69.39
68.97
69.03
From this dataset is possible to state that the OEE has been decreasing (of couse based on a threshold), thus the algorithm would return -1.
I have been searching on the web unsuccessfully. I have found this, or this github project, or this stackoverflow question. However, all those are (more or less) complex forecasting algorithm. I am searching for a much easier solution. Any help is apreciated.

You could go for a
sliding average of the last n values.
Or a
sliding median of the last n values.
It highly depends on your application what is appropriate. But both these are very simple to implement and in a lot of cases more than good enough.

As you know from math, one would use d/dt, which more or less is using the step differences.
A trend is should have some weight.
class Trend {
int direction;
double probability;
}
Trend trend(double[] lastData) {
double[] deltas = Arrays.copyOf(lastData, lastData.length - 1);
for (int i = 0; i < deltas.length; ++i) {
deltas[i] -= lastData[i + 1];
}
// Trend based on two parts:
int parts = 2;
int splitN = (deltas.length + 1) / parts;
int i = 0;
int[] trends = new int[parts];
for (int j = 0; j < parts.length; ++j) {
int n = Math.min(splitN, parts.length - i);
double partAvg = DoubleStream.of(deltas).skip(i).limit(n).sum() / n;
trends[j] = tendency(partAvg);
}
Trend result = new Trend();
trend.direction = trends[parts - 1];
double avg = IntStream.of(trends).average().orElse((double)trend.direction);
trend.probability = ((direction - avg) + 1) / 2;
return trends[parts - 1];
}
int tendency(double sum) {
final double EPS = 0.0001;
return sum < -EPS ? -1 : sum > EPS ? 1 : 0;
}
This is not very sophisticated. For more elaborate treatment a math forum might be useful.

Related

Binary Lifting | Planet Queries 1 | TLE

I am solving this problem on CSES.
Given n planets, with exactly 1 teleporter on each planet which teleports us to some other planet (possibly the same), we have to solve q queries. Each query is associated with a start planet, x and a number of teleporters to traverse, k. For each query, we need to tell where we would reach after going through k teleporters.
I have attempted this problem using the binary lifting concept.
For each planet, I first saved the planets we would reach by going through 20, 21, 22,... teleporters.
Now, as per the constraints (esp. for k) provided in the question, we need only store the values till 231.
Then, for each query, starting from the start planet, I traverse through the teleporters using the data in the above created array (in 1) to mimic the binary expansion of k, the number of teleporters to traverse.
For example, if k = 5, i.e. (101)2, and the initial planet is x, I first go (001)2 = 1 planet ahead, using the array, let's say to planet y, and then (100)2 = 4 planets ahead. The planet now reached is the required result to the query.
Unfortunately, I am receiving TLE (time limit exceeded) error in the last test case (test 12).
Here's my code for reference:
#define inp(x) ll x; scanf("%lld", &x)
void solve()
{
// Inputting the values of n, number of planets and q, number of queries.
inp(n);
inp(q);
// Inputting the location of next planet the teleporter on each planet points to, with correction for 0 - based indexing
vector<int> adj(n);
for(int i = 0; i < n; i++)
{
scanf("%d", &(adj[i]));
adj[i]--;
}
// maxN stores the maximum value till which we need to locate the next reachable plane, based on constraints.
// A value of 32 means that we'll only ever need to go at max 2^31 places away from the planet in query.
int maxN = 32;
// This array consists of the next planet we can reach from any planet.
// Specifically, par[i][j] is the planet we get to, on passing through 2^j teleporters starting from planet i.
vector<vector<int>> par(n, vector<int>(maxN, -1));
for(int i = 0; i < n; i++)
{
par[i][0] = adj[i];
}
for(int i = 1; i < maxN; i++)
{
for(int j = 0; j < n; j++)
{
ll p1 = par[j][i-1];
par[j][i] = par[p1][i-1];
}
}
// This task is done for each query.
for(int i = 0; i < q; i++)
{
// x is the initial planet, corrected for 0 - based indexing.
inp(x);
x--;
// k is the number of teleporters to traverse.
inp(k);
// cur is the planet we currently are at.
int cur = x;
// For every i'th bit in k that is 1, the current planet is moved to the planet we reach to by moving through 2^i teleporters from cur.
for(int i = 0; (1 << i) <= k ; i++)
{
if(k & (1 << i))
{
cur = par[cur][i];
}
}
// Once the full binary expansion of k is used up, we are at cur, so (cur + 1) is the result because of the judge's 1 - based indexing.
cout<<(cur + 1)<<endl;
}
}
The code gives the correct output in every test case, but undergoes TLE in the final one (the result in the final one is correct too, just a TLE occurs). According to my observation the complexity of the code is O(32 * q + n), which doesn't seem to exceed the 106 bound for linear time code in 1 second.
Are there any hidden costs in the algorithm I may have missed, or some possible optimization?
Any help appreciated!
It looks to me like your code works (after fixing the scanf), but your par map could have 6.4M entries in it, and precalculating all of those might just get you over the 1s time limit.
Here are a few things to try, in order of complexity:
replace par with a single vector<int> and index it like par[i*32+j]. This will remove a lot of double indirections.
Buffer the output in a std::string and write it in one step at the end, in case there's some buffer flushing going on that you don't know about. I don't think so, but it's easy to try.
Starting at each planet, you enter a cycle in <= n steps. In O(n) time, you can precalculate the distance to the terminal cycle and the size of the terminal cycle for all planets. Using this information you can reduce each k to at most 20000, and that means you only need j <= 16.

Can this function be refactored to be O(1)

I have this function which is used to calculate a value with diminishing returns. It counts how often an ever increasing value can be subtracted from the input value and returns the number of subtractions. It is currently implemented iteratively with an infinite loop:
// inputValue is our parameter. It is manipulated in the method body.
// step counts how many subtractions have been performed so far. It is also our returned value.
// loss is the value that is being subtracted from the inputValue at each step. It grows polynomially with each step.
public int calculate(int inputValue) {
for (int step = 1; true; step++) {// infinite java for-each loop
int loss = (int) (1 + 0.0006 * step*step + 0.2 * step);
if (inputValue > loss) {
inputValue -= loss;
} else {
return step;
}
}
}
This function is used in various places within the larger application and sometimes in performance critical code. I would prefer it to be refactored in a way which does not require the loop anymore.
I am fairly sure that it is possible to somehow calculate the result more directly. But my mathematical skills seem to be insufficient to do this.
Can anybody show me a function which produces identical results without the need for a loop or recursion? It is OK if the refactored code may produce different results for extreme values and corner cases. Negative inputs need not be considered.
Thank you all in advance.
I don't think you can make the code faster preserving the exact logic. Particularly you have some hard to emulate rounding at
int loss = (int) (1 + 0.0006 * step*step + 0.2 * step);
If this is a requirement of your business logic rather than a bug, I don't think you can do significantly better. On the other hand if what you really want is something like (from the syntax I assumed you use Java):
public static int calculate_double(int inputValue) {
double value = inputValue;
for (int step = 1; true; step++) {// infinite java for-each loop
double loss = (1 + 0.0006 * step * step + 0.2 * step); // no rounding!
if (value > loss) {
value -= loss;
} else {
return step;
}
}
}
I.e. the same logic but without a rounding at every step, then there are some hopes.
Note: unfortunately this rounding does make a difference. For example, according to my test the output of calculate and calculate_double are slightly different for every inputValue in the range of [4, 46465] (sometimes even more than by +1, for example for inputValue = 1000 it is calculate = 90 vs calculate_double = 88). For bigger inputValue the results are more consistent. For example for the result of 519/520 the range of difference is only [55294, 55547]. Still for every results there is some range of different results.
First of all, the sum of loss in the case of no rounding for a given max step (let's call it n) has a closed formula:
sum(n) = n + 0.0006*n*(n+1)*(2n+1)/6 + 0.2*n*(n+1)/2
So theoretically finding such n so that sum(n) < inputValue < sum(n+1) can by done by solving the cubic equation sum(x) = inputValue which has a closed formula and then checking values like floor(x) and ceil(x). However the math behind this is a bit complicated so I didn't went that route.
Please also note that since int has a limited range, theoretically even your implementation of the algorithm is O(1) (because it will never take more steps than to compute calculate(Integer.MAX_VALUE) which is a constant). So probably what you really want is just a significant speed up.
Unfortunately the coefficients 0.0006 and 0.2 are small enough to make different summands the dominant part of the sum for different n. Still you can use binary search for a much better performance:
static int sum(int step) {
// n + 0.2 * n*(n+1)/2 + 0.0006 * n*(n+1)*(2n+1)/6
// ((0.0001*(2n+1) + 0.1) * (n+1) + 1) * n
double s = ((0.0001 * (2 * step + 1) + 0.1) * (step + 1) + 1) * step;
return (int) s;
}
static int calc_bin_search2(int inputValue) {
int left = 0;
// inputValue / 2 is a safe estimate, the answer for 100 is 27 or 28
int right = inputValue < 100 ? inputValue : inputValue / 2;
// for big inputValue reduce right more aggressively before starting the binary search
if (inputValue > 1000) {
while (true) {
int test = right / 8;
int tv = sum(test);
if (tv > inputValue)
right = test;
else {
left = test;
break;
}
}
}
// just usual binary search
while (true) {
int mid = (left + right) / 2;
int mv = sum(mid);
if (mv == inputValue)
return mid;
else if (mid == left) {
return mid + 1;
} else if (mv < inputValue)
left = mid;
else
right = mid;
}
}
Note: the return mid + 1 is the copy of your original logic that returns one step after the last loss was subtracted.
In my tests this implementation matches the output of calculate_double and has roughly the same performance for inputValue under 1000, is x50 faster for values around 1_000_000, and x200 faster for values around 1_000_000_000

Better results in set partition than by differencing

Partition problem is known to be NP-hard. Depending on the particular instance of the problem we can try dynamic programming or some heuristics like differencing (also known as Karmarkar-Karp algorithm).
The latter seems to be very useful for the instances with big numbers (what makes dynamic programming intractable), however not always perfect. What is an efficient way to find a better solution (random, tabu search, other approximations)?
PS: The question has some story behind it. There is a challenge Johnny Goes Shopping available at SPOJ since July 2004. Till now, the challenge has been solved by 1087 users, but only 11 of them scored better than correct Karmarkar-Karp algorithm implementation (with current scoring, Karmarkar-Karp gives 11.796614 points). How to do better? (Answers supported by accepted submission most wanted but please do not reveal your code.)
There are many papers describing various advanced algorithms for set partitioning. Here are only two of them:
"A complete anytime algorithm for number partitioning" by Richard E. Korf.
"An efficient fully polynomial approximation scheme for the Subset-Sum Problem" by Hans Kellerer et al.
Honestly, I don't know which of them gives more efficient solution. Probably neither of these advanced algorithms are needed to solve that SPOJ problem. Korf's paper is still very useful. Algorithms described there are very simple (to understand and implement). Also he overviews several even simpler algorithms (in section 2). So if you want to know the details of Horowitz-Sahni or Schroeppel-Shamir methods (mentioned below), you can find them in Korf's paper. Also (in section 8) he writes that stochastic approaches do not guarantee good enough solutions. So it is unlikely you get significant improvements with something like hill climbing, simulated annealing, or tabu search.
I tried several simple algorithms and their combinations to solve partitioning problems with size up to 10000, maximum value up to 1014, and time limit 4 sec. They were tested on random uniformly distributed numbers. And optimal solution was found for every problem instance I tried. For some problem instances optimality is guaranteed by algorithm, for others optimality is not 100% guaranteed, but probability of getting sub-optimal solution is very small.
For sizes up to 4 (green area to the left) Karmarkar-Karp algorithm always gives optimal result.
For sizes up to 54 a brute force algorithm is fast enough (red area). There is a choice between Horowitz-Sahni or Schroeppel-Shamir algorithms. I used Horowitz-Sahni because it seems more efficient for given limits. Schroeppel-Shamir uses much less memory (everything fits in L2 cache), so it may be preferable when other CPU cores perform some memory-intensive tasks or to do set partitioning using multiple threads. Or to solve bigger problems with not as strict time limit (where Horowitz-Sahni just runs out of memory).
When size multiplied by sum of all values is less than 5*109 (blue area), dynamic programming approach is applicable. Border between brute force and dynamic programming areas on diagram shows where each algorithm performs better.
Green area to the right is the place where Karmarkar-Karp algorithm gives optimal result with almost 100% probability. Here there are so many perfect partitioning options (with delta 0 or 1) that Karmarkar-Karp algorithm almost certainly finds one of them. It is possible to invent data set where Karmarkar-Karp always gives sub-optimal result. For example {17 13 10 10 10 ...}. If you multiply this to some large number, neither KK nor DP would be able to find optimal solution. Fortunately such data sets are very unlikely in practice. But problem setter could add such data set to make contest more difficult. In this case you can choose some advanced algorithm for better results (but only for grey and right green areas on diagram).
I tried 2 ways to implement Karmarkar-Karp algorithm's priority queue: with max heap and with sorted array. Sorted array option appears to be slightly faster with linear search and significantly faster with binary search.
Yellow area is the place where you can choose between guaranteed optimal result (with DP) or just optimal result with high probability (with Karmarkar-Karp).
Finally, grey area, where neither of simple algorithms by itself gives optimal result. Here we could use Karmarkar-Karp to pre-process data until it is applicable to either Horowitz-Sahni or dynamic programming. In this place there are also many perfect partitioning options, but less than in green area, so Karmarkar-Karp by itself could sometimes miss proper partitioning. Update: As noted by #mhum, it is not necessary to implement dynamic programming algorithm to make things working. Horowitz-Sahni with Karmarkar-Karp pre-processing is enough. But it is essential for Horowitz-Sahni algorithm to work on sizes up to 54 in said time limit to (almost) guarantee optimal partitioning. So C++ or other language with good optimizing compiler and fast computer are preferable.
Here is how I combined Karmarkar-Karp with other algorithms:
template<bool Preprocess = false>
i64 kk(const vector<i64>& values, i64 sum, Log& log)
{
log.name("Karmarkar-Karp");
vector<i64> pq(values.size() * 2);
copy(begin(values), end(values), begin(pq) + values.size());
sort(begin(pq) + values.size(), end(pq));
auto first = end(pq);
auto last = begin(pq) + values.size();
while (first - last > 1)
{
if (Preprocess && first - last <= kHSLimit)
{
hs(last, first, sum, log);
return 0;
}
if (Preprocess && static_cast<double>(first - last) * sum <= kDPLimit)
{
dp(last, first, sum, log);
return 0;
}
const auto diff = *(first - 1) - *(first - 2);
sum -= *(first - 2) * 2;
first -= 2;
const auto place = lower_bound(last, first, diff);
--last;
copy(last + 1, place, last);
*(place - 1) = diff;
}
const auto result = (first - last)? *last: 0;
log(result);
return result;
}
Link to full C++11 implementation. This program only determines difference between partition sums, it does not report the partitions themselves. Warning: if you want to run it on a computer with less than 1 Gb free memory, decrease kHSLimit constant.
For whatever it's worth, a straightforward, unoptimized Python implementation of the "complete Karmarkar Karp" (CKK) search procedure in [Korf88] -- modified only slightly to bail out of the search after a given time limit (say, 4.95 seconds) and return the best solution found so far -- is sufficient to score 14.204234 on the SPOJ problem, beating the score for Karmarkar-Karp. As of this writing, this is #3 on the rankings (see Edit #2 below)
A slightly more readable presentation of Korf's CKK algorithm can be found in [Mert99].
EDIT #2 - I've implemented Evgeny Kluev's hybrid heuristic of applying Karmarkar-Karp until the list of numbers is below some threshold and then switching over to the exact Horowitz-Sahni subset enumeration method [HS74] (a concise description may be found in [Korf88]). As suspected, my Python implementation required lowering the switchover threshold versus his C++ implementation. With some trial and error, I found that a threshold of 37 was the maximum that allowed my program to finish within the time limit. Yet, even at that lower threshold, I was able to achieve a score of 15.265633, good enough for second place.
I further attempted to incorporate this hybrid KK/HS method into the CKK tree search, basically by using HS as a very aggressive and expensive pruning strategy. In plain CKK, I was unable to find a switchover threshold that even matched the KK/HS method. However, using the ILDS (see below) search strategy for CKK and HS (with a threshold of 25) to prune, I was able to yield a very small gain over the previous score, up to 15.272802. It probably should not be surprising that CKK+ILDS would outperform plain CKK in this context since it would, by design, provide a greater diversity of inputs to the HS phase.
EDIT #1 -
I've tried two further refinements to the base CKK algorithm:
"Improved Limited Discrepancy Search" (ILDS) [Korf96] This is an alternative to the natural DFS ordering of paths within the search tree. It has a tendency to explore more diverse solutions earlier on than regular Depth-First Search.
"Speeding up 2-Way Number Partitioning" [Cerq12] This generalizes one of the pruning criteria in CKK from nodes within 4 levels of the leaf nodes to nodes within 5, 6, and 7 levels above leaf nodes.
In my test cases, both of these refinements generally provided noticeable benefits over the original CKK in reducing the number of nodes explored (in the case of the latter) and in arriving at better solutions sooner (in the case of the former). However, within the confines of the SPOJ problem structure, neither of these were sufficient to improve my score.
Given the idiosyncratic nature of this SPOJ problem (i.e.: 5-second time limit and only one specific and undisclosed problem instance), it is hard to give advice on what may actually improve the score*. For example, should we continue to pursue alternate search ordering strategies (e.g.: many of the papers by Wheeler Ruml listed here)? Or should we try incorporating some form of local improvement heuristic to solutions found by CKK in order to help pruning? Or maybe we should abandon CKK-based approaches altogether and try for a dynamic programming approach? How about a PTAS? Without knowing more about the specific shape of the instance used in the SPOJ problem, it's very difficult to guess at what kind of approach would yield the most benefit. Each one has its strengths and weaknesses, depending on the specific properties of a given instance.
* Aside from simply running the same thing faster, say, by implementing in C++ instead of Python.
References
[Cerq12] Cerquides, Jesús, and Pedro Meseguer. "Speeding Up 2-way Number Partitioning." ECAI. 2012, doi:10.3233/978-1-61499-098-7-223
[HS74] Horowitz, Ellis, and Sartaj Sahni. "Computing partitions with applications to the knapsack problem." Journal of the ACM (JACM) 21.2 (1974): 277-292.
[Korf88] Korf, Richard E. (1998), "A complete anytime algorithm for number partitioning", Artificial Intelligence 106 (2): 181–203, doi:10.1016/S0004-3702(98)00086-1,
[Korf96] Korf, Richard E. "Improved limited discrepancy search." AAAI/IAAI, Vol. 1. 1996.
[Mert99] Mertens, Stephan (1999), A complete anytime algorithm for balanced number partitioning, arXiv:cs/9903011
EDIT Here's a implementation that starts with Karmarkar-Karp differencing then tries to optimize the resulting partitions.
The only optimizations that time allows are giving 1 from one partition to the other and swapping 1 for 1 between both partitions.
My implementation of Karmarkar-Karp at the beginning must be inaccurate since the resulting score with just Karmarkar-Karp is 2.711483 not 11.796614 points cited by OP. The score goes to 7.718049 when the optimizations are used.
SPOILER WARNING C# submission code follows
using System;
using System.Collections.Generic;
using System.Linq;
public class Test
{
// some comparer's to lazily avoid using a proper max-heap implementation
public class Index0 : IComparer<long[]>
{
public int Compare(long[] x, long[] y)
{
if(x[0] == y[0]) return 0;
return x[0] < y[0] ? -1 : 1;
}
public static Index0 Inst = new Index0();
}
public class Index1 : IComparer<long[]>
{
public int Compare(long[] x, long[] y)
{
if(x[1] == y[1]) return 0;
return x[1] < y[1] ? -1 : 1;
}
}
public static void Main()
{
// load the data
var start = DateTime.Now;
var list = new List<long[]>();
int size = int.Parse(Console.ReadLine());
for(int i=1; i<=size; i++) {
var tuple = new long[]{ long.Parse(Console.ReadLine()), i };
list.Add(tuple);
}
list.Sort((x, y) => { if(x[0] == y[0]) return 0; return x[0] < y[0] ? -1 : 1; });
// Karmarkar-Karp differences
List<long[]> diffs = new List<long[]>();
while(list.Count > 1) {
// get max
var b = list[list.Count - 1];
list.RemoveAt(list.Count - 1);
// get max
var a = list[list.Count - 1];
list.RemoveAt(list.Count - 1);
// (b - a)
var diff = b[0] - a[0];
var tuple = new long[]{ diff, -1 };
diffs.Add(new long[] { a[0], b[0], diff, a[1], b[1] });
// insert (b - a) back in
var fnd = list.BinarySearch(tuple, new Index0());
list.Insert(fnd < 0 ? ~fnd : fnd, tuple);
}
var approx = list[0];
list.Clear();
// setup paritions
var listA = new List<long[]>();
var listB = new List<long[]>();
long sumA = 0;
long sumB = 0;
// Karmarkar-Karp rebuild partitions from differences
bool toggle = false;
for(int i=diffs.Count-1; i>=0; i--) {
var inB = listB.BinarySearch(new long[]{diffs[i][2]}, Index0.Inst);
var inA = listA.BinarySearch(new long[]{diffs[i][2]}, Index0.Inst);
if(inB >= 0 && inA >= 0) {
toggle = !toggle;
}
if(toggle == false) {
if(inB >= 0) {
listB.RemoveAt(inB);
}else if(inA >= 0) {
listA.RemoveAt(inA);
}
var tb = new long[]{diffs[i][1], diffs[i][4]};
var ta = new long[]{diffs[i][0], diffs[i][3]};
var fb = listB.BinarySearch(tb, Index0.Inst);
var fa = listA.BinarySearch(ta, Index0.Inst);
listB.Insert(fb < 0 ? ~fb : fb, tb);
listA.Insert(fa < 0 ? ~fa : fa, ta);
} else {
if(inA >= 0) {
listA.RemoveAt(inA);
}else if(inB >= 0) {
listB.RemoveAt(inB);
}
var tb = new long[]{diffs[i][1], diffs[i][4]};
var ta = new long[]{diffs[i][0], diffs[i][3]};
var fb = listA.BinarySearch(tb, Index0.Inst);
var fa = listB.BinarySearch(ta, Index0.Inst);
listA.Insert(fb < 0 ? ~fb : fb, tb);
listB.Insert(fa < 0 ? ~fa : fa, ta);
}
}
listA.ForEach(a => sumA += a[0]);
listB.ForEach(b => sumB += b[0]);
// optimize our partitions with give/take 1 or swap 1 for 1
bool change = false;
while(DateTime.Now.Subtract(start).TotalSeconds < 4.8) {
change = false;
// give one from A to B
for(int i=0; i<listA.Count; i++) {
var a = listA[i];
if(Math.Abs(sumA - sumB) > Math.Abs((sumA - a[0]) - (sumB + a[0]))) {
var fb = listB.BinarySearch(a, Index0.Inst);
listB.Insert(fb < 0 ? ~fb : fb, a);
listA.RemoveAt(i);
i--;
sumA -= a[0];
sumB += a[0];
change = true;
} else {break;}
}
// give one from B to A
for(int i=0; i<listB.Count; i++) {
var b = listB[i];
if(Math.Abs(sumA - sumB) > Math.Abs((sumA + b[0]) - (sumB - b[0]))) {
var fa = listA.BinarySearch(b, Index0.Inst);
listA.Insert(fa < 0 ? ~fa : fa, b);
listB.RemoveAt(i);
i--;
sumA += b[0];
sumB -= b[0];
change = true;
} else {break;}
}
// swap 1 for 1
for(int i=0; i<listA.Count; i++) {
var a = listA[i];
for(int j=0; j<listB.Count; j++) {
var b = listB[j];
if(Math.Abs(sumA - sumB) > Math.Abs((sumA - a[0] + b[0]) - (sumB -b[0] + a[0]))) {
listA.RemoveAt(i);
listB.RemoveAt(j);
var fa = listA.BinarySearch(b, Index0.Inst);
var fb = listB.BinarySearch(a, Index0.Inst);
listA.Insert(fa < 0 ? ~fa : fa, b);
listB.Insert(fb < 0 ? ~fb : fb, a);
sumA = sumA - a[0] + b[0];
sumB = sumB - b[0] + a[0];
change = true;
break;
}
}
}
//
if(change == false) { break; }
}
/*
// further optimization with 2 for 1 swaps
while(DateTime.Now.Subtract(start).TotalSeconds < 4.8) {
change = false;
// trade 2 for 1
for(int i=0; i<listA.Count >> 1; i++) {
var a1 = listA[i];
var a2 = listA[listA.Count - 1 - i];
for(int j=0; j<listB.Count; j++) {
var b = listB[j];
if(Math.Abs(sumA - sumB) > Math.Abs((sumA - a1[0] - a2[0] + b[0]) - (sumB - b[0] + a1[0] + a2[0]))) {
listA.RemoveAt(listA.Count - 1 - i);
listA.RemoveAt(i);
listB.RemoveAt(j);
var fa = listA.BinarySearch(b, Index0.Inst);
var fb1 = listB.BinarySearch(a1, Index0.Inst);
var fb2 = listB.BinarySearch(a2, Index0.Inst);
listA.Insert(fa < 0 ? ~fa : fa, b);
listB.Insert(fb1 < 0 ? ~fb1 : fb1, a1);
listB.Insert(fb2 < 0 ? ~fb2 : fb2, a2);
sumA = sumA - a1[0] - a2[0] + b[0];
sumB = sumB - b[0] + a1[0] + a2[0];
change = true;
break;
}
}
}
//
if(DateTime.Now.Subtract(start).TotalSeconds > 4.8) { break; }
// trade 2 for 1
for(int i=0; i<listB.Count >> 1; i++) {
var b1 = listB[i];
var b2 = listB[listB.Count - 1 - i];
for(int j=0; j<listA.Count; j++) {
var a = listA[j];
if(Math.Abs(sumA - sumB) > Math.Abs((sumA - a[0] + b1[0] + b2[0]) - (sumB - b1[0] - b2[0] + a[0]))) {
listB.RemoveAt(listB.Count - 1 - i);
listB.RemoveAt(i);
listA.RemoveAt(j);
var fa1 = listA.BinarySearch(b1, Index0.Inst);
var fa2 = listA.BinarySearch(b2, Index0.Inst);
var fb = listB.BinarySearch(a, Index0.Inst);
listA.Insert(fa1 < 0 ? ~fa1 : fa1, b1);
listA.Insert(fa2 < 0 ? ~fa2 : fa2, b2);
listB.Insert(fb < 0 ? ~fb : fb, a);
sumA = sumA - a[0] + b1[0] + b2[0];
sumB = sumB - b1[0] - b2[0] + a[0];
change = true;
break;
}
}
}
//
if(change == false) { break; }
}
*/
// output the correct ordered values
listA.Sort(new Index1());
foreach(var t in listA) {
Console.WriteLine(t[1]);
}
// DEBUG/TESTING
//Console.WriteLine(approx[0]);
//foreach(var t in listA) Console.Write(": " + t[0] + "," + t[1]);
//Console.WriteLine();
//foreach(var t in listB) Console.Write(": " + t[0] + "," + t[1]);
}
}

Printing numbers of the form 2^i * 5^j in increasing order

How do you print numbers of form 2^i * 5^j in increasing order.
For eg:
1, 2, 4, 5, 8, 10, 16, 20
This is actually a very interesting question, especially if you don't want this to be N^2 or NlogN complexity.
What I would do is the following:
Define a data structure containing 2 values (i and j) and the result of the formula.
Define a collection (e.g. std::vector) containing this data structures
Initialize the collection with the value (0,0) (the result is 1 in this case)
Now in a loop do the following:
Look in the collection and take the instance with the smallest value
Remove it from the collection
Print this out
Create 2 new instances based on the instance you just processed
In the first instance increment i
In the second instance increment j
Add both instances to the collection (if they aren't in the collection yet)
Loop until you had enough of it
The performance can be easily tweaked by choosing the right data structure and collection.
E.g. in C++, you could use an std::map, where the key is the result of the formula, and the value is the pair (i,j). Taking the smallest value is then just taking the first instance in the map (*map.begin()).
I quickly wrote the following application to illustrate it (it works!, but contains no further comments, sorry):
#include <math.h>
#include <map>
#include <iostream>
typedef __int64 Integer;
typedef std::pair<Integer,Integer> MyPair;
typedef std::map<Integer,MyPair> MyMap;
Integer result(const MyPair &myPair)
{
return pow((double)2,(double)myPair.first) * pow((double)5,(double)myPair.second);
}
int main()
{
MyMap myMap;
MyPair firstValue(0,0);
myMap[result(firstValue)] = firstValue;
while (true)
{
auto it=myMap.begin();
if (it->first < 0) break; // overflow
MyPair myPair = it->second;
std::cout << it->first << "= 2^" << myPair.first << "*5^" << myPair.second << std::endl;
myMap.erase(it);
MyPair pair1 = myPair;
++pair1.first;
myMap[result(pair1)] = pair1;
MyPair pair2 = myPair;
++pair2.second;
myMap[result(pair2)] = pair2;
}
}
This is well suited to a functional programming style. In F#:
let min (a,b)= if(a<b)then a else b;;
type stream (current, next)=
member this.current = current
member this.next():stream = next();;
let rec merge(a:stream,b:stream)=
if(a.current<b.current) then new stream(a.current, fun()->merge(a.next(),b))
else new stream(b.current, fun()->merge(a,b.next()));;
let rec Squares(start) = new stream(start,fun()->Squares(start*2));;
let rec AllPowers(start) = new stream(start,fun()->merge(Squares(start*2),AllPowers(start*5)));;
let Results = AllPowers(1);;
Works well with Results then being a stream type with current value and a next method.
Walking through it:
I define min for completenes.
I define a stream type to have a current value and a method to return a new string, essentially head and tail of a stream of numbers.
I define the function merge, which takes the smaller of the current values of two streams and then increments that stream. It then recurses to provide the rest of the stream. Essentially, given two streams which are in order, it will produce a new stream which is in order.
I define squares to be a stream increasing in powers of 2.
AllPowers takes the start value and merges the stream resulting from all squares at this number of powers of 5. it with the stream resulting from multiplying it by 5, since these are your only two options. You effectively are left with a tree of results
The result is merging more and more streams, so you merge the following streams
1, 2, 4, 8, 16, 32...
5, 10, 20, 40, 80, 160...
25, 50, 100, 200, 400...
.
.
.
Merging all of these turns out to be fairly efficient with tail recursio and compiler optimisations etc.
These could be printed to the console like this:
let rec PrintAll(s:stream)=
if (s.current > 0) then
do System.Console.WriteLine(s.current)
PrintAll(s.next());;
PrintAll(Results);
let v = System.Console.ReadLine();
Similar things could be done in any language which allows for recursion and passing functions as values (it's only a little more complex if you can't pass functions as variables).
For an O(N) solution, you can use a list of numbers found so far and two indexes: one representing the next number to be multiplied by 2, and the other the next number to be multiplied by 5. Then in each iteration you have two candidate values to choose the smaller one from.
In Python:
numbers = [1]
next_2 = 0
next_5 = 0
for i in xrange(100):
mult_2 = numbers[next_2]*2
mult_5 = numbers[next_5]*5
if mult_2 < mult_5:
next = mult_2
next_2 += 1
else:
next = mult_5
next_5 += 1
# The comparison here is to avoid appending duplicates
if next > numbers[-1]:
numbers.append(next)
print numbers
So we have two loops, one incrementing i and second one incrementing j starting both from zero, right? (multiply symbol is confusing in the title of the question)
You can do something very straightforward:
Add all items in an array
Sort the array
Or you need an other solution with more math analysys?
EDIT: More smart solution by leveraging similarity with Merge Sort problem
If we imagine infinite set of numbers of 2^i and 5^j as two independent streams/lists this problem looks very the same as well known Merge Sort problem.
So solution steps are:
Get two numbers one from the each of streams (of 2 and of 5)
Compare
Return smallest
get next number from the stream of the previously returned smallest
and that's it! ;)
PS: Complexity of Merge Sort always is O(n*log(n))
I visualize this problem as a matrix M where M(i,j) = 2^i * 5^j. This means that both the rows and columns are increasing.
Think about drawing a line through the entries in increasing order, clearly beginning at entry (1,1). As you visit entries, the row and column increasing conditions ensure that the shape formed by those cells will always be an integer partition (in English notation). Keep track of this partition (mu = (m1, m2, m3, ...) where mi is the number of smaller entries in row i -- hence m1 >= m2 >= ...). Then the only entries that you need to compare are those entries which can be added to the partition.
Here's a crude example. Suppose you've visited all the xs (mu = (5,3,3,1)), then you need only check the #s:
x x x x x #
x x x #
x x x
x #
#
Therefore the number of checks is the number of addable cells (equivalently the number of ways to go up in Bruhat order if you're of a mind to think in terms of posets).
Given a partition mu, it's easy to determine what the addable states are. Image an infinite string of 0s following the last positive entry. Then you can increase mi by 1 if and only if m(i-1) > mi.
Back to the example, for mu = (5,3,3,1) we can increase m1 (6,3,3,1) or m2 (5,4,3,1) or m4 (5,3,3,2) or m5 (5,3,3,1,1).
The solution to the problem then finds the correct sequence of partitions (saturated chain). In pseudocode:
mu = [1,0,0,...,0];
while (/* some terminate condition or go on forever */) {
minNext = 0;
nextCell = [];
// look through all addable cells
for (int i=0; i<mu.length; ++i) {
if (i==0 or mu[i-1]>mu[i]) {
// check for new minimum value
if (minNext == 0 or 2^i * 5^(mu[i]+1) < minNext) {
nextCell = i;
minNext = 2^i * 5^(mu[i]+1)
}
}
}
// print next largest entry and update mu
print(minNext);
mu[i]++;
}
I wrote this in Maple stopping after 12 iterations:
1, 2, 4, 5, 8, 10, 16, 20, 25, 32, 40, 50
and the outputted sequence of cells added and got this:
1 2 3 5 7 10
4 6 8 11
9 12
corresponding to this matrix representation:
1, 2, 4, 8, 16, 32...
5, 10, 20, 40, 80, 160...
25, 50, 100, 200, 400...
First of all, (as others mentioned already) this question is very vague!!!
Nevertheless, I am going to give a shot based on your vague equation and the pattern as your expected result. So I am not sure the following will be true for what you are trying to do, however it may give you some idea about java collections!
import java.util.List;
import java.util.ArrayList;
import java.util.SortedSet;
import java.util.TreeSet;
public class IncreasingNumbers {
private static List<Integer> findIncreasingNumbers(int maxIteration) {
SortedSet<Integer> numbers = new TreeSet<Integer>();
SortedSet<Integer> numbers2 = new TreeSet<Integer>();
for (int i=0;i < maxIteration;i++) {
int n1 = (int)Math.pow(2, i);
numbers.add(n1);
for (int j=0;j < maxIteration;j++) {
int n2 = (int)Math.pow(5, i);
numbers.add(n2);
for (Integer n: numbers) {
int n3 = n*n1;
numbers2.add(n3);
}
}
}
numbers.addAll(numbers2);
return new ArrayList<Integer>(numbers);
}
/**
* Based on the following fuzzy question # StackOverflow
* http://stackoverflow.com/questions/7571934/printing-numbers-of-the-form-2i-5j-in-increasing-order
*
*
* Result:
* 1 2 4 5 8 10 16 20 25 32 40 64 80 100 125 128 200 256 400 625 1000 2000 10000
*/
public static void main(String[] args) {
List<Integer> numbers = findIncreasingNumbers(5);
for (Integer i: numbers) {
System.out.print(i + " ");
}
}
}
If you can do it in O(nlogn), here's a simple solution:
Get an empty min-heap
Put 1 in the heap
while (you want to continue)
Get num from heap
print num
put num*2 and num*5 in the heap
There you have it. By min-heap, I mean min-heap
As a mathematician the first thing I always think about when looking at something like this is "will logarithms help?".
In this case it might.
If our series A is increasing then the series log(A) is also increasing. Since all terms of A are of the form 2^i.5^j then all members of the series log(A) are of the form i.log(2) + j.log(5)
We can then look at the series log(A)/log(2) which is also increasing and its elements are of the form i+j.(log(5)/log(2))
If we work out the i and j that generates the full ordered list for this last series (call it B) then that i and j will also generate the series A correctly.
This is just changing the nature of the problem but hopefully to one where it becomes easier to solve. At each step you can either increase i and decrease j or vice versa.
Looking at a few of the early changes you can make (which I will possibly refer to as transforms of i,j or just transorms) gives us some clues of where we are going.
Clearly increasing i by 1 will increase B by 1. However, given that log(5)/log(2) is approx 2.3 then increasing j by 1 while decreasing i by 2 will given an increase of just 0.3 . The problem then is at each stage finding the minimum possible increase in B for changes of i and j.
To do this I just kept a record as I increased of the most efficient transforms of i and j (ie what to add and subtract from each) to get the smallest possible increase in the series. Then applied whichever one was valid (ie making sure i and j don't go negative).
Since at each stage you can either decrease i or decrease j there are effectively two classes of transforms that can be checked individually. A new transform doesn't have to have the best overall score to be included in our future checks, just better than any other in its class.
To test my thougths I wrote a sort of program in LinqPad. Key things to note are that the Dump() method just outputs the object to screen and that the syntax/structure isn't valid for a real c# file. Converting it if you want to run it should be easy though.
Hopefully anything not explicitly explained will be understandable from the code.
void Main()
{
double C = Math.Log(5)/Math.Log(2);
int i = 0;
int j = 0;
int maxi = i;
int maxj = j;
List<int> outputList = new List<int>();
List<Transform> transforms = new List<Transform>();
outputList.Add(1);
while (outputList.Count<500)
{
Transform tr;
if (i==maxi)
{
//We haven't considered i this big before. Lets see if we can find an efficient transform by getting this many i and taking away some j.
maxi++;
tr = new Transform(maxi, (int)(-(maxi-maxi%C)/C), maxi%C);
AddIfWorthwhile(transforms, tr);
}
if (j==maxj)
{
//We haven't considered j this big before. Lets see if we can find an efficient transform by getting this many j and taking away some i.
maxj++;
tr = new Transform((int)(-(maxj*C)), maxj, (maxj*C)%1);
AddIfWorthwhile(transforms, tr);
}
//We have a set of transforms. We first find ones that are valid then order them by score and take the first (smallest) one.
Transform bestTransform = transforms.Where(x=>x.I>=-i && x.J >=-j).OrderBy(x=>x.Score).First();
//Apply transform
i+=bestTransform.I;
j+=bestTransform.J;
//output the next number in out list.
int value = GetValue(i,j);
//This line just gets it to stop when it overflows. I would have expected an exception but maybe LinqPad does magic with them?
if (value<0) break;
outputList.Add(value);
}
outputList.Dump();
}
public int GetValue(int i, int j)
{
return (int)(Math.Pow(2,i)*Math.Pow(5,j));
}
public void AddIfWorthwhile(List<Transform> list, Transform tr)
{
if (list.Where(x=>(x.Score<tr.Score && x.IncreaseI == tr.IncreaseI)).Count()==0)
{
list.Add(tr);
}
}
// Define other methods and classes here
public class Transform
{
public int I;
public int J;
public double Score;
public bool IncreaseI
{
get {return I>0;}
}
public Transform(int i, int j, double score)
{
I=i;
J=j;
Score=score;
}
}
I've not bothered looking at the efficiency of this but I strongly suspect its better than some other solutions because at each stage all I need to do is check my set of transforms - working out how many of these there are compared to "n" is non-trivial. It is clearly related since the further you go the more transforms there are but the number of new transforms becomes vanishingly small at higher numbers so maybe its just O(1). This O stuff always confused me though. ;-)
One advantage over other solutions is that it allows you to calculate i,j without needing to calculate the product allowing me to work out what the sequence would be without needing to calculate the actual number itself.
For what its worth after the first 230 nunmbers (when int runs out of space) I had 9 transforms to check each time. And given its only my total that overflowed I ran if for the first million results and got to i=5191 and j=354. The number of transforms was 23. The size of this number in the list is approximately 10^1810. Runtime to get to this level was approx 5 seconds.
P.S. If you like this answer please feel free to tell your friends since I spent ages on this and a few +1s would be nice compensation. Or in fact just comment to tell me what you think. :)
I'm sure everyone one's might have got the answer by now, but just wanted to give a direction to this solution..
It's a Ctrl C + Ctrl V from
http://www.careercup.com/question?id=16378662
void print(int N)
{
int arr[N];
arr[0] = 1;
int i = 0, j = 0, k = 1;
int numJ, numI;
int num;
for(int count = 1; count < N; )
{
numI = arr[i] * 2;
numJ = arr[j] * 5;
if(numI < numJ)
{
num = numI;
i++;
}
else
{
num = numJ;
j++;
}
if(num > arr[k-1])
{
arr[k] = num;
k++;
count++;
}
}
for(int counter = 0; counter < N; counter++)
{
printf("%d ", arr[counter]);
}
}
The question as put to me was to return an infinite set of solutions. I pondered the use of trees, but felt there was a problem with figuring out when to harvest and prune the tree, given an infinite number of values for i & j. I realized that a sieve algorithm could be used. Starting from zero, determine whether each positive integer had values for i and j. This was facilitated by turning answer = (2^i)*(2^j) around and solving for i instead. That gave me i = log2 (answer/ (5^j)). Here is the code:
class Program
{
static void Main(string[] args)
{
var startTime = DateTime.Now;
int potential = 0;
do
{
if (ExistsIandJ(potential))
Console.WriteLine("{0}", potential);
potential++;
} while (potential < 100000);
Console.WriteLine("Took {0} seconds", DateTime.Now.Subtract(startTime).TotalSeconds);
}
private static bool ExistsIandJ(int potential)
{
// potential = (2^i)*(5^j)
// 1 = (2^i)*(5^j)/potential
// 1/(2^1) = (5^j)/potential or (2^i) = potential / (5^j)
// i = log2 (potential / (5^j))
for (var j = 0; Math.Pow(5,j) <= potential; j++)
{
var i = Math.Log(potential / Math.Pow(5, j), 2);
if (i == Math.Truncate(i))
return true;
}
return false;
}
}

Shuffle list, ensuring that no item remains in same position

I want to shuffle a list of unique items, but not do an entirely random shuffle. I need to be sure that no element in the shuffled list is at the same position as in the original list. Thus, if the original list is (A, B, C, D, E), this result would be OK: (C, D, B, E, A), but this one would not: (C, E, A, D, B) because "D" is still the fourth item. The list will have at most seven items. Extreme efficiency is not a consideration. I think this modification to Fisher/Yates does the trick, but I can't prove it mathematically:
function shuffle(data) {
for (var i = 0; i < data.length - 1; i++) {
var j = i + 1 + Math.floor(Math.random() * (data.length - i - 1));
var temp = data[j];
data[j] = data[i];
data[i] = temp;
}
}
You are looking for a derangement of your entries.
First of all, your algorithm works in the sense that it outputs a random derangement, ie a permutation with no fixed point. However it has a enormous flaw (which you might not mind, but is worth keeping in mind): some derangements cannot be obtained with your algorithm. In other words, it gives probability zero to some possible derangements, so the resulting distribution is definitely not uniformly random.
One possible solution, as suggested in the comments, would be to use a rejection algorithm:
pick a permutation uniformly at random
if it hax no fixed points, return it
otherwise retry
Asymptotically, the probability of obtaining a derangement is close to 1/e = 0.3679 (as seen in the wikipedia article). Which means that to obtain a derangement you will need to generate an average of e = 2.718 permutations, which is quite costly.
A better way to do that would be to reject at each step of the algorithm. In pseudocode, something like this (assuming the original array contains i at position i, ie a[i]==i):
for (i = 1 to n-1) {
do {
j = rand(i, n) // random integer from i to n inclusive
} while a[j] != i // rejection part
swap a[i] a[j]
}
The main difference from your algorithm is that we allow j to be equal to i, but only if it does not produce a fixed point. It is slightly longer to execute (due to the rejection part), and demands that you be able to check if an entry is at its original place or not, but it has the advantage that it can produce every possible derangement (uniformly, for that matter).
I am guessing non-rejection algorithms should exist, but I would believe them to be less straight-forward.
Edit:
My algorithm is actually bad: you still have a chance of ending with the last point unshuffled, and the distribution is not random at all, see the marginal distributions of a simulation:
An algorithm that produces uniformly distributed derangements can be found here, with some context on the problem, thorough explanations and analysis.
Second Edit:
Actually your algorithm is known as Sattolo's algorithm, and is known to produce all cycles with equal probability. So any derangement which is not a cycle but a product of several disjoint cycles cannot be obtained with the algorithm. For example, with four elements, the permutation that exchanges 1 and 2, and 3 and 4 is a derangement but not a cycle.
If you don't mind obtaining only cycles, then Sattolo's algorithm is the way to go, it's actually much faster than any uniform derangement algorithm, since no rejection is needed.
As #FelixCQ has mentioned, the shuffles you are looking for are called derangements. Constructing uniformly randomly distributed derangements is not a trivial problem, but some results are known in the literature. The most obvious way to construct derangements is by the rejection method: you generate uniformly randomly distributed permutations using an algorithm like Fisher-Yates and then reject permutations with fixed points. The average running time of that procedure is e*n + o(n) where e is Euler's constant 2.71828... That would probably work in your case.
The other major approach for generating derangements is to use a recursive algorithm. However, unlike Fisher-Yates, we have two branches to the algorithm: the last item in the list can be swapped with another item (i.e., part of a two-cycle), or can be part of a larger cycle. So at each step, the recursive algorithm has to branch in order to generate all possible derangements. Furthermore, the decision of whether to take one branch or the other has to be made with the correct probabilities.
Let D(n) be the number of derangements of n items. At each stage, the number of branches taking the last item to two-cycles is (n-1)D(n-2), and the number of branches taking the last item to larger cycles is (n-1)D(n-1). This gives us a recursive way of calculating the number of derangements, namely D(n)=(n-1)(D(n-2)+D(n-1)), and gives us the probability of branching to a two-cycle at any stage, namely (n-1)D(n-2)/D(n-1).
Now we can construct derangements by deciding to which type of cycle the last element belongs, swapping the last element to one of the n-1 other positions, and repeating. It can be complicated to keep track of all the branching, however, so in 2008 some researchers developed a streamlined algorithm using those ideas. You can see a walkthrough at http://www.cs.upc.edu/~conrado/research/talks/analco08.pdf . The running time of the algorithm is proportional to 2n + O(log^2 n), a 36% improvement in speed over the rejection method.
I have implemented their algorithm in Java. Using longs works for n up to 22 or so. Using BigIntegers extends the algorithm to n=170 or so. Using BigIntegers and BigDecimals extends the algorithm to n=40000 or so (the limit depends on memory usage in the rest of the program).
package io.github.edoolittle.combinatorics;
import java.math.BigInteger;
import java.math.BigDecimal;
import java.math.MathContext;
import java.util.Random;
import java.util.HashMap;
import java.util.TreeMap;
public final class Derangements {
// cache calculated values to speed up recursive algorithm
private static HashMap<Integer,BigInteger> numberOfDerangementsMap
= new HashMap<Integer,BigInteger>();
private static int greatestNCached = -1;
// load numberOfDerangementsMap with initial values D(0)=1 and D(1)=0
static {
numberOfDerangementsMap.put(0,BigInteger.valueOf(1));
numberOfDerangementsMap.put(1,BigInteger.valueOf(0));
greatestNCached = 1;
}
private static Random rand = new Random();
// private default constructor so class isn't accidentally instantiated
private Derangements() { }
public static BigInteger numberOfDerangements(int n)
throws IllegalArgumentException {
if (numberOfDerangementsMap.containsKey(n)) {
return numberOfDerangementsMap.get(n);
} else if (n>=2) {
// pre-load the cache to avoid stack overflow (occurs near n=5000)
for (int i=greatestNCached+1; i<n; i++) numberOfDerangements(i);
greatestNCached = n-1;
// recursion for derangements: D(n) = (n-1)*(D(n-1) + D(n-2))
BigInteger Dn_1 = numberOfDerangements(n-1);
BigInteger Dn_2 = numberOfDerangements(n-2);
BigInteger Dn = (Dn_1.add(Dn_2)).multiply(BigInteger.valueOf(n-1));
numberOfDerangementsMap.put(n,Dn);
greatestNCached = n;
return Dn;
} else {
throw new IllegalArgumentException("argument must be >= 0 but was " + n);
}
}
public static int[] randomDerangement(int n)
throws IllegalArgumentException {
if (n<2)
throw new IllegalArgumentException("argument must be >= 2 but was " + n);
int[] result = new int[n];
boolean[] mark = new boolean[n];
for (int i=0; i<n; i++) {
result[i] = i;
mark[i] = false;
}
int unmarked = n;
for (int i=n-1; i>=0; i--) {
if (unmarked<2) break; // can't move anything else
if (mark[i]) continue; // can't move item at i if marked
// use the rejection method to generate random unmarked index j &lt i;
// this could be replaced by more straightforward technique
int j;
while (mark[j=rand.nextInt(i)]);
// swap two elements of the array
int temp = result[i];
result[i] = result[j];
result[j] = temp;
// mark position j as end of cycle with probability (u-1)D(u-2)/D(u)
double probability
= (new BigDecimal(numberOfDerangements(unmarked-2))).
multiply(new BigDecimal(unmarked-1)).
divide(new BigDecimal(numberOfDerangements(unmarked)),
MathContext.DECIMAL64).doubleValue();
if (rand.nextDouble() < probability) {
mark[j] = true;
unmarked--;
}
// position i now becomes out of play so we could mark it
//mark[i] = true;
// but we don't need to because loop won't touch it from now on
// however we do have to decrement unmarked
unmarked--;
}
return result;
}
// unit tests
public static void main(String[] args) {
// test derangement numbers D(i)
for (int i=0; i<100; i++) {
System.out.println("D(" + i + ") = " + numberOfDerangements(i));
}
System.out.println();
// test quantity (u-1)D_(u-2)/D_u for overflow, inaccuracy
for (int u=2; u<100; u++) {
double d = numberOfDerangements(u-2).doubleValue() * (u-1) /
numberOfDerangements(u).doubleValue();
System.out.println((u-1) + " * D(" + (u-2) + ") / D(" + u + ") = " + d);
}
System.out.println();
// test derangements for correctness, uniform distribution
int size = 5;
long reps = 10000000;
TreeMap<String,Integer> countMap = new TreeMap&ltString,Integer>();
System.out.println("Derangement\tCount");
System.out.println("-----------\t-----");
for (long rep = 0; rep < reps; rep++) {
int[] d = randomDerangement(size);
String s = "";
String sep = "";
if (size > 10) sep = " ";
for (int i=0; i<d.length; i++) {
s += d[i] + sep;
}
if (countMap.containsKey(s)) {
countMap.put(s,countMap.get(s)+1);
} else {
countMap.put(s,1);
}
}
for (String key : countMap.keySet()) {
System.out.println(key + "\t\t" + countMap.get(key));
}
System.out.println();
// large random derangement
int size1 = 1000;
System.out.println("Random derangement of " + size1 + " elements:");
int[] d1 = randomDerangement(size1);
for (int i=0; i<d1.length; i++) {
System.out.print(d1[i] + " ");
}
System.out.println();
System.out.println();
System.out.println("We start to run into memory issues around u=40000:");
{
// increase this number from 40000 to around 50000 to trigger
// out of memory-type exceptions
int u = 40003;
BigDecimal d = (new BigDecimal(numberOfDerangements(u-2))).
multiply(new BigDecimal(u-1)).
divide(new BigDecimal(numberOfDerangements(u)),MathContext.DECIMAL64);
System.out.println((u-1) + " * D(" + (u-2) + ") / D(" + u + ") = " + d);
}
}
}
In C++:
template <class T> void shuffle(std::vector<T>&arr)
{
int size = arr.size();
for (auto i = 1; i < size; i++)
{
int n = rand() % (size - i) + i;
std::swap(arr[i-1], arr[n]);
}
}

Resources