Find all number pairs in a given range - algorithm

I have N numbers let say 20 30 15 30 30 40 15 20. Now I want to find how many numbers pairs are in a given range.(L and R given).
number pair= both numbers are same.
My approach:
Create a Map of Array, such that key of map= number, and value=ArrayList of indexes at which that number appears. Then I traverse from L to R and for each value in that range I traverse in the corresponding arraylist to find if there is a pair that fits in range, and then increment count.
But I think this approach is too slow. Is there some faster method to do the same?
Example: for above given sequence and L=0 and R=6
Answer=5. Possible pairs are 1 for 20, 1 for 15 and 3 for 30.
I am developing a solution, assuming numbers can be upto 10^8( and non negative).

If you are looking for speed and don't care about memory there's maybe a better way.
You can use a set as an auxiliary data structure to see if a number was found, and then simply walk the array. Pseudo code:
int numPairs = 0;
set setVisited;
for (int i = L; i < R; i++) {
if (setVisited.contains(a[i])) {
// found the second of a pair. count it up and reset.
numPairs++;
setVisited.remove(a[i]);
} else {
// remember that we saw this number, so we can spot the next pair.
setVisited.add(a[i]);
}

New solution... hopefully better this time. Psuedo C-ish code:
// Sort the sub-array a[L..R]. This can be done O(nlogn) using qsort.
// ... code omitted ...
// Walk through the sorted array counting how many times number occurs.
// When the number changes, count how many possibles ways to make pairs
// from the given count.
int totalPairs = 0;
int count = 1;
int current = a[L];
for (i = L+1; i < R; i++) {
if (a[i] == current) { // found another, keep counting
count++;
} else { // found a different one
if (count > 1) { // need at least 2 to make a pair!
totalPairs += factorial(count) / 2;
}
}
// start counting the new one
current = a[i];
count = 1;
}
// count the final one
if (count > 1) {
totalPairs += factorial(count) / 2;
}
The sort runs O(nlgn), and the loop body runs O(n). Interestingly the performance barrier is now factorial. For really long arrays with really high numbers of occurrences, factorial is expensive unless you optimize further.
One way would be to have loop count repetitions but not compute factorial yet -- leave yet another array of counts of numbers. Then sort this array (again Nlg(N)), then walk through this array and re-use previously computed factorial to compute the next one.
Also if this array gets big, you'll need a large integer to represent the total. I don't know the O() performance of large integers off the top of my head.
Cool problem!

Related

How to turn integers into Fibonacci coding efficiently?

Fibonacci sequence is obtained by starting with 0 and 1 and then adding the two last numbers to get the next one.
All positive integers can be represented as a sum of a set of Fibonacci numbers without repetition. For example: 13 can be the sum of the sets {13}, {5,8} or {2,3,8}. But, as we have seen, some numbers have more than one set whose sum is the number. If we add the constraint that the sets cannot have two consecutive Fibonacci numbers, than we have a unique representation for each number.
We will use a binary sequence (just zeros and ones) to do that. For example, 17 = 1 + 3 + 13. Then, 17 = 100101. See figure 2 for a detailed explanation.
I want to turn some integers into this representation, but the integers may be very big. How to I do this efficiently.
The problem itself is simple. You always pick the largest fibonacci number less than the remainder. You can ignore the the constraint with the consecutive numbers (since if you need both, the next one is the sum of both so you should have picked that one instead of the initial two).
So the problem remains how to quickly find the largest fibonacci number less than some number X.
There's a known trick that starting with the matrix (call it M)
1 1
1 0
You can compute fibbonacci number by matrix multiplications(the xth number is M^x). More details here: https://www.nayuki.io/page/fast-fibonacci-algorithms . The end result is that you can compute the number you're look in O(logN) matrix multiplications.
You'll need large number computations (multiplications and additions) if they don't fit into existing types.
Also store the matrices corresponding to powers of two you compute the first time, since you'll need them again for the results.
Overall this should be O((logN)^2 * large_number_multiplications/additions)).
First I want to tell you that I really liked this question, I didn't know that All positive integers can be represented as a sum of a set of Fibonacci numbers without repetition, I saw the prove by induction and it was awesome.
To respond to your question I think that we have to figure how the presentation is created. I think that the easy way to find this is that from the number we found the closest minor fibonacci item.
For example if we want to present 40:
We have Fib(9)=34 and Fib(10)=55 so the first element in the presentation is Fib(9)
since 40 - Fib(9) = 6 and (Fib(5) =5 and Fib(6) =8) the next element is Fib(5). So we have 40 = Fib(9) + Fib(5)+ Fib(2)
Allow me to write this in C#
class Program
{
static void Main(string[] args)
{
List<int> fibPresentation = new List<int>();
int numberToPresent = Convert.ToInt32(Console.ReadLine());
while (numberToPresent > 0)
{
int k =1;
while (CalculateFib(k) <= numberToPresent)
{
k++;
}
numberToPresent = numberToPresent - CalculateFib(k-1);
fibPresentation.Add(k-1);
}
}
static int CalculateFib(int n)
{
if (n == 1)
return 1;
int a = 0;
int b = 1;
// In N steps compute Fibonacci sequence iteratively.
for (int i = 0; i < n; i++)
{
int temp = a;
a = b;
b = temp + b;
}
return a;
}
}
Your result will be in fibPresentation
This encoding is more accurately called the "Zeckendorf representation": see https://en.wikipedia.org/wiki/Fibonacci_coding
A greedy approach works (see https://en.wikipedia.org/wiki/Zeckendorf%27s_theorem) and here's some Python code that converts a number to this representation. It uses the first 100 Fibonacci numbers and works correctly for all inputs up to 927372692193078999175 (and incorrectly for any larger inputs).
fibs = [0, 1]
for _ in xrange(100):
fibs.append(fibs[-2] + fibs[-1])
def zeck(n):
i = len(fibs) - 1
r = 0
while n:
if fibs[i] <= n:
r |= 1 << (i - 2)
n -= fibs[i]
i -= 1
return r
print bin(zeck(17))
The output is:
0b100101
As the greedy approach seems to work, it suffices to be able to invert the relation N=Fn.
By the Binet formula, Fn=[φ^n/√5], where the brackets denote the nearest integer. Then with n=floor(lnφ(√5N)) you are very close to the solution.
17 => n = floor(7.5599...) => F7 = 13
4 => n = floor(4.5531) => F4 = 3
1 => n = floor(1.6722) => F1 = 1
(I do not exclude that some n values can be off by one.)
I'm not sure if this is an efficient enough for you, but you could simply use Backtracking to find a(the) valid representation.
I would try to start the backtracking steps by taking the biggest possible fib number and only switch to smaller ones if the consecutive or the only once constraint is violated.

Finding if a random number has occured before or not

Let me be clear at start that this is a contrived example and not a real world problem.
If I have a problem of creating a random number between 0 to 10. I do this 11 times making sure that a previously occurred number is not drawn again, if I get a repeated number,
I create another random number again to make sure it has not be seen earlier. So essentially I get a a sequence of unique numbers from 0 - 10 in a random order
e.g. 3 1 2 0 5 9 4 8 10 6 7 and so on
Now to come up with logic to make sure that the random numbers are unique and not one which we have drawn before, we could use many approaches
Use C++ std::bitset and set the bit corresponding to the index equal to value of each random no. and check it next time when a new random number is drawn.
Or
Use a std::map<int,int> to count the number of times or even simple C array with some sentinel values stored in that array to indicate if that number has occurred or not.
If I have to avoid these methods above and use some mathematical/logical/bitwise operation to find whether a random number has been draw before or not, is there a way?
You don't want to do it the way you suggest. Consider what happens when you have already selected 10 of the 11 items; your random number generator will cycle until it finds the missing number, which might be never, depending on your random number generator.
A better solution is to create a list of numbers 0 to 10 in order, then shuffle the list into a random order. The normal algorithm for doing this is due to Knuth, Fisher and Yates: starting at the first element, swap each element with an element at a location greater than the current element in the array.
function shuffle(a, n)
for i from n-1 to 1 step -1
j = randint(i)
swap(a[i], a[j])
We assume an array with indices 0 to n-1, and a randint function that sets j to the range 0 <= j <= i.
Use an array and add all possible values to it. Then pick one out of the array and remove it. Next time, pick again until the array is empty.
Yes, there is a mathematical way to do it, but it is a bit expansive.
have an array: primes[] where primes[i] = the i'th prime number. So its beginning will be [2,3,5,7,11,...].
Also store a number mult Now, once you draw a number (let it be i) you check if mult % primes[i] == 0, if it is - the number was drawn before, if it wasn't - then the number was not. chose it and do mult = mult * primes[i].
However, it is expansive because it might require a lot of space for large ranges (the possible values of mult increases exponentially
(This is a nice mathematical approach, because we actually look at a set of primes p_i, the array of primes is only the implementation to the abstract set of primes).
A bit manipulation alternative for small values is using an int or long as a bitset.
With this approach, to check a candidate i is not in the set you only need to check:
if (pow(2,i) & set == 0) // not in the set
else //already in the set
To enter an element i to the set:
set = set | pow(2,i)
A better approach will be to populate a list with all the numbers, shuffle it with fisher-yates shuffle, and iterate it for generating new random numbers.
If I have to avoid these methods above and use some
mathematical/logical/bitwise operation to find whether a random number
has been draw before or not, is there a way?
Subject to your contrived constraints yes, you can imitate a small bitset using bitwise operations:
You can choose different integer types on the right according to what size you need.
bitset code bitwise code
std::bitset<32> x; unsigned long x = 0;
if (x[i]) { ... } if (x & (1UL << i)) { ... }
// assuming v is 0 or 1
x[i] = v; x = (x & ~(1UL << i)) | ((unsigned long)v << i);
x[i] = true; x |= (1UL << i);
x[i] = false; x &= ~(1UL << i);
For a larger set (beyond the size in bits of unsigned long long), you will need an array of your chosen integer type. Divide the index by the width of each value to know what index to look up in the array, and use the modulus for the bit shifts. This is basically what bitset does.
I'm assuming that the various answers that tell you how best to shuffle 10 numbers are missing the point entirely: that your contrived constraints are there because you do not in fact want or need to know how best to shuffle 10 numbers :-)
Keep a variable too map the drawn numbers. The i'th bit of that variable will be 1 if the number was drawn before:
int mapNumbers = 0;
int generateRand() {
if (mapNumbers & ((1 << 11) - 1) == ((1 << 11) - 1)) return; // return if all numbers have been generated
int x;
do {
x = newVal();
} while (!x & mapNumbers);
mapNumbers |= (1 << x);
return x;
}

Find unique common element from 3 arrays

Original Problem:
I have 3 boxes each containing 200 coins, given that there is only one person who has made calls from all of the three boxes and thus there is one coin in each box which has same fingerprints and rest of all coins have different fingerprints. You have to find the coin which contains same fingerprint from all of the 3 boxes. So that we can find the fingerprint of the person who has made call from all of the 3 boxes.
Converted problem:
You have 3 arrays containing 200 integers each. Given that there is one and only one common element in these 3 arrays. Find the common element.
Please consider solving this for other than trivial O(1) space and O(n^3) time.
Some improvement in Pelkonen's answer:
From converted problem in OP:
"Given that there is one and only one common element in these 3 arrays."
We need to sort only 2 arrays and find common element.
If you sort all the arrays first O(n log n) then it will be pretty easy to find the common element in less than O(n^3) time. You can for example use binary search after sorting them.
Let N = 200, k = 3,
Create a hash table H with capacity ≥ Nk.
For each element X in array 1, set H[X] to 1.
For each element Y in array 2, if Y is in H and H[Y] == 1, set H[Y] = 2.
For each element Z in array 3, if Z is in H and H[Z] == 2, return Z.
throw new InvalidDataGivenByInterviewerException();
O(Nk) time, O(Nk) space complexity.
Use a hash table for each integer and encode the entries such that you know which array it's coming from - then check for the slot which has entries from all 3 arrays. O(n)
Use a hashtable mapping objects to frequency counts. Iterate through all three lists, incrementing occurrence counts in the hashtable, until you encounter one with an occurrence count of 3. This is O(n), since no sorting is required. Example in Python:
def find_duplicates(*lists):
num_lists = len(lists)
counts = {}
for l in lists:
for i in l:
counts[i] = counts.get(i, 0) + 1
if counts[i] == num_lists:
return i
Or an equivalent, using sets:
def find_duplicates(*lists):
intersection = set(lists[0])
for l in lists[1:]:
intersection = intersection.intersect(set(l))
return intersection.pop()
O(N) solution: use a hash table. H[i] = list of all integers in the three arrays that map to i.
For all H[i] > 1 check if three of its values are the same. If yes, you have your solution. You can do this check with the naive solution even, it should still be very fast, or you can sort those H[i] and then it becomes trivial.
If your numbers are relatively small, you can use H[i] = k if i appears k times in the three arrays, then the solution is the i for which H[i] = 3. If your numbers are huge, use a hash table though.
You can extend this to work even if you can have elements that can be common to only two arrays and also if you can have elements repeating elements in one of the arrays. It just becomes a bit more complicated, but you should be able to figure it out on your own.
If you want the fastest* answer:
Sort one array--time is N log N.
For each element in the second array, search the first. If you find it, add 1 to a companion array; otherwise add 0--time is N log N, using N space.
For each non-zero count, copy the corresponding entry into the temporary array, compacting it so it's still sorted--time is N.
For each element in the third array, search the temporary array; when you find a hit, stop. Time is less than N log N.
Here's code in Scala that illustrates this:
import java.util.Arrays
val a = Array(1,5,2,3,14,1,7)
val b = Array(3,9,14,4,2,2,4)
val c = Array(1,9,11,6,8,3,1)
Arrays.sort(a)
val count = new Array[Int](a.length)
for (i <- 0 until b.length) {
val j =Arrays.binarySearch(a,b(i))
if (j >= 0) count(j) += 1
}
var n = 0
for (i <- 0 until count.length) if (count(i)>0) { count(n) = a(i); n+= 1 }
for (i <- 0 until c.length) {
if (Arrays.binarySearch(count,0,n,c(i))>=0) println(c(i))
}
With slightly more complexity, you can either use no extra space at the cost of being even more destructive of your original arrays, or you can avoid touching your original arrays at all at the cost of another N space.
Edit: * as the comments have pointed out, hash tables are faster for non-perverse inputs. This is "fastest worst case". The worst case may not be so unlikely unless you use a really good hashing algorithm, which may well eat up more time than your sort. For example, if you multiply all your values by 2^16, the trivial hashing (i.e. just use the bitmasked integer as an index) will collide every time on lists shorter than 64k....
//Begineers Code using Binary Search that's pretty Easy
// bool BS(int arr[],int low,int high,int target)
// {
// if(low>high)
// return false;
// int mid=low+(high-low)/2;
// if(target==arr[mid])
// return 1;
// else if(target<arr[mid])
// BS(arr,low,mid-1,target);
// else
// BS(arr,mid+1,high,target);
// }
// vector <int> commonElements (int A[], int B[], int C[], int n1, int n2, int n3)
// {
// vector<int> ans;
// for(int i=0;i<n2;i++)
// {
// if(i>0)
// {
// if(B[i-1]==B[i])
// continue;
// }
// //The above if block is to remove duplicates
// //In the below code we are searching an element form array B in both the arrays A and B;
// if(BS(A,0,n1-1,B[i]) && BS(C,0,n3-1,B[i]))
// {
// ans.push_back(B[i]);
// }
// }
// return ans;
// }

Algorithm to select a single, random combination of values?

Say I have y distinct values and I want to select x of them at random. What's an efficient algorithm for doing this? I could just call rand() x times, but the performance would be poor if x, y were large.
Note that combinations are needed here: each value should have the same probability to be selected but their order in the result is not important. Sure, any algorithm generating permutations would qualify, but I wonder if it's possible to do this more efficiently without the random order requirement.
How do you efficiently generate a list of K non-repeating integers between 0 and an upper bound N covers this case for permutations.
Robert Floyd invented a sampling algorithm for just such situations. It's generally superior to shuffling then grabbing the first x elements since it doesn't require O(y) storage. As originally written it assumes values from 1..N, but it's trivial to produce 0..N and/or use non-contiguous values by simply treating the values it produces as subscripts into a vector/array/whatever.
In pseuocode, the algorithm runs like this (stealing from Jon Bentley's Programming Pearls column "A sample of Brilliance").
initialize set S to empty
for J := N-M + 1 to N do
T := RandInt(1, J)
if T is not in S then
insert T in S
else
insert J in S
That last bit (inserting J if T is already in S) is the tricky part. The bottom line is that it assures the correct mathematical probability of inserting J so that it produces unbiased results.
It's O(x)1 and O(1) with regard to y, O(x) storage.
Note that, in accordance with the combinations tag in the question, the algorithm only guarantees equal probability of each element occuring in the result, not of their relative order in it.
1O(x2) in the worst case for the hash map involved which can be neglected since it's a virtually nonexistent pathological case where all the values have the same hash
Assuming that you want the order to be random too (or don't mind it being random), I would just use a truncated Fisher-Yates shuffle. Start the shuffle algorithm, but stop once you have selected the first x values, instead of "randomly selecting" all y of them.
Fisher-Yates works as follows:
select an element at random, and swap it with the element at the end of the array.
Recurse (or more likely iterate) on the remainder of the array, excluding the last element.
Steps after the first do not modify the last element of the array. Steps after the first two don't affect the last two elements. Steps after the first x don't affect the last x elements. So at that point you can stop - the top of the array contains uniformly randomly selected data. The bottom of the array contains somewhat randomized elements, but the permutation you get of them is not uniformly distributed.
Of course this means you've trashed the input array - if this means you'd need to take a copy of it before starting, and x is small compared with y, then copying the whole array is not very efficient. Do note though that if all you're going to use it for in future is further selections, then the fact that it's in somewhat-random order doesn't matter, you can just use it again. If you're doing the selection multiple times, therefore, you may be able to do only one copy at the start, and amortise the cost.
If you really only need to generate combinations - where the order of elements does not matter - you may use combinadics as they are implemented e.g. here by James McCaffrey.
Contrast this with k-permutations, where the order of elements does matter.
In the first case (1,2,3), (1,3,2), (2,1,3), (2,3,1), (3,1,2), (3,2,1) are considered the same - in the latter, they are considered distinct, though they contain the same elements.
In case you need combinations, you may really only need to generate one random number (albeit it can be a bit large) - that can be used directly to find the m th combination.
Since this random number represents the index of a particular combination, it follows that your random number should be between 0 and C(n,k).
Calculating combinadics might take some time as well.
It might just not worth the trouble - besides Jerry's and Federico's answer is certainly simpler than implementing combinadics.
However if you really only need a combination and you are bugged about generating the exact number of random bits that are needed and none more... ;-)
While it is not clear whether you want combinations or k-permutations, here is a C# code for the latter (yes, we could generate only a complement if x > y/2, but then we would have been left with a combination that must be shuffled to get a real k-permutation):
static class TakeHelper
{
public static IEnumerable<T> TakeRandom<T>(
this IEnumerable<T> source, Random rng, int count)
{
T[] items = source.ToArray();
count = count < items.Length ? count : items.Length;
for (int i = items.Length - 1 ; count-- > 0; i--)
{
int p = rng.Next(i + 1);
yield return items[p];
items[p] = items[i];
}
}
}
class Program
{
static void Main(string[] args)
{
Random rnd = new Random(Environment.TickCount);
int[] numbers = new int[] { 1, 2, 3, 4, 5, 6, 7 };
foreach (int number in numbers.TakeRandom(rnd, 3))
{
Console.WriteLine(number);
}
}
}
Another, more elaborate implementation that generates k-permutations, that I had lying around and I believe is in a way an improvement over existing algorithms if you only need to iterate over the results. While it also needs to generate x random numbers, it only uses O(min(y/2, x)) memory in the process:
/// <summary>
/// Generates unique random numbers
/// <remarks>
/// Worst case memory usage is O(min((emax-imin)/2, num))
/// </remarks>
/// </summary>
/// <param name="random">Random source</param>
/// <param name="imin">Inclusive lower bound</param>
/// <param name="emax">Exclusive upper bound</param>
/// <param name="num">Number of integers to generate</param>
/// <returns>Sequence of unique random numbers</returns>
public static IEnumerable<int> UniqueRandoms(
Random random, int imin, int emax, int num)
{
int dictsize = num;
long half = (emax - (long)imin + 1) / 2;
if (half < dictsize)
dictsize = (int)half;
Dictionary<int, int> trans = new Dictionary<int, int>(dictsize);
for (int i = 0; i < num; i++)
{
int current = imin + i;
int r = random.Next(current, emax);
int right;
if (!trans.TryGetValue(r, out right))
{
right = r;
}
int left;
if (trans.TryGetValue(current, out left))
{
trans.Remove(current);
}
else
{
left = current;
}
if (r > current)
{
trans[r] = left;
}
yield return right;
}
}
The general idea is to do a Fisher-Yates shuffle and memorize the transpositions in the permutation.
It was not published anywhere nor has it received any peer-review whatsoever. I believe it is a curiosity rather than having some practical value. Nonetheless I am very open to criticism and would generally like to know if you find anything wrong with it - please consider this (and adding a comment before downvoting).
A little suggestion: if x >> y/2, it's probably better to select at random y - x elements, then choose the complementary set.
The trick is to use a variation of shuffle or in other words a partial shuffle.
function random_pick( a, n )
{
N = len(a);
n = min(n, N);
picked = array_fill(0, n, 0); backup = array_fill(0, n, 0);
// partially shuffle the array, and generate unbiased selection simultaneously
// this is a variation on fisher-yates-knuth shuffle
for (i=0; i<n; i++) // O(n) times
{
selected = rand( 0, --N ); // unbiased sampling N * N-1 * N-2 * .. * N-n+1
value = a[ selected ];
a[ selected ] = a[ N ];
a[ N ] = value;
backup[ i ] = selected;
picked[ i ] = value;
}
// restore partially shuffled input array from backup
// optional step, if needed it can be ignored
for (i=n-1; i>=0; i--) // O(n) times
{
selected = backup[ i ];
value = a[ N ];
a[ N ] = a[ selected ];
a[ selected ] = value;
N++;
}
return picked;
}
NOTE the algorithm is strictly O(n) in both time and space, produces unbiased selections (it is a partial unbiased shuffling) and non-destructive on the input array (as a partial shuffle would be) but this is optional
adapted from here
update
another approach using only a single call to PRNG (pseudo-random number generator) in [0,1] by IVAN STOJMENOVIC, "ON RANDOM AND ADAPTIVE PARALLEL GENERATION OF COMBINATORIAL OBJECTS" (section 3), of O(N) (worst-case) complexity
Here is a simple way to do it which is only inefficient if Y is much larger than X.
void randomly_select_subset(
int X, int Y,
const int * inputs, int X, int * outputs
) {
int i, r;
for( i = 0; i < X; ++i ) outputs[i] = inputs[i];
for( i = X; i < Y; ++i ) {
r = rand_inclusive( 0, i+1 );
if( r < i ) outputs[r] = inputs[i];
}
}
Basically, copy the first X of your distinct values to your output array, and then for each remaining value, randomly decide whether or not to include that value.
The random number is further used to choose an element of our (mutable) output array to replace.
If, for example, you have 2^64 distinct values, you can use a symmetric key algorithm (with a 64 bits block) to quickly reshuffle all combinations. (for example Blowfish).
for(i=0; i<x; i++)
e[i] = encrypt(key, i)
This is not random in the pure sense but can be useful for your purpose.
If you want to work with arbitrary # of distinct values following cryptographic techniques you can but it's more complex.

Is it possible to rearrange an array in place in O(N)?

If I have a size N array of objects, and I have an array of unique numbers in the range 1...N, is there any algorithm to rearrange the object array in-place in the order specified by the list of numbers, and yet do this in O(N) time?
Context: I am doing a quick-sort-ish algorithm on objects that are fairly large in size, so it would be faster to do the swaps on indices than on the objects themselves, and only move the objects in one final pass. I'd just like to know if I could do this last pass without allocating memory for a separate array.
Edit: I am not asking how to do a sort in O(N) time, but rather how to do the post-sort rearranging in O(N) time with O(1) space. Sorry for not making this clear.
I think this should do:
static <T> void arrange(T[] data, int[] p) {
boolean[] done = new boolean[p.length];
for (int i = 0; i < p.length; i++) {
if (!done[i]) {
T t = data[i];
for (int j = i;;) {
done[j] = true;
if (p[j] != i) {
data[j] = data[p[j]];
j = p[j];
} else {
data[j] = t;
break;
}
}
}
}
}
Note: This is Java. If you do this in a language without garbage collection, be sure to delete done.
If you care about space, you can use a BitSet for done. I assume you can afford an additional bit per element because you seem willing to work with a permutation array, which is several times that size.
This algorithm copies instances of T n + k times, where k is the number of cycles in the permutation. You can reduce this to the optimal number of copies by skipping those i where p[i] = i.
The approach is to follow the "permutation cycles" of the permutation, rather than indexing the array left-to-right. But since you do have to begin somewhere, everytime a new permutation cycle is needed, the search for unpermuted elements is left-to-right:
// Pseudo-code
N : integer, N > 0 // N is the number of elements
swaps : integer [0..N]
data[N] : array of object
permute[N] : array of integer [-1..N] denoting permutation (used element is -1)
next_scan_start : integer;
next_scan_start = 0;
while (swaps < N )
{
// Search for the next index that is not-yet-permtued.
for (idx_cycle_search = next_scan_start;
idx_cycle_search < N;
++ idx_cycle_search)
if (permute[idx_cycle_search] >= 0)
break;
next_scan_start = idx_cycle_search + 1;
// This is a provable invariant. In short, number of non-negative
// elements in permute[] equals (N - swaps)
assert( idx_cycle_search < N );
// Completely permute one permutation cycle, 'following the
// permutation cycle's trail' This is O(N)
while (permute[idx_cycle_search] >= 0)
{
swap( data[idx_cycle_search], data[permute[idx_cycle_search] )
swaps ++;
old_idx = idx_cycle_search;
idx_cycle_search = permute[idx_cycle_search];
permute[old_idx] = -1;
// Also '= -idx_cycle_search -1' could be used rather than '-1'
// and would allow reversal of these changes to permute[] array
}
}
Do you mean that you have an array of objects O[1..N] and then you have an array P[1..N] that contains a permutation of numbers 1..N and in the end you want to get an array O1 of objects such that O1[k] = O[P[k]] for all k=1..N ?
As an example, if your objects are letters A,B,C...,Y,Z and your array P is [26,25,24,..,2,1] is your desired output Z,Y,...C,B,A ?
If yes, I believe you can do it in linear time using only O(1) additional memory. Reversing elements of an array is a special case of this scenario. In general, I think you would need to consider decomposition of your permutation P into cycles and then use it to move around the elements of your original array O[].
If that's what you are looking for, I can elaborate more.
EDIT: Others already presented excellent solutions while I was sleeping, so no need to repeat it here. ^_^
EDIT: My O(1) additional space is indeed not entirely correct. I was thinking only about "data" elements, but in fact you also need to store one bit per permutation element, so if we are precise, we need O(log n) extra bits for that. But most of the time using a sign bit (as suggested by J.F. Sebastian) is fine, so in practice we may not need anything more than we already have.
If you didn't mind allocating memory for an extra hash of indexes, you could keep a mapping of original location to current location to get a time complexity of near O(n). Here's an example in Ruby, since it's readable and pseudocode-ish. (This could be shorter or more idiomatically Ruby-ish, but I've written it out for clarity.)
#!/usr/bin/ruby
objects = ['d', 'e', 'a', 'c', 'b']
order = [2, 4, 3, 0, 1]
cur_locations = {}
order.each_with_index do |orig_location, ordinality|
# Find the current location of the item.
cur_location = orig_location
while not cur_locations[cur_location].nil? do
cur_location = cur_locations[cur_location]
end
# Swap the items and keep track of whatever we swapped forward.
objects[ordinality], objects[cur_location] = objects[cur_location], objects[ordinality]
cur_locations[ordinality] = orig_location
end
puts objects.join(' ')
That obviously does involve some extra memory for the hash, but since it's just for indexes and not your "fairly large" objects, hopefully that's acceptable. Since hash lookups are O(1), even though there is a slight bump to the complexity due to the case where an item has been swapped forward more than once and you have to rewrite cur_location multiple times, the algorithm as a whole should be reasonably close to O(n).
If you wanted you could build a full hash of original to current positions ahead of time, or keep a reverse hash of current to original, and modify the algorithm a bit to get it down to strictly O(n). It'd be a little more complicated and take a little more space, so this is the version I wrote out, but the modifications shouldn't be difficult.
EDIT: Actually, I'm fairly certain the time complexity is just O(n), since each ordinality can have at most one hop associated, and thus the maximum number of lookups is limited to n.
#!/usr/bin/env python
def rearrange(objects, permutation):
"""Rearrange `objects` inplace according to `permutation`.
``result = [objects[p] for p in permutation]``
"""
seen = [False] * len(permutation)
for i, already_seen in enumerate(seen):
if not already_seen: # start permutation cycle
first_obj, j = objects[i], i
while True:
seen[j] = True
p = permutation[j]
if p == i: # end permutation cycle
objects[j] = first_obj # [old] p -> j
break
objects[j], j = objects[p], p # p -> j
The algorithm (as I've noticed after I wrote it) is the same as the one from #meriton's answer in Java.
Here's a test function for the code:
def test():
import itertools
N = 9
for perm in itertools.permutations(range(N)):
L = range(N)
LL = L[:]
rearrange(L, perm)
assert L == [LL[i] for i in perm] == list(perm), (L, list(perm), LL)
# test whether assertions are enabled
try:
assert 0
except AssertionError:
pass
else:
raise RuntimeError("assertions must be enabled for the test")
if __name__ == "__main__":
test()
There's a histogram sort, though the running time is given as a bit higher than O(N) (N log log n).
I can do it given O(N) scratch space -- copy to new array and copy back.
EDIT: I am aware of the existance of an algorithm that will proceed through. The idea is to perform the swaps on the array of integers 1..N while at the same time mirroring the swaps on your array of large objects. I just cannot find the algorithm right now.
The problem is one of applying a permutation in place with minimal O(1) extra storage: "in-situ permutation".
It is solvable, but an algorithm is not obvious beforehand.
It is described briefly as an exercise in Knuth, and for work I had to decipher it and figure out how it worked. Look at 5.2 #13.
For some more modern work on this problem, with pseudocode:
http://www.fernuni-hagen.de/imperia/md/content/fakultaetfuermathematikundinformatik/forschung/berichte/bericht_273.pdf
I ended up writing a different algorithm for this, which first generates a list of swaps to apply an order and then runs through the swaps to apply it. The advantage is that if you're applying the ordering to multiple lists, you can reuse the swap list, since the swap algorithm is extremely simple.
void make_swaps(vector<int> order, vector<pair<int,int>> &swaps)
{
// order[0] is the index in the old list of the new list's first value.
// Invert the mapping: inverse[0] is the index in the new list of the
// old list's first value.
vector<int> inverse(order.size());
for(int i = 0; i < order.size(); ++i)
inverse[order[i]] = i;
swaps.resize(0);
for(int idx1 = 0; idx1 < order.size(); ++idx1)
{
// Swap list[idx] with list[order[idx]], and record this swap.
int idx2 = order[idx1];
if(idx1 == idx2)
continue;
swaps.push_back(make_pair(idx1, idx2));
// list[idx1] is now in the correct place, but whoever wanted the value we moved out
// of idx2 now needs to look in its new position.
int idx1_dep = inverse[idx1];
order[idx1_dep] = idx2;
inverse[idx2] = idx1_dep;
}
}
template<typename T>
void run_swaps(T data, const vector<pair<int,int>> &swaps)
{
for(const auto &s: swaps)
{
int src = s.first;
int dst = s.second;
swap(data[src], data[dst]);
}
}
void test()
{
vector<int> order = { 2, 3, 1, 4, 0 };
vector<pair<int,int>> swaps;
make_swaps(order, swaps);
vector<string> data = { "a", "b", "c", "d", "e" };
run_swaps(data, swaps);
}

Resources