Permutation of a vector - algorithm

suppose I have a vector:
0 1 2 3 4 5
And a permutation of its indices:
Is there an efficient way to resort it according to the permutation thus obtaining:
Using at most O(1) additional space?

Yes. Starting from the leftmost position, we put the element there in its correct position i by swapping it with the (other) misplaced element at that position i. This is where we need the O(1) additional space. We keep swapping pairs of elements around until the element in this position is correct. Only then do we proceed to the next position and do the same thing.
[5 3 2 1 0 4] initial state
[4 3 2 1 0 5] swapped (5,4), 5 is now in the correct position, but 4 is still wrong
[0 3 2 1 4 5] swapped (4,0), now both 4 and 0 are in the correct positions, move on to next position
[0 1 2 3 4 5] swapped (3,1), now 1 and 3 are both in the correct positions, move on to next position
[0 1 2 3 4 5] all elements are in the correct positions, end.
Since each swap operation puts at least one (of the two) elements in the correct position, we need no more than N such swaps altogether.

Zach's solution is very good.
Still, I was wondering why there is any need to sort. If you have the permutation of the indices, use the values as a pointer to the old array.
This may eliminate the need to sort the array in the first place. This is not a solution that can be used in all cases, but it will work fine in most cases.
For example:
a = [45,89,22,31,23,76];
b = [5,3,2,1,0,4]
Now if you want to lop through the values in a, you can do something like (pseudo-code):
for i=0 to 4
If you want to loop through the values in the new order, do:
for i=0 to 4
As mentioned earlier, this soluion may be sufficient in many cases, but may not in some other cases. For other cases you can use the solution by Zach.But for the cases where this solution can be used, it is better because no sorting is needed at all.


Shell sort algorithm: shift or swap?

I read that Shell algorithm is an improved version of insertion sort, but I also read online sometimes it is about shifting, and sometimes I read it is about swapping, which one is correct?
For example: [5 3 2 10 0]
If we take the gap to be 2, then we will compare first 5 and 2, as a first step then, the result will be:
[2 3 5 10 0] by swapping and [2 5 3 10 0] by shifting, which one is Shell algorithm?
The main principle in Shell sort is that with the chosen gap we look at the data as a collection of interleaved, shorter arrays. Each of those shorter arrays has their first entry at an index less than gap. These shorter arrays are sorted independently. Once that is done, the gap is reduced.
In the example, there are two interleaved arrays, which we can picture like this:
interleaved array: 5 2 0
interleaved array: 3 10
The first algorithm would fit under Shell sort. But the second one does not sort the interleaved arrays independently, as such rotations (shifts) move values from one interleaved array to another:
interleaved array: 5 🠔 2 0
⬊ ⬈
interleaved array: 3 10
...resulting in:
interleaved array: 2 3 0
interleaved array: 5 10
Unless other precautions are made, the second algorithm will not ensure a rotation improves the situation. For instance, if the input is [3 1 2 4] and gap is 2, then the comparison of 3 and 2 will lead to a rotation, and we get [2 3 1 4]. But now we still have two values in the first interleaved array that are not in order (2 is greater than 1).
Shifting does not occur like you depicted it (crossing multiple interleaved arrays), but within one interleaved array it is generally done, just like it is done in insertion sort. So to apply that to your example:
interleaved array: 5 2 0
interleaved array: 3 10
The value 2 is picked up and preceding values are shifted forward within the same interleaved array until the right slot is found for the picked up value. In this case only one value is shifted (5), which makes it a swap:
interleaved array: 2 5 0
interleaved array: 3 10
Now 0 is picked up, and two values are shifted (2 and 5):
interleaved array: 0 2 5
interleaved array: 3 10
Now the first interleaved array is sorted. The second interleaved array happens to be sorted already. Then the gap is reduced to 1:
array: 0 3 2 10 5
Here 2 is picked up and one value (3) is shifted:
array: 0 2 3 10 5
Finally 5 is picked up and one value (10) is shifted:
array: 0 2 3 5 10

Practical algorithms for permuting external memory

On a spinning disk, I have N records that I want to permute. In RAM, I have an array of N indices that contain the desired permutation. I also have enough RAM to hold n records at a time. What algorithm can I use to execute the permutation on disk as quickly as possible, taking into account the fact that sequential disk access is a lot faster?
I have plenty of excess disk to use for intermediate files, if desired.
This is a known problem. Find the cycles in your permutation order. For instance, given five records to permute [1, 0, 3, 4, 2], you have cycles (0, 1) and (2, 3, 4). You do this by picking an unused starting position; follow the index pointers until you return to your starting point. The sequence of pointers describes a cycle.
You then permute the records with an internal temporary variable, one record long.
temp = disk[0]
disk[0] = disk[1]
disk[1] = temp
temp = disk[2]
disk[2] = disk[3]
disk[3] = disk[4]
disk[4] = temp
Note that you can also perform the permutation as you traverse the pointers. You will also need some method to recall which positions have already been permuted, such as clearing the permutation index (set it to -1).
Can you see how to generalize that?
This is an problem with interval coordination. I'll simplify the notation slightly by changing the memory available to M records -- having upper- and lower-case N is a little confusing.
First, we re-cast the permutations as a series of intervals, the rotational span during which a record needs to reside in RAM. If a record needs to be written to a lower-numbered position, we increase the endpoint by the list size, to indicate the wraparound -- have to wait for the next disk rotation. For instance, using my earlier example, we expand the list:
[1, 0, 3, 4, 2]
0 -> 1
1 -> 0+5
2 -> 3
3 -> 4
4 -> 2+5
Now, we apply standard greedy scheduling resolution. First, sort by endpoint:
[0, 1]
[2, 3]
[3, 4]
[1, 5]
[4, 7]
Now, apply the algorithm for M-1 "lanes"; the extra one is needed for swap space. We fill each lane, appending the interval with the earliest endpoint, whose start-point doesn't overlap:
[0, 1] [2, 3] [3, 4] [4, 7]
[1, 5]
We can do this in a total of 7 "ticks" if M >= 3. If M=2, we defer the second lane by 2 rotations to [11, 15].
Sneftal's nice example gives us more troubles, with deeper overlap:
[0, 4]
[1, 5]
[2, 6]
[3, 7]
[4, 0+8]
[5, 1+8]
[6, 2+8]
[7, 3+8]
This requires 4 "lanes" if available, deferring lanes as needed if M < 5.
The pathological case is where every record in the permutation needs to be copied back one position, such as [3, 0, 1, 2], with M=2.
[0, 3]
[1, 4]
[2, 5]
[3, 6]
In this case, we walk through the deferral cycle multiple times. At the end of every rotation, we have to defer all remaining intervals by one rotation, resulting in
[0, 3] [3, 6] [2+4, 5+4] [1+4+4, 4+4+4]
Does that get you moving, or do you need more detail?
I have an idea, which might need further improvement. But here it goes:
suppose the hdd has the following structure:
5 4 1 2 3
And we want to write out this permutation:
2 3 5 1 4
Since hdd is a circular buffer, and assuming it can only rotate in one direction, we can write the above permutation using shifts as such:
5 >> 2
4 >> 3
1 >> 1
2 >> 2
3 >> 2
So let's put that in an array, and since we know it is a circular array, lets put its mirrors side by side:
| 2 3 1 2 2 | 2 3 1 2 2| 2 3 1 2 2 | 2 3 1 2 2 |... Inf
Since we want to favor sequential reads, (or writes) we can put a cost function to the above series. Let the cost function be linear, i. e:
0 1 2 3 4 5 6 7 8 9 10 ... Inf
Now, let us add the cost function to the above series, but how to select the starting point?
The idea is to select the starting point such that you get the maximum congruent monotonically increasing sequence.
For example, if you select the 0 point to be on "3", you'll get
(1) | - 3 2 4 5 | 6 8 7 9 10 | ...
If you select the 0 point to be on "2", the one just right of "1", you'll get:
(2) | - - - 2 3 | 4 6 5 7 8 | ...
Since we are trying to favor consecutive reads, lets define our read-write function to work as such:
At any currently pointed hdd location, function will read the currently pointed hdd file, into available RAM. (namely, total space - 1, because we want to save 1 for swap)
If no available space is left on RAM for read, the function will assert and program will halt.
At any current hdd location, if ram holds the value that we want to be written in that hdd location, function reads the current file into swap space, writes the wanted value from the ram to hdd, and destroys the value in ram.
If a value is placed into hdd, function will check if the sequence is completed. If it is, program will return with success.
Now, we should note that if the following holds:
shift amount <= n - 1 (n : available memory we can hold)
We can traverse the hard disk in once pass using the above function. For example:
current: 4 5 6 7 0 1 2 3
we want: 0 1 2 3 4 5 6 7
n : 5
We can start anywhere we want, say from the initial "4". We read 4 items sequentially, (n has 4 items now) and we start placing from 0 1 2 3, (we can because n = 5 total, and 4 is used. 1 is used for swap). So the total operations is 4 consecutive reads, and then r-w operations for 8 times.
Using that analogy, it becomes clear that if we subtract "n-1" from equations (1) and (2), the positions which have value "<= 0" will be a better suit for initial position because the ones higher than zero will definitely require another pass.
So we select eq. (2) and subtract, for let's say "n = 3", we subtract 2 from eq. (2):
(2) | - - - 0 1 | 2 4 3 5 6 | ...
Now it is clear that, using f(), and starting from 0, assuming n = 3, we will have a starting operation as such: r, r, r-w, r-w, ...
So, how do we do the rest and find minimum cost? We will place an array with initial minimum cost, just below equation (2). The positions in that array will signify where we want f() to be executed.
| - - - 0 1 | 2 4 3 5 6 | ...
| - - - 1 1 | 1 1 1 1 1 | ...
The second array, the ones with 1's and 0's tell the program where to execute f(). Note that, if we assumed those locations wrong, f() will assert.
Before we start actually placing files into hdd, we of course want to see if the f() positions are correct. We check if there are assertions, we we will try to minimize cost whilst removing all assertions. So, e.g:
(1) 1111000000000000001111
(2) 1111111000000000000000
(1) obviously has higher cost that (2). So the question simplifies on finding the 1-0 array.
Some ideas on finding the best array:
Simplest solution is to write out all 1's and turn assertions into 0's. (essentially it's a skip). This method is guaranteed to work.
Brute force: write an array of as shown in (2) and start shifting 1's to right, in such an order that tries out every permutation available:
Full random approach: Plug in mt1997 and start permuting. Whenever you see a sharp drop in cost, stop executing and implement hdd copy-paste. You won't find the global minimum, but you'll get a nice trade-off.
Genetic algorithms: For permutations where "shift count is much lower than n - 1", the methodology provided in this answer should (?) provide a global minimum and smooth gradients. This allows one to use genetic algorithms without relying on mutations too much.
One advantage I find in this approach is that, since OP mentioned that this is a real life problem, the method provides an easy(ier?) way to change cost functions. It is easier to detect the effect of say, having lots of contigous small files to be copied vs. having a single huge file. Or perhaps rrwwrrww is better than rrrrwwww?
Does any of this even make sense? We will have to try out ...

Data structure to handle numerous queries on large size array

Given q queries of the following form. A list is there.
1 x y: Add number x to the list y times.
2 n: find the nth number of the sorted list
1 <= q <= 5 * 100000
1 <= x, y <= 1000000000
1 <= n < length of list
1 3 6
1 5 2
2 7
2 4
This is a competitive programming problem that it's too early in the morning for me to solve right now, but I can try and give some pointers.
If you were to store the entire array explicitly, it would obviously blow out your memory. But you can exploit the structure of the array to instead store the number of times each entry appears in the array. So if you got the query
1 3 5
then instead of storing [3, 3, 3], you'd store the pair (3, 5), indicating that the number 3 is in the list 5 times.
You can pretty easily build this, perhaps as a vector of pairs of ints that you update.
The remaining task is to implement the 2 query, where you find an element by its index. A side effect of the structure we've chosen is that you can't directly index into that vector of pairs of ints, since the indices in that list don't match up with the indices into the hypothetical array. We could just add up the size of each entry in the vector from the start until we hit the index we want, but that's O(n^2) in the number of queries we've processed so far... likely too slow. Instead, we probably want some updatable data structure for prefix sums—perhaps as described in this answer.

How to perform range updates in sqrt{n} time?

I have an array and I have to perform query and updates on it.
For queries, I have to find frequency of a particular number in a range from l to r and for update, I have to add x from some range l to r.
How to perform this?
I thought of sqrt{n} optimization but I don't know how to perform range updates with this time complexity.
Edit - Since some people are asking for an example, here is one
Suppose the array is of size n = 8
and it is
1 3 3 4 5 1 2 3
And there are 3 queries to help everybody explain about what I am trying to say
Here they are
q 1 5 3 - This means that you have to find the frequency of 3 in range 1 to 5 which is 2 as 3 appears on 2nd and 3rd position.
second is update query and it goes like this - u 2 4 6 -> This means that you have to add 6 in the array from range 2 to 4. So the new array will become
1 9 9 10 5 1 2 3
And the last query is again the same as first one which will now return 0 as there is no 3 in the array from position 1 to 5 now.
I believe things must be more clear now. :)
I developed this algorithm long time (20+ years) ago for Arithmetic coder.
Both Update and Retrieve are performed in O(log(N)).
I named this algorithm "Method of Intervals". Let I show you the example.
Imagine, we have 8 intervals, with numbers 0-7:
Lets we create additional set of intervals, each spawns pair of original ones:
Thereafter, we'll create the extra one layer of intervals, spawn pairs of 2nd:
And at last, we create single interval, covers all 8:
As you see, in this structure, to retrieve right border of the interval [5], you needed just add together length of intervals [0123] + [45]. to retrieve left border of the interval [5], you needed sum of length the intervals [0123] + [4] (left border for 5 is right border for 4).
Of course, left border of the interval [0] is always = 0.
When you'll watch this proposed structure carefully, you will see, the odd elements in the each layers aren't needed. I say, you do not needed elements 1, 3, 5, 7, 23, 67, 4567, since these elements aren't used, during Retrieval or Update.
Lets we remove the odd elements and make following remuneration:
As you see, with this remuneration, used the numbers [1-8]. Lets they will be array indexes. So, you see, there is used memory O(N).
To retrieve right border of the interval [7], you needed add length of the values with indexes 4,6,7. To update length of the interval [7], you needed add difference to all 3 of these values. As result, both Retrieval and Update are performed for Log(N) time.
Now is needed algorithm, how by the original interval number compute set of indexes in this data structure. For instance - how to convert:
1 -> 1
2 -> 2
3 -> 3,2
7 -> 7,6,4
This is easy, if we will see binary representation for these numbers:
1 -> 1
10 -> 10
11 -> 11,10
111 -> 111,110,100
As you see, in the each chain - next value is previous value, where rightmost "1" changed to "0". Using simple bit operation "x & (x - 1)", we can wtite a simple loop to iterate array indexes, related to the interval number:
int interval = 7;
do {
int index = interval;
} while(interval &= interval - 1);

Cycle sort Algorithm

I was browsing through the internet when i found out that there is an algorithm called cycle sort which makes the least number of memory writes.But i am not able to find the algorithm anywhere.How to detect whether a cycle is there or not in an array?
Can anybody give a complete explanation for this algorithm?
The cycle sort algorithm is motivated by something called a cycle decomposition. Cycle decompositions are best explained by example. Let's suppose that you have this array:
4 3 0 1 2
Let's imagine that we have this sequence in sorted order, as shown here:
0 1 2 3 4
How would we have to shuffle this sorted array to get to the shuffled version? Well, let's place them side-by-side:
0 1 2 3 4
4 3 0 1 2
Let's start from the beginning. Notice that the number 0 got swapped to the position initially held by 2. The number 2, in turn, got swapped to the position initially held by 4. Finally, 4 got swapped to the position initially held by 0. In other words, the elements 0, 2, and 4 all were cycled forward one position. That leaves behind the numbers 1 and 3. Notice that 1 swaps to where 3 is and 3 swaps to where 1 is. In other words, the elements 1 and 3 were cycled forward one position.
As a result of the above observations, we'd say that the sequence 4 3 0 1 2 has cycle decomposition (0 2 4)(1 3). Here, each group of terms in parentheses means "circularly cycle these elements forward." This means to cycle 0 to the spot where 2 is, 2 to the spot where 4 is, and 4 to the spot where 0 was, then to cycle 1 to the spot where 3 was and 3 to the spot where 1 is.
If you have the cycle decomposition for a particular array, you can get it back in sorted order making the fewest number of writes by just cycling everything backward one spot. The idea behind cycle sort is to try to determine what the cycle decomposition of the input array is, then to reverse it to put everything back in its place.
Part of the challenge of this is figuring out where everything initially belongs since a cycle decomposition assumes you know this. Typically, cycle sort works by going to each element and counting up how many elements are smaller than it. This is expensive - it contributes to the Θ(n2) runtime of the sorting algorithm - but doesn't require any writes.
here's a python implementation if anyone needs
def cycleSort(vector):
writes = 0
# Loop through the vector to find cycles to rotate.
for cycleStart, item in enumerate(vector):
# Find where to put the item.
pos = cycleStart
for item2 in vector[cycleStart + 1:]:
if item2 < item:
pos += 1
# If the item is already there, this is not a cycle.
if pos == cycleStart:
# Otherwise, put the item there or right after any duplicates.
while item == vector[pos]:
pos += 1
vector[pos], item = item, vector[pos]
writes += 1
# Rotate the rest of the cycle.
while pos != cycleStart:
# Find where to put the item.
pos = cycleStart
for item2 in vector[cycleStart + 1:]:
if item2 < item:
pos += 1
# Put the item there or right after any duplicates.
while item == vector[pos]:
pos += 1
vector[pos], item = item, vector[pos]
writes += 1
return writes
x = [0, 1, 2, 2, 2, 2, 1, 9, 3.5, 5, 8, 4, 7, 0, 6]
w = cycleSort(x)
print w, x
