Judgecode -- Sort with swap (2) - algorithm

The problem I've seen is as bellow, anyone has some idea on it?
http://judgecode.com/problems/1011
Given a permutation of integers from 0 to n - 1, sorting them is easy. But what if you can only swap a pair of integers every time?
Please calculate the minimal number of swaps

One classic algorithm seems to be permutation cycles (https://en.wikipedia.org/wiki/Cycle_notation#Cycle_notation). The number of swaps needed equals the total number of elements subtracted by the number of cycles.
For example:
1 2 3 4 5
2 5 4 3 1
Start with 1 and follow the cycle:
1 down to 2, 2 down to 5, 5 down to 1.
1 -> 2 -> 5 -> 1
3 -> 4 -> 3
We would need to swap index 1 with 5, then index 5 with 2; as well as index 3 with index 4. Altogether 3 swaps or n - 2. We subtract n by the number of cycles since cycle elements together total n and each cycle represents a swap less than the number of elements in it.

Here is a simple implementation in C for the above problem. The algorithm is similar to User גלעד ברקן:
Store the position of every element of a[] in b[]. So, b[a[i]] = i
Iterate over the initial array a[] from left to right.
At position i, check if a[i] is equal to i. If yes, then keep iterating.
If no, then it's time to swap. Look at the logic in the code minutely to see how the swapping takes place. This is the most important step as both array a[] and b[] needs to be modified. Increase the count of swaps.
Here is the implementation:
long long sortWithSwap(int n, int *a) {
int *b = (int*)malloc(sizeof(int)*n); //create a temporary array keeping track of the position of every element
int i,tmp,t,valai,posi;
for(i=0;i<n;i++){
b[a[i]] = i;
}
long long ans = 0;
for(i=0;i<n;i++){
if(a[i]!=i){
valai = a[i];
posi = b[i];
a[b[i]] = a[i];
a[i] = i;
b[i] = i;
b[valai] = posi;
ans++;
}
}
return ans;
}

The essence of solving this problem lies in the following observation
1. The elements in the array do not repeat
2. The range of elements is from 0 to n-1, where n is the size of the array.
The way to approach
After you have understood the way to approach the problem ou can solve it in linear time.
Imagine How would the array look like after sorting all the entries ?
It will look like arr[i] == i, for all entries . Is that convincing ?
First create a bool array named FIX, where FIX[i] == true if ith location is fixed, initialize this array with false initially
Start checking the original array for the match arr[i] == i, till the time this condition holds true, eveything is okay. While going ahead with traversal of array also update the FIX[i] = true. The moment you find that arr[i] != i you need to do something, arr[i] must have some value x such that x > i, how do we guarantee that ? The guarantee comes from the fact that the elements in the array do not repeat, therefore if the array is sorted till index i then it means that the element at position i in the array cannot come from left but from right.
Now the value x is essentially saying about some index , why so because the array only has elements till n-1 starting from 0, and in the sorted arry every element i of the array must be at location i.
what does arr[i] == x means is that , not only element i is not at it's correct position but also the element x is missing from it's place.
Now to fix ith location you need to look at xth location, because maybe xth location holds i and then you will swap the elements at indices i and x, and get the job done. But wait, it's not necessary that the index x will hold i (and you finish fixing these locations in just 1 swap). Rather it may be possible that index x holds value y, which again will be greater than i, because array is only sorted till location i.
Now before you can fix position i , you need to fix x, why ? we will see later.
So now again you try to fix position x, and then similarly you will try fixing till the time you don't see element i at some location in the fashion told .
The fashion is to follow the link from arr[i], untill you hit element i at some index.
It is guaranteed that you will definitely hit i at some location while following in this way . Why ? try proving it, make some examples, and you will feel it
Now you will start fixing all the index you saw in the path following from index i till this index (say it j). Now what you see is that the path which you have followed is a circular one and for every index i, the arr[i] is tored at it's previous index (index from where you reached here), and Once you see that you can fix the indices, and mark all of them in FIX array to be true. Now go ahead with next index of array and do the same thing untill whole array is fixed..
This was the complete idea, but to only conunt no. of swaps, you se that once you have found a cycle of n elements you need n swaps, and after doing that you fix the array , and again continue. So that's how you will count the no. of swaps.
Please let me know if you have some doubts in the approach .
You may also ask for C/C++ code help. Happy to help :-)

Related

Algorithm to sort an Array, in which every element is 10 positions away from where it should be

What is the most efficient sorting algorithm to sort an Array, that has n elements and EVERY element originally is 10 position away from its position after sorting?
I am thinking about insertion sort, but I have no clue how to proof that:
(1) It is the most efficient.
(2) The algorithm needs in worst case O(n) steps to sort the Array.
A self-conceived example: [10,11,12,13,14,15,16,17,18,19,0,1,2,3,4,5,6,7,8,9,]
With these constraints there are not that many possibilities:
The value at index 0 must go to index 10 as that is the only index that is 10 positions away from index 0. And which value can move to index 0? It can only be the value that is currently at index 10. So it's a swap between indexes 0 and 10.
With the same reasoning the value at index 1 will swap with the value at index 11, and 2 with 12, 3 with 13, ... 9 with 19.
So now we have covered all indices in the range 0..19. No values outside this range will get into this range, nor will any value in this range move out of it. All movements involving these indices are already defined above.
We can repeat the same reasoning for indices in the range 20..39, and again from positions 40..59, ...etc
So we can conclude:
The array's size is necessarily a multiple of 20
Only one permutation is possible that abides to the given rules
The solution is therefore simple.
Solution in pseudo code:
sort(A):
for i = 0 to size(A) - 1 step 20:
swap A[i+0..i+9] with A[i+10..i+19]
In some languages the swap of such array slices can be done very efficiently.
When you say 10 positions away, the actual position could be i - 10 or i + 10.
So, just make a temporary copy of the array and take minimums for each 10 index positions away.
This is because the only clash we can assume is some index going for +10 and another index going for -10 for some same index j. So taking minimums will install correct value at the index j.
private static void solve(int[] arr){
int[] temp = new int[arr.length];
Arrays.fill(temp,Integer.MAX_VALUE);
for(int i=0;i<arr.length;++i){
if(i - 10 >= 0) temp[i - 10] = Math.min(temp[i - 10],arr[i]);
if(i + 10 < temp.length) temp[i + 10] = Math.min(temp[i + 10],arr[i]);
}
for(int i=0;i<arr.length;++i) arr[i] = temp[i];
}

Algorithm - Find the first missing integer in the sequence of integers

Find the first missing integer in the sequence of integers
[4,5,1,2,6,7] missing is 3
Then when there is repeated integers
[1,2,2,2,5,8,9] still missing 3
When you also have negative
[-2,0, 1,2,] missing -1
[1,2,3,4,5] missing 6 or 0
Can anyone help me find a good algorithm to cover all these cases. I have an algorithm which covers first 2 cases but not sure how to cover all the cases in effective manner.
What I consider the classic O(n) solution for this problem is to rely on the fact that the array can contain at most N unique numbers, where N is the input's length. Therefore the range for our record is restricted to N.
Since you seem to allow the expected sequence to start anywhere, including negative numbers, we can start by iterating once over the array and recording, L, the lowest number seen. Now use L as an offset so that 0 + L equals the first number we expect to be present.
Initialise an array record of length (N + 1) and set each entry to false. Iterate over the input and for each entry, A[i], if (A[i] - L) is not greater than N, set record[ A[i] - L ] to true. For example:
[-2, 0, 1, 2] ->
N = 4
L = -2
-2 -> -2 - (-2) = 0
-> record[0] = true
0 -> 0 - (-2) = 2
-> record[2] = true
1 -> 1 - (-2) = 3
-> record[3] = true
2 -> 2 - (-2) = 4
-> record[4] = true
record -> [true, false, true, true, true]
Now iterate over the record. Output the first entry at index i that is set to false as i + L. In our example above, this would be:
record[1] is false
output: 1 + (-2) -> -1
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>
int main()
{
int n;
scanf("%d",&n);
int a[n],i=0;
//Reading elements
for(i=0;i<n;i++){
scanf("%d",&a[i]);
}
int min=__INT_MAX__,max=0;
//Finding the minimun and maximum from given elements
for(i=0;i<n;i++){
if(a[i]>max)
max=a[i];
if(a[i]<min)
min=a[i];
}
int len=max-min,diff=0-min,miss;
int b[len];
//Creating a new array and assigning 0
for(i=0;i<len;i++)
b[i]=0;
//The corresponding index value is incremented based on the given numbers
for(i=0;i<n;i++){
b[a[i]+diff]++;
}
//Finding the missed value
for(i=0;i<len;i++){
if(b[i]==0){
miss=i-diff;
break;
}
}
printf("%d",miss);
}
Code Explanation:
1.Find the minimum and maximum in the given numbers.
2.Create an count array of size (maximum-minimum) and iniatizing to 0, which maintains the count of the given numbers.
3.Now by iterating, for each given element increment the corresponding index by 1.
4.Finally iterate through the count array and find the first missing number.
This might help you in solving your problem. Correct me if i'm wrong.
I think, it will be easy to solve sort of problems using data-structure like TreeMap in JAVA, e.g:
treeMap.put(array[i], treeMap.get(array[i]) == null ? 1 : treeMap.get(array[i]) + 1);
So, you are putting key and value to the TreeMap the key represent the digit itself e.g, 1,2,3... and the value represent the occurrence times.
Thus, and by taking advantage of this data-structure (Sort elements for us) you can loop through this data-structure and check which key is missing in the sequence, e.g:
for key in treeMap
if(key > currentIndex) // this is the missing digit
if(loop-completed-without-missing-key) // it's not in the array.
Add the numbers to a running array and keep them sorted.
You may also have optional minimum and maximum bounds for the array (to handle your third case, "6 is missing even if not in array"
On examination of a new number:
- try inserting it in the sorting array.
- already present: discard
- below minimum or above maximum: nullify minimum or maximum accordingly
- otherwise add in proper position.
To handle an array: sort it, compare first and last elements to expected minimum / maximum. Nullify minimum if greater than first element, nullify maximum if smaller than last element.
There might be a special case if minimum and maximum are both above first or both above last:
min=5 max=8 array = [ 10, 11, 13 ]
Here 5, 6, 7, 8 and 12 are missing, but what about 9? Should it be considered missing?
When checking for missing numbers include:
- if minimum is not null, all numbers from minimum to first element.
- if maximum is not null, all numbers from last element to maximum.
- if (last - first) = number of elements, no numbers are missing
(total numbers examined minus array size is duplicate count)
- otherwise walk the array and report all missing numbers: when
checking array[i], if array[i]-array[i-1] != 1 you have a gap.
only "first" missing
You still have to manage the whole array even if you're only interested in one missing number. For if you discarded part of the array, and the missing number arrived, then the new missing number might well have been in the discarded part of the array.
However you might keep trace of what the smallest missing number is, and recalculate with cost of o(log n) only when/if it arrives; then you'd be able to tell which is it in o(1) time. To quickly zero on that missing number, consider that there is a gap between arr[i] and arr[j] iff arr[j]-arr[i] > j-i.
So you can use the bisection method: start with i = first, j = last; if gap(i,j) then c = ceil(i+j)/2. If gap(i, c) then j = c, else i = c, and repeat until j-i = 1. At that point arr[i]+1 is your smallest missing number.

Divide an odd size array into into two equal sets of same size and same sum after deleting any one element from the array

Given an array of odd size. You have to delete any one element from the array and then find whether it is possible to divide the remaining even size array into two sets of equal size and having same sum of their elements. It is mandatory to remove any one element from the array.
So Here I am assuming that it is necessary to remove 1 element from the array.
Please look at the code snippet below.
int solve(int idx, int s, int cntr, int val) {
if(idx == n)
if(cntr != 1)
return INT_MAX;
else
return abs((sum-val)-2*s);
int ans = INT_MAX;
if(cntr == 0)
ans = min(ans, solve(idx+1, s, cntr+1, arr[idx]));
else
ans = min(ans, min(solve(idx+1,s+arr[idx], cntr, val), solve(idx+1, s, cntr, val)));
return ans;
}
Here sum is the total sum of original array,
val is the
value of the element at any position which u want to delete, and cntr to keep track whether any value is removed from the array or not.
So the algo goes like this.
Forget that you need to delete any value, Then the problem becomes whether is it possible to divide the array into 2 equi-sum halves. Now we can think of this problem such as divide the array into 2 parts such that abs(sum-2*sum_of_any_half_part) is minimized. So With this idea Lets say I initially have a bucket s which can be the part of array which we are concerned about. So at each step we can either put any element into this part or leave it for the other part.
Now if we introduce the deletion part in to this problem, its just one small changes which is required. Now at each step instead of 2 you have 3 options.
To delete this particular element and then increase the cntr to 1 and the val to the value of the element at that index in the array.
don't do any thing with this element. This is equal to putting this element into other bucket/half
put this element into bucket s, i.e. increase value of s by arr[idx];
Now recursively check which gives the best result.
P.S. Look at the base case in the code snippet to have better idea.
In the end if the above solve function gives ans = 0 then that means yes we can divide the array into 2 equi-sum parts after deleting any element.
Hope this helps.

Disperse Duplicates in an Array

Source : Google Interview Question
Write a routine to ensure that identical elements in the input are maximally spread in the output?
Basically, we need to place the same elements,in such a way , that the TOTAL spreading is as maximal as possible.
Example:
Input: {1,1,2,3,2,3}
Possible Output: {1,2,3,1,2,3}
Total dispersion = Difference between position of 1's + 2's + 3's = 4-1 + 5-2 + 6-3 = 9 .
I am NOT AT ALL sure, if there's an optimal polynomial time algorithm available for this.Also,no other detail is provided for the question other than this .
What i thought is,calculate the frequency of each element in the input,then arrange them in the output,each distinct element at a time,until all the frequencies are exhausted.
I am not sure of my approach .
Any approaches/ideas people .
I believe this simple algorithm would work:
count the number of occurrences of each distinct element.
make a new list
add one instance of all elements that occur more than once to the list (order within each group does not matter)
add one instance of all unique elements to the list
add one instance of all elements that occur more than once to the list
add one instance of all elements that occur more than twice to the list
add one instance of all elements that occur more than trice to the list
...
Now, this will intuitively not give a good spread:
for {1, 1, 1, 1, 2, 3, 4} ==> {1, 2, 3, 4, 1, 1, 1}
for {1, 1, 1, 2, 2, 2, 3, 4} ==> {1, 2, 3, 4, 1, 2, 1, 2}
However, i think this is the best spread you can get given the scoring function provided.
Since the dispersion score counts the sum of the distances instead of the squared sum of the distances, you can have several duplicates close together, as long as you have a large gap somewhere else to compensate.
for a sum-of-squared-distances score, the problem becomes harder.
Perhaps the interview question hinged on the candidate recognizing this weakness in the scoring function?
In perl
#a=(9,9,9,2,2,2,1,1,1);
then make a hash table of the counts of different numbers in the list, like a frequency table
map { $x{$_}++ } #a;
then repeatedly walk through all the keys found, with the keys in a known order and add the appropriate number of individual numbers to an output list until all the keys are exhausted
#r=();
$g=1;
while( $g == 1 ) {
$g=0;
for my $n (sort keys %x)
{
if ($x{$n}>1) {
push #r, $n;
$x{$n}--;
$g=1
}
}
}
I'm sure that this could be adapted to any programming language that supports hash tables
python code for algorithm suggested by Vorsprung and HugoRune:
from collections import Counter, defaultdict
def max_spread(data):
cnt = Counter()
for i in data: cnt[i] += 1
res, num = [], list(cnt)
while len(cnt) > 0:
for i in num:
if num[i] > 0:
res.append(i)
cnt[i] -= 1
if cnt[i] == 0: del cnt[i]
return res
def calc_spread(data):
d = defaultdict()
for i, v in enumerate(data):
d.setdefault(v, []).append(i)
return sum([max(x) - min(x) for _, x in d.items()])
HugoRune's answer takes some advantage of the unusual scoring function but we can actually do even better: suppose there are d distinct non-unique values, then the only thing that is required for a solution to be optimal is that the first d values in the output must consist of these in any order, and likewise the last d values in the output must consist of these values in any (i.e. possibly a different) order. (This implies that all unique numbers appear between the first and last instance of every non-unique number.)
The relative order of the first copies of non-unique numbers doesn't matter, and likewise nor does the relative order of their last copies. Suppose the values 1 and 2 both appear multiple times in the input, and that we have built a candidate solution obeying the condition I gave in the first paragraph that has the first copy of 1 at position i and the first copy of 2 at position j > i. Now suppose we swap these two elements. Element 1 has been pushed j - i positions to the right, so its score contribution will drop by j - i. But element 2 has been pushed j - i positions to the left, so its score contribution will increase by j - i. These cancel out, leaving the total score unchanged.
Now, any permutation of elements can be achieved by swapping elements in the following way: swap the element in position 1 with the element that should be at position 1, then do the same for position 2, and so on. After the ith step, the first i elements of the permutation are correct. We know that every swap leaves the scoring function unchanged, and a permutation is just a sequence of swaps, so every permutation also leaves the scoring function unchanged! This is true at for the d elements at both ends of the output array.
When 3 or more copies of a number exist, only the position of the first and last copy contribute to the distance for that number. It doesn't matter where the middle ones go. I'll call the elements between the 2 blocks of d elements at either end the "central" elements. They consist of the unique elements, as well as some number of copies of all those non-unique elements that appear at least 3 times. As before, it's easy to see that any permutation of these "central" elements corresponds to a sequence of swaps, and that any such swap will leave the overall score unchanged (in fact it's even simpler than before, since swapping two central elements does not even change the score contribution of either of these elements).
This leads to a simple O(nlog n) algorithm (or O(n) if you use bucket sort for the first step) to generate a solution array Y from a length-n input array X:
Sort the input array X.
Use a single pass through X to count the number of distinct non-unique elements. Call this d.
Set i, j and k to 0.
While i < n:
If X[i+1] == X[i], we have a non-unique element:
Set Y[j] = Y[n-j-1] = X[i].
Increment i twice, and increment j once.
While X[i] == X[i-1]:
Set Y[d+k] = X[i].
Increment i and k.
Otherwise we have a unique element:
Set Y[d+k] = X[i].
Increment i and k.

Interview puzzle: Jump Game

Jump Game:
Given an array, start from the first element and reach the last by jumping. The jump length can be at most the value at the current position in the array. The optimum result is when you reach the goal in minimum number of jumps.
What is an algorithm for finding the optimum result?
An example: given array A = {2,3,1,1,4} the possible ways to reach the end (index list) are
0,2,3,4 (jump 2 to index 2, then jump 1 to index 3 then 1 to index 4)
0,1,4 (jump 1 to index 1, then jump 3 to index 4)
Since second solution has only 2 jumps it is the optimum result.
Overview
Given your array a and the index of your current position i, repeat the following until you reach the last element.
Consider all candidate "jump-to elements" in a[i+1] to a[a[i] + i]. For each such element at index e, calculate v = a[e] + e. If one of the elements is the last element, jump to the last element. Otherwise, jump to the element with the maximal v.
More simply put, of the elements within reach, look for the one that will get you furthest on the next jump. We know this selection, x, is the right one because compared to every other element y you can jump to, the elements reachable from y are a subset of the elements reachable from x (except for elements from a backward jump, which are obviously bad choices).
This algorithm runs in O(n) because each element need be considered only once (elements that would be considered a second time can be skipped).
Example
Consider the array of values a, indicies, i, and sums of index and value v.
i -> 0 1 2 3 4 5 6 7 8 9 10 11 12
a -> [4, 11, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
v -> 4 12 3 4 5 6 7 8 9 10 11 12 13
Start at index 0 and consider the next 4 elements. Find the one with maximal v. That element is at index 1, so jump to 1. Now consider the next 11 elements. The goal is within reach, so jump to the goal.
Demo
See here or here with code.
Dynamic programming.
Imagine you have an array B where B[i] shows the minimum number of step needed to reach index i in your array A. Your answer of course is in B[n], given A has n elements and indices start from 1. Assume C[i]=j means the you jumped from index j to index i (this is to recover the path taken later)
So, the algorithm is the following:
set B[i] to infinity for all i
B[1] = 0; <-- zero steps to reach B[1]
for i = 1 to n-1 <-- Each step updates possible jumps from A[i]
for j = 1 to A[i] <-- Possible jump sizes are 1, 2, ..., A[i]
if i+j > n <-- Array boundary check
break
if B[i+j] > B[i]+1 <-- If this path to B[i+j] was shorter than previous
B[i+j] = B[i]+1 <-- Keep the shortest path value
C[i+j] = i <-- Keep the path itself
The number of jumps needed is B[n]. The path that needs to be taken is:
1 -> C[1] -> C[C[1]] -> C[C[C[1]]] -> ... -> n
Which can be restored by a simple loop.
The algorithm is of O(min(k,n)*n) time complexity and O(n) space complexity. n is the number of elements in A and k is the maximum value inside the array.
Note
I am keeping this answer, but cheeken's greedy algorithm is correct and more efficient.
Construct a directed graph from the array. eg: i->j if |i-j|<=x[i] (Basically, if you can move from i to j in one hop have i->j as an edge in the graph). Now, find the shortest path from first node to last.
FWIW, you can use Dijkstra's algorithm so find shortest route. Complexity is O( | E | + | V | log | V | ). Since | E | < n^2, this becomes O(n^2).
We can calculate far index to jump maximum and in between if the any index value is larger than the far, we will update the far index value.
Simple O(n) time complexity solution
public boolean canJump(int[] nums) {
int far = 0;
for(int i = 0; i<nums.length; i++){
if(i <= far){
far = Math.max(far, i+nums[i]);
}
else{
return false;
}
}
return true;
}
start from left(end)..and traverse till number is same as index, use the maximum of such numbers. example if list is
list: 2738|4|6927
index: 0123|4|5678
once youve got this repeat above step from this number till u reach extreme right.
273846927
000001234
in case you dont find nething matching the index, use the digit with the farthest index and value greater than index. in this case 7.( because pretty soon index will be greater than the number, you can probably just count for 9 indices)
basic idea:
start building the path from the end to the start by finding all array elements from which it is possible to make the last jump to the target element (all i such that A[i] >= target - i).
treat each such i as the new target and find a path to it (recursively).
choose the minimal length path found, append the target, return.
simple example in python:
ls1 = [2,3,1,1,4]
ls2 = [4,11,1,1,1,1,1,1,1,1,1,1,1]
# finds the shortest path in ls to the target index tgti
def find_path(ls,tgti):
# if the target is the first element in the array, return it's index.
if tgti<= 0:
return [0]
# for each 0 <= i < tgti, if it it possible to reach
# tgti from i (ls[i] <= >= tgti-i) then find the path to i
sub_paths = [find_path(ls,i) for i in range(tgti-1,-1,-1) if ls[i] >= tgti-i]
# find the minimum length path in sub_paths
min_res = sub_paths[0]
for p in sub_paths:
if len(p) < len(min_res):
min_res = p
# add current target to the chosen path
min_res.append(tgti)
return min_res
print find_path(ls1,len(ls1)-1)
print find_path(ls2,len(ls2)-1)
>>>[0, 1, 4]
>>>[0, 1, 12]

Resources