lexicographical Smallest permutation in matrix with restricted exchange - algorithm

Following question was asked in Facebook Job interview aptitude test:
A permutation is a list of K numbers, each between 1 and K (both inclusive),
that has no duplicate elements.
Permutation X is lexicographically smaller than Permutation Y iff for some
i <= K:
All of the first i-1 elements of X are equal to first i-1 elements of Y.
ith element of X is smaller than ith element of Y.
You are given a permutation P, you can exchange some of its elements as many
times as you want in any order. You have to find the lexicographically smallest
Permutation that you can obtain from P.
K is less than 101.
Input Format:
First line of input contains K being the size of permutation.
Next line contains K single spaced numbers indicating the permutation.
Each of next K lines contains K characters, character j of line i is equal to
'Y' if you can exchange ith and jth element of a permutation, and
'N' otherwise.
Output Format:
Print K numbers with a space separating each of them indicating the
permutation.
Sample Input
3
3 1 2
NNY
NNN
YNN
Sample Output
2 1 3
Sample Input
3
3 2 1
NYN
YNY
NYN
Sample Output
1 2 3
In the first example you can exchange first element with last element to
obtain 2 1 3 from 3 1 2.
What I did?
I generated all the permutations First.
Then, I discarded those permutations that are not feasible.
In Example 1: 1 3 3 is not feasible as position 1 and 2 is nonexchangeable.
From all the permissible permutation lists, i picked the lexicographical smallest one as the solution.
Problem with above solution:
My solution works perfectly for K<=25. When the size of K becomes greater than 25, the solution is really slow. For K=100, I did n't get the output even in 60 minutes.
My Questions here are:
How Should I optimize my solution?
Can it be done without generating all permutations?
Better Solution with Explanations and Pseudo-Code(or Code) will be highly helpful.
Thank you!

My solution works perfectly for K<=25. When the size of K becomes greater than 25, the solution is really slow.
Your solution will work very slow. As you are generating all permutations, so overall complexity for generating permutations is:
O(2^K).
Hence, O(2^K) will take a year as K can be as large as 100.
Can it be done without generating all permutations?
Yes, it can be done without generating all permutations.
How Should I optimize my solution?
You can solve this problem in linear time using(DFS and Connected Component) concept in graph theory.
Please note we will take second example for explaining the steps (involved in the algorithm) which I am going to descibe.
Step 1:
Construct a graph G with K-1 Vertices.
Thus V={0,1,2}
Step 2:
Let an edge e connect two vertices whenever swapping the elements at those two positions is permissible.
Therefore the edges are: E={(0,1) , (1,0) , (1,2) , (2,1)}
Step 3:
Find all the Connected Components(CC) of this graph G(V,E).
In example 2:
All the CC are:
CC1: {0, 1, 2}
Step 4:
For each of the connected components, sort the elements available within that connected component in such a way that smallest index within the connected component gets smallest element, second smallest index gets second smallest element, etc.
In example 2:
Smallest index in CC1 = 0
Smallest index in CC1 = 1
Smallest index in CC1 = 2
Smallest index 0 in CC1 gets the smallest element. Smallest
element=1.
Second smaller index in CC1 gets the second smallest element. Second
smallest index =2.
Third smaller index in CC1 gets the Third smallest element. Third
smallest index =3.
Thus, The result after sorting CC1 as per above rule is (1,2,3).
When Step 4 is done for all connected components, we have the lowest possible permutation.
Therefore, 1 2 3 is the lexicographically smallest permutation in example 2.
Pseudo-Code(or Code) will be highly helpful.
As I had already described the logic, here is the Code in C++:
vector<int>TMP_IP;
char Adjacency[LIMIT][LIMIT];
vector<vector<int> >ADJ_vector(LIMIT);
int MARKED[LIMIT];
vector<int>connected_COMPONENTS;
vector<int>Positions_vector;
void DepthFirstTraversal(int u)
{
MARKED[u]=1;
connected_COMPONENTS.push_back(u);
for(int j=0;j<ADJ_vector[u].size();++j)
if(!MARKED[ADJ_vector[u][j]] )
DepthFirstTraversal(ADJ_vector[u][j]);
}
//Print result
void lexo_smallest(int K)
{
for(int i=0;i<K;++i)
cout<<TMP_IP[i]<<" ";
cout<<endl;
}
int main()
{
int K,ip;
string rows[109];
scanf("%d",&K);
for(int i=0;i<K;++i)
{
scanf("%d",&ip);
TMP_IP.push_back(ip);
}
for(int i=0;i<K;++i)
cin>>rows[i];
for(int i=0;i<K;++i)
for(int j=0;j<rows[i].size();++j)
Adjacency[i][j]=rows[i][j];
for(int i=0;i<K;++i)
for(int j=0;j<K;++j)
if(Adjacency[i][j]=='Y')
ADJ_vector[i].push_back(j);
for( int i = 0 ; i <K ; ++i )
{
if( !MARKED[ i ] )
{
DepthFirstTraversal( i );
for(int x=0;x<connected_COMPONENTS.size();++x)
{
Positions_vector.push_back(TMP_IP[connected_COMPONENTS[x]]);
}
sort(connected_COMPONENTS.begin(),connected_COMPONENTS.end());
sort(Positions_vector.begin(),Positions_vector.end());
for(int x=0;x<connected_COMPONENTS.size();++x)
{
TMP_IP[connected_COMPONENTS[x]]=Positions_vector[x];
}
connected_COMPONENTS.clear();
Positions_vector.clear();
}
}
lexo_smallest(K);
return 0;
}
DEMO # IDEONE
Complexity of the above solution:
The total time for taking input is O(K^2).
Complexity of above algorithm is same as DFS. O(V+E).
Overall time: O(K^2)+O(V+E)
Even for K=5000, the above solution is amazingly fast.

In this you first need to form a new matrix in which if a(i,j) is Y then this means that element at jth position can come at ith place.
This task can be done easily by applying Floyd Warshall's Algorithm, By replacing all Y with 1 and all N with infinity(or a very large number). So after applying Floyd Warshall if any element a(i,j) is less than infinity then element at jth position can be placed at position i.
Now the task is easy. Choose the elements greedily i.e. for each i you are given a list of element that you can exchange places with. So now go sequentially from position 1 to end and for each i, find the element with minimum value (index j) for which a(i,j) is Y (i.e. less than infinity) and swap i and j.

You can do it location by location basis.
For the left-most location find the smallest number you can find and swap it to the location 1. This can be done by checking if a path from location of 1 to first location exists and so on.

Related

Maximize number of zigzag sequence in an array

I want to maximize number of zigzag sequence in an array(without reordering).
I've a main array of random sequence of integers.I want a sub-array of index of main array that has zigzag pattern.
A sequence of integers is called zigzag sequence if each of its elements is either strictly less or strictly greater than its neighbors(and two adjacent of neighbors).
Example : The sequence 4 2 3 1 5 2 forms a zigzag, but 7 3 5 5 2 and 3 8 6 4 5
and 4 2 3 1 5 3 don't.
For a given array of integers we need to find (contiguous) sub-array of indexes that forms a zigzag sequence.
Can this be done in O(N) ?
Yes, this would seem to be solvable in O(n) time. I'll describe the algorithm as a dynamic program.
Setup
Let the array containing potential zig-zags be called Z.
Let U be an array such that len(U) == len(Z), and U[i] is an integer representing the largest contiguous left-to-right subsequence starting at i that is a zig-zag such that Z[i] < Z[i+1] (it zigs up).
Let D be similar to U, except that D[i] is an integer representing the largest contiguous left-to-right subsequence starting at i that is a zig-zag such that Z[i] > Z[i+1] (it zags down).
Subproblem
The subproblem is to find both U[i] and D[i] at each i. This can be done as follows:
U[i] = {
1 + D[i+1] if i < i+1
0 otherwise
}
L[i] = {
1 + U[i+1] if i > i+1
0 otherwise
}
The top version says that if we're looking for the largest sequence beginning with an up-zig, we see if the next element is larger (goes up), and then add a single zig to the size of the next down-zag sequence. The next one is the reverse.
Base Cases
If i == len(Z) (it is the last element), U[i] = L[i] = 0. The last element cannot have a left-to-right sequence after it because there is nothing after it.
Solution
To get the solution, first we find max(U[i]) and max(L[i]) for every i. Then get the maximum of those two values, store i, and store the length of this largest zig-zag (in a variable called length). The sequence begins at index i and ends at index i + length.
Runtime
There are n indexes, so there are 2n subproblems between U and L. Each subproblem takes O(1) time to solve, given that solutions to previously solved subproblems are memoized. Finally, iterating through U and L to get the final answer takes O(2n) time.
We thus have O(2n) + O(2n) time, or O(n).
This may be an overly complex solution, but it demonstrates that it can be done in O(n).

Number of different marks

I came across an interesting problem and I can't solve it in a good complexity (better than O(qn)):
There are n persons in a row. Initially every person in this row has some value - lets say that i-th person has value a_i. These values are pairwise distinct.
Every person gets a mark. There are two conditions:
If a_i < a_j then j-th person cant get worse mark than i-th person.
If i < j then j-th person can't get worse mark than i-th person (this condition tells us that sequence of marks is non-decreasing sequence).
There are q operations. In every operation two person are swapped (they swap their values).
After each operation you have tell what is maximal number of diffrent marks that these n persons can get.
Do you have any idea?
Consider any two groups, J and I (j < i and a_j < a_i for all j and i). In any swap scenario, a_i is the new max for J and a_j is the new min for I, and J gets extended to the right at least up to and including i.
Now if there was any group of is to the right of i whos values were all greater than the values in the left segment of I up to i, this group would not have been part of I, but rather its own group or part of another group denoting a higher mark.
So this kind of swap would reduce the mark count by the count of groups between J and I and merge groups J up to I.
Now consider an in-group swap. The only time a mark would be added is if a_i and a_j (j < i), are the minimum and maximum respectively of two adjacent segments, leading to the group splitting into those two segments. Banana123 showed in a comment below that this condition is not sufficient (e.g., 3,6,4,5,1,2 => 3,1,4,5,6,2). We can address this by also checking before the switch that the second smallest i is greater than the second largest j.
Banana123 also showed in a comment below that more than one mark could be added in this instance, for example 6,2,3,4,5,1. We can handle this by keeping in a segment tree a record of min,max and number of groups, which correspond with a count of sequential maxes.
Example 1:
(1,6,1) // (min, max, group_count)
(3,6,1) (1,4,1)
(6,6,1) (3,5,1) (4,4,1) (1,2,1)
6 5 3 4 2 1
Swap 2 and 5. Updates happen in log(n) along the intervals containing 2 and 5.
To add group counts in a larger interval the left group's max must be lower than the right group's min. But if it's not, as in the second example, we must check one level down in the tree.
(1,6,1)
(2,6,1) (1,5,1)
(6,6,1) (2,3,2) (4,4,1) (1,5,1)
6 2 3 4 5 1
Swap 1 and 6:
(1,6,6)
(1,3,3) (4,6,3)
(1,1,1) (2,3,2) (4,4,1) (5,6,2)
1 2 3 4 5 6
Example 2:
(1,6,1)
(3,6,1) (1,4,1)
(6,6,1) (3,5,1) (4,4,1) (1,2,1)
6 5 3 4 2 1
Swap 1 and 6. On the right side, we have two groups where the left group's max is greater than the right group's min, (4,4,1) (2,6,2). To get an accurate mark count, we go down a level and move 2 into 4's group to arrive at a count of two marks. A similar examination is then done in the level before the top.
(1,6,3)
(1,5,2) (2,6,2)
(1,1,1) (3,5,1) (4,4,1) (2,6,2)
1 5 3 4 2 6
Here's an O(n log n) solution:
If n = 0 or n = 1, then there are n distinct marks.
Otherwise, consider the two "halves" of the list, LEFT = [1, n/2] and RIGHT = [n/2 + 1, n]. (If the list has an odd number of elements, the middle element can go in either half, it doesn't matter.)
Find the greatest value in LEFT — call it aLEFT_MAX — and the least value in the second half — call it aRIGHT_MIN.
If aLEFT_MAX < aRIGHT_MIN, then there's no need for any marks to overlap between the two, so you can just recurse into each half and return the sum of the two results.
Otherwise, we know that there's some segment, extending at least from LEFT_MAX to RIGHT_MIN, where all elements have to have the same mark.
To find the leftmost extent of this segment, we can scan leftward from RIGHT_MIN down to 1, keeping track of the minimum value we've seen so far and the position of the leftmost element we've found to be greater than some further-rightward value. (This can actually be optimized a bit more, but I don't think we can improve the algorithmic complexity by doing so, so I won't worry about that.) And, conversely to find the rightmost extent of the segment.
Suppose the segment in question extends from LEFTMOST to RIGHTMOST. Then we just need to recursively compute the number of distinct marks in [1, LEFTMOST) and in (RIGHTMOST, n], and return the sum of the two results plus 1.
I wasn't able to get a complete solution, but here are a few ideas about what can and can't be done.
First: it's impossible to find the number of marks in O(log n) from the array alone - otherwise you could use your algorithm to check if the array is sorted faster than O(n), and that's clearly impossible.
General idea: spend O(n log n) to create any additional data which would let you to compute number of marks in O(log n) time and said data can be updated after a swap in O(log n) time. One possibly useful piece to include is the current number of marks (i.e. finding how number of marks changed may be easier than to compute what it is).
Since update time is O(log n), you can't afford to store anything mark-related (such as "the last person with the same mark") for each person - otherwise taking an array 1 2 3 ... n and repeatedly swapping first and last element would require you to update this additional data for every element in the array.
Geometric interpretation: taking your sequence 4 1 3 2 5 7 6 8 as an example, we can draw points (i, a_i):
|8
+---+-
|7 |
| 6|
+-+---+
|5|
-------+-+
4 |
3 |
2|
1 |
In other words, you need to cover all points by a maximal number of squares. Corollary: exchanging points from different squares a and b reduces total number of squares by |a-b|.
Index squares approach: let n = 2^k (otherwise you can add less than n fictional persons who will never participate in exchanges), let 0 <= a_i < n. We can create O(n log n) objects - "index squares" - which are "responsible" for points (i, a_i) : a*2^b <= i < (a+1)*2^b or a*2^b <= a_i < (a+1)*2^b (on our plane, this would look like a cross with center on the diagonal line a_i=i). Every swap affects only O(log n) index squares.
The problem is, I can't find what information to store for each index square so that it would allow to find number of marks fast enough? all I have is a feeling that such approach may be effective.
Hope this helps.
Let's normalize the problem first, so that a_i is in the range of 0 to n-1 (can be achieved in O(n*logn) by sorting a, but just hast to be done once so we are fine).
function normalize(a) {
let b = [];
for (let i = 0; i < a.length; i++)
b[i] = [i, a[i]];
b.sort(function(x, y) {
return x[1] < y[1] ? -1 : 1;
});
for (let i = 0; i < a.length; i++)
a[b[i][0]] = i;
return a;
}
To get the maximal number of marks we can count how many times
i + 1 == mex(a[0..i]) , i integer element [0, n-1]
a[0..1] denotes the sub-array of all the values from index 0 to i.
mex() is the minimal exclusive, which is the smallest value missing in the sequence 0, 1, 2, 3, ...
This allows us to solve a single instance of the problem (ignoring the swaps for the moment) in O(n), e.g. by using the following algorithm:
// assuming values are normalized to be element [0,n-1]
function maxMarks(a) {
let visited = new Array(a.length + 1);
let smallestMissing = 0, marks = 0;
for (let i = 0; i < a.length; i++) {
visited[a[i]] = true;
if (a[i] == smallestMissing) {
smallestMissing++;
while (visited[smallestMissing])
smallestMissing++;
if (i + 1 == smallestMissing)
marks++;
}
}
return marks;
}
If we swap the values at indices x and y (x < y) then the mex for all values i < x and i > y doesn't change, although it is an optimization, unfortunately that doesn't improve complexity and it is still O(qn).
We can observe that the hits (where mark is increased) are always at the beginning of an increasing sequence and all matches within the same sequence have to be a[i] == i, except for the first one, but couldn't derive an algorithm from it yet:
0 6 2 3 4 5 1 7
*--|-------|*-*
3 0 2 1 4 6 5 7
-|---|*-*--|*-*

Generate a random integer from 0 to N-1 which is not in the list

You are given N and an int K[].
The task at hand is to generate a equal probabilistic random number between 0 to N-1 which doesn't exist in K.
N is strictly a integer >= 0.
And K.length is < N-1. And 0 <= K[i] <= N-1. Also assume K is sorted and each element of K is unique.
You are given a function uniformRand(int M) which generates uniform random number in the range 0 to M-1 And assume this functions's complexity is O(1).
Example:
N = 7
K = {0, 1, 5}
the function should return any random number { 2, 3, 4, 6 } with equal
probability.
I could get a O(N) solution for this : First generate a random number between 0 to N - K.length. And map the thus generated random number to a number not in K. The second step will take the complexity to O(N). Can it be done better in may be O(log N) ?
You can use the fact that all the numbers in K[] are between 0 and N-1 and they are distinct.
For your example case, you generate a random number from 0 to 3. Say you get a random number r. Now you conduct binary search on the array K[].
Initialize i = K.length/2.
Find K[i] - i. This will give you the number of numbers missing from the array in the range 0 to i.
For example K[2] = 5. So 3 elements are missing from K[0] to K[2] (2,3,4)
Hence you can decide whether you have to conduct the remaining search in the first part of array K or the next part. This is because you know r.
This search will give you a complexity of log(K.length)
EDIT: For example,
N = 7
K = {0, 1, 4} // modified the array to clarify the algorithm steps.
the function should return any random number { 2, 3, 5, 6 } with equal probability.
Random number generated between 0 and N-K.length = random{0-3}. Say we get 3. Hence we require the 4th missing number in array K.
Conduct binary search on array K[].
Initial i = K.length/2 = 1.
Now we see K[1] - 1 = 0. Hence no number is missing upto i = 1. Hence we search on the latter part of the array.
Now i = 2. K[2] - 2 = 4 - 2 = 2. Hence there are 2 missing numbers up to index i = 2. But we need the 4th missing element. So we again have to search in the latter part of the array.
Now we reach an empty array. What should we do now? If we reach an empty array between say K[j] & K[j+1] then it simply means that all elements between K[j] and K[j+1] are missing from the array K.
Hence all elements above K[2] are missing from the array, namely 5 and 6. We need the 4th element out of which we have already discarded 2 elements. Hence we will choose the second element which is 6.
Binary search.
The basic algorithm:
(not quite the same as the other answer - the number is only generated at the end)
Start in the middle of K.
By looking at the current value and it's index, we can determine the number of pickable numbers (numbers not in K) to the left.
Similarly, by including N, we can determine the number of pickable numbers to the right.
Now randomly go either left or right, weighted based on the count of pickable numbers on each side.
Repeat in the chosen subarray until the subarray is empty.
Then generate a random number in the range consisting of the numbers before and after the subarray in the array.
The running time would be O(log |K|), and, since |K| < N-1, O(log N).
The exact mathematics for number counts and weights can be derived from the example below.
Extension with K containing a bigger range:
Now let's say (for enrichment purposes) K can also contain values N or larger.
Then, instead of starting with the entire K, we start with a subarray up to position min(N, |K|), and start in the middle of that.
It's easy to see that the N-th position in K (if one exists) will be >= N, so this chosen range includes any possible number we can generate.
From here, we need to do a binary search for N (which would give us a point where all values to the left are < N, even if N could not be found) (the above algorithm doesn't deal with K containing values greater than N).
Then we just run the algorithm as above with the subarray ending at the last value < N.
The running time would be O(log N), or, more specifically, O(log min(N, |K|)).
Example:
N = 10
K = {0, 1, 4, 5, 8}
So we start in the middle - 4.
Given that we're at index 2, we know there are 2 elements to the left, and the value is 4, so there are 4 - 2 = 2 pickable values to the left.
Similarly, there are 10 - (4+1) - 2 = 3 pickable values to the right.
So now we go left with probability 2/(2+3) and right with probability 3/(2+3).
Let's say we went right, and our next middle value is 5.
We are at the first position in this subarray, and the previous value is 4, so we have 5 - (4+1) = 0 pickable values to the left.
And there are 10 - (5+1) - 1 = 3 pickable values to the right.
We can't go left (0 probability). If we go right, our next middle value would be 8.
There would be 2 pickable values to the left, and 1 to the right.
If we go left, we'd have an empty subarray.
So then we'd generate a number between 5 and 8, which would be 6 or 7 with equal probability.
This can be solved by basically solving this:
Find the rth smallest number not in the given array, K, subject to
conditions in the question.
For that consider the implicit array D, defined by
D[i] = K[i] - i for 0 <= i < L, where L is length of K
We also set D[-1] = 0 and D[L] = N
We also define K[-1] = 0.
Note, we don't actually need to construct D. Also note that D is sorted (and all elements non-negative), as the numbers in K[] are unique and increasing.
Now we make the following claim:
CLAIM: To find the rth smallest number not in K[], we need to find right most occurrence of r' in D (which occurs at position defined by j), where r' is the largest number in D, which is < r. Such an r' exists, because D[-1] = 0. Once we find such an r' (and j), the number we are looking for is r-r' + K[j].
Proof: Basically the definition of r' and j tells us that there are exactlyr' numbers missing from 0 to K[j], and more than r numbers missing from 0 to K[j+1]. Thus all the numbers from K[j]+1 to K[j+1]-1 are missing (and these missing are at least r-r' in number), and the number we seek is among them, given by K[j] + r-r'.
Algorithm:
In order to find (r',j) all we need to do is a (modified) binary search for r in D, where we keep moving to the left even if we find r in the array.
This is an O(log K) algorithm.
If you are running this many times, it probably pays to speed up your generation operation: O(log N) time just isn't acceptable.
Make an empty array G. Starting at zero, count upwards while progressing through the values of K. If a value isn't in K add it to G. If it is in K don't add it and progress your K pointer. (This relies on K being sorted.)
Now you have an array G which has only acceptable numbers.
Use your random number generator to choose a value from G.
This requires O(N) preparatory work and each generation happens in O(1) time. After N look-ups the amortized time of all operations is O(1).
A Python mock-up:
import random
class PRNG:
def __init__(self, K,N):
self.G = []
kptr = 0
for i in range(N):
if kptr<len(K) and K[kptr]==i:
kptr+=1
else:
self.G.append(i)
def getRand(self):
rn = random.randint(0,len(self.G)-1)
return self.G[rn]
prng=PRNG( [0,1,5], 7)
for i in range(20):
print prng.getRand()

Minimal number of swaps?

There are N characters in a string of types A and B in the array (same amount of each type). What is the minimal number of swaps to make sure that no two adjacent chars are same if we can only swap two adjacent characters ?
For example, input is:
AAAABBBB
The minimal number of swaps is 6 to make the array ABABABAB. But how would you solve it for any kind of input ? I can only think of O(N^2) solution. Maybe some kind of sort ?
If we need just to count swaps, then we can do it with O(N).
Let's assume for simplicity that array X of N elements should become ABAB... .
GetCount()
swaps = 0, i = -1, j = -1
for(k = 0; k < N; k++)
if(k % 2 == 0)
i = FindIndexOf(A, max(k, i))
X[k] <-> X[i]
swaps += i - k
else
j = FindIndexOf(B, max(k, j))
X[k] <-> X[j]
swaps += j - k
return swaps
FindIndexOf(element, index)
while(index < N)
if(X[index] == element) return index
index++
return -1; // should never happen if count of As == count of Bs
Basically, we run from left to right, and if a misplaced element is found, it gets exchanged with the correct element (e.g. abBbbbA** --> abAbbbB**) in O(1). At the same time swaps are counted as if the sequence of adjacent elements would be swapped instead. Variables i and j are used to cache indices of next A and B respectively, to make sure that all calls together of FindIndexOf are done in O(N).
If we need to sort by swaps then we cannot do better than O(N^2).
The rough idea is the following. Let's consider your sample: AAAABBBB. One of Bs needs O(N) swaps to get to the A B ... position, another B needs O(N) to get to A B A B ... position, etc. So we get O(N^2) at the end.
Observe that if any solution would swap two instances of the same letter, then we can find a better solution by dropping that swap, which necessarily has no effect. An optimal solution therefore only swaps differing letters.
Let's view the string of letters as an array of indices of one kind of letter (arbitrarily chosen, say A) into the string. So AAAABBBB would be represented as [0, 1, 2, 3] while ABABABAB would be [0, 2, 4, 6].
We know two instances of the same letter will never swap in an optimal solution. This lets us always safely identify the first (left-most) instance of A with the first element of our index array, the second instance with the second element, etc. It also tells us our array is always in sorted order at each step of an optimal solution.
Since each step of an optimal solution swaps differing letters, we know our index array evolves at each step only by incrementing or decrementing a single element at a time.
An initial string of length n = 2k will have an array representation A of length k. An optimal solution will transform this array to either
ODDS = [1, 3, 5, ... 2k]
or
EVENS = [0, 2, 4, ... 2k - 1]
Since we know in an optimal solution instances of a letter do not pass each other, we can conclude an optimal solution must spend min(abs(ODDS[0] - A[0]), abs(EVENS[0] - A[0])) swaps to put the first instance in correct position.
By realizing the EVENS or ODDS choice is made only once (not once per letter instance), and summing across the array, we can count the minimum number of needed swaps as
define count_swaps(length, initial, goal)
total = 0
for i from 0 to length - 1
total += abs(goal[i] - initial[i])
end
return total
end
define count_minimum_needed_swaps(k, A)
return min(count_swaps(k, A, EVENS), count_swaps(k, A, ODDS))
end
Notice the number of loop iterations implied by count_minimum_needed_swaps is 2 * k = n; it runs in O(n) time.
By noting which term is smaller in count_minimum_needed_swaps, we can also tell which of the two goal states is optimal.
Since you know N, you can simply write a loop that generates the values with no swaps needed.
#define N 4
char array[N + N];
for (size_t z = 0; z < N + N; z++)
{
array[z] = 'B' - ((z & 1) == 0);
}
return 0; // The number of swaps
#Nemo and #AlexD are right. The algorithm is order n^2. #Nemo misunderstood that we are looking for a reordering where two adjacent characters are not the same, so we can not use that if A is after B they are out of order.
Lets see the minimum number of swaps.
We dont care if our first character is A or B, because we can apply the same algorithm but using A instead of B and viceversa everywhere. So lets assume that the length of the word WORD_N is 2N, with N As and N Bs, starting with an A. (I am using length 2N to simplify the calculations).
What we will do is try to move the next B right to this A, without taking care of the positions of the other characters, because then we will have reduce the problem to reorder a new word WORD_{N-1}. Lets also assume that the next B is not just after A if the word has more that 2 characters, because then the first step is done and we reduce the problem to the next set of characters, WORD_{N-1}.
The next B should be as far as possible to be in the worst case, so it is after half of the word, so we need $N-1$ swaps to put this B after the A (maybe less than that). Then our word can be reduced to WORD_N = [A B WORD_{N-1}].
We se that we have to perform this algorithm as most N-1 times, because the last word (WORD_1) will be already ordered. Performing the algorithm N-1 times we have to make
N_swaps = (N-1)*N/2.
where N is half of the lenght of the initial word.
Lets see why we can apply the same algorithm for WORD_{N-1} also assuming that the first word is A. In this case it matters than the first word should be the same as in the already ordered pair. We can be sure that the first character in WORD_{N-1} is A because it was the character just next to the first character in our initial word, ant if it was B the first work can perform only a swap between these two words and or none and we will already have WORD_{N-1} starting with the same character than WORD_{N}, while the first two characters of WORD_{N} are different at the cost of almost 1 swap.
I think this answer is similar to the answer by phs, just in Haskell. The idea is that the resultant-indices for A's (or B's) are known so all we need to do is calculate how far each starting index has to move and sum the total.
Haskell code:
Prelude Data.List> let is = elemIndices 'B' "AAAABBBB"
in minimum
$ map (sum . zipWith ((abs .) . (-)) is) [[1,3..],[0,2..]]
6 --output

Reduce a sequence in most optimal way

We are given a sequence a of n numbers. The reduction of sequence a is defined as replacing the elements a[i] and a[i+1] with max(a[i],a[i+1]).
Each reduction operation has a cost defined as max(a[i],a[i+1]). After n-1 reductions a sequence of length 1 is obtained.
Now our goal is to print the cost of the optimal reduction of the given sequence a such that the resulting sequence of length 1 has the minimum cost.
e.g.:
1
2
3
Output :
5
An O(N^2) solution is trivial. Any ideas?
EDIT1:
People are asking about my idea, so my idea was to traverse through the sequence pairwise and for each pair check cost and in the end reduce the pair with least cost.
1 2 3
2 3 <=== Cost is 2
So reduce above sequence to
2 3
now again traverse through sequence, we get cost as 3
2 3
3 <=== Cost is 3
So total cost is 2+3=5
Above algorithm is of O(N^2). That is why I was asking for some more optimized idea.
O(n) solution:
High-level:
The basic idea is to repeatedly merge any element e smaller than both its neighbours ns and nl with its smallest neighbour ns. This produces the minimal cost because both the cost and result of merging is max(a[i],a[i+1]), which means no merge can make an element smaller than it currently is, thus the cheapest possible merge for e is with ns, and that merge can't increase the cost of any other possible merges.
This can be done with a one pass algorithm by keeping a stack of elements from our array in decreasing order. We compare the current element to both its neighbours (one being the top of the stack) and perform appropriate merges until we're done.
Pseudo-code:
stack = empty
for pos = 0 to length
// stack.top > arr[pos] is implicitly true because of the previous iteration of the loop
if stack.top > arr[pos] > arr[pos+1]
stack.push(arr[pos])
else if stack.top > arr[pos+1] > arr[pos]
merge(arr[pos], arr[pos+1])
else while arr[pos+1] > stack.top > arr[pos]
merge(arr[pos], stack.pop)
Java code:
Stack<Integer> stack = new Stack<Integer>();
int cost = 0;
int arr[] = {10,1,2,3,4,5};
for (int pos = 0; pos < arr.length; pos++)
if (pos < arr.length-1 && (stack.empty() || stack.peek() >= arr[pos+1]))
if (arr[pos] > arr[pos+1])
stack.push(arr[pos]);
else
cost += arr[pos+1]; // merge pos and pos+1
else
{
int last = Integer.MAX_VALUE; // required otherwise a merge may be missed
while (!stack.empty() && (pos == arr.length-1 || stack.peek() < arr[pos+1]))
{
last = stack.peek();
cost += stack.pop(); // merge stack.pop() and pos or the last popped item
}
if (last != Integer.MAX_VALUE)
{
int costTemp = Integer.MAX_VALUE;
if (!stack.empty())
costTemp = stack.peek();
if (pos != arr.length-1)
costTemp = Math.min(arr[pos+1], costTemp);
cost += costTemp;
}
}
System.out.println(cost);
I am confused if you mean by "cost" of reduction "computational cost" i.e. an operation taking time max(a[i],a[i+1]) or simply something you want to calculate. If it is the latter, then the following algorithm is better than O(n^2):
sort the list, or more precise, define b[i] s.t. a[b[i]] is the sorted list: O(n) if you can use RADIX sort, O(n log n) otherwise.
starting from the second-lowest item i in the sorted list: if left/right is lower than i, then perform reduction: O(1) for each item, update list from 2, O(n) in total.
I have no idea if that is the optimal solution, but it's O(n) for integers and O(n log n), otherwise.
edit: Realized that removing a precomputing step made it much simpler
If you don't consider it cheating to sort the list, then do it in n log n time and then merge the first two entries recursively. The total cost in this case will be the sum of the entries minus the smallest entry. This is optimal since
the cost will be the sum of n-1 entries (with repeats allowed)
the ith smallest entry can appear at most i-1 times in the cost function
The same fundamental idea works even if the list isn't sorted. An optimal solution is to merge the smallest element with its smallest neighbor. To see that this is optimal, note that
the cost will be the sum of n-1 entries (with repeats allowed)
entry a_i can appear at most j-1 times in the cost function, where j is the length of the longest consecutive subsequence containing a_i such that a_i is the maximum element of the subsequence
In the worst case, the sequence is decreasing and the time is O(n^2).
Greedy approach indeed works.
You can always reduce the smallest number with its smaller neighbor.
Proof: we have to reduce smallest number at some point. Any reduction of a neighbor will make the value of neighbor at least the same(possibly) bigger, so operation that reduces minimal element a[i] will always have cost c>=min(a[i-1], a[i+1])
Now we need to
quickly find/remove smallest number
find its neigbors
I'd go with 2 RMQs on that. Doing operation 2 as a binary search. Which gives us O(N * log^2(N))
EDIT: first RMQ - values. When you remove an element put some big value there
second RMQ - "presence". 0 or 1 (value is there/isn't there). To find a [for example] left neighbor of a[i], you need to find the greatest l, that sum[l,i-1] = 1.

Resources