Sum of continuous sequences - algorithm

Given an array A with N elements, I want to find the sum of minimum elements in all the possible contiguous sub-sequences of A. I know if N is small we can look for all possible sub sequences but as N is upto 10^5 what can be best way to find this sum?
Example: Let N=3 and A[1,2,3] then ans is 10 as Possible contiguous sub sequences {(1),(2),(3),(1,2),(1,2,3),(2,3)} so Sum of minimum elements = 1 + 2 + 3 + 1 + 1 + 2 = 10

Let's fix one element(a[i]). We want to know the position of the rightmost element smaller than this one located to the left from i(L). We also need to know the position of the leftmost element smaller than this one located to the right from i(R).
If we know L and R, we should add (i - L) * (R - i) * a[i] to the answer.
It is possible to precompute L and R for all i in linear time using a stack. Pseudo code:
s = new Stack
L = new int[n]
fill(L, -1)
for i <- 0 ... n - 1:
while !s.isEmpty() && s.top().first > a[i]:
s.pop()
if !s.isEmpty():
L[i] = s.top().second
s.push(pair(a[i], i))
We can reverse the array and run the same algorithm to find R.
How to deal with equal elements? Let's assume that a[i] is a pair <a[i], i>. All elements are distinct now.
The time complexity is O(n).
Here is a full pseudo code(I assume that int can hold any integer value here, you should
choose a feasible type to avoid an overflow in a real code. I also assume that all elements are distinct):
int[] getLeftSmallerElementPositions(int[] a):
s = new Stack
L = new int[n]
fill(L, -1)
for i <- 0 ... n - 1:
while !s.isEmpty() && s.top().first > a[i]:
s.pop()
if !s.isEmpty():
L[i] = s.top().second
s.push(pair(a[i], i))
return L
int[] getRightSmallerElementPositions(int[] a):
R = getLeftSmallerElementPositions(reversed(a))
for i <- 0 ... n - 1:
R[i] = n - 1 - R[i]
return reversed(R)
int findSum(int[] a):
L = getLeftSmallerElementPositions(a)
R = getRightSmallerElementPositions(a)
int res = 0
for i <- 0 ... n - 1:
res += (i - L[i]) * (R[i] - i) * a[i]
return res

If the list is sorted, you can consider all subsets for size 1, then 2, then 3, to N. The algorithm is initially somewhat inefficient, but an optimized version is below. Here's some pseudocode.
let A = {1, 2, 3}
let total_sum = 0
for set_size <- 1 to N
total_sum += sum(A[1:N-(set_size-1)])
First, sets with one element:{{1}, {2}, {3}}: sum each of the elements.
Then, sets of two element {{1, 2}, {2, 3}}: sum each element but the last.
Then, sets of three elements {{1, 2, 3}}: sum each element but the last two.
But this algorithm is inefficient. To optimize to O(n), multiply each ith element by N-i and sum (indexing from zero here). The intuition is that the first element is the minimum of N sets, the second element is the minimum of N-1 sets, etc.
I know it's not a python question, but sometimes code helps:
A = [1, 2, 3]
# This is [3, 2, 1]
scale = range(len(A), 0, -1)
# Take the element-wise product of the vectors, and sum
sum(a*b for (a,b) in zip(A, scale))
# Or just use the dot product
np.dot(A, scale)

Related

Generate one permutation from an index

Is there an efficient algorithm to generate a permutation from one index provided? The permutations do not need to have any specific ordering and it just needs to return every permutation once per every possible index. The set I wish to permute is all integers from 0~255.
If I understand the question correctly, the problem is as follows: You are given two integers n and k, and you want to find the kth permutation of n integers. You don't care about it being the kth lexicographical permutation, but it's just easier to be lexicographical so let's stick with that.
This is not too bad to compute. The base permutation is 1,2,3,4...n. This is the k=0 case. Consider what happens if you were to swap the 1 and 2: by moving the 1, you are passing up every single permutation where 1 goes first, and there are (n-1)! of those (since you could have permuted 2,3,4..n if you fixed the 1 in place). Thus, the algorithm is as follows:
for i from 1 to n:
j = k / (n-i)! // integer division, so rounded down
k -= j * (n-i)!
place down the jth unplaced number
This will iteratively produce the kth lexicographical permutation, since it repeatedly solves a sub-problem with a smaller set of numbers to place, and decrementing k along the way.
There is an implementation in python in module more-itertools: nth_permutation.
Here is an implementation, adapted from the code of more_itertools.nth_permutation:
from sympy import factorial
def nth_permutation(iterable, index):
pool = list(iterable)
n = len(pool)
c = factorial(n)
index = index % c
result = [0] * n
q = index
for d in range(1, n + 1):
q, i = divmod(q, d)
if 0 <= n - d < n:
result[n - d] = i
if q == 0:
break
return tuple(map(pool.pop, result))
print( nth_permutation(range(6), 360) )
# (3, 0, 1, 2, 4, 5)

Counting valid sequences with dynamic programming

I am pretty new to Dynamic Programming, but I am trying to get better. I have an exercise from a book, which asks me the following question (slightly abridged):
You want to construct sequence of length N from numbers from the set {1, 2, 3, 4, 5, 6}. However, you cannot place the number i (i = 1, 2, 3, 4, 5, 6) more than A[i] times consecutively, where A is a given array. Given the sequence length N (1 <= N <= 10^5) and the constraint array A (1 <= A[i] <= 50), how many sequences are possible?
For instance if A = {1, 2, 1, 2, 1, 2} and N = 2, this would mean you can only have one consecutive 1, two consecutive 2's, one consecutive 3, etc. Here, something like "11" is invalid since it has two consecutive 1's, whereas something like "12" or "22" are both valid. It turns out that the actual answer for this case is 33 (there are 36 total two-digit sequences, but "11", "33", and "55" are all invalid, which gives 33).
Somebody told me that one way to solve this problem is to use dynamic programming with three states. More specifically, they say to keep a 3d array dp(i, j, k) with i representing the current position we are at in the sequence, j representing the element put in position i - 1, and k representing the number of times that this element has been repeated in the block. They also told me that for the transitions, we can put in position i every element different from j, and we can only put j in if A[j] > k.
It all makes sense to me in theory, but I've been struggling with implementing this. I have no clue how to begin with the actual implementation other than initializing the matrix dp. Typically, most of the other exercises had some sort of "base case" that were manually set in the matrix, and then a loop was used to fill in the other entries.
I guess I am particularly confused because this is a 3D array.
For a moment let's just not care about the array. Let's implement this recursively. Let dp(i, j, k) be the number of sequences with length i, last element j, and k consecutive occurrences of j at the end of the array.
The question now becomes how do we write the solution of dp(i, j, k) recursively.
Well we know that we are adding a j the kth time, so we have to take each sequence of length i - 1, and has j occurring k - 1 times, and add another j to that sequence. Notice that this is simply dp(i - 1, j, k - 1).
But what if k == 1? If that's the case we can add one occurence of j to every sequence of length i - 1 that doesn't end with j. Essentially we need the sum of all dp(i, x, k), such that A[x] >= k and x != j.
This gives our recurrence relation:
def dp(i, j, k):
# this is the base case, the number of sequences of length 1
# one if k is valid, otherwise zero
if i == 1: return int(k == 1)
if k > 1:
# get all the valid sequences [0...i-1] and add j to them
return dp(i - 1, j, k - 1)
if k == 1:
# get all valid sequences that don't end with j
res = 0
for last in range(len(A)):
if last == j: continue
for n_consec in range(1, A[last] + 1):
res += dp(i - 1, last, n_consec)
return res
We know that our answer will be all valid subsequences of length N, so our final answer is sum(dp(N, j, k) for j in range(len(A)) for k in range(1, A[j] + 1))
Believe it or not this is the basis of dynamic programming. We just broke our main problem down into a set of subproblems. Of course, right now our time is exponential because of the recursion. We have two ways to lower this:
Caching, we can simply keep track of the result of each (i, j, k) and then spit out what we originally computed when it's called again.
Use an array. We can reimplement this idea with bottom-up dp, and have an array dp[i][j][k]. All of our function calls just become array accesses in a for loop. Note that using this method forces us iterate over the array in topological order which may be tricky.
There are 2 kinds of dp approaches: top-down and bottom-up
In bottom up, you fill the terminal cases in dp table and then use for loops to build up from that. Lets consider bottom-up algo to generate Fibonacci sequence. We set dp[0] = 1 and dp[1] = 1 and run a for loop from i = 2 to n.
In top down approach, we start from the "top" view of the problem and go down from there. Consider the recursive function to get n-th Fibonacci number:
def fib(n):
if n <= 1:
return 1
if dp[n] != -1:
return dp[n]
dp[n] = fib(n - 1) + fib(n - 2)
return dp[n]
Here we don't fill the complete table, but only the cases we encounter.
Why I am talking about these 2 types is because when you start learning dp, it is often difficult to come up with bottom-up approaches (like you are trying to). When this happens, first you want to come up with a top-down approach, and then try to get a bottom up solution from that.
So let's create a recursive dp function first:
# let m be size of A
# initialize dp table with all values -1
def solve(i, j, k, n, m):
# first write terminal cases
if k > A[j]:
# this means sequence is invalid. so return 0
return 0
if i >= n:
# this means a valid sequence.
return 1
if dp[i][j][k] != -1:
return dp[i][j][k]
result = 0
for num = 1 to m:
if num == j:
result += solve(i + 1, num, k + 1, n)
else:
result += solve(i + 1, num, 1, n)
dp[i][j][k] = result
return dp[i][j][k]
So we know what terminal cases are. We create a dp table of size dp[n + 1][m][50]. Initialize it with all values 0, not -1.
So we can do bottom-up as:
# initially all values in table are zero. With loop below, we set the valid endings as 1.
# So any state trying to reach valid terminal states will get 1, but invalid states will
# return the values 0
for num = 1 to m:
for occour = 1 to A[num]:
dp[n][num][occour] = 1
# now to build up from bottom, we start by filling n-1 th position
for i = n-1 to 1:
for num = 1 to m:
for occour = 1 to A[num]:
for next_num = 1 to m:
if next_num != num:
dp[i][num][occour] += dp[i + 1][next_num][1]
else:
dp[i][num][occour] += dp[i + 1][num][occour + 1]
The answer will be:
sum = 0
for num = 1 to m:
sum += dp[1][num][1]
I am sure there must be some more elegant dp solution, but I believe this answers your question. Note that I considered that k is the number of times j-th number has been repeated consecutively, correct me if I am wrong with this.
Edit:
With the given constraints the size of the table will be, in the worst case, 10^5 * 6 * 50 = 3e7. This would be > 100MB. It is workable, but can be considered too much space use (I think some kernels doesn't allow that much stack space to a process). One way to reduce it would be to use a hash-map instead of an array with top down approach since top-down doesn't visit all the states. That would be mostly true in this case, for example if A[1] is 2, then all the other states where 1 has occoured more that twice need not be stored. Ofcourse this would not save much space if A[i] has large values, say [50, 50, 50, 50, 50, 50]. Another approach would be to modify our approach a bit. We dont actually need to store the dimension k, i.e. the times j has appeared consecutively:
dp[i][j] = no of ways from i-th position if (i - 1)th position didn't have j and i-th position is j.
Then, we would need to modify our algo to be like:
def solve(i, j):
if i == n:
return 1
if i > n:
return 0
if dp[i][j] != -1
return dp[i][j]
result = 0
# we will first try 1 consecutive j, then 2 consecutive j's then 3 and so on
for count = 1 to A[j]:
for num = 1 to m:
if num != j:
result += solve(i + count, num)
dp[i][j] = result
return dp[i][j]
This approach will reduce our space complexity to O(10^6) ~= 2mb, while time complexity is still the same : O(N * 6 * 50)

Finding median in merged array of two sorted arrays

Assume we have 2 sorted arrays of integers with sizes of n and m. What is the best way to find median of all m + n numbers?
It's easy to do this with log(n) * log(m) complexity. But i want to solve this problem in log(n) + log(m) time. So is there any suggestion to solve this problem?
Explanation
The key point of this problem is to ignore half part of A and B each step recursively by comparing the median of remaining A and B:
if (aMid < bMid) Keep [aMid +1 ... n] and [bLeft ... m]
else Keep [bMid + 1 ... m] and [aLeft ... n]
// where n and m are the length of array A and B
As the following: time complexity is O(log(m + n))
public double findMedianSortedArrays(int[] A, int[] B) {
int m = A.length, n = B.length;
int l = (m + n + 1) / 2;
int r = (m + n + 2) / 2;
return (getkth(A, 0, B, 0, l) + getkth(A, 0, B, 0, r)) / 2.0;
}
public double getkth(int[] A, int aStart, int[] B, int bStart, int k) {
if (aStart > A.length - 1) return B[bStart + k - 1];
if (bStart > B.length - 1) return A[aStart + k - 1];
if (k == 1) return Math.min(A[aStart], B[bStart]);
int aMid = Integer.MAX_VALUE, bMid = Integer.MAX_VALUE;
if (aStart + k/2 - 1 < A.length) aMid = A[aStart + k/2 - 1];
if (bStart + k/2 - 1 < B.length) bMid = B[bStart + k/2 - 1];
if (aMid < bMid)
return getkth(A, aStart + k / 2, B, bStart, k - k / 2); // Check: aRight + bLeft
else
return getkth(A, aStart, B, bStart + k / 2, k - k / 2); // Check: bRight + aLeft
}
Hope it helps! Let me know if you need more explanation on any part.
Here's a very good solution I found in Java on Stack Overflow. It's a method of finding the K and K+1 smallest items in the two arrays where K is the center of the merged array.
If you have a function for finding the Kth item of two arrays then finding the median of the two is easy;
Calculate the weighted average of the Kth and Kth+1 items of X and Y
But then you'll need a way to find the Kth item of two lists; (remember we're one indexing now)
If X contains zero items then the Kth smallest item of X and Y is the Kth smallest item of Y
Otherwise if K == 2 then the second smallest item of X and Y is the smallest of the smallest items of X and Y (min(X[0], Y[0]))
Otherwise;
i. Let A be min(length(X), K / 2)
ii. Let B be min(length(Y), K / 2)
iii. If the X[A] > Y[B] then recurse from step 1. with X, Y' with all elements of Y from B to the end of Y and K' = K - B, otherwise recurse with X' with all elements of X from A to the end of X, Y and K' = K - A
If I find the time tomorrow I will verify that this algorithm works in Python as stated and provide the example source code, it may have some off-by-one errors as-is.
Take the median element in list A and call it a. Compare a to the center elements in list B. Lets call them b1 and b2 (if B has odd length then exactly where you split b depends on your definition of the median of an even length list, but the procedure is almost identical regardless). if b1&leq;a&leq;b2 then a is the median of the merged array. This can be done in constant time since it requires exactly two comparisons.
If a is greater than b2 then we add the top half of A to the top of B and repeat. B will no longer be sorted, but it doesn't matter. If a is less than b1 then we add the bottom half of A to the bottom of B and repeat. These will iterate log(n) times at most (if the median is found sooner then stop, of course).
It is possible that this will not find the median. If this is the case then the median is in B. If so, perform the same algorithm with A and B reversed. This will require log(m) iterations. In total you will have performed at most 2*(log(n)+log(m)) iterations of a constant time operation, so you have solved the problem in order log(n)+log(m) time.
This is essentially the same answer as was given by iehrlich, but written out more explicitly.
Yes, this can be done. Given two arrays, A and B, in the worst-case scenario you have to first perform a binary search in A, and then, if it fails, binary search in B looking for the median. On each step of a binary search, you check if the current element is actually a median of a merged A+B array. Such check takes constant time.
Let's see why such check is constant. For simplicity, let's assume that |A| + |B| is an odd number, and that all numbers in both arrays are different. You can remove these restrictions later by applying the usual median definition approach (i.e., how to calculate the median of an array containing duplicates, or of an array with even length). Anyway, given that, we know for sure, that in the merged array there will be (|A| + |B| - 1) / 2 elements to the right and to the left of an actual median. In the process of a binary search in A, we know the index of current element x in array A (let it be i). Now, if x satisfies the condition B[j] < x < B[j+1], where i + j == (|A| + |B| - 1) / 2, then x is your median.
The overall complexity is O(log(max(|A|, |B|)) time and O(1) memory.

Finding largest from each subarray of length k

Interview Question :- Given an array and an integer k , find the maximum for each and every contiguous sub array of size k.
Sample Input :
1 2 3 1 4 5 2 3 6
3 [ value of k ]
Sample Output :
3
3
4
5
5
5
6
I cant think of anything better than brute force. Worst case is O(nk) when array is sorted in decreasing order.
Just iterate over the array and keep k last elements in a self-balancing binary tree.
Adding element to such tree, removing element and finding current maximum costs O(logk).
Most languages provide standard implementations for such trees. In STL, IIRC, it's MultiSet. In Java you'd use TreeMap (map, because you need to keep count, how many times each element occurs, and Java doesn't provide Multi- collections).
Pseudocode
for (int i = 0; i < n; ++i) {
tree.add(a[i]);
if (tree.size() > k) {
tree.remove(a[i - k]);
}
if (tree.size() == k) {
print(tree.max());
}
}
You can actually do this in O(n) time with O(n) space.
Split the array into blocks of each.
[a1 a2 ... ak] [a(k+1) ... a2k] ...
For each block, maintain two more blocks, the left block and the right block.
The ith element of the left block will be the max of the i elements from the left.
The ith element of the right block will be the max of the i elements from the right.
You will have two such blocks for each block of k.
Now if you want to find the max in range a[i... i+k], say the elements span two of the above blocks of k.
[j-k+1 ... i i+1 ... j] [j+1 ... i+k ... j+k]
All you need to do is find the max of RightMax of i to j of the first block and the left max of j+1 to i+k of the second block.
Hope this is the solution which you are looking for:
def MaxContigousSum(lst, n):
m = [0]
if lst[0] > 0:
m[0] = lst[0]
maxsum = m[0]
for i in range(1, n):
if m[i - 1] + lst[i] > 0:
m.append(m[i - 1] + lst[i])
else:
m.append(0)
if m[i] > maxsum:
maxsum = m[i]
return maxsum
lst = [-2, 11, -4, 13, -5, 2, 1, -3, 4, -2, -1, -6, -9]
print MaxContigousSum(lst, len(lst))
**Output**
20 for [11, -4, 13]

Minimum number of swaps needed to change Array 1 to Array 2?

For example, input is
Array 1 = [2, 3, 4, 5]
Array 2 = [3, 2, 5, 4]
Minimum number of swaps needed are 2.
The swaps need not be with adjacent cells, any two elements can be swapped.
https://www.spoj.com/problems/YODANESS/
As #IVlad noted in the comment to your question Yodaness problem asks you to count number of inversions and not minimal number of swaps.
For example:
L1 = [2,3,4,5]
L2 = [2,5,4,3]
The minimal number of swaps is one (swap 5 and 3 in L2 to get L1), but number of inversions is three: (5 4), (5 3), and (4 3) pairs are in the wrong order.
The simplest way to count number of inversions follows from the definition:
A pair of elements (pi,pj) is called an inversion in a permutation p if i < j and pi > pj.
In Python:
def count_inversions_brute_force(permutation):
"""Count number of inversions in the permutation in O(N**2)."""
return sum(pi > permutation[j]
for i, pi in enumerate(permutation)
for j in xrange(i+1, len(permutation)))
You could count inversion in O(N*log(N)) using divide & conquer strategy (similar to how a merge sort algorithm works). Here's pseudo-code from Counting Inversions translated to Python code:
def merge_and_count(a, b):
assert a == sorted(a) and b == sorted(b)
c = []
count = 0
i, j = 0, 0
while i < len(a) and j < len(b):
c.append(min(b[j], a[i]))
if b[j] < a[i]:
count += len(a) - i # number of elements remaining in `a`
j+=1
else:
i+=1
# now we reached the end of one the lists
c += a[i:] + b[j:] # append the remainder of the list to C
return count, c
def sort_and_count(L):
if len(L) == 1: return 0, L
n = len(L) // 2
a, b = L[:n], L[n:]
ra, a = sort_and_count(a)
rb, b = sort_and_count(b)
r, L = merge_and_count(a, b)
return ra+rb+r, L
Example:
>>> sort_and_count([5, 4, 2, 3])
(5, [2, 3, 4, 5])
Here's solution in Python for the example from the problem:
yoda_words = "in the force strong you are".split()
normal_words = "you are strong in the force".split()
perm = get_permutation(normal_words, yoda_words)
print "number of inversions:", sort_and_count(perm)[0]
print "number of swaps:", number_of_swaps(perm)
Output:
number of inversions: 11
number of swaps: 5
Definitions of get_permutation() and number_of_swaps() are:
def get_permutation(L1, L2):
"""Find permutation that converts L1 into L2.
See http://en.wikipedia.org/wiki/Cycle_representation#Notation
"""
if sorted(L1) != sorted(L2):
raise ValueError("L2 must be permutation of L1 (%s, %s)" % (L1,L2))
permutation = map(dict((v, i) for i, v in enumerate(L1)).get, L2)
assert [L1[p] for p in permutation] == L2
return permutation
def number_of_swaps(permutation):
"""Find number of swaps required to convert the permutation into
identity one.
"""
# decompose the permutation into disjoint cycles
nswaps = 0
seen = set()
for i in xrange(len(permutation)):
if i not in seen:
j = i # begin new cycle that starts with `i`
while permutation[j] != i: # (i σ(i) σ(σ(i)) ...)
j = permutation[j]
seen.add(j)
nswaps += 1
return nswaps
As implied by Sebastian's solution, the algorithm you are looking for can be based on inspecting the permutation's cycles.
We should consider array #2 to be a permutation transformation on array #1. In your example, the permutation can be represented as P = [2,1,4,3].
Every permutation can be expressed as a set of disjoint cycles, representing cyclic position changes of the items. The permutation P for example has 2 cycles: (2,1) and (4,3). Therefore two swaps are enough. In the general case, you should simply subtract the number of cycles from the permutation length, and you get the minimum number of required swaps. This follows from the observation that in order to "fix" a cycle of N elements, N-1 swaps are enough.
This problem has a clean, greedy, trivial solution:
Find any swap operation which gets both swapped elements in Array1 closer to their destination in Array2. Perform the swap operation on Array1 if one exists.
Repeat step1 until no more such swap operations exist.
Find any swap operation which gets one swapped element in Array1 closer to its destination in Array2. If such an operation exist, perform it on Array1.
Go back to step1 until Array1 == Array2.
The correctness of the algorithm can be proved by defining a potential for the problem as the sum of distances of all elements in array1 from their destination in array2.
This can be easily converted to another type of problem, which can be solved more efficiently. All that is needed is to convert the arrays into permutations, i.e. change the values to their ids. So your arrays:
L1 = [2,3,4,5]
L2 = [2,5,4,3]
would become
P1 = [0,1,2,3]
P2 = [0,3,2,1]
with the assignment 2->0, 3->1, 4->2, 5->3. This can only be done if there are no repeated items though. If there are, then this becomes harder to solve.
Converting permutation from one to another can be converted to a similar problem (Number of swaps in a permutation) by inverting the target permutation in O(n), composing the permutations in O(n) and then finding the number of swaps from there to an identity permutation in O(m).
Given:
int P1[] = {0, 1, 2, 3}; // 2345
int P2[] = {0, 3, 2, 1}; // 2543
// we can follow a simple algebraic modification
// (see http://en.wikipedia.org/wiki/Permutation#Product_and_inverse):
// P1 * P = P2 | premultiply P1^-1 *
// P1^-1 * P1 * P = P1^-1 * P2
// I * P = P1^-1 * P2
// P = P1^-1 * P2
// where P is a permutation that makes P1 into P2.
// also, the number of steps from P to identity equals
// the number of steps from P1 to P2.
int P1_inv[4];
for(int i = 0; i < 4; ++ i)
P1_inv[P1[i]] = i;
// invert the first permutation in O(n)
int P[4];
for(int i = 0; i < 4; ++ i)
P[i] = P2[P1_inv[i]];
// chain the permutations in O(n)
int num_steps = NumSteps(P, 4); // will return 2
// now we just need to count the steps in O(num_steps)
To count the steps, a simple algorithm can be devised, such as:
int NumSteps(int *P, int n)
{
int count = 0;
for(int i = 0; i < n; ++ i) {
for(; P[i] != i; ++ count) // could be permuted multiple times
swap(P[P[i]], P[i]); // look where the number at hand should be
}
// count number of permutations
return count;
}
This always swaps an item for a place where it should be in the identity permutation, therefore at every step it undoes and counts one swap. Now, provided that the number of swaps it returns is indeed minimum, the runtime of the algorithm is bounded by it and is guaranteed to finish (instead of getting stuck in an infinite loop). It will run in O(m) swaps or O(m + n) loop iterations where m is number of swaps (the count returned) and n is number of items in the sequence (4). Note that m < n is always true. Therefore, this should be superior to O(n log n) solutions, as the upper bound is O(n - 1) of swaps or O(n + n - 1) of loop iterations here, which is both practically O(n) (constant factor of 2 omitted in the latter case).
The algorithm will only work for valid permutations, it will loop infinitely for sequences with duplicate values and will do out-of-bounds array access (and crash) for sequences with values other than [0, n). A complete test case can be found here (builds with Visual Studio 2008, the algorithm itself should be fairly portable). It generates all possible permutations of lengths 1 to 32 and checks against solutions, generated with breadth first search (BFS), seems to work for all of permutations of lengths 1 to 12, then it becomes fairly slow but I assume it will just continue working.
Algorithm:
Check if the elements of list in the same position are equal. If yes, no swap is required. If no, swap the position of list-element wherever the element is matching
Iterate the process for the entire list elements.
Code:
def nswaps(l1, l2):
cnt = 0
for i in range(len(l1)):
if l1[i] != l2[i]:
ind = l2.index(l1[i])
l2[i], l2[ind] = l2[ind], l2[i]
cnt += 1
pass
return cnt
Since we already know that arr2 has the correct indexes of each element present in arr1. Therefore, we can simply compare the arr1 elements with arr2, and swap them with the correct indexes in case they are at wrong index.
def minimum_swaps(arr1, arr2):
swaps = 0
for i in range(len(arr1)):
if arr1[i] != arr2[i]:
swaps += 1
element = arr1[i]
index = arr1.index(arr2[i]) # find index of correct element
arr1[index] = element # swap
arr1[i] = arr2[i]
return swaps
#J.F. Sebastian and #Eyal Schneider's answer are pretty cool.
I got inspired on solving a similar problem: Calculate the minimum swaps needed to sort an array, e.g.: to sort {2,1,3,0}, you need minimum 2 swaps.
Here is the Java Code:
// 0 1 2 3
// 3 2 1 0 (0,3) (1,2)
public static int sortWithSwap(int [] a) {
Integer[] A = new Integer[a.length];
for(int i=0; i<a.length; i++) A[i] = a[i];
Integer[] B = Arrays.copyOf(mapping(A), A.length, Integer[].class);
int cycles = 0;
HashSet<Integer> set = new HashSet<>();
boolean newCycle = true;
for(int i=0; i<B.length; ) {
if(!set.contains(B[i])) {
if(newCycle) {
newCycle = false;
cycles++;
}
set.add(B[i]);
i = B[i];
}
else if(set.contains(B[i])) { // duplicate in existing cycles
newCycle = true;
i++;
}
}
// suppose sequence has n cycles, each cycle needs swap len(cycle)-1 times
// and sum of length of all cycles is length of sequence, so
// swap = sequence length - cycles
return a.length - cycles;
}
// a b b c
// c a b b
// 3 0 1 1
private static Object[] mapping(Object[] A) {
Object[] B = new Object[A.length];
Object[] ret = new Object[A.length];
System.arraycopy(A, 0, B, 0, A.length);
Arrays.sort(A);
HashMap<Object, Integer> map = new HashMap<>();
for(int i=0; i<A.length; i++) {
map.put(A[i], i);
}
for(int i=0; i<B.length; i++) {
ret[i] = map.get(B[i]);
}
return ret;
}
This seems like an edit distance problem, except that only transpositions are allowed.
Check out Damerau–Levenshtein distance pseudo code. I believe you can adjust it to count only the transpositions.

Resources