Finding kth smallest element in union of 2 sorted array - algorithm

I think this question was asked so many times, but still there aren't any clear solution!
Anyways, this is what I found as good answer in O(k) (possibly O(logm + logn) too). But I don't understand part, where if M_B > M_A (or other way round) we should be throwing away after elements after M_B. But here its reverse - throwing elements which are before M_B. Can anyone please explain why?
http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15451-s01/recitations/rec03/rec03.ps
And other question is doing K/2 ... we should be doing it, but it isn't obvious to me.
[EDIT 1]
Example
A = [2, 9, 15, 22, 24, 25, 26, 30]
B = [1, 4, 5, 7, 18, 22, 27, 33]
k= 6
Answer is 9 (A[1])
Here is what I think, if I want to solve in O(Log k) ... need to throw k/2 elements each time.
Base solution: if K < 2: return 2nd smallest element from - A[0], A[1], B[0], B[1]
else:
compare A[k/2] and B[k/2]: if A[k/2] < B[k/2]: then kth smallest element will be in A[1 ... n] and B[1 ... K/2] ... okay here I thrower k/2 (can do similar for A[k/2] > B[k/2]. so now question is next time also k index is K or k/2?
What I'm doing is right?

That algorithm isn't bad -- it's better than the one which is usually referenced here on SO, in my opinion, because it's a lot simpler -- but it has one huge flaw: it requires that both vectors have at least k elements. (The problem says that they both have the same number of elements, n, but never specifies that n ≥ k; the function doesn't even let you tell it how big the vectors are. However, that's easily solved. I'll leave it as an exercise for now. In general, we'd need an algorithm like this to work on differently-sized arrays, and it does; we just need to be clear on the preconditions.)
The use of floor and ceil is nice and specific, but maybe confusing. Let's just look at this in the most general way. Also, the solution quoted seems to assume that arrays are 1-indexed (i.e. A[1] is the first element, not A[0]). The description I'm about to write, however, uses a more C-like pseudocode, so it assumes that A[0] is the first element. Consequently, I'm going to write it to find element k in the combined set, which is the (k+1)th element. And finally, the solution I'm about to describe differs subtly from the solution presented, which will be apparent in the end condition. IMHO, it's slightly better.
OK, if x is element k in a sequence, there are exactly k elements in the sequence smaller than x. (We won't deal with the case where there are repeated elements, but it's not much different. See note 3.)
Suppose that we know that A and B each have an element k. (Remember, this means they each have at least k + 1 elements.) Select any non-negative integer less than k; we'll call it i. And let j be k - i - 1 (so that i + j == k - 1). [See note 1, below.] Now, look at elements A[i] and B[j]. Let's say A[i] is smaller, since we just have to change all the names in the other case. Remember that we're assuming all the elements are different. So here's what we know at this point:
1) There are i elements in A which are < A[i]
2) There are j elements in B which are < B[j]
3) A[i] < B[j]
4) From (2) and (3), we know that:
5) There are at most j elements in B which are < A[i]
6) From (1) and (5), we know that:
7) There are at most i + j elements in A and B together which are < A[i]
8) But i + j is k - 1, so actually we know:
9) Element k of the merged array must be greater than A[i] (because A[i] is at most element i + j).
Since we know that the answer must be greater than A[i], we can discard A[0] through A[i] (actually, we just increment an array pointer, but effectively we'll discard them). However, we've now discarded i + 1 elements from the original problem. So out of the new set of elements (in the shortened A and the original B), we need element k - (i + 1), instead of the element k.
Now, let's check the precondition. We said that both A and B had an element k elements to start with, so they both have at least k + 1 elements. In the new problem we want to know whether the shortened A and the original B each have at least k - i elements. Clearly B does, because k - i is no greater k. Also, we removed i + 1 elements from A. Originally it had at least k + 1 elements, so now it has at least k - i elements. So we're OK there.
Finally, let's check the termination condition. At the beginning I said that we choose non-negative integers i and j so that i + j == k - 1. That's not possible if k == 0, but it can be done for k == 1. So we only need to do something special once k reaches 0, in which case what we need to do is return min(A[0], B[0]). [This is a much simpler termination condition than in the algorithm you looked at, see Note 2.]
So what's a good strategy for picking i? We'll end up removing either i + 1 or k - i elements from the problem, and we'd like that to be as close to half of the elements as possible. So we should choose i = floor((k - 1) / 2). Although it might not be immediately obvious, that will make j = floor(k / 2).
I'm leaving out the bit where I solve the case where A and B have fewer elements. It's not complicated; I'd encourage you to think about it yourself.
[1] The algorithm you were looking at selects i + j == k (if k is even), and drops either i or j elements. Mine selects i + j == k - 1 (always) which might make one of them smaller, but then it drops i + 1 or j + 1 elements. So it should converge slightly more rapidly.
[2] The difference between selecting i + j == k (theirs) and i + j == k - 1 (mine) is apparent in the end condition. In their formulation, both i and j must be positive, because if one of the were 0, there is a risk of dropping 0 elements, which would be an infinite recursive loop. So in their formulation, the minimum possible value of k is 2, not 1, and so their termination case has to handle k == 1, which involves comparing between four elements, rather than two. For what it's worth, I believe the best solution of "find the second smallest element out of two sorted vectors" is: min(max(A[0], B[0]), min(A[1], B[1])), which requires three comparisons. This doesn't make their algorithm slower; just more complicated.
[3] Suppose elements could repeat. Actually this doesn't change anything. The algorithm still works. Why? Well, we could pretend that every element in A was actually a pair with its actual value and its actual index, and similarly for every element in B, and that we use the index as a tie breaker when comparing values within a vector. Between vectors, we give preference to all the elements in A if A[i] ≤ B[j]; otherwise to all the elements in B. This doesn't actually change the actual code at all, because we never actually have to do any comparison differently, but it makes all the inequalities in the proof valid.

Related

Adding two arrays into one

I have two arrays (a and b) of size n, (positive whole numbers)
a= [a1…..an] b= [b1….bn]
I want to store them in array c, also an array of size n
c=[c1…..cn]
where I add one element from a plus one element from b (each used once) into c, lets say the first element in c is combining a1+b3
Quick example:
n=4 a=[a1,a2,a3,a4] b=[b1,b2,b3,b4]
one way could be:
c=[a1+b2,b3+a4,a2+b1,a3+b4]
The problem is that I want to add them in a way so that the elements in c become as evenly distributed as possible,
One ideal case would be that c came out as:
c=[5,5,5,5]
but the numbers in a and b might not match up so they become even, so I want it to come as close to even as possible.
I an trying to find a way so that the difference between the biggest number in c minus the smallest number in c (after being combined as evenly as I can) to be as small as possible. In my optimal example above that would be 5-5=0 which is most optimal since 0 is the smallest minimum difference I want to achieve. Some other case with other numbers might come out as 6-5=1, which might be the smallest I could get in that situation
My way of going would be to sort array a in ascending order and my array b in descending order,and then combining them with the same element that they are in. Im not sure if this is the best way or the fastest to do this in, I want my code (doing it with python) to be fast. I cant come up with a better way where I could distribute them more evenly,any clue if there are better ways to solve this problem? I really appreciate all advice I could get! Thank you
When trying to solve it in a way where one of the arrays is ascending, and the other one being descending, there might already exist an algorithm that solves it better that I have not thought of. Thank you for reading!
Your algorithm is both correct and fast. It is just proving it that is optimal which is tricky.
We can do this by proving the following two results.
Any other matching of a and b will lead to a maximum at least as big as yours.
Any other matching of a and b will lead to a minimum at least as small as yours.
And the conclusion is that any other matching must have a maximum-minimum at least as big as yours. From which yours must be optimal.
Now let's look at part 1. Sort a ascending, and b descending. Find the i such that c[i] = a[i] + b[i] is a maximum. Suppose that m is any other matching where we're matching up a[j] + b[m[j]]. Note that m[1], ..., m[n] is a permutation of 1, ..., n.
If a[i] + b[m[i]] >= a[i] + b[i] then part 1 is true..
If a[i] + b[m[i]] < a[i] + b[i] then b[m[i]] < b[i] and so we must have i < m[i]. Now there are n-i numbers in the range i+1, ..., n. m maps something out of that range into that range. Because m is a permutation, by the pigeonhole principle, m must map something in that range, out of that range.
In other words there must be a j > i such that m[j] <= i. But now a[i] <= a[j] and b[i] <= b[m[j]] and therefore a[i] + b[i] <= a[j] + b[m[j]]. And so part 1 is true again.
That concludes the proof of part 1.
The proof of part 2 is similar. Except now a[i] + b[i] is at a minimum, m[i] < i, there is a j < i with i <= m[j], a[j] <= a[i], b[m[j]] <= b[i], and a[j] + b[m[j]] <= a[i] + b[i].
And as noted, part 1 and part 2 together implies that you've minimized the difference between the minimum and maximum.

Efficient algorithm to calculate the mode of a hidden array

I'm trying to solve the extension to a problem I described in my question: Efficient divide-and-conquer algorithm
For this extension, there is known to be representatives for 3 parties at the event, and there are more members for 1 party attending than for any other. A formal description of the problem can be found below.
You are given an integer n. There is a hidden array A of size n, which contains elements that can take 1 of 3 values. There is a value, let this be m, that appears more often in the array than the other 2 values.
You are allowed queries of the form introduce(i, j), where i≠j, and 1 <= i, j <= n, and you will get a boolean value in return: You will get back 1, if A[i] = A[j], and 0 otherwise.
Output: B ⊆ [1, 2. ... n] where the A-value of every element in B is m.
A brute-force solution to this could calculate B in O(n2) by calling introduce(i, j) on n(n-1) combinations of elements and create 3 lists containing A-indexes of elements for which a 1 was returned when introduce was called on them, returning the list of largest size.
I understand the Boyer–Moore majority vote algorithm but can't find a way to modify it for this problem or find an efficient algorithm to solve it.
Scan for all A[i] = A[0], and make list I[] of all i for which A[i] != A[0]. Then scan for all A[I[j]] = A[I[0]], and so on. Which requires one O(n) scan for each possible value in A[].
[I assume if introduce(i, j) = 1 and introduce(j, k) = 1, then introduce(i, k) = 1 -- so you don't need to check all combinations of elements.]
Of course, this doesn't tell you what 'm' is, it just makes n lists, where n is the number of values, and each list is all the 'i' where A[i] is the same.

Heap's algorithm for permutations

I'm preparing for interviews and I'm trying to memorize Heap's algorithm:
procedure generate(n : integer, A : array of any):
if n = 1 then
output(A)
else
for i := 0; i < n; i += 1 do
generate(n - 1, A)
if n is even then
swap(A[i], A[n-1])
else
swap(A[0], A[n-1])
end if
end for
end if
This algorithm is a pretty famous one to generate permutations. It is concise and fast and goes hand-in-hand with the code to generate combinations.
The problem is: I don't like to memorize things by heart and I always try to keep the concepts to "deduce" the algorithm later.
This algorithm is really not intuitive and I can't find a way to explain how it works to myself.
Can someone please tell me why and how this algorithm works as expected when generating permutations?
Heap's algorithm is probably not the answer to any reasonable interview question. There is a much more intuitive algorithm which will produce permutations in lexicographical order; although it is amortized O(1) (per permutation) instead of O(1), it is not noticeably slower in practice, and it is much easier to derive on the fly.
The lexicographic order algorithm is extremely simple to describe. Given some permutation, find the next one by:
Finding the rightmost element which is smaller than the element to its right.
Swap that element with the smallest element to its right which is larger than it.
Reverse the part of the permutation to the right of where that element was.
Both steps (1) and (3) are worst-case O(n), but it is easy to prove that the average time for those steps is O(1).
An indication of how tricky Heap's algorithm is (in the details) is that your expression of it is slightly wrong because it does one extra swap; the extra swap is a no-op if n is even, but significantly changes the order of permutations generated when n is odd. In either case, it does unnecessary work. See https://en.wikipedia.org/wiki/Heap%27s_algorithm for the correct algorithm (at least, it's correct today) or see the discussion at Heap's algorithm permutation generator
To see how Heap's algorithm works, you need to look at what a full iteration of the loop does to the vector, in both even and odd cases. Given a vector of even length, a full iteration of Heap's algorithm will rearrange the elements according to the rule
[1,...n] → [(n-2),(n-1),2,3,...,(n-3),n,1]
whereas if the vector is of odd length, it will be simply swap the first and last elements:
[1,...n] → [n,2,3,4,...,(n-2),(n-1),1]
You can prove that both of these facts are true using induction, although that doesn't provide any intuition as to why it's true. Looking at the diagram on the Wikipedia page might help.
I found an article that tries to explain it here: Why does Heap's algorithm work?
However, I think it is hard to understand it, so came up with an explanation that is hopefully easier to understand:
Please just assume that these statements are true for a moment (i'll show that later):
Each invocation of the "generate" function
(I) where n is odd, leaves the elements in the exact same ordering when it is finished.
(II) where n is even, rotates the elements to the right, for example ABCD becomes DABC.
So in the "for i"-loop
when
n is even
The recursive call "generate(n - 1, A)" does not change the order.
So the for-loop can iteratively swap the element at i=0..(n-1) with the element at (n - 1) and will have called "generate(n - 1, A)" each time with another element missing.
n is odd
The recursive call "generate(n - 1, A)" has rotated the elements right.
So the element at index 0 will always be a different element automatically.
Just swap the elements at 0 and (n-1) in each iteration to produce a unique set of elements.
Finally, let's see why the initial statements are true:
Rotate-right
(III) This series of swaps result in a rotation to the right by one position:
A[0] <-> A[n - 1]
A[1] <-> A[n - 1]
A[2] <-> A[n - 1]
...
A[n - 2] <-> A[n - 1]
For example try it with sequence ABCD:
A[0] <-> A[3]: DBCA
A[1] <-> A[3]: DACB
A[2] <-> A[3]: DABC
No-op
(IV) This series of steps leaves the sequence in the exact same ordering as before:
Repeat n times:
Rotate the sub-sequence a[0...(n-2)] to the right
Swap: a[0] <-> a[n - 1]
Intuitively, this is true:
If you have a sequence of length 5, then rotate it 5 times, it ends up unchanged.
Taking the element at 0 out before the rotation, then after the rotation swapping it with the new element at 0 does not change the outcome (if rotating n times).
Induction
Now we can see why (I) and (II) are true:
If n is 1:
Trivially, the ordering is unchanged after invoking the function.
If n is 2:
The recursive calls "generate(n - 1, A)" leave the ordering unchanged (because it invokes generate with first argument being 1).
So we can just ignore those calls.
The swaps that get executed in this invocation result in a right-rotation, see (III).
If n is 3:
The recursive calls "generate(n - 1, A)" result in a right-rotation.
So the total steps in this invocation equal (IV) => The sequence is unchanged.
Repeat for n = 4, 5, 6, ...
The reason Heap’s algorithm constructs all permutations is that it adjoins each element to each permutation of the rest of the elements. When you execute Heap's algorithm, recursive calls on even length inputs place elements n, (n-1), 2, 3, 4, ..., (n-2), 1 in the last position and recursive calls on odd length inputs place elements n, (n-3), (n-4), (n-5), ..., 2, (n-2), (n-1), 1 in the last position. Thus, in either case, all elements are adjoined with all permutations of n - 1 elements.
If you would like a more detailed an graphical explanation, have a look at this article.
function* permute<T>(array: T[], n = array.length): Generator<T[]> {
if (n > 1) {
for (let ix = 1; ix < n; ix += 1) {
for (let _arr of permute(array, n - 1)) yield _arr
let j = n % 2 ? 0 : ix - 1
;[array[j], array[n - 1]] = [array[n - 1], array[j]]
}
for (let _arr of permute(array, n - 1)) yield _arr
} else yield array
}
Example use:
for (let arr of permute([1, 2, 3])) console.log(arr)
Trickiest part for me to understand as I am still studying it as well was the recursive expression:
for i := 0; i < n; i += 1 do
generate(n - 1, A)
I read it as evaluate at every i to n
have the terminate condition at n = 1
either an odd/even n return on execution
Since it calls and returns one 1 for every i as n is passed back recursively. Minimal change can be achieved when permuting every n + 1 passed back.
just a side tip. the heap algorithm will generate n! combinations.
i.e
if you pass n=[1,2,3] as a input the result will be n! which is

Correctness of greedy algorithm

In non-decreasing sequence of (positive) integers two elements can be removed when . How many pairs can be removed at most from this sequence?
So I have thought of the following solution:
I take given sequence and divide into two parts (first and second).
Assign to each of them iterator - it_first := 0 and it_second := 0, respectively. count := 0
when it_second != second.length
if 2 * first[it_first] <= second[it_second]
count++, it_first++, it_second++
else
it_second++
count is the answer
Example:
count := 0
[1,5,8,10,12,13,15,24] --> first := [1,5,8,10], second := [12,13,15,24]
2 * 1 ?< 12 --> true, count++, it_first++ and it_second++
2 * 5 ?< 13 --> true, count++, it_first++ and it_second++
2 * 8 ?< 15 --> false, it_second++
8 ?<24 --> true, count ++it_second reach the last element - END.
count == 3
Linear complexity (the worst case when there are no such elements to be removed. n/2 elements compare with n/2 elements).
So my missing part is 'correctness' of algorithm - I've read about greedy algorithms proof - but mostly with trees and I cannot find analogy. Any help would be appreciated. Thanks!
EDIT:
By correctness I mean:
* It works
* It cannot be done faster(in logn or constant)
I would like to put some graphics but due to reputation points < 10 - I can't.
(I've meant one latex at the beginning ;))
Correctness:
Let's assume that the maximum number of pairs that can be removed is k. Claim: there is an optimal solution where the first elements of all pairs are k smallest elements of the array.
Proof: I will show that it is possible to transform any solution into the one that contains the first k elements as the first elements of all pairs.
Let's assume that we have two pairs (a, b), (c, d) such that a <= b <= c <= d, 2 * a <= b and 2 * c <= d. In this case, pairs (a, c) and (b, d) are valid, too. And now we have a <= c <= b <= d. Thus, we can always transform out pairs in such a way that the first element from any pair is not greater than the second element of any pair.
When we have this property, we can simply substitute the smallest element among all first all elements of all pairs with the smallest element in the array, the second smallest among all first elements - with the second smallest element in the array and so on without invalidating any pair.
Now we know that there is an optimal solution that contains k smallest elements. It is clear that we cannot make the answer worse by taking the smallest unused element(making it bigger can only reduce the answer for the next elements) which fits each of them. Thus, this solution is correct.
A note about the case when the length of the array is odd: it doesn't matter where the middle element goes: to the first or to the second half. In the first half it is useless(there are not enough elements in the second half). If we put it to the second half, it is useless two(let's assume that we took it. It means that there is "free space" somewhere in the second half. Thus, we can shift some elements by one and get rid of it).
Optimality in terms of time complexity: the time complexity of this solution is O(n). We cannot find the answer without reading the entire input in the worst case and reading is already O(n) time. Thus, this algorithm is optimal.
Presuming your method. Indices are 0-based.
Denote in general:
end_1 = floor(N/2) boundary (inclusive) of first part.
Denote while iterating:
i index in first part, j index in second part,
optimal solution until this point sol(i,j) (using algorithm from front),
pairs that remain to be paired-up optimally behind (i,j) point i.e. from
(i+1,j+1) onward rem(i,j) (can be calculated using algorithm from back),
final optimal solution can be expressed as the function of any point as sol(i,j) + rem(i,j).
Observation #1: when doing algorithm from front all points in [0, i] range are used, some points from [end_1+1, j] range are not used (we skip a(j) not large engough). When doing algorithm from back some [i+1, end_1] points are not used, and all [j+1, N] points are used (we skip a(i) not small enough).
Observation #2: rem(i,j) >= rem(i,j+1), because rem(i,j) = rem(i,j+1) + M, where M can be 0 or 1 depending on whether we can pair up a(j) with some unused element from [i+1, end_1] range.
Argument (by contradiction): let's assume 2*a(i) <= a(j) and that not pairing up a(i) and a(j) gives at least as good final solution. By the algorithm we would next try to pair up a(i) and a(j+1). Since:
rem(i,j) >= rem(i,j+1) (see above),
sol(i,j+1) = sol(i,j) (since we didn't pair up a(i) and a(j))
we get that sol(i,j) + rem(i,j) >= sol(i,j+1) + rem(i,j+1) which contradicts the assumption.

Minimal number of swaps?

There are N characters in a string of types A and B in the array (same amount of each type). What is the minimal number of swaps to make sure that no two adjacent chars are same if we can only swap two adjacent characters ?
For example, input is:
AAAABBBB
The minimal number of swaps is 6 to make the array ABABABAB. But how would you solve it for any kind of input ? I can only think of O(N^2) solution. Maybe some kind of sort ?
If we need just to count swaps, then we can do it with O(N).
Let's assume for simplicity that array X of N elements should become ABAB... .
GetCount()
swaps = 0, i = -1, j = -1
for(k = 0; k < N; k++)
if(k % 2 == 0)
i = FindIndexOf(A, max(k, i))
X[k] <-> X[i]
swaps += i - k
else
j = FindIndexOf(B, max(k, j))
X[k] <-> X[j]
swaps += j - k
return swaps
FindIndexOf(element, index)
while(index < N)
if(X[index] == element) return index
index++
return -1; // should never happen if count of As == count of Bs
Basically, we run from left to right, and if a misplaced element is found, it gets exchanged with the correct element (e.g. abBbbbA** --> abAbbbB**) in O(1). At the same time swaps are counted as if the sequence of adjacent elements would be swapped instead. Variables i and j are used to cache indices of next A and B respectively, to make sure that all calls together of FindIndexOf are done in O(N).
If we need to sort by swaps then we cannot do better than O(N^2).
The rough idea is the following. Let's consider your sample: AAAABBBB. One of Bs needs O(N) swaps to get to the A B ... position, another B needs O(N) to get to A B A B ... position, etc. So we get O(N^2) at the end.
Observe that if any solution would swap two instances of the same letter, then we can find a better solution by dropping that swap, which necessarily has no effect. An optimal solution therefore only swaps differing letters.
Let's view the string of letters as an array of indices of one kind of letter (arbitrarily chosen, say A) into the string. So AAAABBBB would be represented as [0, 1, 2, 3] while ABABABAB would be [0, 2, 4, 6].
We know two instances of the same letter will never swap in an optimal solution. This lets us always safely identify the first (left-most) instance of A with the first element of our index array, the second instance with the second element, etc. It also tells us our array is always in sorted order at each step of an optimal solution.
Since each step of an optimal solution swaps differing letters, we know our index array evolves at each step only by incrementing or decrementing a single element at a time.
An initial string of length n = 2k will have an array representation A of length k. An optimal solution will transform this array to either
ODDS = [1, 3, 5, ... 2k]
or
EVENS = [0, 2, 4, ... 2k - 1]
Since we know in an optimal solution instances of a letter do not pass each other, we can conclude an optimal solution must spend min(abs(ODDS[0] - A[0]), abs(EVENS[0] - A[0])) swaps to put the first instance in correct position.
By realizing the EVENS or ODDS choice is made only once (not once per letter instance), and summing across the array, we can count the minimum number of needed swaps as
define count_swaps(length, initial, goal)
total = 0
for i from 0 to length - 1
total += abs(goal[i] - initial[i])
end
return total
end
define count_minimum_needed_swaps(k, A)
return min(count_swaps(k, A, EVENS), count_swaps(k, A, ODDS))
end
Notice the number of loop iterations implied by count_minimum_needed_swaps is 2 * k = n; it runs in O(n) time.
By noting which term is smaller in count_minimum_needed_swaps, we can also tell which of the two goal states is optimal.
Since you know N, you can simply write a loop that generates the values with no swaps needed.
#define N 4
char array[N + N];
for (size_t z = 0; z < N + N; z++)
{
array[z] = 'B' - ((z & 1) == 0);
}
return 0; // The number of swaps
#Nemo and #AlexD are right. The algorithm is order n^2. #Nemo misunderstood that we are looking for a reordering where two adjacent characters are not the same, so we can not use that if A is after B they are out of order.
Lets see the minimum number of swaps.
We dont care if our first character is A or B, because we can apply the same algorithm but using A instead of B and viceversa everywhere. So lets assume that the length of the word WORD_N is 2N, with N As and N Bs, starting with an A. (I am using length 2N to simplify the calculations).
What we will do is try to move the next B right to this A, without taking care of the positions of the other characters, because then we will have reduce the problem to reorder a new word WORD_{N-1}. Lets also assume that the next B is not just after A if the word has more that 2 characters, because then the first step is done and we reduce the problem to the next set of characters, WORD_{N-1}.
The next B should be as far as possible to be in the worst case, so it is after half of the word, so we need $N-1$ swaps to put this B after the A (maybe less than that). Then our word can be reduced to WORD_N = [A B WORD_{N-1}].
We se that we have to perform this algorithm as most N-1 times, because the last word (WORD_1) will be already ordered. Performing the algorithm N-1 times we have to make
N_swaps = (N-1)*N/2.
where N is half of the lenght of the initial word.
Lets see why we can apply the same algorithm for WORD_{N-1} also assuming that the first word is A. In this case it matters than the first word should be the same as in the already ordered pair. We can be sure that the first character in WORD_{N-1} is A because it was the character just next to the first character in our initial word, ant if it was B the first work can perform only a swap between these two words and or none and we will already have WORD_{N-1} starting with the same character than WORD_{N}, while the first two characters of WORD_{N} are different at the cost of almost 1 swap.
I think this answer is similar to the answer by phs, just in Haskell. The idea is that the resultant-indices for A's (or B's) are known so all we need to do is calculate how far each starting index has to move and sum the total.
Haskell code:
Prelude Data.List> let is = elemIndices 'B' "AAAABBBB"
in minimum
$ map (sum . zipWith ((abs .) . (-)) is) [[1,3..],[0,2..]]
6 --output

Resources