How to divide N numbers into N/2 groups (2 numbers each group) such that the sum of diffrence between the 2 numbers in each group is minmal? - algorithm

General Problem Description
Hi, it is actually a special assignment problem( check wiki if interested). Suppose I have 10 agents denoted as A1, A2, ... A10 and I need them to work in pairs. While, according to previous experience, I know the working efficiency of each two-agent pair so that I have an efficiency matrix shown as follows whose ( i, j ) element represents the working efficiency of agent pair ( Ai, Aj ). Hence, we know it should be a symmetric matrix, which means E( i, j )=E( j, i ) and E( i, i ) should be 0. Now, I need divide these 10 agents into 5 groups such that the overall (sum) efficiency is maximal.
E =
0 25 28 23 39 77 56 58 85 41
25 0 18 77 32 52 69 59 47 18
28 18 0 20 55 75 63 38 5 56
23 77 20 0 59 76 24 82 68 64
39 32 55 59 0 49 70 28 42 31
77 52 75 76 49 0 33 84 50 29
56 69 63 24 70 33 0 15 49 83
58 59 38 82 28 84 15 0 68 40
85 47 5 68 42 50 49 68 0 56
41 18 56 64 31 29 83 40 56 0
N.B.
From the matrix view of this problem, I need pick 5 elements from above matrix such that none of them share a same index with others. ( if you pick E( 2, 3 ), then you cannot pick any elments with index containing 2 or 3 since A2 and A3 are assigned to work. In other words, you cannot pick any elements from the 2nd, 3rd row and 2nd, 3rd column.)
The title of this problem is an equivalent problem to the special assignment problem mentinoned above.
You may find the Hungarian(munkres) algorithm helpful! Here is the matlab code.
Another view of this problem is to solve a normal assignment problem, but we need to find a solution whose elements are symmetrically distributed about the diagonal.
Directly applying Hungarian(munkres) algorithm to the symmetric efficiency matrix does not always work. Sometimes it will give asymmetric permutations e.g.
E =
0 30 63 32 20 40
30 0 67 84 75 63
63 67 0 37 79 88
32 84 37 0 43 59
20 75 79 43 0 56
40 63 88 59 56 0
The optimal solution is:
assignment =
0 0 0 0 0 1
0 0 0 1 0 0
1 0 0 0 0 0
0 1 0 0 0 0
0 0 1 0 0 0
0 0 0 0 1 0

This can be solved as weighted maximum matching problem, where:
G = (V,E,w)
V = { all numbers }
E = { (v,u) | v,u in V, v!=u }
w(u,v) = -|u-v|
The solution to maximum matching will pair all your vertices (numbers), such that sum of: sum { -|u-v| : u,v paired } is maximum, which means sum { |u-v| : u,v paired is minimum.

Related

Is there a way to sum pairwise in Octave, vectorized (ie. mapping and reducing matrices)?

Is there a way to sum pairwise in Octave?
If for example, I have a 10-row by 4 column. I want a new 10 row by 2 column, where each column is the sum of the pairs.
ex.
[ 1 2 3 4
2 3 4 5
...
]
=> [ 3 7
5 9
...
]
I know how to accomplish this using for loops and accumarray etc, but I'm just not sure if there's a way to do it that is completely vectorized.
Here are a few more options.
Given:
a = reshape(1:40, 10, 4)
a =
1 11 21 31
2 12 22 32
3 13 23 33
4 14 24 34
5 15 25 35
6 16 26 36
7 17 27 37
8 18 28 38
9 19 29 39
10 20 30 40
Keep it simple
b = [sum(a(:,1:2),2) sum(a(:,3:4),2)]
b =
12 52
14 54
16 56
18 58
20 60
22 62
24 64
26 66
28 68
30 70
Squeeze a little
b = squeeze(sum(reshape(a, [], 2, 2), 2))
b =
12 52
14 54
16 56
18 58
20 60
22 62
24 64
26 66
28 68
30 70
Or, my personal favorite...
Mathemagic
b = a * [1 1 0 0; 0 0 1 1].'
b =
12 52
14 54
16 56
18 58
20 60
22 62
24 64
26 66
28 68
30 70
Perhaps someone comes with a better idea:
a = [1 2 3 4; 2 3 4 5]
b = reshape (sum (reshape (a.', 2, [])), [], rows(a)).'
gives
b =
3 7
5 9

FInd location of element in a vector

I'm new to APL and I would like to find the position of an element(s) within a vector. For example, if I create a vector of 50 random numbers:
lst ← 50 ? 100
How can I find the positions of 91 assuming it occurs 3 times in the vector?
Thanks.
I'm not an expert, but a simple way is to just select the elements from ⍳ 100 where the corresponding element in lst is 91
(lst=91)/⍳100
With Dyalog 16.0, you can use the new monadic function ⍸ "Where".
⍸lst=91
lst=91 gives a vector of 0s and 1s. Applying ⍸ on this gives the locations of all the 1s. This also works if lst is a matrix.
Thanks to ngn, Cows_quack and Probie. I should have read Mastering Dyalog APL more carefully as it also mentions this on page 126. So taking all the answers together:
⍝ Generate a list of 100 non-unique random numbers
lst ← ?100⍴100
⍝ How many times does 1, for example, appear in the vector Using the compress function?
+/ (lst = 1) ⍝ Appears twice
2
⍝ Find the locations of 1 in the vector
(lst = 1) / ⍳ ⍴ lst
2 37 ⍝ Positions 2 and 37
So to break down the solution; (i) (lst = 1) generates a boolean vector where true occurs where the int value of 1 exists; (ii) compress the lst vector by the boolean vector creates a new vector with the positions of 'true' in lst.
Correct me if my description is off?
SIMPLIFICATION:
Using the 'Where' function makes it more readable (though the previous method shows how the APL mindset of array programming is used to solve it):
⍸lst=1
2 37 ⍝ Positions 2 and 37
Thanks for your time on this!
Regards
While your question has already been amply answered, you may be interested in the Key operator, ⌸. When its derived function is applied monadically, it takes a single operand and applies it once for each element in the argument. The function is called with the unique element as left argument and the list of its indices as right argument:
lst ← ?100⍴10
{⍺ ⍵}⌸lst
┌──┬──────────────────────────────────────────┐
│3 │1 3 9 28 37 38 55 70 88 │
├──┼──────────────────────────────────────────┤
│10│2 6 13 17 30 59 64 66 71 82 83 96 │
├──┼──────────────────────────────────────────┤
│7 │4 5 12 15 20 52 54 68 74 85 89 91 92 │
├──┼──────────────────────────────────────────┤
│9 │7 11 24 47 53 58 69 86 90 │
├──┼──────────────────────────────────────────┤
│8 │8 14 16 21 43 51 63 67 73 80 │
├──┼──────────────────────────────────────────┤
│2 │10 18 26 27 34 36 48 78 79 87 │
├──┼──────────────────────────────────────────┤
│1 │19 25 31 32 33 42 57 65 75 84 97 98 99 100│
├──┼──────────────────────────────────────────┤
│6 │22 23 45 46 50 60 76 94 │
├──┼──────────────────────────────────────────┤
│5 │29 49 56 61 72 77 93 95 │
├──┼──────────────────────────────────────────┤
│4 │35 39 40 41 44 62 81 │
└──┴──────────────────────────────────────────┘
Try it online!

Selecting the "P" in Prune and Search Algorithm

Note: the diagram above shows a partition into groups of 5 (the columns). The horizontal box denotes the median values of each partition. The 'P' item indicates the median of medians.
Most of the researches that I saw have this picture in Selecting their "P" and it always have an odd numbers of elements. But What if the numbers elements you have are even?
ex.
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
how do you get your "P" in an even set of elements?
This explanation gives the detail I think you're looking for:
https://www.cs.duke.edu/courses/summer10/cps130/files/Edelsbrunner_Median.pdf
The median of the set plays a special role in this algorithm, and it
is defined as the i-smallest item where i = (n+1)/2 if n is odd and i =
n/2 or (n+2)/2 if n is even.

vectorized indexing of matrices with other matrices (in octave)

Suppose we have a 2D (5x5) matrix:
test =
39 13 90 5 71
60 78 38 4 11
87 92 46 45 35
40 96 61 17 1
90 50 46 89 63
And a second 2D (5x2) matrix:
tidx =
1 3
2 4
2 3
2 4
4 5
And now we want to use tidx as an idex into test, so that we get the following output:
out =
39 90
78 4
92 46
96 17
89 63
One way to do this is with a for loop...
for i=1:size(test,1)
out(i,:) = test(i,tidx(i,:));
end
Question:
Is there a way to vectorize this so the same output is generated without a for loop?
Here is one way:
test(repmat([1:rows(test)]',1,columns(tidx)) + (tidx-1)*rows(test))
What you describe is an index problem. When you place a matrix all in one dimension, you get
test(:) =
39
60
87
40
90
13
78
92
96
50
90
38
46
61
46
5
4
45
17
89
71
11
35
1
63
This can be indexed using a single number. Here is how you figure out how to transform tidx into the correct format.
First, I use the above reference to figure out the index numbers which are:
outinx =
1 11
7 17
8 13
9 19
20 25
Then I start trying to figure out the pattern. This calculation gives a clue:
(tidx-1)*rows(test) =
0 10
5 15
5 10
5 15
15 20
This will move the index count to the correct column of test. Now I just need the correct row.
outinx-(tidx-1)*rows(test) =
1 1
2 2
3 3
4 4
5 5
This pattern is created by the for loop. I created that matrix with:
[1:rows(test)]' * ones(1,columns(tidx))
*EDIT: This does the same thing with a built in function.
repmat([1:rows(test)]',1,columns(tidx))
I then add the 2 together and use them as the index for test.

Finding a set of permutations, with a constraint

I have a set of N^2 numbers and N bins. Each bin is supposed to have N numbers from the set assigned to it. The problem I am facing is finding a set of distributions that map the numbers to the bins, satisfying the constraint, that each pair of numbers can share the same bin only once.
A distribution can nicely be represented by an NxN matrix, in which each row represents a bin. Then the problem is finding a set of permutations of the matrix' elements, in which each pair of numbers shares the same row only once. It's irrelevant which row it is, only that two numbers were both assigned to the same one.
Example set of 3 permutations satisfying the constraint for N=8:
0 1 2 3 4 5 6 7
8 9 10 11 12 13 14 15
16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31
32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55
56 57 58 59 60 61 62 63
0 8 16 24 32 40 48 56
1 9 17 25 33 41 49 57
2 10 18 26 34 42 50 58
3 11 19 27 35 43 51 59
4 12 20 28 36 44 52 60
5 13 21 29 37 45 53 61
6 14 22 30 38 46 54 62
7 15 23 31 39 47 55 63
0 9 18 27 36 45 54 63
1 10 19 28 37 46 55 56
2 11 20 29 38 47 48 57
3 12 21 30 39 40 49 58
4 13 22 31 32 41 50 59
5 14 23 24 33 42 51 60
6 15 16 25 34 43 52 61
7 8 17 26 35 44 53 62
A permutation that doesn't belong in the above set:
0 10 20 30 32 42 52 62
1 11 21 31 33 43 53 63
2 12 22 24 34 44 54 56
3 13 23 25 35 45 55 57
4 14 16 26 36 46 48 58
5 15 17 27 37 47 49 59
6 8 18 28 38 40 50 60
7 9 19 29 39 41 51 61
Because of multiple collisions with the second permutation, since, for example they're both pairing the numbers 0 and 32 in one row.
Enumerating three is easy, it consists of 1 arbitrary permutation, its transposition and a matrix where the rows are made of the previous matrix' diagonals.
I can't find a way to produce a set consisting of more though. It seems to be either a very complex problem, or a simple problem with an unobvious solution. Either way I'd be thankful if somebody had any ideas how to solve it in reasonable time for the N=8 case, or identified the proper, academic name of the problem, so I could google for it.
In case you were wondering what is it useful for, I'm looking for a scheduling algorithm for a crossbar switch with 8 buffers, which serves traffic to 64 destinations. This part of the scheduling algorithm is input traffic agnostic, and switches cyclically between a number of hardwired destination-buffer mappings. The goal is to have each pair of destination addresses compete for the same buffer only once in the cycling period, and to maximize that period's length. In other words, so that each pair of addresses was competing for the same buffer as seldom as possible.
EDIT:
Here's some code I have.
CODE
It's greedy, it usually terminates after finding the third permutation. But there should exist a set of at least N permutations satisfying the problem.
The alternative would require that choosing permutation I involved looking for permutations (I+1..N), to check if permutation I is part of the solution consisting of the maximal number of permutations. That'd require enumerating all permutations to check at each step, which is prohibitively expensive.
What you want is a combinatorial block design. Using the nomenclature on the linked page, you want designs of size (n^2, n, 1) for maximum k. This will give you n(n+1) permutations, using your nomenclature. This is the maximum theoretically possible by a counting argument (see the explanation in the article for the derivation of b from v, k, and lambda). Such designs exist for n = p^k for some prime p and integer k, using an affine plane. It is conjectured that the only affine planes that exist are of this size. Therefore, if you can select n, maybe this answer will suffice.
However, if instead of the maximum theoretically possible number of permutations, you just want to find a large number (the most you can for a given n^2), I am not sure what the study of these objects is called.
Make a 64 x 64 x 8 array: bool forbidden[i][j][k] which indicates whether the pair (i,j) has appeared in row k. Each time you use the pair (i, j) in the row k, you will set the associated value in this array to one. Note that you will only use the half of this array for which i < j.
To construct a new permutation, start by trying the member 0, and verify that at least seven of forbidden[0][j][0] that are unset. If there are not seven left, increment and try again. Repeat to fill out the rest of the row. Repeat this whole process to fill the entire NxN permutation.
There are probably optimizations you should be able to come up with as you implement this, but this should do pretty well.
Possibly you could reformulate your problem into graph theory. For example, you start with the complete graph with N×N vertices. At each step, you partition the graph into N N-cliques, and then remove all edges used.
For this N=8 case, K64 has 64×63/2 = 2016 edges, and sixty-four lots of K8 have 1792 edges, so your problem may not be impossible :-)
Right, the greedy style doesn't work because you run out of numbers.
It's easy to see that there can't be more than 63 permutations before you violate the constraint. On the 64th, you'll have to pair at least one of the numbers with another its already been paired with. The pigeonhole principle.
In fact, if you use the table of forbidden pairs I suggested earlier, you find that there are a maximum of only N+1 = 9 permutations possible before you run out. The table has N^2 x (N^2-1)/2 = 2016 non-redundant constraints, and each new permutation will create N x (N choose 2) = 28 new pairings. So all the pairings will be used up after 2016/28 = 9 permutations. It seems like realizing that there are so few permutations is the key to solving the problem.
You can generate a list of N permutations numbered n = 0 ... N-1 as
A_ij = (i * N + j + j * n * N) mod N^2
which generates a new permutation by shifting the columns in each permutation. The top row of the nth permutation are the diagonals of the n-1th permutation. EDIT: Oops... this only appears to work when N is prime.
This misses one last permutation, which you can get by transposing the matrix:
A_ij = j * N + i

Resources