Finding a set of permutations, with a constraint - algorithm

I have a set of N^2 numbers and N bins. Each bin is supposed to have N numbers from the set assigned to it. The problem I am facing is finding a set of distributions that map the numbers to the bins, satisfying the constraint, that each pair of numbers can share the same bin only once.
A distribution can nicely be represented by an NxN matrix, in which each row represents a bin. Then the problem is finding a set of permutations of the matrix' elements, in which each pair of numbers shares the same row only once. It's irrelevant which row it is, only that two numbers were both assigned to the same one.
Example set of 3 permutations satisfying the constraint for N=8:
0 1 2 3 4 5 6 7
8 9 10 11 12 13 14 15
16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31
32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55
56 57 58 59 60 61 62 63
0 8 16 24 32 40 48 56
1 9 17 25 33 41 49 57
2 10 18 26 34 42 50 58
3 11 19 27 35 43 51 59
4 12 20 28 36 44 52 60
5 13 21 29 37 45 53 61
6 14 22 30 38 46 54 62
7 15 23 31 39 47 55 63
0 9 18 27 36 45 54 63
1 10 19 28 37 46 55 56
2 11 20 29 38 47 48 57
3 12 21 30 39 40 49 58
4 13 22 31 32 41 50 59
5 14 23 24 33 42 51 60
6 15 16 25 34 43 52 61
7 8 17 26 35 44 53 62
A permutation that doesn't belong in the above set:
0 10 20 30 32 42 52 62
1 11 21 31 33 43 53 63
2 12 22 24 34 44 54 56
3 13 23 25 35 45 55 57
4 14 16 26 36 46 48 58
5 15 17 27 37 47 49 59
6 8 18 28 38 40 50 60
7 9 19 29 39 41 51 61
Because of multiple collisions with the second permutation, since, for example they're both pairing the numbers 0 and 32 in one row.
Enumerating three is easy, it consists of 1 arbitrary permutation, its transposition and a matrix where the rows are made of the previous matrix' diagonals.
I can't find a way to produce a set consisting of more though. It seems to be either a very complex problem, or a simple problem with an unobvious solution. Either way I'd be thankful if somebody had any ideas how to solve it in reasonable time for the N=8 case, or identified the proper, academic name of the problem, so I could google for it.
In case you were wondering what is it useful for, I'm looking for a scheduling algorithm for a crossbar switch with 8 buffers, which serves traffic to 64 destinations. This part of the scheduling algorithm is input traffic agnostic, and switches cyclically between a number of hardwired destination-buffer mappings. The goal is to have each pair of destination addresses compete for the same buffer only once in the cycling period, and to maximize that period's length. In other words, so that each pair of addresses was competing for the same buffer as seldom as possible.
EDIT:
Here's some code I have.
CODE
It's greedy, it usually terminates after finding the third permutation. But there should exist a set of at least N permutations satisfying the problem.
The alternative would require that choosing permutation I involved looking for permutations (I+1..N), to check if permutation I is part of the solution consisting of the maximal number of permutations. That'd require enumerating all permutations to check at each step, which is prohibitively expensive.

What you want is a combinatorial block design. Using the nomenclature on the linked page, you want designs of size (n^2, n, 1) for maximum k. This will give you n(n+1) permutations, using your nomenclature. This is the maximum theoretically possible by a counting argument (see the explanation in the article for the derivation of b from v, k, and lambda). Such designs exist for n = p^k for some prime p and integer k, using an affine plane. It is conjectured that the only affine planes that exist are of this size. Therefore, if you can select n, maybe this answer will suffice.
However, if instead of the maximum theoretically possible number of permutations, you just want to find a large number (the most you can for a given n^2), I am not sure what the study of these objects is called.

Make a 64 x 64 x 8 array: bool forbidden[i][j][k] which indicates whether the pair (i,j) has appeared in row k. Each time you use the pair (i, j) in the row k, you will set the associated value in this array to one. Note that you will only use the half of this array for which i < j.
To construct a new permutation, start by trying the member 0, and verify that at least seven of forbidden[0][j][0] that are unset. If there are not seven left, increment and try again. Repeat to fill out the rest of the row. Repeat this whole process to fill the entire NxN permutation.
There are probably optimizations you should be able to come up with as you implement this, but this should do pretty well.

Possibly you could reformulate your problem into graph theory. For example, you start with the complete graph with N×N vertices. At each step, you partition the graph into N N-cliques, and then remove all edges used.
For this N=8 case, K64 has 64×63/2 = 2016 edges, and sixty-four lots of K8 have 1792 edges, so your problem may not be impossible :-)

Right, the greedy style doesn't work because you run out of numbers.
It's easy to see that there can't be more than 63 permutations before you violate the constraint. On the 64th, you'll have to pair at least one of the numbers with another its already been paired with. The pigeonhole principle.
In fact, if you use the table of forbidden pairs I suggested earlier, you find that there are a maximum of only N+1 = 9 permutations possible before you run out. The table has N^2 x (N^2-1)/2 = 2016 non-redundant constraints, and each new permutation will create N x (N choose 2) = 28 new pairings. So all the pairings will be used up after 2016/28 = 9 permutations. It seems like realizing that there are so few permutations is the key to solving the problem.
You can generate a list of N permutations numbered n = 0 ... N-1 as
A_ij = (i * N + j + j * n * N) mod N^2
which generates a new permutation by shifting the columns in each permutation. The top row of the nth permutation are the diagonals of the n-1th permutation. EDIT: Oops... this only appears to work when N is prime.
This misses one last permutation, which you can get by transposing the matrix:
A_ij = j * N + i

Related

How do I use the median of 3 rules to find the pivot point?

Suppose I have a list of numbers like this:
2 23 19 3 5 28 31 67 44 35 6 33 19 45
How do I find the pivot point using the median of 3 method?
I thought it involves adding the first and last numbers,and then dividing it by 2, but apparently that's wrong.

Pyramidal algorithm

I'm trying to find an algorithm in which i can go through a numerical pyramid, starting for the top of the pyramid and go forward through adjacent numbers in the next row and each number has to be added to a final sum. The thing is, i have to find the route that returns the highest result.
I already tried to go throught the higher adjacent number in next row, but that is not the answer, because it not always get the best route.
I.E.
34
43 42
67 89 68
05 51 32 78
72 25 32 49 40
If i go through highest adjacent number, it is:
34 + 43 + 89 + 51 + 32 = 249
But if i go:
34 + 42 + 68 + 78 + 49 = 269
In the second case the result is higher, but i made that route by hand and i can't think in an algorithm that get the highest result in all cases.
Can anyone give me a hand?
(Please tell me if I did not express myself well)
Start with the bottom row. As you go from left to right, consider the two adjacent numbers. Now go up one row and compare the sum of the number that is above the two numbers, in the row above, with each of the numbers below. Select the larger sum.
Basically you are looking at the triangles formed by the bottom row and the row above. So for your original triangle,
34
43 42
67 89 68
05 51 32 78
72 25 32 49 40
the bottom left triangle looks like,
05
72 25
So you would add 72 + 05 = 77, as that is the largest sum between 72 + 05 and 25 + 05.
Similarly,
51
25 32
will give you 51 + 32 = 83.
If you continue this approach for each two adjacent numbers and the number above, you can discard the bottom row and replace the row above with the computed sums.
So in this case, the second to last row becomes
77 83 81 127
and your new pyramid is
34
43 42
67 89 68
77 83 81 127
Keep doing this and your pyramid starts shrinking until you have one number which is the number you are after.
34
43 42
150 172 195
34
215 237
Finally, you are left with one number, 271.
Starting at the bottom (row by row), add the highest value of both the values under each element to that element.
So, for your tree, 05 for example, will get replaced by max(72, 25) + 05 = 77. Later you'll add the maximum of that value and the new value for the 51 element to 67.
The top-most node will be the maximum sum.
Not to spoil all your fun, I'll leave the implementation to you, or the details of getting the actual path, if required.

How does the "successive passes in opposite direction" improvement work for bubble sort?

According to Data Structures Using C by Tenenbaum, one of the improvements of bubble sort is to have successive passes go in opposite direction so that the small elements move quickly to the front which will reduce the required number of passes [pg 336].
I worked out two examples, one which supports this statement and other which is against this one.
Supports: 25 48 37 12 57 86 33 92
iterations using usual Bubble sort :
25 48 37 12 57 86 33 92
25 37 12 48 57 33 86 92
25 12 37 48 33 57 86 92
12 25 37 33 48 57 86 92
12 25 33 37 48 57 86 92
iterations using improvement:
25 48 37 12 57 86 33 92
25 37 12 48 57 33 86 92
12 25 37 33 48 57 86 92
12 25 33 37 48 57 86 92
against: 3 4 1 2 5
iterations using usual Bubble sort:
3 4 1 2 5
3 1 2 4 5
1 2 3 4 5
iterations using improvement:
3 4 1 2 5
3 1 2 4 5
1 3 2 4 5
1 2 3 4 5
So is the statement incorrect that this improvement will always help? Or I am doing something wrong here ?
The example you gave above shows that this algorithm isn't a strict improvement over a standard bubble sort.
The advantage of this approach (sometimes called "cocktail sort," by the way) is that in cases where there are a lot of small elements at the end of the array, it rapidly pulls them to the front compared against normal bubble sort. For example, consider this array:
2 3 4 5 6 7 8 9 10 11 12 ... 10,000,000 1
With a normal bubble sort, it would take 9,999,999 passes over this array to sort it because the element 1, which is way out of place, only gets swapped one step forward on each iteration. On the other hand, with a cocktail sort, this would take just two passes - one initial pass and then a reverse pass.
While the above example is definitely contrived, in a randomly-shuffled array, there are likely going to be some smaller elements toward the end of the array and the number of passes of bubblesort is going to have to be large to move them back. Going in both directions helps speed this up.
That said, bubblesort is a pretty poor choice of a sorting algorithm, so hopefully this is just a theoretical discussion. :-)

Selecting the "P" in Prune and Search Algorithm

Note: the diagram above shows a partition into groups of 5 (the columns). The horizontal box denotes the median values of each partition. The 'P' item indicates the median of medians.
Most of the researches that I saw have this picture in Selecting their "P" and it always have an odd numbers of elements. But What if the numbers elements you have are even?
ex.
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
how do you get your "P" in an even set of elements?
This explanation gives the detail I think you're looking for:
https://www.cs.duke.edu/courses/summer10/cps130/files/Edelsbrunner_Median.pdf
The median of the set plays a special role in this algorithm, and it
is defined as the i-smallest item where i = (n+1)/2 if n is odd and i =
n/2 or (n+2)/2 if n is even.

Understanding "median of medians" algorithm

I want to understand "median of medians" algorithm on the following example:
We have 45 distinct numbers divided into 9 group with 5 elements each.
48 43 38 33 28 23 18 13 8
49 44 39 34 29 24 19 14 9
50 45 40 35 30 25 20 15 10
51 46 41 36 31 26 21 16 53
52 47 42 37 32 27 22 17 54
The first step is sorting every group (in this case they are already sorted)
Second step recursively, find the "true" median of the medians (50 45 40 35 30 25 20 15 10) i.e. the set will be divided into 2 groups:
50 25
45 20
40 15
35 10
30
sorting these 2 groups
30 10
35 15
40 20
45 25
50
the medians is 40 and 15 (in case the numbers are even we took left median)
so the returned value is 15 however "true" median of medians (50 45 40 35 30 25 20 15 10) is 30, moreover there are 5 elements less then 15 which are much less than 30% of 45 which are mentioned in wikipedia
and so T(n) <= T(n/5) + T(7n/10) + O(n) fails.
By the way in the Wikipedia example, I get result of recursion as 36. However, the true median is 47.
So, I think in some cases this recursion may not return true median of medians. I want to understand where is my mistake.
The problem is in the step where you say to find the true median of the medians. In your example, you had these medians:
50 45 40 35 30 25 20 15 10
The true median of this data set is 30, not 15. You don't find this median by splitting the groups into blocks of five and taking the median of those medians, but instead by recursively calling the selection algorithm on this smaller group. The error in your logic is assuming that median of this group is found by splitting the above sequence into two blocks
50 45 40 35 30
and
25 20 15 10
then finding the median of each block. Instead, the median-of-medians algorithm will recursively call itself on the complete data set 50 45 40 35 30 25 20 15 10. Internally, this will split the group into blocks of five and sort them, etc., but it does so to determine the partition point for the partitioning step, and it's in this partitioning step that the recursive call will find the true median of the medians, which in this case will be 30. If you use 30 as the median as the partitioning step in the original algorithm, you do indeed get a very good split as required.
Hope this helps!
Here is the pseudocode for median of medians algorithm (slightly modified to suit your example). The pseudocode in wikipedia fails to portray the inner workings of the selectIdx function call.
I've added comments to the code for explanation.
// L is the array on which median of medians needs to be found.
// k is the expected median position. E.g. first select call might look like:
// select (array, N/2), where 'array' is an array of numbers of length N
select(L,k)
{
if (L has 5 or fewer elements) {
sort L
return the element in the kth position
}
partition L into subsets S[i] of five elements each
(there will be n/5 subsets total).
for (i = 1 to n/5) do
x[i] = select(S[i],3)
M = select({x[i]}, n/10)
// The code to follow ensures that even if M turns out to be the
// smallest/largest value in the array, we'll get the kth smallest
// element in the array
// Partition array into three groups based on their value as
// compared to median M
partition L into L1<M, L2=M, L3>M
// Compare the expected median position k with length of first array L1
// Run recursive select over the array L1 if k is less than length
// of array L1
if (k <= length(L1))
return select(L1,k)
// Check if k falls in L3 array. Recurse accordingly
else if (k > length(L1)+length(L2))
return select(L3,k-length(L1)-length(L2))
// Simply return M since k falls in L2
else return M
}
Taking your example:
The median of medians function will be called over the entire array of 45 elements like (with k = 45/2 = 22):
median = select({48 49 50 51 52 43 44 45 46 47 38 39 40 41 42 33 34 35 36 37 28 29 30 31 32 23 24 25 26 27 18 19 20 21 22 13 14 15 16 17 8 9 10 53 54}, 45/2)
The first time M = select({x[i]}, n/10) is called, array {x[i]} will contain the following numbers: 50 45 40 35 30 20 15 10.
In this call, n = 45, and hence the select function call will be M = select({50 45 40 35 30 20 15 10}, 4)
The second time M = select({x[i]}, n/10) is called, array {x[i]} will contain the following numbers: 40 20.
In this call, n = 9 and hence the call will be M = select({40 20}, 0).
This select call will return and assign the value M = 20.
Now, coming to the point where you had a doubt, we now partition the array L around M = 20 with k = 4.
Remember array L here is: 50 45 40 35 30 20 15 10.
The array will be partitioned into L1, L2 and L3 according to the rules L1 < M, L2 = M and L3 > M. Hence:
L1: 10 15
L2: 20
L3: 30 35 40 45 50
Since k = 4, it's greater than length(L1) + length(L2) = 3. Hence, the search will be continued with the following recursive call now:
return select(L3,k-length(L1)-length(L2))
which translates to:
return select({30 35 40 45 50}, 1)
which will return 30 as a result. (since L has 5 or fewer elements, hence it'll return the element in kth i.e. 1st position in the sorted array, which is 30).
Now, M = 30 will be received in the first select function call over the entire array of 45 elements, and the same partitioning logic which separates the array L around M = 30 will apply to finally get the median of medians.
Phew! I hope I was verbose and clear enough to explain median of medians algorithm.

Resources