Weekly group assignment algorithm - algorithm

A friend of mine who who is a teacher has 23 students in a class. They want an algorithm that assigns students in groups of 2 and one group of 3 (handle the odd number of students) across 14 weeks such that no two pairs repeat across the 14 weeks (a pair is assigned to one week).
A brute force approach would be too inefficient, so I was thinking of other approaches, matrix representation sounds appealing, and graph theory. Does anyone have any ideas? The problems that I could find deal only with 1 week and this answer I could quite figure out.

Round-robin algorithm will do the trick i think.
Add the remaining student to the second group and you are done.
First run
1 2 3 4 5 6 7 8 9 10 11 12
23 22 21 20 19 18 17 16 15 14 13
Second run
1 23 2 3 4 5 6 7 8 9 10 11
22 21 20 19 18 17 16 15 14 13 12
...

Another possibility might be graph matching, 14 distinct graph matchings would be needed.

Try to describe the problem in terms of constraints.
Then pass the constraints to a tool like ECLiPSe (not Eclipse), see http://eclipseclp.org/.
In fact, your problem seems similar to that of the Golf example on that site (http://eclipseclp.org/examples/golf.ecl.txt).

Here's an example in Haskell that will produce groups of 14 non-repeating 11-pair-combinations. The value 'pairs' is all combinations of pairs from 1 to 23 (e.g., [1,2], [1,3] etc.). Then the program builds lists where each list is 14 lists of 11 pairs (choosing from the value 'pairs') such that no pair is repeated and no single number is repeated in one list of 11 pairs. It's up to you to simply place the missing last student for each week as you see fit. (It took about three minutes to calculate before it started to output results):
import Data.List
import Control.Monad
pairs = nubBy (\x y -> reverse x == y)
$ filter (\x -> length (nub x) == length x) $ replicateM 2 [1..23]
solve = solve' [] where
solve' results =
if length results == 14
then return results
else solveOne [] where
solveOne result =
if length result == 11
then solve' (result:results)
else do next <- pairs
guard (notElem (head next) result'
&& notElem (last next) result'
&& notElem next results')
solveOne (next:result)
where result' = concat result
results' = concat results
One sample from the output:
[[[12,17],[10,19],[9,18],[8,22],[7,21],[6,23],[5,11],[4,14],[3,13],[2,16],[1,15]],
[[12,18],[11,19],[9,17],[8,21],[7,23],[6,22],[5,10],[4,15],[3,16],[2,13],[1,14]],
[[12,19],[11,18],[10,17],[8,23],[7,22],[6,21],[5,9],[4,16],[3,15],[2,14],[1,13]],
[[15,23],[14,22],[13,17],[8,18],[7,19],[6,20],[5,16],[4,9],[3,10],[2,11],[1,12]],
[[16,23],[14,21],[13,18],[8,17],[7,20],[6,19],[5,15],[4,10],[3,9],[2,12],[1,11]],
[[16,21],[15,22],[13,19],[8,20],[7,17],[6,18],[5,14],[4,11],[3,12],[2,9],[1,10]],
[[16,22],[15,21],[14,20],[8,19],[7,18],[6,17],[5,13],[4,12],[3,11],[2,10],[1,9]],
[[20,21],[19,22],[18,23],[12,13],[11,14],[10,15],[9,16],[4,5],[3,6],[2,7],[1,8]],
[[20,22],[19,21],[17,23],[12,14],[11,13],[10,16],[9,15],[4,6],[3,5],[2,8],[1,7]],
[[20,23],[18,21],[17,22],[12,15],[11,16],[10,13],[9,14],[4,7],[3,8],[2,5],[1,6]],
[[19,23],[18,22],[17,21],[12,16],[11,15],[10,14],[9,13],[4,8],[3,7],[2,6],[1,5]],
[[22,23],[18,19],[17,20],[14,15],[13,16],[10,11],[9,12],[6,7],[5,8],[2,3],[1,4]],
[[21,23],[18,20],[17,19],[14,16],[13,15],[10,12],[9,11],[6,8],[5,7],[2,4],[1,3]],
[[21,22],[19,20],[17,18],[15,16],[13,14],[11,12],[9,10],[7,8],[5,6],[3,4],[1,2]]]

Start off with a set (maybe a bitset mapping to students for less memory consumption) for each student that has all other students in it. Iterate 14 times, each time picking 11 students (for the 11 groups you will form) for whom you will pick partners. For each student, pick a partner they haven't been in a group with yet. For one random student of those 11, pick a second partner, but make sure no student has less remaining partners than there are iterations left. For every pick, adjust the sets.

Related

Finding the best combination of elements with the max total parameter value

I have 100 elements. Each element has 4 features A,B,C,D. Each feature is an integer.
I want to select 2 elements for each feature, so that I have selected a total of 8 distinct elements. I want to maximize the sum of the 8 selected features A,A,B,B,C,C,D,D.
A greedy algorithm would be to select the 2 elements with highest A, then the two elements with highest B among the remaining elements, etc. However, this might not be optimal, because the elements that have highest A could also have a much higher B.
Do we have an algorithm to solve such a problem optimally?
This can be solved as a minimum cost flow problem. In particular, this is an Assignment problem
First of all, see that we only need the 8 best elements of each features, meaning 32 elements maximum. It should even be possible to cut the search space further (as if the 2 best elements of A is not one of the 6 best elements of any other feature, we can already assigne those 2 elements to A, and each other feature only needs to look at the first 6 best elements. If it's not clear why, I'll try to explain further).
Then we make the vertices S,T and Fa,Fb,Fc,Fd and E1,E2,...E32, with the following edge :
for each vertex Fx, an edge from S to Fx with maximum flow 2 and a weight of 0 (as we want 2 element for each feature)
for each vertex Ei, an edge from Fx to Ei if Ei is one of the top elements of feature x, with maximum flow 1 and weight equal to the negative value of feature x of Ei. (negative because the algorithm will find the minimum cost)
for each vertex Ei, an edge from Ei to T, with maximum flow 1 and weight 0. (as each element can only be selected once)
I'm not sure if this is the best way, but It should work.
As suggested per #AloisChristen, this can be written as an assignment problem:
On the one side, we select the 8 best elements for each feature; that's 32 elements or less, since one element might be in the best 8 for more than one feature;
On the other side, we put 8 seats A,A,B,B,C,C,D,D
Solve the resulting assignment problem.
Here the problem is solved using scipy's linear_sum_assignment optimization function:
from numpy.random import randint
from numpy import argpartition, unique, concatenate
from scipy.optimize import linear_sum_assignment
# PARAMETERS
n_elements = 100
n_features = 4
n_per_feature = 2
# RANDOM DATA
data = randint(0, 21, (n_elements, n_features)) # random data with integer features between 0 and 20 included
# SELECT BEST 8 CANDIDATES FOR EACH FEATURE
n_selected = n_features * n_per_feature
n_candidates = n_selected * n_features
idx = argpartition(data, range(-n_candidates, 0), axis=0)
idx = unique(idx[-n_selected:].ravel())
candidates = data[idx]
n_candidates = candidates.shape[0]
# SOLVE ASSIGNMENT PROBLEM
cost_matrix = -concatenate((candidates,candidates), axis=1) # 8 columns in order ABCDABCD
element_idx, seat_idx = linear_sum_assignment(cost_matrix)
score = -cost_matrix[element_idx, seat_idx].sum()
# DISPLAY RESULTS
print('SUM OF SELECTED FEATURES: {}'.format(score))
for e,s in zip(element_idx, seat_idx):
print('{:2d}'.format(idx[e]),
'ABCDABCD'[s],
-cost_matrix[e,s],
data[idx[e]])
Output:
SUM OF SELECTED FEATURES: 160
3 B 20 [ 5 20 14 11]
4 A 20 [20 9 3 12]
6 C 20 [ 3 3 20 8]
10 A 20 [20 10 9 9]
13 C 20 [16 12 20 18]
23 D 20 [ 6 10 4 20]
24 B 20 [ 5 20 6 8]
27 D 20 [20 13 19 20]

Generate an unique identifier such as a hash every N minutes. But they have to be the same in N minutes timeframe without storing data

I want to create a unique identifier such as a small hash every N minutes but the result should be the same in the N timeframe without storing data.
Examples when the N minute is 10:
0 > 10 = 25ba38ac9
10 > 20 = 900605583
20 > 30 = 6156625fb
30 > 40 = e130997e3
40 > 50 = 2225ca027
50 > 60 = 3b446db34
Between minute 1 and 10 i get "25ba38ac9" but with anything between 10 and 20 i get "900605583" etc.
I have no start/example code because i have no idea or algorithm i can use to create the desired result.
I did not provide a specific tag or language in this question because i am interested in the logic and not the final code. But i appreciate documented code as a example.
Pick your favourite hash-function h. Pick your favourite string sugar. To get a hash at time t, append the euclidean quotient of t divided by N to sugar, and apply h to it.
Example in python:
h = lambda x: hex(abs(hash(x)))
sugar = 'Samir'
def hash_for_N_minutes(t, N=10):
return h(sugar + str(t // N))
for t in range(0, 30, 2):
print(t, hash_for_N_minutes(t, 10))
Output:
0 0xeb3d3abb787c890
2 0xeb3d3abb787c890
4 0xeb3d3abb787c890
6 0xeb3d3abb787c890
8 0xeb3d3abb787c890
10 0x45e2d2a970323e9f
12 0x45e2d2a970323e9f
14 0x45e2d2a970323e9f
16 0x45e2d2a970323e9f
18 0x45e2d2a970323e9f
20 0x334dce1d931e5da8
22 0x334dce1d931e5da8
24 0x334dce1d931e5da8
26 0x334dce1d931e5da8
28 0x334dce1d931e5da8
Weakness of this hash method, and suggested improvement
Of course, nothing stops you from inputting a time in the future. So you can easily answer the question "What will the hash be in exactly one hour ?".
If you want future hashes to be unpredictable, you can combine the value t // N with a real-world value that's dependent on the time, not known in advance, but for which we keep records.
There are two well-known time-series that fit this criterion: values related to the meteo, and values related to the stock market.
See also this 2008 xkcd comic: https://xkcd.com/426/

Getting minimum possible number after performing operations on array elements

Question : Given an integer(n) denoting the no. of particles initially
Given an array of sizes of these particles
These particles can go into any number of simulations (possibly none)
In one simualtion two particles combines to give another particle with size as the difference between the size of them (possibly 0).
Find the smallest particle that can be formed.
constraints
n<=1000
size<=1e9
Example 1
3
30 10 8
Output
2
Explaination- 10 - 8 is the smallest we can achive
Example 2
4
1 2 4 8
output
1
explanation
We cannot make another 1 so as to get 0 so smallest without any simulation is 1
example 3
5
30 27 26 10 6
output
0
30-26=4
10-6 =4
4-4 =0
My thinking: I can only think of the brute force solution which will obviously time out. Can anyone help me out here with just the approach? I think it's related to dynamic programming
I think this can be solved in O(n^2log(n))
Consider your third example: 30 27 26 10 6
Sort the input to make it : 6 10 26 27 30
Build a list of differences for each (i,j) combination.
For:
i = 1 -> 4 20 21 24
i = 2 -> 16, 17, 20
i = 3 -> 1, 4
i = 4 -> 3
There is no list for i = 5 why? because it is already considered for combination with other particles before.
Now consider the below cases:
Case 1
The particle i is not combined with any other particle yet. This means some other particle should have been combined with a particle other than i.
This suggests us that we need to search for A[i] in the lists j = 1 to N except for j = i.
Get the nearest value. This can be done using binary search. Because your difference lists are sorted! Then your result for now is |A[i] - NearestValueFound|
Case 2
The particle i is combined with some other particle.
Take example i = 1 above and lets consider that its combined with particle 2. The result is 4.
So search for 4 in all the lists except list 2 - because we consider that particle 2 is already combined with particle 1 and we shouldn't search list 2.
Do we have a best match? It seems we have a match 4 found in the list 3. It needn't be 0 - in this case it is 0 so just return 0.
Repeat Case 1, 2 for all particles. Time complexity is O(n^2log(n)), because you are doing a binary search on all lists for each i except the list i.
import itertools as it
N = int(input())
nums = list()
for i in range(N):
nums.append(int(input()))
_min = min(nums)
def go(li):
global _min
if len(li)>1:
for i in it.combinations(li, 2):
temp = abs(i[0] - i[1])
if _min > temp:
_min = temp
k = li.copy()
k.remove(i[0])
k.remove(i[1])
k.append(temp)
go(k)
go(nums)
print(_min)

How do I find the lowest numbers in a square table, only one per column and one per row [duplicate]

Suppose we have a table of numbers like this (we can assume it is a square table):
20 2 1 3 4
5 1 14 8 9
15 12 17 17 11
16 1 1 15 18
20 13 15 5 11
Your job is to calculate the maximum sum of n numbers where n is the number of rows or columns in the table. The catch is each number must come from a unique row and column.
For example, selecting the numbers at (0,0), (1,1), (2,2), (3,3), and (4,4) is acceptable, but (0,0), (0,1), (2,2), (3,3), and (4,4) is not because the first two numbers were pulled from the same row.
My (laughable) solution to this problem is iterating through all the possible permutations of the rows and columns. This works for small grids but, of course, it is incredibly slow as n gets big. It has O(n!) time complexity, if I'm not mistaken (sample Python code below).
I really think this can be solved in better time, but I'm not coming up with anything sufficiently clever.
So my question is, what algorithm should be used to solve this?
If it helps, this problem seems similar to the
knapsack problem.
import itertools
import re
grid = """20 2 1 3 4
5 1 14 8 9
15 12 17 17 11
16 1 1 15 18
20 13 15 5 11"""
grid = [[int(x) for x in re.split("\s+", line)] for line in grid.split("\n")]
possible_column_indexes = itertools.permutations(range(len(grid)))
max_sum = 0
max_positions = []
for column_indexes in possible_column_indexes:
current_sum = 0
current_positions = []
for row, col in enumerate(column_indexes):
current_sum += grid[row][col]
current_positions.append("(%d, %d)" % (row, col))
if current_sum > max_sum:
max_sum = current_sum
max_positions = current_positions
print "Max sum is", max_sum
for position in max_positions:
print position
This is the maximum cost bipartite matching problem. The classical way to solve it is by using the Hungarian algorithm.
Basically you have a bipartite graph: the left set is the rows and the right set is the columns. Each edge from row i to column j has cost matrix[i, j]. Find the matching that maximizes the costs.
For starters, you can use dynamic programming.
In your straightforward approach, you are doing exactly the same computation many, many times.
For example, at some point you answer the question: "For the last three columns with rows 1 and 2 already taken, how do I maximize the sum?" You compute the answer to this question twice, once when you pick row 1 from column 1 and row 2 from column 2, and once when you pick them vice-versa.
So don't do that. Cache the answer -- and also cache all similar answers to all similar questions -- and re-use them.
I do not have time right now to analyze the running time of this approach. I think it is O(2^n) or thereabouts. More later maybe...

Maximize sum of table where each number must come from unique row and column

Suppose we have a table of numbers like this (we can assume it is a square table):
20 2 1 3 4
5 1 14 8 9
15 12 17 17 11
16 1 1 15 18
20 13 15 5 11
Your job is to calculate the maximum sum of n numbers where n is the number of rows or columns in the table. The catch is each number must come from a unique row and column.
For example, selecting the numbers at (0,0), (1,1), (2,2), (3,3), and (4,4) is acceptable, but (0,0), (0,1), (2,2), (3,3), and (4,4) is not because the first two numbers were pulled from the same row.
My (laughable) solution to this problem is iterating through all the possible permutations of the rows and columns. This works for small grids but, of course, it is incredibly slow as n gets big. It has O(n!) time complexity, if I'm not mistaken (sample Python code below).
I really think this can be solved in better time, but I'm not coming up with anything sufficiently clever.
So my question is, what algorithm should be used to solve this?
If it helps, this problem seems similar to the
knapsack problem.
import itertools
import re
grid = """20 2 1 3 4
5 1 14 8 9
15 12 17 17 11
16 1 1 15 18
20 13 15 5 11"""
grid = [[int(x) for x in re.split("\s+", line)] for line in grid.split("\n")]
possible_column_indexes = itertools.permutations(range(len(grid)))
max_sum = 0
max_positions = []
for column_indexes in possible_column_indexes:
current_sum = 0
current_positions = []
for row, col in enumerate(column_indexes):
current_sum += grid[row][col]
current_positions.append("(%d, %d)" % (row, col))
if current_sum > max_sum:
max_sum = current_sum
max_positions = current_positions
print "Max sum is", max_sum
for position in max_positions:
print position
This is the maximum cost bipartite matching problem. The classical way to solve it is by using the Hungarian algorithm.
Basically you have a bipartite graph: the left set is the rows and the right set is the columns. Each edge from row i to column j has cost matrix[i, j]. Find the matching that maximizes the costs.
For starters, you can use dynamic programming.
In your straightforward approach, you are doing exactly the same computation many, many times.
For example, at some point you answer the question: "For the last three columns with rows 1 and 2 already taken, how do I maximize the sum?" You compute the answer to this question twice, once when you pick row 1 from column 1 and row 2 from column 2, and once when you pick them vice-versa.
So don't do that. Cache the answer -- and also cache all similar answers to all similar questions -- and re-use them.
I do not have time right now to analyze the running time of this approach. I think it is O(2^n) or thereabouts. More later maybe...

Resources