Related
Background: I'm working on a "matchmaking system" for a small multiplayer video game side project. Every player has a rank from 0-10, every team has 4 players. I'm trying to find a good way to balance out the teams so that the average rank of both of them is as close as possible and the match is as fair as possible.
My current, flawed approach looks like this:
def create_teams(players)
teams = Hash.new{|hash, team| hash[team] = []}
players.sort_by(&:rank).each_slice(2) do |slice|
teams[:team1] << slice[0]
teams[:team2] << slice[1]
end
teams
end
This works decently well if the ranks are already pretty similar but it's not a proper solution to this problem.
For example, it fails in a situation like this:
require "ostruct"
class Array
def avg
sum.fdiv(size)
end
end
dummy_players = [9, 5, 5, 3, 3, 3, 2, 0].map{|rank| OpenStruct.new(rank: rank)}
teams = create_teams(dummy_players)
teams.each do |team, players|
ranks = players.map(&:rank)
puts "#{team} - ranks: #{ranks.inspect}, avg: #{ranks.avg}"
end
This results in pretty unfair teams:
team1 - ranks: [0, 3, 3, 5], avg: 2.75
team2 - ranks: [2, 3, 5, 9], avg: 4.75
Instead, I'd like the teams in this situation to be like this:
team1 - ranks: [0, 3, 3, 9], avg: 3.75
team2 - ranks: [2, 3, 5, 5], avg: 3.75
If there are n players, where n is an even number, there are
C(n) = n!/((n/2)!(n/2)!)
ways to partition the n players into two teams of n/2 players, where n! equals n-facorial. This is often expressed as the number of ways to choosing n/2 items from a collection of n items.
To obtain the partition that has a mimimum absolute difference in total ranks (and hence, in mean ranks), one would have to enumerate all C(n) partitions. If n = 8, as in this example, C(8) = 70 (see, for example, this online calculator). If, however, n = 16, then C(16) = 12,870 and C(32) = 601,080,390. This gives you an idea of how small n must be in order perform a complete enumeration.
If n is too large to enumerate all combinations you must resort to using a heuristic, or a subjective rule for partitioning the array of ranks. Here are two possibilities:
assign the highest rank element ("rank 1") to team A, assign elements with ranks 2 and 3 to team B, assign elements with ranks 4 and 5 to team A, and so on.
assign elements with ranks 1 and n to team A, elements with ranks 2 and n-1 to team B, and so on.
The trouble with heuristics is evaluating their effectiveness. For this problem, for every heuristic you devise there is an array of ranks for which the heuristic's performance is abysmal. If you know the universe of possible arrays of ranks and have a way of drawing unbiased samples you can evaluate the heuristic statistically. That generally is not possible, however.
Here is how you could examine all partitions. Suppose:
ranks = [3, 3, 0, 2, 5, 9, 3, 5]
Then we may perform the following calculations.
indices = ranks.size.times.to_a
#=> [0, 1, 2, 3, 4, 5, 6, 7]
team_a = indices.combination(ranks.size/2).min_by do |combo|
team_b = indices - combo
(combo.sum { |i| ranks[i] } - team_b.sum { |i| ranks[i] }).abs
end
#=> [0, 1, 2, 5]
team_b = indices - team_a
#=> [3, 4, 6, 7]
See Array#combination and Enumerable#min_by.
We see that team A players have ranks:
arr = ranks.values_at(*team_a)
#=> [3, 3, 0, 9]
and the sum of those ranks is:
arr.sum
#=> 15
Similarly, for team B:
arr = ranks.values_at(*team_b)
#=> [2, 5, 3, 5]
arr.sum
#=> 15
See Array#values_at.
The input is an array of cards. In one move, you can remove any group of consecutive identical cards. For removing k cards, you get k * k points. Find the maximum number of points you can get per game.
Time limit: O(n4)
Example:
Input: [1, 8, 7, 7, 7, 8, 4, 8, 1]
Output: 23
Does anyone have an idea how to solve this?
To clarify, in the given example, one path to the best solution is
Remove Points Total new hand
3 7s 9 9 [1, 8, 8, 4, 8, 1]
1 4 1 10 [1, 8, 8, 8, 1]
3 8s 9 19 [1, 1]
2 1s 4 23 []
Approach
Recursion would fit well here.
First, identify the contiguous sequences in the array -- one lemma of this problem is that if you decide to remove at least one 7, you want to remove the entire sequence of three. From here on, you'll work with both cards and quantities. For instance,
card = [1, 8, 7, 8, 4, 8, 1]
quant = [1, 1, 3, 1, 1, 1, 1]
Now you're ready for the actual solving. Iterate through the array. For each element, remove that element, and add the score for that move.
Check to see whether the elements on either side match; if so, merge those entries. Recur on the remaining array.
For instance, here's the first turn of what will prove to be the optimal solution for the given input:
Choose and remove the three 7's
card = [1, 8, 8, 4, 8, 1]
quant = [1, 1, 1, 1, 1, 1]
score = score + 3*3
Merge the adjacent 8 entries:
card = [1, 8, 4, 8, 1]
quant = [1, 2, 1, 1, 1]
Recur on this game.
Improvement
Use dynamic programming: memoize the solution for every sub game.
Any card that appears only once in the card array can be removed first, without loss of generality. In the given example, you can remove the 7's and the single 4 to improve the remaining search tree.
Here is a problem I run into a few days ago.
Given a list of integer items, we want to partition the items into at most N nonoverlapping, consecutive bins, in a way that minimizes the maximum number of items in any bin.
For example, suppose we are given the items (5, 2, 3, 6, 1, 6), and we want 3 bins. We can optimally partition these as follows:
n < 3: 1, 2 (2 items)
3 <= n < 6: 3, 5 (2 items)
6 <= n: 6, 6 (2 items)
Every bin has 2 items, so we can’t do any better than that.
Can anyone share your idea about this question?
Given n bins and an array with p items, here is one greedy algorithm you could use.
To minimize the max number of items in a bin:
p <= n Try to use p bins.
Simply try and put each item in it's own bin. If you have duplicate numbers then your average will be unavoidably worse.
p > n Greedily use all bins but try to keep each one's member count near floor(p / n).
Group duplicate numbers
Pad the largest duplicate bins that fall short of floor(p / n) with unique numbers to the left and right (if they exist).
Count the number of bins you have and determine the number mergers you need to make, let's call it r.
Repeat the following r times:
Check each possible neighbouring bin pairing; find and perform the minimum merger
Example
{1,5,6,9,8,8,6,2,5,4,7,5,2,4,5,3,2,8,7,5} 20 items to 4 bins
{1}{2, 2, 2}{3}{4, 4}{5, 5, 5, 5, 5}{6, 6}{7, 7}{8, 8, 8}{9} 1. sorted and grouped
{1, 2, 2, 2, 3}{4, 4}{5, 5, 5, 5, 5}{6, 6}{7, 7}{8, 8, 8, 9} 2. greedy capture by largest groups
{1, 2, 2, 2, 3}{4, 4}{5, 5, 5, 5, 5}{6, 6}{7, 7}{8, 8, 8, 9} 3. 6 bins but we want 4, so 2 mergers need to be made.
{1, 2, 2, 2, 3}{4, 4}{5, 5, 5, 5, 5}{6, 6, 7, 7}{8, 8, 8, 9} 3. first merger
{1, 2, 2, 2, 3, 4, 4}{5, 5, 5, 5, 5}{6, 6, 7, 7}{8, 8, 8, 9} 3. second merger
So the minimum achievable max was 7.
Here is some psudocode that will give you just one solution with the minimum bin quantity possible:
Sort the list of "Elements" with Element as a pair {Value, Quanity}.
So for example {5,2,3,6,1,6} becomes an ordered set:
Let S = {{1,1},{2,1},{3,1},{5,1},{6,2}}
Let A = the largest quanity of any particular value in the set
Let X = Items in List
Let N = Number of bins
Let MinNum = ceiling ( X / N )
if A > MinNum then Let MinNum = A
Create an array BIN(1 to N+1) of pointers to linked lists of elements.
For I from 1 to N
Remove as many elements from the front of S that are less than MinNum
and Add them to Bin(I)
Next I
Let Bin(I+1)=any remaining in S
LOOP while Bin(I+1) not empty
Let MinNum = MinNum + 1
For I from 1 to N
Remove as many elements from the front of Bin(I+1) so that Bin(I) is less than MinNum
and Add them to Bin(I)
Next I
END LOOP
Your minimum bin size possible will be MinNum and BIN(1) to Bin(N) will contain the distribution of values.
I need for given N create N*N matrix which does not have repetitions in rows, cells, minor and major diagonals and values are 1, 2 , 3, ...., N.
For N = 4 one of matrices is the following:
1 2 3 4
3 4 1 2
4 3 2 1
2 1 4 3
Problem overview
The math structure you described is Diagonal Latin Square. Constructing them is the more mathematical problem than the algorithmic or programmatic.
To correctly understand what it is and how to create you should read following articles:
Latin squares definition
Magic squares definition
Diagonal Latin square construction <-- p.2 is answer to your question with proof and with other interesting properties
Short answer
One of the possible ways to construct Diagonal Latin Square:
Let N is the power of required matrix L.
If there are exist numbers A and B from range [0; N-1] which satisfy properties:
A relativly prime to N
B relatively prime to N
(A + B) relatively prime to N
(A - B) relatively prime to N
Then you can create required matrix with the following rule:
L[i][j] = (A * i + B * j) mod N
It would be nice to do this mathematically, but I'll propose the simplest algorithm that I can think of - brute force.
At a high level
we can represent a matrix as an array of arrays
for a given N, construct S a set of arrays, which contains every combination of [1..N]. There will be N! of these.
using an recursive & iterative selection process (e.g. a search tree), search through all orders of these arrays until one of the 'uniqueness' rules is broken
For example, in your N = 4 problem, I'd construct
S = [
[1,2,3,4], [1,2,4,3]
[1,3,2,4], [1,3,4,2]
[1,4,2,3], [1,4,3,2]
[2,1,3,4], [2,1,4,3]
[2,3,1,4], [2,3,4,1]
[2,4,1,3], [2,4,3,1]
[3,1,2,4], [3,1,4,2]
// etc
]
R = new int[4][4]
Then the algorithm is something like
If R is 'full', you're done
Evaluate does the next row from S fit into R,
if yes, insert it into R, reset the iterator on S, and go to 1.
if no, increment the iterator on S
If there are more rows to check in S, go to 2.
Else you've iterated across S and none of the rows fit, so remove the most recent row added to R and go to 1. In other words, explore another branch.
To improve the efficiency of this algorithm, implement a better data structure. Rather than a flat array of all combinations, use a prefix tree / Trie of some sort to both reduce the storage size of the 'options' and reduce the search area within each iteration.
Here's a method which is fast for N <= 9 : (python)
import random
def generate(n):
a = [[0] * n for _ in range(n)]
def rec(i, j):
if i == n - 1 and j == n:
return True
if j == n:
return rec(i + 1, 0)
candidate = set(range(1, n + 1))
for k in range(i):
candidate.discard(a[k][j])
for k in range(j):
candidate.discard(a[i][k])
if i == j:
for k in range(i):
candidate.discard(a[k][k])
if i + j == n - 1:
for k in range(i):
candidate.discard(a[k][n - 1 - k])
candidate_list = list(candidate)
random.shuffle(candidate_list)
for e in candidate_list:
a[i][j] = e
if rec(i, j + 1):
return True
a[i][j] = 0
return False
rec(0, 0)
return a
for row in generate(9):
print(row)
Output:
[8, 5, 4, 7, 1, 6, 2, 9, 3]
[2, 7, 5, 8, 4, 1, 3, 6, 9]
[9, 1, 2, 3, 6, 4, 8, 7, 5]
[3, 9, 7, 6, 2, 5, 1, 4, 8]
[5, 8, 3, 1, 9, 7, 6, 2, 4]
[4, 6, 9, 2, 8, 3, 5, 1, 7]
[6, 3, 1, 5, 7, 9, 4, 8, 2]
[1, 4, 8, 9, 3, 2, 7, 5, 6]
[7, 2, 6, 4, 5, 8, 9, 3, 1]
Introduction
While trying to do some cathegorization on nodes in a graph (which will be rendered differenty), I find myself confronted with the following problem:
The Problem
Given a superset of elements S = {0, 1, ... M} and a number n of non-disjoint subsets T_i thereof, with 0 <= i < n, what is the best algorithm to find out the partition of the set S called P?
P = S is the union of all disjoint partitions P_j of the original superset S, with 0 <= j < M, such that for all elements x in P_j, every x has the same list of "parents" among the "original" sets T_i.
Example
S = [1, 2, 3, 4, 5, 6, 8, 9]
T_1 = [1, 4]
T_2 = [2, 3]
T_3 = [1, 3, 4]
So all P_js would be:
P_1 = [1, 4] # all elements x have the same list of "parents": T_1, T_3
P_2 = [2] # all elements x have the same list of "parents": T_2
P_3 = [3] # all elements x have the same list of "parents": T_2, T_3
P_4 = [5, 6, 8, 9] # all elements x have the same list of "parents": S (so they're not in any of the P_j
Questions
What are good functions/classes in the python packages to compute all P_js and the list of their "parents", ideally restricted to numpy and scipy? Perhaps there's already a function which does just that
What is the best algorithm to find those partitions P_js and for each one, the list of "parents"? Let's note T_0 = S
I think the brute force approach would be to generate all 2-combinations of T sets and split them in at most 3 disjoint sets, which would be added back to the pool of T sets and then repeat the process until all resulting Ts are disjoint, and thus we've arrived at our answer - the set of P sets. A little problematic could be caching all the "parents" on the way there.
I suspect a dynamic programming approach could be used to optimize the algorithm.
Note: I would have loved to write the math parts in latex (via MathJax), but unfortunately this is not activated :-(
The following should be linear time (in the number of the elements in the Ts).
from collections import defaultdict
S = [1, 2, 3, 4, 5, 6, 8, 9]
T_1 = [1, 4]
T_2 = [2, 3]
T_3 = [1, 3, 4]
Ts = [S, T_1, T_2, T_3]
parents = defaultdict(int)
for i, T in enumerate(Ts):
for elem in T:
parents[elem] += 2 ** i
children = defaultdict(list)
for elem, p in parents.items():
children[p].append(elem)
print(list(children.values()))
Result:
[[5, 6, 8, 9], [1, 4], [2], [3]]
The way I'd do this is to construct an M × n boolean array In where In(i, j) = Si ∈ Tj. You can construct that in O(Σj|Tj|), provided you can map an element of S onto its integer index in O(1), by scanning all of the sets T and marking the corresponding bit in In.
You can then read the "signature" of each element i directly from In by concatenating row i into a binary number of n bits. The signature is precisely the equivalence relationship of the partition you are seeking.
By the way, I'm in total agreement with you about Math markup. Perhaps it's time to mount a new campaign.