Best matchmaking algorithm for 5v5s within player pools of millions of players [closed] - algorithm

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
Suppose that I am trying to create some sort of match making algorithm for my game. The game is similar to League or DOTA, whereby 5 players are pitted against 5 players. Suppose further that the player pool is gigantic (millions of players searching for a game at a time), and the job of the match maker is to put as many players as possible into many instances of 5v5 games. At this point, we do not worry at all about MMR, ELO or any player/party rating coming into play. We just want to place players into 5v5s.
My current brute-force algorithm is absolutely atrocious in scaling. It first tries to find all possible combinations of 5 player parties within the millions of players, then, it tries to find pairs of parties, while removing players from possible party matches if the players have already been used:
So, suppose I have 10 players and I want to find all possible 5v5s, I first transform them into bits and do bit shifting to find all possible combinations.
Players: ABCDEFGHIJ
1111100000 => ABCDE
1111010000 => ABCDF
1111001000 => ABCDG
1111000100 => ABCDH
1111000010 => ABCDI
1111000001 => ABCDJ
1110110000 => ABCEF
and so on...
Then out of all possible parties, I use 2 for loops to start trying to find pairs of parties:
ABCDE vs FGHIJ
ABCDF vs EGHIJ
ABCDG VS EFHIJ
and so on...
This algorithm has run time of O((nCr)^2). Because it tries to find all possible party combinations, just matchmaking 50 players would require 4.4891439e+12 operations, which is insane.
What is a better algorithm that doesn't go through all possible parties and brute-force this problem?

From your example, I gather that you don't care about gathering players by rating classes, but that you do care about balancing the resulting teams. Here's an algorithm that should get you a workable solution. Start by grabbing the first 9 players in the queue; call this the pool.
Compute the average rating of the players, avg = mean(pool)
Compute the target score for a team of 5: team_target = 5*avg
Find the combination of 5 players whose ratings have a sum closest to team_target (solved in several other postings). Make that team1.
Compute total rating of the team: team1_rating = sum(team1)
Remove those five players from the pool. Put the remaining pool players onto team2.
Compute the rating of this remaining team of 4: team2_rating = sum(team2)
Subtract the ratings to get the rating of the needed 10th player: player_target = team1_rating - team2_rating
Grab the next 10 players in the queue; this is the new pool.
Find the pool player with the rating closest to player_target.
Put that player onto team2 and post the match **team1 vs team2*.
There are 9 players left in the pool; go back to step 1. Iterate as necessary.
ADVANTAGES
This is a simple, linear algorithm that can handle a stream of input requests. Since the team size is fixed, it's O(N) on the length of the queue. The only part that's at all time-consuming is finding the team closest to the average rating, checking 9C5 = 126 possibilities, a pretty cheap overhead per match.
The space overhead is trivial: the high-water mark is handling 19 players at once.
PROBLEMS
You can have an unbalanced match if the distribution isn't smooth. For instance, a queue with one star player, such as (100, 5, 5, 5, 5, 6, 6, 6, 6 | 5, 5, 5, 5, 5, 6, 6, 6, 6, 6) will give you team ratings of 120 and 30 for the "best" pairing. If this is a functional problem for you, feel free to adjust, perhaps keeping a pool of outliers to handle until you get 10 high and/or 10 low ones.

Related

Deconflict constrained N team schedule

I am writing a perl program to schedule N number of teams to play each other team once as home team and once as visitor. We use two fields and two time periods. So up to eight teams play in a day. No team can play at the same time on both fields or play twice in the same day. Any team not playing for the day is put on the BYE list.
I have written the code to define all the required games. But when I try to schedule each game and remove it from the array of games to be played, I arrive at conditions where there are no games left that can satisfy the rules for field or time periods in a day. This is most pronounced if I do not shuffle the array of games to be played. Even with 8 teams, I get these conflicts near the end.
What is the logic to deconflict the schedule sequence?
Do not simply create a list of all games and select from that at random: you need an algorithm to create the rounds - the circular method being the "standard" one (see for instance the link in the comment by #David Eisenstat).
Once you created the rounds you still have to define a calendar that will respect the limitation of 4 games per day (and no team playing more than once per day). This is straightforward: if one round fills exactly one or more days, i.e. if you have 8, 16, 24, ... teams, then you simply split each round in the requested number of days. But even if N is not a multiple of 8, there are no problems.
Lets' keep things simple, and consider the case of N = 12, so each round requires one day and a half: on day 1 you select (randomly) 4 of the 6 games of round 1; on day 2 you select the 2 missing games of round 1, and 2 games of round 2, taking care to avoid that the same team plays twice in a day; finally on day 3 you complete round 2, and so on. Can we be sure that we will always able to assign day 2 avoiding the duplication of a team? Yes, we can: when you assign the last two games from round 1, you have 4 teams affected; even if in round 2 there are no games between those 4, you only exclude 4 games from day 2, so you still have 2 games available for placement on that day.
Final notes: as you can see there's no need for a bye list. The only situation to deal with is when N is odd, and it is usually handled by adding a dummy team.
Regarding home vs visitors, you will need to repeat the full calendar a second time. Just note that it is not possible to have every team alternating between home and visitors at each round. For instance with 4 teams you may have TeamA (h) vs TeamB and TeamC (h) vs TeamD; at the second round you may still do TeamD (h) vs TeamA and TeamB (h) vs TeamC; but at the third round TeamA and TeamC must play each other, and both come from a visitor round. And the same hold for TeamB and TeamD, who both come from a home round.

Algorithm to Make a Set of Random Outcomes Approach a Specific Percentage

Currently, I have a pool of basketball players where I have a projected total of points for each player. Additionally, I have a normal distribution function that gives me a random drawing from a normal distribution for each player. Currently, I have an algorithm that calculates n unique random lineups of 8 players based on some constraints. Between each lineup, the normal distribution function runs again to produce new predictions for each player. Then the best lineup is produced for that specific set of predictions.
I would like to tweak this algorithm in the following way. I would like to have 4 tiers of maximum and minimum percentages where each player is assigned a tier. Within the number of lineups generated, I would like each specific player to occur with that frequency. So for example if I wanted to generate 10 lineups and player 1 is in tier 1 which requires the player to be between 50-60%, then the player would occur in 5-6 lineups ideally.
I'm struggling with how to modify my current algorithm to include this stipulation. Any thoughts would be greatly appreciated! I just don't know how to force each player within a specific range of percentages.
There are a lot of ways to do it.
Here is an easy approach. Keep a current relative odds of being picked for each player. The actual probability is the relative odds divided by the sum of the odds. Each person starts with the expected number of times be selected. Whenever someone is selected, their relative odds is reduced by 1. If it goes below 0, that person is out of the pool.
This approach guarantees that each player will not be in more than a maximum number of teams. It makes it unlikely, but not impossible, that any given player will be in fewer teams than you want.
An easy way to solve that is to randomly round people's desired frequencies up and down to get the right integer count. And now everything has to come even.
There is yet another problem, though. Which is that it is possible that you'll not succeed in assignment to fill all the teams. But if you go from the most popular player to the least, the odds of such mistakes should be acceptably low. Doubly so if you widen the ranges slightly by populating a few extra teams, then throwing away ones that didn't work out.
First draft
So if I understand correctly, you have N players that might appear in the first
position of the string. But you want them to be selected not at random, but according
to some percentage.
Now the first step is to normalize those percentages:
Alice 20%
Bob 40%
Charlie 10%
Doug 60%
Eric 30%
The sum is 160%, so you generate a random number from 1 to 160; say it's 97.
97 is more than 20, so subtract 20 and ignore Alice.
77 is more than 40, so subtract 40 and ignore Bob.
37 is more than 10, so subtract 10 and ignore Charlie.
27 is less than 60: Doug it is.
You can also pre-populate a 160-element array with 20 "Alice" indexes, 60 "Doug" indexes etc., and your player is players[array[random(160)]].

Complex pairing algorithm - team tournament

I would like to ask for help with an algorithm I’ve been working on for quite some time. I actually programmed it a few years ago using greedy pairing mostly but I’m not satisfied. Any help would be greatly appreciated!
So getting down to business. I have an application for tournament play (beachvolleyball to be precise, but should work for any pair-sport played in tournament format). The players show up on tournament day and gets randomly-ish put together with other participants and against other random teams. Top focus is to play as much as possible, however the number of players aren’t always divisible by the number of simultaneous playing spots. Therefor there will always be a number of players resting, standing the round out that is, and I’m trying to make sure this is as fair as possible by using 2 variables:
Rests (total number of rests during the day)
Rests in a row (Resting several games in a row, obviously)
The original concept of the tournament was mixing the teams with 1 male(m) and 1 female(f) in each team, playing against another team of m/f. However, the resting part is more important and there is often a lot more players of one sex than the other (i.e. 20 f and 7 m). Instead of letting the males play every single round, the program should make teams of f/f playing against f/f. Same-sex vs f/m should be avoided though.
Players should get new partners every round and play against new teams every round. Preferably you should play with all players of the opposite sex before playing with someone again. Players are allowed to come and leave as they like, and also take a break at any time (voluntary rest).
I’ve looked into the unstable marriage problem and the roommate problem, but my problem seems to be a mix of the two. Normally there will be two lists of players (m/f) and pairing, but under certain premises there should be teams made from just one list as described. Let me give you an example:
EXAMPLE:
43 players show up for a tournament with 6 courts.
17 Females (f) and 26 Males (m).
The 6 courts fit 12 teams with a total of 24 players per round.
Round 1
*12 m - 12 f*
*19 resting (5f, 14m)*
Round 2
5f and 14m have 1 rest and should play.
The best solution would be:
*4 f - 4 m*
*1f - 1m*
*4m - 4m*
*1f - 1m* (these players played last round as well).
In this example there will normally not be more than 1 rests in a row, if there woud’ve been 49 players from the start on the other hand..
In future updates, I’m also planning on letting the user choose number of players per team, and also to skip the m/f requisite.
Any thoughts?

How to calculate correlation amongst preferences?

I have to split a group of x people into 3 or 4 groups, most likely 3.
I want people to be happy, so I'm having each person rate the other members of the big group from 1 to (x-1).
How do I optimize preferences to create 3 groups?
Here is a method that is likely to get a good arrangement, even if it is not an optimal arrangement:
First create a ranking function that can take any pair of groupings and determine whether one is better than the other. Then apply the following algorithm:
Randomly assign people into groups.
Randomly pick one person from each group.
Create new groupings in which each combination of reassignments is performed on the people chosen in step 2. (For 3 groups there will be 6 such reassignments. For 4, 24.)
Of all possible reasignments, pick the best one.
Repeat steps 2–4 one million times.
UPDATE
If there are only 18 people that need to be assigned, then that's just (18 choose 6) * (12 choose 6) / 6 = 2,858,856 possible groupings. (Or, in the case of four groups it's (18 choose 4) * (14 choose 4) * (10 choose 5) / 4 = 192,972,780 groupings.)
You can just try each one and pick the best.
I guess the ranking algorithm itself is really the hard part of this assignment.
You could just give each person a score based on summing the scores of the people selected to be in their group, then sum the scores of each person together.
The problem is that you're going to end up with all the popular people in one group, and all the unpopular people in another group, and all the telephone handset cleaners in another group.
You should just assign people randomly, and then tell them that you used some really scientific system. That way everybody gets a good mix.
Measure the total satisfaction of a given configuration by calculating the distance between the actual positions and the stated preferences. Start with a randomized set of groups. Then use something like hill climbing or simulated annealing to optimise.
http://en.wikipedia.org/wiki/Hill_climbing
http://en.wikipedia.org/wiki/Simulated_annealing
Simulated annealing sounds complicated, but it's really just a cleverer version of hill-climbing.

Fair matchmaking for online games [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed last year.
Improve this question
Most online games arbitrarily form teams. Often times its up to the user, and they'll choose a fast server with a free slot. This behavior produces unfair teams and people rage quit. By tracking a player's statics (or any statics that can be gathered) how can you choose teams that are as fair as possible?
One of the more well-known systems now is Microsoft's TrueSkill algorithm.
People have also attempted to adapt the Elo system for team matchmaking, though it's more designed for 1-v-1 pairings.
After my previous answer, I realized that if you wanted to get really fancy you could use a really simple but powerful idea: Markov Chains.
The intuitive idea behind using a Markov Chain goes something like this:
Create a graph G=(V,E)
Let each vertex in V represent an entity
Let each edge in E represent a transitioning probability between entities. This means that the sum of the out degrees of each vertex must be 1.
At the start (time t=0) assign each entity a unit value of 1
At each time step, transition form entity i, j by the transition probability defined in 3.
Let t->infinity then the value of each entity at t=infinity is the equilibrium (that is the chance of a transition into an entity is the same as the total chance of a transition out of an entity.)
This idea has for example been used successfully to implement Google's page rank algorithm. To describe how you can use it consider the following:
V = players E = probability of transitioning form player to player based on relative win/loss ratios
Each player is a vertex.
An edge from player A to B (B is not equal to A) has probability X/N where N is the total number of games played by A and X is the total games lost to B. Add an edge from A to A with probability M/N where M is the total number of games won by A.
Assign a skill level of 1 to each player at the start.
Use the Power Method to find the dominant eigenvector of the link matrix constructed from the probabilities defined in 3.
The dominant eigenvector is the amount of skill each player has at t=infinity, that is
the amount of skill each player has once the markov chain has come to equilibrium. This is a very robust measure of each players skill using the topology of the win/loss space.
Some caveats: there are several problems when applying this directly, the biggest problem will be seperated webs (that is your markov chain will not be irreducible and so the power method will not be guaranteed to converge.) Lucky for you, google has dealt with all these problems and more when implementing their page rank algorithm and all that remains for you is to look up how they circumvent these problems if you are so inclined.
One way would be to simply create a list of players looking for matches at any given time, sorted by player rank. Once you've reached enough people to start a new match (or perhaps, two less than the required), group them as such:
Remove best and worst player and put them on team 1
Remove now-best and now-worst player (really second-best and second worst) and put them on team 2
If there are only two players left, place each one on different teams, depending on who has the lowest combined score. Otherwise, repeat:
Remove now-best and now-worst and put them on team 1
Remove now-best and now-worst and put them on team 2
etc. etc. etc. until your teams are filled.
If you decided to start a new match with less than the required, then here it is time to let the players wait for new people to join. As soon as a new person joins, you're going to want to put them on the open team with the least combined score.
Alternatively, if you wanted to avoid games that combined good and bad players on the same team, you could split up everyone into tiers, (groups based on their ranking) and only match people within the same tier. This would require a new open/sorted list for each extra tier.
Example
Game is 4v4
A - 1000 pnts
B - 800 pnts
C - 600 pnts
D - 400 pnts
E - 200 pnts
F - 100 pnts
As soon as you get these six, group them into teams as such:
Team 1: A, F, D (combined score 1500)
Team 2: B, E, C (combined score 1600)
Now, we wait for two more players to join.
First, player E comes along with 500 pnts. He goes to Team 1, because they have a lower combined score.
Then, player F comes with 800 pnts. He goes to Team 2, because are the only open team left.
Total teams:
Team 1: A, F, D, E (combined score 2000)
Team 2: B, E, C, F (combined score 2400)
Note that the teams were actually pretty fair until the last two came in. To be honest, the best way would be to only create the match when you have enough players to start it. But then the wait times might be too long for the player.
Adjust with how much you need before forming the match. Lower = less wait time, more possibly unfair. Higher = more wait time, less possibly unfair.
If you have a pre-game screen, lower would also offer more time for people to chat and talk with their to-be teammates while waiting.
It is difficult to estimate the skill of any one player by a single metric and such a method is prone to abuse. However, if you only care about implementing something simple that will work well try the following:
keep track of wins and losses
use the percentage of wins vs losses as the statistic to match players ( in some sense of the word match, i.e. group players with similar percentages)
This has the obvious downfall of the case where a player may have a win-loss ratio of 5-0 and another of 50-20, the first has an infinite percentage while the other has a more reasonable percentage. It makes sense for the matching system to acknowledge this and be far more confident that the latter player has actual more skill because of the consistency required; however, pitting the two players against each other would probably be a good thing because the 5-0 player is probably trying to work the system by playing versus weaker players so pitting him against a consistently good player would do everyone good.
Note, I speak from experience from playing only strategy games such as Warcraft 3 where this is the typical match making behaviour. It seems to me like the percentage of wins over losses is a great metric by which to match players.
Match based on multiple attributes. I've implemented a simple matchmaking system using AWS Cloudsearch (based on Apache Solr). For example matching based on the a combination of following fields is possible
{
"fields": {
"elo_rating": 3121.44,
"points": 404,
"randomizer": 35,
"last_login": "2014-10-09T22:57:57Z",
"weapons": [
"CANNON",
"GUN"
]
}
It is now possible to run queries inclusive of multiple fields like the following.
(and (or weapons:'GUN' weapons:'CANNON' weapons:'DRONE')(and last_login:['2013-05-25T00:00:00Z','2014-10-25T00:00:00Z'])(and points:[100, 200])(and elo_rating:[1000, 2000]))}

Resources