The problem statement is as follows:
science tournament is taking place where each team can design solar vehicle with following scoring system:
you can submit 0 < D < 20 different designs to compete
each design will be given different score penalty (based on weight, dimension, production cost, material used, etc)
for every design submitted, penalty score will be applied immediately and the total penalty score will serve as each team's starting score. more design you submit, the higher the penalty score will be
each design will run all available terrains, once each
each design will be given score 0 - 100 based on how many km it travels in an hour on different terrains. there will be 0 < T < 30 terrains
we get only 1 score from each terrain. if there are multiple designs in the same terrain, the highest score will be awarded
D #
penalty score
T1
T2
T3
D1
1
10
10
7
D2
2
8
8
12
D3
3
15
16
8
if we submit all 3 designs, the total penalty score for our team is 6, making our initial score -6
and our terrain scores:
T1 -> D3 15 points
T2 -> D3 16 points
T3 -> D2 12 points
penalty -> -6
-------------------------+
total -> 37 points
D1, even if its penalty score is the lowest, is actually useless, and we dont need to submit it in the first place, thus, we can score 38 points if we only submitted D2 and D3. we need to find the highest score we can get given D designs and T terrains. we can pick and choose which design(s) we want to submit into the tournament.
brute force will give you Big O of D!
is there any better way to solve this?
Thanks
This problem is NP hard.
To show that, let's reduce the set cover problem to this one.
Let's assign one terrain per element, and one design per set. Every design has a penalty of 1. A design will perform 1 on a terrain that is not in its set, and will perform 2 * number_of_designs on one that is. It is straightforward to prove that the optimal tournament submission is the smallest set of designs corresponding to a set cover in the original design. So if we can solve your problem efficiently, then we can find the minimal set cover.
I would suggest attempting some kind of branch and bound algorithm to solve this. Either exactly or heuristically.
Related
I have been working on the algorithm for this problem, but can't figure it out. The problem is below:
In a tournament with X player, each player is betting on the outcomes of basketball matches in the NBA.
Guessing the correct match outcome earns a player 3 points, guessing the MVP of the match earns 1 point and guessing both wrong - 0 points.
The algorithm needs to be able to determine if a certain player can't reach the number 1 spot in this betting game.
For example, let's say there are a total of 30 games in the league, so the max points a player can get for guessing right is (3+1)*30=120.
In the table below you can see players X,Y and Z.
Player X guessed correctly so far 20 matches so he have 80 points.
Players Y and Z have 26 and 15 points, and since there are only 10 matches left, even if they guess correctly all the remaining 10 it would not be enough to reach the number 1 spot.
Therefore, the algorithm determined that they are eliminated from the game.
Team
Points
Points per match
Total Games
Max Points possible
Games left
Points Available
Eliminated?
X
80
0-L 1-MVP 3-W
30
120
10
0-40
N
Y
26
0-L 1-MVP 3-W
30
120
10
0-40
Y
Z
15
0-L 1-MVP 3-W
30
120
10
0-40
Y
The baseball elimination problem seems to be the most similar to this problem, but it's not exactly it.
How should I build the reduction of the maximum-flow problem to suit this problem?
Thank you.
I don't get why you are looking at very complex max-flow algorithms. Those might be needed for very complex things (especially when pairings lead to zero-sum results and order/remaining parings start to matter -> !much! harder to do worst-case analysis).
Maybe the baseball problem you mention is one of those (did not check it). But your use-case sounds trivial.
1. Get current leader score LS
2. Get remaining matches N
3. For each player P
4. Get current player score PS
5. Eliminate iff PS + 3 * N < LS
(assumes parallel progress: standings always synced to all players P have played M games
-> easy to generalize though)
This is simple. Given your description there is nothing preventing us from asumming worst-case performance from every other player aka it's a valid scenario that every other player guesses wrong for all upcoming guesses -> score S of player P can stay at S for all remaining games.
Things might quickly change to NP-hard decision-problems when there are more complex side-constraints (e.g. statistical distributions / expectations)
I have the following problem in one of my coding project which I will simplify here:
I am ordering groceries online and want very specific things in very specific quantities. I would like to order the following:
8 Apples
1 Yam
2 Soups
3 Steaks
20 Orange Juices
There are many stores equidistant from me which I will have food delivered from. Not all stores have what I need. I want to obtain what I need with the fewest number of orders made. For example, ordering from Store #2 below is a wasted order, since I can complete my items in less orders by ordering from different stores. What is the name of the optimization algorithm that solves this?
Store #1 Supply
50 Apples
Store #2 Supply
1 Orange Juice
2 Steaks
1 Soup
Store #3 Supply
25 Soup
50 Orange Juices
Store #4 Supply
25 Steaks
10 Yams
The lowest possible orders is 3 in this case. 8 Apples from Store #1. 2 Soup and 20 Orange Juice from Store #3. 1 Yam and 3 Steaks from Store #4.
To me, this most likely sounds like a restricted case of the Integer Linear programming problem (ILP), namely, its 0-or-1 variant, where the integer variables are restricted to the set {0, 1}. This is known to be NP-hard (and the corresponding decision problem is NP-complete).
The problem is formulated as follows (following the conventions in the op. cit.):
Given the matrix A, the constraint vector b, and the weight vector c, find the vector x ∈ {0, 1}N such that all the constraints A⋅x ≥ b are satisfied, and the cost c⋅x is minimal.
I flipped the constraint inequality, but this is equivalent to changing the sign of both A and b.
The inequalities indicate satisfaction of your order: that you can buy at the least the amount of every item in the visited store. Note that b has the same length as the number of rows in A and the number of columns in both c and x. The dot-product c⋅x is, naturally, a scalar.
Since you are minimizing the number of trips, each trip costs the same, so that c = 1, and c⋅x is the total number of trips. The store inventory matrix A has a row per item, and a column per store, and the b is your shopping list.
Naturally, the exact best solution is found by trying all possible 2N values for the x.
Since there is no single approach to NP-hard problems, consider the problem size, and how close to the optimum you want to arrive. A greedy approach would work well (when your next store to visit has the most total number of items not yet satisfied) when the "inventories" are large. If you have the idea in advance about the expected minimum number of trips, you can trim the search beam at some value, exceeding the number of trips by some multiplication coefficient. This is the best approach when your search is time constrained (I routinely do beam searches, closely related to the branch-and-cut approach mentioned in the article, in graphs that take a few GB of memory slightly faster than the limit of 30ms per exploration step with a beam as wide as 10,000). Simulated annealing also works, if the search landscape is not excessively rough.
Also search on cs.SE; it may be even a better place for questions of this type.
I recently created a tournament system that will soon lead into player rankings. Basically, after players are done with the tournament, they are given a rank based on how they did in the tournament. So the person who won the tournament will have the most points and be ranked #1, while the second will have the second most points and be ranked #2, and so on...
However, after they are ranked in the new rankings, they can challenge other members and have a way to play other members and change their ranks. So basically (using a ranking system), if Player A who is ranked #2 beats Player B who is ranked #1, Player A will now become #1.
I've also decided that if a player wants to compete in the rankings but was not present during the tournament, they can sign up after the tournament, and will be given the lowest possible rank with the lowest points (but they have a chance to move up).
So now, I am wanting to know which way should I go about planning this. When I convert the players from tournament to match rankings, I have to identify them with points in order to rank them. I decided this seems like the best way to do it.
1 1000
2 900
3 800
4 700
5 600
6 500
7 400
8 300
9 200
10 100
After looking on the internet I've decided it would be wise to use ELO to give players their new rank after they players have matched against each other.. I went about it on this page: http://www.lifewithalacrity.com/2006/01/ranking_systems.html
So if I go about it this way, lets say I have rank #10 facing rank #1. According to the website above, my formula is:
R' = R + K * (S - E)
and the rating of #10 only has 100 points where #1 has 1,000.
So after doing the math rank #10's expected value of beating #1 is:
1 / [ 1 + 10 ^ ( [1000 - 100] / 400) ]
= 0.55%
So
100 + 32 * (1 - 0.52)
= 115.36
The problem I have with ELO is it makes no sense. After A rank such as #10 beats #1, he should not gain something as low as 15 points. I'm not sure if i'm doing the math wrong, or if I'm splitting up the points wrong. Or maybe I shouldn't use ELO at all? Any suggestions would be very helpful
Don't get offended, it is your table that doesn't make sense.
Elo system is based on the premise that a rating is an accurate estimate of the strength, and difference of ratings accurately predicts an outcome of a match (a player better by 200 point is expected to score 75%). If an actual outcome does not agree with a prediction, it means that ratings do not reflect strength, hence must be adjusted according to how much an actual outcome differs from the predicted.
An official (as in FIDE) Elo system has few arbitrary arbitrary constants (e.g. 200/75 gauge, Erf as predictor, etc); choosing them (reasonably) different may lead to a different rating values, yet would result (in a long run) in the same ranking. There is some interesting math behind this assertion; this is not a right place to get into details.
Now back to your table. It assigns the rating based on the place, not on the points scored. The champion gets 1000 no matter whether she swept the tournament with an absolute 100% result, or barely made it among equals. These points do not estimate the strength of the participants.
So my advise is to abandon the table altogether, assign each new player an entry rating (say, 1000; it really doesn't matter as long as you are consistent), and stick to Elo from the very beginning.
I want to calculate a percentage probability based on a list of past occurrences.
The data looks similar to this simplified table, for instance when the first value has been 8 in the past there has been a 72% chance of the event occurring.
1 76%
2 64%
4 80%
6 85%
7 83%
8 72%
11 70%
The full table ranges from 0 to 1030 and has 377 rows but changes daily. I want pass the function a value such as 3 and be returned a percentage probability of the event occurring. I don't need exact code, but would appreciate being pointed in the right direction.
Thanks
Based on your answers in the comments of the question, I would suggest an interpolation---linear interpolation is the simplest answer. It doesn't look like a probabilistic model would be appropriate based on the series in the spreadsheet (there doesn't appear to be a clear relationship between the column 1 and column 3).
To give an example of how this would work: imagine you want the probability for some point p, which is unobserved in the data. The biggest value you observe which is less than p is p_low (with corresponding probability f(p_low)), and the smallest value greater than p is p_high (with probability f(p_high)). Your estimate for p is:
interval = p_high - p_low
f_p_hat = ((p-p_low)/interval*f_p_low) + ((p_high-p)/interval*f_p_high)
This is going to make your estimate for p a weighted average of the values at p_low and p_high, with weights given by the distances between p and p_low, and p and p_high. E.g. if p is equidistant between p_low and p_high, f_p_hat (your estimate for f(p)) is just the mean of p_low and p_high.
Now, linear interpolation may not work if you have reason to suspect that the estimates at the endpoints are inaccurate (possibly due to small sample sizes). If so, it would be possible to do a (possibly weighted) least squares fit to a neighbourhood of points around p, and use that as a prediction. If this is the case I can go into a bit more detail.
I have a data set of players' skill ranking, age and sex and would like to create evenly matched teams.
Teams will have the same number of players (currently 8 teams of 12 players).
Teams should have the same or similar male to female ratio.
Teams should have similar age curve/distribution.
I would like to try this in Haskell but the choice of coding language is the least important aspect of this problem.
This is a bin packing problem, or a multi-dimensional knapsack problem. Björn B. Brandenburg has made a bin packing heuristics library in Haskell that you may find useful.
You need something like...
data Player = P { skill :: Int, gender :: Bool, age :: Int }
Decide on a number of teams n (I'm guessing this is a function of the total number of players).
Find the desired total skill per team:
teamSkill n ps = sum (map skill ps) / n
Find the ideal gender ratio:
genderRatio ps = sum (map (\x -> if gender x then 1 else 0)) / length ps
Find the ideal age variance (you'll want the Math.Statistics package):
ageDist ps = pvar (map age ps)
And you must assign the three constraints some weights to come up with a scoring for a given team:
score skillW genderW ageW team = skillW * sk + genderW * g + ageW * a
where (sk, (g, a)) = (teamSkill 1 &&& genderRatio &&& ageDist) team
The problem reduces to the minimization of the difference in scores between teams. A brute force approach will take time proportional to Θ(nk−1). Given the size of your problem (8 teams of 12 players each), this translates to about 6 to 24 hours on a typical modern PC.
EDIT
An approach that may work well for you (since you don't need an exact solution in practise) is simulated annealing, or continual improvement by random permutation:
Pick teams at random.
Get a score for this configuration (see above).
Randomly swap players between two or more teams.
Get a score for the new configuration. If it's better than the previous one, keep it and recurse to step 3. Otherwise discard the new configuration and try step 3 again.
When the score has not improved for some fixed number of iterations (experiment to find the knee of this curve), stop. It's likely that the configuration you have at this point will be close enough to the ideal. Run this algorithm a few times to gain confidence that you have not hit on some local optimum that is considerably worse than ideal.
Given the number of players per team and the gender ration (which you can easily compute). The remaining problem is called n-partition problem, which is unfortunately NP-complete and thus very hard to solve exactly. You will have to use approximative or heuristic allgorithms (evolutionary algorithms), if your problem size is too big for a brute force solution. A very simple approximation would be sorting by age and assign in an alternating way.
Assign point values to the skill levels, gender, and age
Assign the sum of the points for each criteria to each player
Sort players by their calculated point value
Assign the next player to the first team
Assign players to the second team until it has >= total points than the first team or the team reaches the maximum players.
Perform 5 for each team, looping back to the first team, until all players are assigned
You can tweak the skill level, gender, and age point values to change the distribution of each.
Lets say you have six players (for a simple example). We can use the same algorithm which pairs opponents in single-elimination tournaments and adapt that to generate "even" teams based on any criteria you choose.
First rank your players best-to-worst. Don't take this too literally. You want a list of players sorted by the criteria you wish to separate them.
Why?
Let's look at single elimination tournaments for a second. The idea of using an algorithm to generate optimal single-elimination matches is to avoid the problem of the "top players" meeting too soon in the tournament. If top players meet too soon, one of the top players will be eliminated early on, making the tournament less interesting. We can use this "optimal" pairing to generate teams in which the "top" players are spread out evenly across the teams. Then spread out the the second top players, etc, etc.
So list you players by the criteria you want them separated: men first, then women... sorted by age second. We get (for example):
Player 1: Male - 18
Player 2: Male - 26
Player 3: Male - 45
Player 4: Female - 18
Player 5: Female - 26
Player 6: Female - 45
Then we'll apply the single-elimination algorithm which uses their "rank" (which is just their player number) to create "good match ups".
The single-elimination tournament generator basically works like this: take their rank (player number) and reverse the bits (binary). This new number you come up with become their "slot" in the tournament.
Player 1 in binary (001), reversed becomes 100 (4 decimal) = slot 4
Player 2 in binary (010), reversed becomes 010 (2 decimal) = slot 2
Player 3 in binary (011), reversed becomes 110 (6 decimal) = slot 6
Player 4 in binary (100), reversed becomes 001 (1 decimal) = slot 1
Player 5 in binary (101), reversed becomes 101 (5 decimal) = slot 5
Player 6 in binary (110), reversed becomes 011 (3 decimal) = slot 3
In a single-elimination tournament, slot 1 plays slot 2, 3-vs-4, 5-vs-6. We're going to uses these "pair ups" to generate optimal teams.
Looking at the player number above, ordered by their "slot number", here is the list we came up with:
Slot 1: Female - 18
Slot 2: Male - 26
Slot 3: Female - 45
Slot 4: Male - 18
Slot 5: Female - 26
Slot 6: Male - 45
When you split the slots up into teams (two or more) you get the players in slot 1-3 vs players in slot 4-6. That is the best/optimal grouping you can get.
This technique scales very well with many more players, multiple criteria (just group them together correctly), and multiple teams.
Idea:
Sort players by skill
Assign best players in order (i.e.: team A: 1st player, team B: 2nd player, ...)
Assign worst players in order
Loop on 2
Evaluate possible corrections and perform them (i.e.: if team A has a total skill of 19 with a player with skill 5 and team B has a total skill of 21 with a player with skill 4, interchange them)
Evaluate possible corrections on gender distribution and perform them
Evaluate possible corrections on age distribution and perform them
Almost trivial approach for two teams:
Sort all player by your skill/rank assessment.
Assign team A the best player.
Assign team B the next two best players
Assign team A the next two best players
goto 3
End when you're out of players.
Not very flexible, and only works on one column ranking, so it won't try to get similar gender or age profiles. But it does make fair well matched teams if the input distribution is reasonably smooth. Plus it doesn't always end with team A have the spare player when there are an odd number.
Well,
My answer is not about scoring strategies of teams/players because all the posted are good, but I would try a brute force or a random search approach.
I don't think it's worth create a genetic algorithm.
Regards.