Algorithm to minimize the distance between boats and fishing spots - algorithm

Each boat can only go to one fishing location and each fishing location can only have one boat.
Every fishing spot contains an estimated amount of fish, and every fishing boat has a performance rating.
No crew will accept to be assigned a spot with fewer fish than a spot assigned to a crew with a lesser rating.
Boat may onlytravel parallel to the axes of the (2-dimensional) coordinate system,
The computation should be done in a way that,first, maximises the amount of fish captured, second, minimises the total distance travelled,and, third, minimises the sum of the ratings of the crews.
An example of distance would be covered would be
(200,100) to (250,100) == 50
(100,200) to (150,150) == 100
(50,50) to (0,100) == 100
In case there are more boats than fishing locations I tried to compute all possible combinations and see which minimized the total distance but the process takes a long time since the number of boats and fishing locations can be 1 <= n <= 4000
I don't know how to come up with a more efficient solution that considerably reduces program execution time.
If you need more clarification please ask.

Related

Solving a travelling salesman problem to maximize gain in minimum time

Team
I need suggestions on how to solve the below problem.
There are n places (for example say 10 places). Time taken from any one place to the other is known. On reaching a particular place a known reward is given in the form of rupees (ex. if I travel from place 1 to place 2, I get 100 rupees. Travelling from place 2 to place 3 will fetch me 50 rupees etc...). Also, sometimes a particular place is unavailable to travel to which changes with time. At all time instances, whatever places can be traveled to is known, reward fetched from each place is known and the time taken to travel from one place to other is known. This is an ongoing process, meaning after you reach place A and earn 100 rupees, you travelled to place B and fetch 100 Rs. Then it is possible that place A can again fetch you rupees say 50 if you travel from B to A again.
Problem statement is:
A path should be followed with time ( A to B, B to C, C to B, B to A etc...) so that I always have maximum rupees in a given time. Thus at the end of 1 month, I should have followed a path that fetches me the maximum amount among all possibilities available.
We already know that in the traveling salesman problem it takes O(N!) to calculate the best way for the month if there are no changes. Because of the unknown changes that can happen, the best way is to use a greedy algorithm such that every time you come to new place, you calculate where you get the most R's in the least amount of time. It will take O(N*k) where k is the amount of time that you move between places in a month.
I'm not sure how this problem is related to travelling salesman -- I understood the latter as having the restriction of at least visiting all the places once.
Assuming we have all of the time instances and their related information ahead of our calculation, if we work backwards from each place we imagine ending at, the choices we have for the previous location visited dictate the possible times it took to get to the last place and the possible earning we could have gotten. Clearly from those choices we would choose the best reward among them because it's our last choice. Apply this idea recursively from there, until we reach the start of the month. If we run this recursion from each possible ending place, we can reuse states we've seen before; for example if we reached place A at time T as one of the options when calculating backwards from B, and then we reach A again at time T when calculating a path that started at C, we can reuse the record for the first state. The search space would be O(N*T) but practically would vary with the input.
Something like this? (Assumes we cannot wait in any one place. Otherwise, the solution could be better coded bottom-up where we can try all place + time states.) Return the best of running f with the same memo map on all possible ending states.
get_travel_time(place_a, place_b):
# returns travel time from place a to place b
get_neighbours(place):
# returns places from which we can travel to place
get_reward(place, time):
# returns the reward awarded at place place at time time
f(place, time, memo={}):
if time == 0:
return 0
key = (place, time)
if key in memo:
return memo[key]
current_reward = get_reward(place, time)
best = -Infinity
for neighbour in get_neighbours(place):
previous_time = time - get_travel_time(neighbour, place)
if previous_time >= 0:
best = max(best, current_reward + f(neighbour, previous_time, memo))
memo[key] = best
return memo[key]

Algorithm: Eliminating players that no longer have a chance to win the tournament

I have been working on the algorithm for this problem, but can't figure it out. The problem is below:
In a tournament with X player, each player is betting on the outcomes of basketball matches in the NBA.
Guessing the correct match outcome earns a player 3 points, guessing the MVP of the match earns 1 point and guessing both wrong - 0 points.
The algorithm needs to be able to determine if a certain player can't reach the number 1 spot in this betting game.
For example, let's say there are a total of 30 games in the league, so the max points a player can get for guessing right is (3+1)*30=120.
In the table below you can see players X,Y and Z.
Player X guessed correctly so far 20 matches so he have 80 points.
Players Y and Z have 26 and 15 points, and since there are only 10 matches left, even if they guess correctly all the remaining 10 it would not be enough to reach the number 1 spot.
Therefore, the algorithm determined that they are eliminated from the game.
Team
Points
Points per match
Total Games
Max Points possible
Games left
Points Available
Eliminated?
X
80
0-L 1-MVP 3-W
30
120
10
0-40
N
Y
26
0-L 1-MVP 3-W
30
120
10
0-40
Y
Z
15
0-L 1-MVP 3-W
30
120
10
0-40
Y
The baseball elimination problem seems to be the most similar to this problem, but it's not exactly it.
How should I build the reduction of the maximum-flow problem to suit this problem?
Thank you.
I don't get why you are looking at very complex max-flow algorithms. Those might be needed for very complex things (especially when pairings lead to zero-sum results and order/remaining parings start to matter -> !much! harder to do worst-case analysis).
Maybe the baseball problem you mention is one of those (did not check it). But your use-case sounds trivial.
1. Get current leader score LS
2. Get remaining matches N
3. For each player P
4. Get current player score PS
5. Eliminate iff PS + 3 * N < LS
(assumes parallel progress: standings always synced to all players P have played M games
-> easy to generalize though)
This is simple. Given your description there is nothing preventing us from asumming worst-case performance from every other player aka it's a valid scenario that every other player guesses wrong for all upcoming guesses -> score S of player P can stay at S for all remaining games.
Things might quickly change to NP-hard decision-problems when there are more complex side-constraints (e.g. statistical distributions / expectations)

Probability - expectation Puzzle: 1000 persons and a door

You stand in an office by a door, with a measuring tape. Every time a person walks in you measure him or her and only keep tally of the “record” tallest. If the new person is taller than the preceding one, you count a record. If later another person is taller, you have another record, etc.
A 1000 persons pass through the door. How many records do you expect to have?
(Assume independence of height/arrival. Also note that the answer does not depend on any assumption about the probability distribution other than independence.)
PS - I'm able to come up with answer (~7.5) with a brute force approach. ( Running this scenario over 1000000 times and taking average ). But here I'm looking for a theoretical approach.
consider x_1 to x_1000 as the record, and max(i) as max of the sequence until i. The question is reduced to finding expected number of times the max(i) changes.
for i=0 to 999:
if x_i+1>max(i), then max(i) changes
Also, P(x_i+1>max(i))=1/i+1
answer=> summation of 1/1+i (i varies from 0 to 999) which is approx. 7.49

Developing player rankings with ELO

I recently created a tournament system that will soon lead into player rankings. Basically, after players are done with the tournament, they are given a rank based on how they did in the tournament. So the person who won the tournament will have the most points and be ranked #1, while the second will have the second most points and be ranked #2, and so on...
However, after they are ranked in the new rankings, they can challenge other members and have a way to play other members and change their ranks. So basically (using a ranking system), if Player A who is ranked #2 beats Player B who is ranked #1, Player A will now become #1.
I've also decided that if a player wants to compete in the rankings but was not present during the tournament, they can sign up after the tournament, and will be given the lowest possible rank with the lowest points (but they have a chance to move up).
So now, I am wanting to know which way should I go about planning this. When I convert the players from tournament to match rankings, I have to identify them with points in order to rank them. I decided this seems like the best way to do it.
1 1000
2 900
3 800
4 700
5 600
6 500
7 400
8 300
9 200
10 100
After looking on the internet I've decided it would be wise to use ELO to give players their new rank after they players have matched against each other.. I went about it on this page: http://www.lifewithalacrity.com/2006/01/ranking_systems.html
So if I go about it this way, lets say I have rank #10 facing rank #1. According to the website above, my formula is:
R' = R + K * (S - E)
and the rating of #10 only has 100 points where #1 has 1,000.
So after doing the math rank #10's expected value of beating #1 is:
1 / [ 1 + 10 ^ ( [1000 - 100] / 400) ]
= 0.55%
So
100 + 32 * (1 - 0.52)
= 115.36
The problem I have with ELO is it makes no sense. After A rank such as #10 beats #1, he should not gain something as low as 15 points. I'm not sure if i'm doing the math wrong, or if I'm splitting up the points wrong. Or maybe I shouldn't use ELO at all? Any suggestions would be very helpful
Don't get offended, it is your table that doesn't make sense.
Elo system is based on the premise that a rating is an accurate estimate of the strength, and difference of ratings accurately predicts an outcome of a match (a player better by 200 point is expected to score 75%). If an actual outcome does not agree with a prediction, it means that ratings do not reflect strength, hence must be adjusted according to how much an actual outcome differs from the predicted.
An official (as in FIDE) Elo system has few arbitrary arbitrary constants (e.g. 200/75 gauge, Erf as predictor, etc); choosing them (reasonably) different may lead to a different rating values, yet would result (in a long run) in the same ranking. There is some interesting math behind this assertion; this is not a right place to get into details.
Now back to your table. It assigns the rating based on the place, not on the points scored. The champion gets 1000 no matter whether she swept the tournament with an absolute 100% result, or barely made it among equals. These points do not estimate the strength of the participants.
So my advise is to abandon the table altogether, assign each new player an entry rating (say, 1000; it really doesn't matter as long as you are consistent), and stick to Elo from the very beginning.

Open-ended tournament pairing algorithm

I'm developing a tournament model for a virtual city commerce game (Urbien.com) and would love to get some algorithm suggestions. Here's the scenario and current "basic" implementation:
Scenario
Entries are paired up duel-style, like on the original Facemash or Pixoto.com.
The "player" is a judge, who gets a stream of dueling pairs and must choose a winner for each pair.
Tournaments never end, people can submit new entries at any time and winners of the day/week/month/millenium are chosen based on the data at that date.
Problems to be solved
Rating algorithm - how to rate tournament entries and how to adjust their ratings after each match?
Pairing algorithm - how to choose the next pair to feed the player?
Current solution
Rating algorithm - the Elo rating system currently used in chess and other tournaments.
Pairing algorithm - our current algorithm recognizes two imperatives:
Give more duels to entries that have had less duels so far
Match people with similar ratings with higher probability
Given:
N = total number of entries in the tournament
D = total number of duels played in the tournament so far by all players
Dx = how many duels player x has had so far
To choose players x and y to duel, we first choose player x with probability:
p(x) = (1 - (Dx / D)) / N
Then choose player y the following way:
Sort the players by rating
Let the probability of choosing player j at index jIdx in the sorted list be:
p(j) = ...
0, if (j == x)
n*r^abs(jIdx - xIdx) otherwise
where 0 < r < 1 is a coefficient to be chosen, and n is a normalization factor.
Basically the probabilities in either direction from x form a geometic series, normalized so they sum to 1.
Concerns
Maximize informational value of a duel - pairing the lowest rated entry against the highest rated entry is very unlikely to give you any useful information.
Speed - we don't want to do massive amounts of calculations just to choose one pair. One alternative is to use something like the Swiss pairing system and pair up all entries at once, instead of choosing new duels one at a time. This has the drawback (?) that all entries submitted in a given timeframe will experience roughly the same amount of duels, which may or may not be desirable.
Equilibrium - Pixoto's ImageDuel algorithm detects when entries are unlikely to further improve their rating and gives them less duels from then on. The benefits of such detection are debatable. On the one hand, you can save on computation if you "pause" half the entries. On the other hand, entries with established ratings may be the perfect matches for new entries, to establish the newbies' ratings.
Number of entries - if there are just a few entries, say 10, perhaps a simpler algorithm should be used.
Wins/Losses - how does the player's win/loss ratio affect the next pairing, if at all?
Storage - what to store about each entry and about the tournament itself? Currently stored:
Tournament Entry: # duels so far, # wins, # losses, rating
Tournament: # duels so far, # entries
instead of throwing in ELO and ad-hoc probability formulae, you could use a standard approach based on the maximum likelihood method.
The maximum likelihood method is a method for parameter estimation and it works like this (example). Every contestant (player) is assigned a parameter s[i] (1 <= i <= N where N is total number of contestants) that measures the strength or skill of that player. You pick a formula that maps the strengths of two players into a probability that the first player wins. For example,
P(i, j) = 1/(1 + exp(s[j] - s[i]))
which is the logistic curve (see http://en.wikipedia.org/wiki/Sigmoid_function). When you have then a table that shows the actual results between the users, you use global optimization (e.g. gradient descent) to find those strength parameters s[1] .. s[N] that maximize the probability of the actually observed match result. E.g. if you have three contestants and have observed two results:
Player 1 won over Player 2
Player 2 won over Player 3
then you find parameters s[1], s[2], s[3] that maximize the value of the product
P(1, 2) * P(2, 3)
Incidentally, it can be easier to maximize
log P(1, 2) + log P(2, 3)
Note that if you use something like the logistics curve, it is only the difference of the strength parameters that matters so you need to anchor the values somewhere, e.g. choose arbitrarily
s[1] = 0
In order to have more recent matches "weigh" more, you can adjust the importance of the match results based on their age. If t measures the time since a match took place (in some time units), you can maximize the value of the sum (using the example)
e^-t log P(1, 2) + e^-t' log P(2, 3)
where t and t' are the ages of the matches 1-2 and 2-3, so that those games that occurred more recently weigh more.
The interesting thing in this approach is that when the strength parameters have values, the P(...) formula can be used immediately to calculate the win/lose probability for any future match. To pair contestants, you can pair those where the P(...) value is close to 0.5, and then prefer those contestants whose time-adjusted number of matches (sum of e^-t1 + e^-t2 + ...) for match ages t1, t2, ... is low. The best thing would be to calculate the total impact of a win or loss between two players globally and then prefer those matches that have the largest expected impact on the ratings, but that could require lots of calculations.
You don't need to run the maximum likelihood estimation / global optimization algorithm all the time; you can run it e.g. once a day as a batch run and use the results for the next day for matching people together. The time-adjusted match masses can be updated real time anyway.
On algorithm side, you can sort the players after the maximum likelihood run base on their s parameter, so it's very easy to find equal-strength players quickly.

Resources