Recursive structure in dynamic programming - algorithm

The Trust Game on n rounds is a two-player dynamic game. Here, Player I starts with
$100. The game proceeds as follows.
• Round 1: Player I takes a fraction of the $100 (which could be nothing) to give to Player II. The money Player I gives to Player II is multiplied by 1.5 before Player II receives it. (So for example, if Player I gives $20 to Player II, then Player II receives $30 and Player I is left with $80).
• Round 2: Player II can choose a fraction of the money they received to offer to Player I. The money offered to Player I increases by a multiple of 1.5 before Player I receives it.
More generally, at round i, the Player at the current round (Player I if i is odd, and Player II if i is even) takes a fraction of the money in the pile to send to the other Player and keeps the rest. That money increases by a factor of 1.5 before the other player receives it. The game terminates if the current player does not send any money to the other player, or if round n is reached. At round n, the money in the pile is split evenly between the two players.
Each individual player wishes to maximize the total amount of money they receive.
Can anyone help me in finding the recursive structure/component in this problem?

Let us assume player-1 has $M and player-2 has $N and now it is the turn of player-1. M and N are non-negative numbers assuming maximum amount any player can give at his turn is the present amount of money with them.
If M = (x + y) with x ≥ 0 and y ≥ 0 and player-1 decides to give x amount to player-2, then combined amount with players is M + N + 0.5x and this amount is maximized if x = M.
Assuming perfect trust b/w persons, each player maximizes money they receive at step-N if the combined amount b/w them is maximized. So the best strategy would for each player is to give all their money to the other in order to maximize total amount at each step.
At round-(k+1) (k≥0), total amount b/w the players is ((1.5)^k) * 100
Some practical application may include having 100l of water with you. In a nearby shop-A, 1l of water is exchanged with 1.5l of petrol. In a nearby shop-B, 1l of petrol is exchanged with 1.5l of water. In a nearby shop-C, each litre of water/petrol is bought for $1. What is maximum money you can get if you can make a total of (N+1) trips ignore any transportation costs.

Related

Algorithm to minimize the distance between boats and fishing spots

Each boat can only go to one fishing location and each fishing location can only have one boat.
Every fishing spot contains an estimated amount of fish, and every fishing boat has a performance rating.
No crew will accept to be assigned a spot with fewer fish than a spot assigned to a crew with a lesser rating.
Boat may onlytravel parallel to the axes of the (2-dimensional) coordinate system,
The computation should be done in a way that,first, maximises the amount of fish captured, second, minimises the total distance travelled,and, third, minimises the sum of the ratings of the crews.
An example of distance would be covered would be
(200,100) to (250,100) == 50
(100,200) to (150,150) == 100
(50,50) to (0,100) == 100
In case there are more boats than fishing locations I tried to compute all possible combinations and see which minimized the total distance but the process takes a long time since the number of boats and fishing locations can be 1 <= n <= 4000
I don't know how to come up with a more efficient solution that considerably reduces program execution time.
If you need more clarification please ask.

Algorithm: Eliminating players that no longer have a chance to win the tournament

I have been working on the algorithm for this problem, but can't figure it out. The problem is below:
In a tournament with X player, each player is betting on the outcomes of basketball matches in the NBA.
Guessing the correct match outcome earns a player 3 points, guessing the MVP of the match earns 1 point and guessing both wrong - 0 points.
The algorithm needs to be able to determine if a certain player can't reach the number 1 spot in this betting game.
For example, let's say there are a total of 30 games in the league, so the max points a player can get for guessing right is (3+1)*30=120.
In the table below you can see players X,Y and Z.
Player X guessed correctly so far 20 matches so he have 80 points.
Players Y and Z have 26 and 15 points, and since there are only 10 matches left, even if they guess correctly all the remaining 10 it would not be enough to reach the number 1 spot.
Therefore, the algorithm determined that they are eliminated from the game.
Team
Points
Points per match
Total Games
Max Points possible
Games left
Points Available
Eliminated?
X
80
0-L 1-MVP 3-W
30
120
10
0-40
N
Y
26
0-L 1-MVP 3-W
30
120
10
0-40
Y
Z
15
0-L 1-MVP 3-W
30
120
10
0-40
Y
The baseball elimination problem seems to be the most similar to this problem, but it's not exactly it.
How should I build the reduction of the maximum-flow problem to suit this problem?
Thank you.
I don't get why you are looking at very complex max-flow algorithms. Those might be needed for very complex things (especially when pairings lead to zero-sum results and order/remaining parings start to matter -> !much! harder to do worst-case analysis).
Maybe the baseball problem you mention is one of those (did not check it). But your use-case sounds trivial.
1. Get current leader score LS
2. Get remaining matches N
3. For each player P
4. Get current player score PS
5. Eliminate iff PS + 3 * N < LS
(assumes parallel progress: standings always synced to all players P have played M games
-> easy to generalize though)
This is simple. Given your description there is nothing preventing us from asumming worst-case performance from every other player aka it's a valid scenario that every other player guesses wrong for all upcoming guesses -> score S of player P can stay at S for all remaining games.
Things might quickly change to NP-hard decision-problems when there are more complex side-constraints (e.g. statistical distributions / expectations)

Probabilities and exercises

I've been practicing some probabilities for last few weeks around 10/week usually did the job, however as the topics got harder I've started struggling and now I'm completely stuck. I've been looking online and i did find similar examples but nothing to touch my case in particular. I will continue looking for an answer so even if u can't answer my specific cases links to online literature will be appreciated.
answers are welcome, however i would rather be more interested in explanation of how a problem works.
urn contains m white and k black balls. two players are drawing the balls one after another without putting them back into the urn. the winner is the first person to draw white ball. what is the probability that second player will be the winner? (k=4, m=4)
women tend to vote with probability of a, men tend to do the same with probability of b. the probability c tells us that if we take a couple one of them will not go voting. what are the chances that at least one of them will vote? (a=0.49, b=0.61, c=0.75)
we are sending a message of n bytes. to obtain higher chance of sending entire message without ruining it we use k different wires. What is the probability of sending an entire message through one of the wires without ruining it, if the probability of ruining any of the bytes at any of the wires is p. (p=0.06, n=7, k=6)
the basketball game finals play to N wins. after m + n games the result is m : n. what is the probability of winning the finals for the lagging(the team that's behind) team if it is known that the leading team wins each match with a probability of p? (m=3 n=2 N=5 p=0.36)
sorry for my English and any help will be appreciated
Just A
So Given m = k and both = 4. And each player takes a ball out per turn. Given the worst case scenario 4 good balls will be picked (K) there will be 2 turns of both players picking the black ball then the first player will be guaranteed the white ball next turn. Therefore you need to work out the probability for the first two turns.
Turn 1
- P1 turn 1 Chance for black= 4/8
- P2 turn 2 Change for white = 4/7
Therefore turn 1 chance = (4/8)*(4/7). = 28.57%
Turn 2
- P1 turn 1 Chance for black= 2/6
- P2 turn 2 Change for white = 4/5
Therefore turn 2 chance = (4/6)*(4/5). = 26.664% but to get to turn 2 is a 28.57% chance
Therefore its the first probability add the second probability given the first case occurs. So 28.57% + (26.44%*28.57%)= 36.12%
The third case is a always lose. Top tip then be the first player!
Turn 3 Instance win for player 2 therefore a loss. = 0%

Open-ended tournament pairing algorithm

I'm developing a tournament model for a virtual city commerce game (Urbien.com) and would love to get some algorithm suggestions. Here's the scenario and current "basic" implementation:
Scenario
Entries are paired up duel-style, like on the original Facemash or Pixoto.com.
The "player" is a judge, who gets a stream of dueling pairs and must choose a winner for each pair.
Tournaments never end, people can submit new entries at any time and winners of the day/week/month/millenium are chosen based on the data at that date.
Problems to be solved
Rating algorithm - how to rate tournament entries and how to adjust their ratings after each match?
Pairing algorithm - how to choose the next pair to feed the player?
Current solution
Rating algorithm - the Elo rating system currently used in chess and other tournaments.
Pairing algorithm - our current algorithm recognizes two imperatives:
Give more duels to entries that have had less duels so far
Match people with similar ratings with higher probability
Given:
N = total number of entries in the tournament
D = total number of duels played in the tournament so far by all players
Dx = how many duels player x has had so far
To choose players x and y to duel, we first choose player x with probability:
p(x) = (1 - (Dx / D)) / N
Then choose player y the following way:
Sort the players by rating
Let the probability of choosing player j at index jIdx in the sorted list be:
p(j) = ...
0, if (j == x)
n*r^abs(jIdx - xIdx) otherwise
where 0 < r < 1 is a coefficient to be chosen, and n is a normalization factor.
Basically the probabilities in either direction from x form a geometic series, normalized so they sum to 1.
Concerns
Maximize informational value of a duel - pairing the lowest rated entry against the highest rated entry is very unlikely to give you any useful information.
Speed - we don't want to do massive amounts of calculations just to choose one pair. One alternative is to use something like the Swiss pairing system and pair up all entries at once, instead of choosing new duels one at a time. This has the drawback (?) that all entries submitted in a given timeframe will experience roughly the same amount of duels, which may or may not be desirable.
Equilibrium - Pixoto's ImageDuel algorithm detects when entries are unlikely to further improve their rating and gives them less duels from then on. The benefits of such detection are debatable. On the one hand, you can save on computation if you "pause" half the entries. On the other hand, entries with established ratings may be the perfect matches for new entries, to establish the newbies' ratings.
Number of entries - if there are just a few entries, say 10, perhaps a simpler algorithm should be used.
Wins/Losses - how does the player's win/loss ratio affect the next pairing, if at all?
Storage - what to store about each entry and about the tournament itself? Currently stored:
Tournament Entry: # duels so far, # wins, # losses, rating
Tournament: # duels so far, # entries
instead of throwing in ELO and ad-hoc probability formulae, you could use a standard approach based on the maximum likelihood method.
The maximum likelihood method is a method for parameter estimation and it works like this (example). Every contestant (player) is assigned a parameter s[i] (1 <= i <= N where N is total number of contestants) that measures the strength or skill of that player. You pick a formula that maps the strengths of two players into a probability that the first player wins. For example,
P(i, j) = 1/(1 + exp(s[j] - s[i]))
which is the logistic curve (see http://en.wikipedia.org/wiki/Sigmoid_function). When you have then a table that shows the actual results between the users, you use global optimization (e.g. gradient descent) to find those strength parameters s[1] .. s[N] that maximize the probability of the actually observed match result. E.g. if you have three contestants and have observed two results:
Player 1 won over Player 2
Player 2 won over Player 3
then you find parameters s[1], s[2], s[3] that maximize the value of the product
P(1, 2) * P(2, 3)
Incidentally, it can be easier to maximize
log P(1, 2) + log P(2, 3)
Note that if you use something like the logistics curve, it is only the difference of the strength parameters that matters so you need to anchor the values somewhere, e.g. choose arbitrarily
s[1] = 0
In order to have more recent matches "weigh" more, you can adjust the importance of the match results based on their age. If t measures the time since a match took place (in some time units), you can maximize the value of the sum (using the example)
e^-t log P(1, 2) + e^-t' log P(2, 3)
where t and t' are the ages of the matches 1-2 and 2-3, so that those games that occurred more recently weigh more.
The interesting thing in this approach is that when the strength parameters have values, the P(...) formula can be used immediately to calculate the win/lose probability for any future match. To pair contestants, you can pair those where the P(...) value is close to 0.5, and then prefer those contestants whose time-adjusted number of matches (sum of e^-t1 + e^-t2 + ...) for match ages t1, t2, ... is low. The best thing would be to calculate the total impact of a win or loss between two players globally and then prefer those matches that have the largest expected impact on the ratings, but that could require lots of calculations.
You don't need to run the maximum likelihood estimation / global optimization algorithm all the time; you can run it e.g. once a day as a batch run and use the results for the next day for matching people together. The time-adjusted match masses can be updated real time anyway.
On algorithm side, you can sort the players after the maximum likelihood run base on their s parameter, so it's very easy to find equal-strength players quickly.

Programming algorithm: finding the winner of a competition

I will compete in the OBI (Brazilian Olympiad of Informatics, in English) and I am trying a few exercises from the last years. But I can't find a solution for this exercise (I translated it, so there may be a few errors):
Chocolate Competition
Carlos and Paula just got a bag of chocolate balls. As they would eat
everything too quickly, they made a competition:
They will eat alternately, one after the other (Paula always starts).
Each time, they can only eat from 1 to M balls, M decided by Paula's mother, so they don't choke.
If one ate K balls in his/her turn, the next can't eat K balls.
Whoever can't play according the rules above loses.
In the example below with M = 5 and 20 balls, Carlos has won:
Who plays How many ate Balls left
20
Paula 5 15
Carlos 4 11
Paula 3 8
Carlos 4 4
Paula 2 2
Carlos 1 1
Note that in the end, Carlos couldn't eat 2 balls to win, because Paula ate 2 in her last turn. But Paula couldn't eat the last ball, because Carlos ate 1 in his last turn, so Paula can't play and loses.
Both are very smart and play optimally. If there is a sequence of turns that ensures him/her the victory independent of the other's turns, he/she will play these sequences.
Task:
Your task is to find out who will win the competition if both play optimally.
Input:
The input contains only a single test group, which should be read from the standard input (usually the keyboard).
The input has 2 integers N (2 ≤ N ≤ 1000000) and M (2 ≤ M ≤ 1000), being N the number of balls and M the number allowed per turn.
Output:
Your program should print, in the standard output, one line containing the name of the winner.
Examples:
Input: Output:
5 3 Paula
20 5 Carlos
5 6 Paula
I've been trying to solve the problem, but I have no idea how.
A solution in C can be found here: http://olimpiada.ic.unicamp.br/passadas/OBI2009/res_fase2_prog/programacao_n2/solucoes/chocolate.c.txt But I can't understand the algorithm. Someone posted a question about this problem in another site, but nobody replied.
Could you explain me the algorithm?
Here are the expected outputs of the program: http://olimpiada.ic.unicamp.br/passadas/OBI2009/res_fase2_prog/programacao_n2/gabaritos/chocolate.zip
Let's say we have a boolean function FirstPlayerWin (FPW) that takes two arguments: number of chocolates left (c) and the last move (l) i.e. the number of chocolates taken on the previous round, which is 0 at the first move. The routine returns true if and only if the first player to play at this situation is guaranteed a win.
The base case is that FPW(0, l) = false for any l != 1
Otherwise, to calculate FPW(c, l), FPW(c, l) is true if for any x <= M, x <= c, x != l, FPW(c - x, x) is false. Otherwise it is false. This is where dynamic programming kicks, because now the calculation of FPW is reduced to calculating FPW for smaller values of c.
However, storing the entries for this formulation would require N * M table entries, where as the solution you pointed to uses only 2N table entries.
The reason for this is that if FPW(c, 0) is true (first player wins if any move is available at chocolate count c) but FPW(c, x) is false for x > 0, FPW(c, y) for and y != x must be true. This is because if denying the move x makes the player lose, i.e. the player would win only by playing x, then the move x is available when y is banned instead. Therefore it is enough to store for any count 'c' at most one forbidden move that causes the player to lose there. So you can refactor the dynamic programming problem so that instead of storing the full 2-dimensional array FPW(c, x) you have two arrays, one stores the values FPW(c, 0) and the other stores the single forbidden moves that cause the first player to lose instead of winning, if any.
How you get to the exact text of the quoted C program is left as an exercise to the reader.
I think this is yet another thinly disguised exercise in dynamic programming. The state of the game is described by two quantities: the number of balls remaining, and the number of balls eaten in the previous move. When the number of balls remaining is <= M, the game is either won (if the number remaining is not equal to the number eaten in the previous move) or lost (if it is - you can't eat all the balls, and your opponent can eat the balls that you leave).
If you have worked out the win/lose situation for all numbers of balls up to H, and all possible numbers of balls eaten by the previous player and written this down in a table, then you can work out the answers for all numbers of balls up to H+1. A player with H+1 balls and k balls eaten in the previous move will consider all possibilities - eat i balls for i = 1 to M except for the illegal value of k, leaving a position with H+1-i balls and a previous move of i. They can use the table giving the win-lose situation for up to H balls left to try and find a legal k that gives them a win. If they can find such a value, the H+1/k position is a win. If not, it is a loss, so they can extend the table to cover up to H+1 balls, and so on.
I haven't worked through all the uncommented example code, but it looks like it could be doing something like this - using a dynamic programming like recursion to build a table.

Resources