Finding better approach of Modified Nim - algorithm

It is ultimately a game of nim with certain modification.
Rules are as follows :
There are two players A and B.
The game is played with two piles of matches. Initially, the first pile contains N matches and the second one contains M matches.
The players alternate turns; A plays first.
On each turn, the current player must choose one pile and remove a positive number of matches (not exceeding the current number of matches on that pile) from it.
It is only allowed to remove X matches from a pile if the number of matches in the other pile divides X.
The player that takes the last match from any pile wins.
Both players play optimally.
My take :
let us say we have two piles 2 7. we have 3 cases to reduce the second pile to wiz : 2 1 , 2 3 , 2 5 If A is playing optimally he/she will go for 2 3 so that the only chance left for B is to do 2 1 and then A can go for 0 1 and win the game. The crust of the solution being that if A or B ever encounters any situation where it can directly lose in the next step then it will try their best to avoid it and use the situation to their advantage by just leaving at a state 1 step before that losing stage .
But this approach fails for some unknown test cases , is there any better approach to find the winner , or any other test case which defies this logic.

This is a classic dynamic programming problem. First, find a recurrence relationship that describes an outcome of a game in terms of smaller games. Your parameters here are X and Y, where X is the number of matches in one stack and Y in the other. How do I reduce this problem?
Well, suppose it is my turn with X matches, and suppose that Y is divisible by the numbers a1, a2, and a3, while x is divisible by b1, b2, b3. Then, I have six possible turns. The problem reduces to solving for (X-a1, Y) (X-a2, Y) (X-a3,Y), (X,Y-b1), (X,Y-b2), (X, Y-b3). Once these six smaller games are solved, if one of them is a winning game for me, then I make the corresponding move and win the game.
There is one more parameter, which is whose turn it is. This doubles the size of solvable problems.
The key is to find all possible moves, and recur for each of them, keeping a storage of already solved games for efficiency.
The base case needs to be figured out naturally.

Related

probability of winning a special sing-elimination tournament using dynamic programming and bitmask

Let's say we have N teams in a tournament and based on historical data we know what is the probability of each team winning any other team .Lets put all the probabilities in a matrix called P . P[a][b] is the probability team a winning team b. It is obvious that P[a][a] = 0 and P[a][b] = 1-P[b][a].
In this tournament at every round, two of teams compete against each other and the loser is eliminated. This two team are chosen randomly (with equal possibility of each team being picked). So at the first round we have n teams, next n-1 teams and so on until only one team remains and becomes the champion. What is the probability of each team becoming the champion? ( 1 <= N <= 18).
At first when I didn't know how to approach the problem but after some reading and search and keeping in mind that max n is 18 I figured at that using Dynamic programming and Bitmask is the way to go. How ever I couldn't figure at a solution. Here are my problems:
I have really hard time to figure at what are the sub problems and what sub problems should not be recomputed, basically I can't find a well defined recursive ( or not recursive) equation for the problem
In bitmask+dp problems we usually define something like dp[mask][n] or dp[n][mask]. I tried different approaches to define the mask but since the general solution is not clear to me there was no success
Some guidance on this two problems would be very helpful.
This is not really a dynamic programming problem.
If you have a vector V that gives the probability of each player being in the game after n rounds, then you can calculate the player probabilities for n+1 rounds by:
V'i = 2/((18-n)(17-n)) * sum over all j!=i of [ViVjPi,j]
That first factor is the probability that any given available match will be chosen, which depends on the number of previous rounds, because each successive round has fewer players to match up.
The second part is the probability of the players being available for each match, times the probability that the current player will win.
Just do this calculation 17 times to get the player probabilities after 17 rounds, which is the answer you're looking for. You can even drop that first factor, and fix it at the end by normalizing the vector so that the probabilities sum to 1.

Probabilities and exercises

I've been practicing some probabilities for last few weeks around 10/week usually did the job, however as the topics got harder I've started struggling and now I'm completely stuck. I've been looking online and i did find similar examples but nothing to touch my case in particular. I will continue looking for an answer so even if u can't answer my specific cases links to online literature will be appreciated.
answers are welcome, however i would rather be more interested in explanation of how a problem works.
urn contains m white and k black balls. two players are drawing the balls one after another without putting them back into the urn. the winner is the first person to draw white ball. what is the probability that second player will be the winner? (k=4, m=4)
women tend to vote with probability of a, men tend to do the same with probability of b. the probability c tells us that if we take a couple one of them will not go voting. what are the chances that at least one of them will vote? (a=0.49, b=0.61, c=0.75)
we are sending a message of n bytes. to obtain higher chance of sending entire message without ruining it we use k different wires. What is the probability of sending an entire message through one of the wires without ruining it, if the probability of ruining any of the bytes at any of the wires is p. (p=0.06, n=7, k=6)
the basketball game finals play to N wins. after m + n games the result is m : n. what is the probability of winning the finals for the lagging(the team that's behind) team if it is known that the leading team wins each match with a probability of p? (m=3 n=2 N=5 p=0.36)
sorry for my English and any help will be appreciated
Just A
So Given m = k and both = 4. And each player takes a ball out per turn. Given the worst case scenario 4 good balls will be picked (K) there will be 2 turns of both players picking the black ball then the first player will be guaranteed the white ball next turn. Therefore you need to work out the probability for the first two turns.
Turn 1
- P1 turn 1 Chance for black= 4/8
- P2 turn 2 Change for white = 4/7
Therefore turn 1 chance = (4/8)*(4/7). = 28.57%
Turn 2
- P1 turn 1 Chance for black= 2/6
- P2 turn 2 Change for white = 4/5
Therefore turn 2 chance = (4/6)*(4/5). = 26.664% but to get to turn 2 is a 28.57% chance
Therefore its the first probability add the second probability given the first case occurs. So 28.57% + (26.44%*28.57%)= 36.12%
The third case is a always lose. Top tip then be the first player!
Turn 3 Instance win for player 2 therefore a loss. = 0%

Branching Factor and Depth

Question
A simple two-player game involves a pile of N matchsticks and two
players who have alternating turns. In each turn, a player removes 1,
2 or 3 matchsticks from the pile. The player who removes the last
matchstick loses the game.
A) What are the branching factor and depth of the game tree (give a general solution expressed in terms of N)? How large is the search
space?
B) How many unique states are there in the game? For large N what could be done to make the search more efficient?
Answer
A) I said the branching factor would be 3 but I justified this because the player could only ever remove up to 3 matches, meaning our tree would usually have three children. The second part with regards to the depth, I'm not sure.
B) N x 2 where N is the number of matches remaining. I am not sure how we could make the search more efficient though? Maybe introducing Alpha-beta pruning?
A :
For the depth, just imagine what the longest possible game would look like. It is the game that consists of both players only removing 1 match in each turn. Since there are n matches, such a game would take n turns : the tree has depth n.
B :
There are only 2*N states, each of them accessible from 3 states of higher matchstick count. Since the number of matches necessarily goes down as the game goes on, the graph of possible states is a DAG (Directed Acyclic Graph). A dynamic programming method is therefore possible to analyze this game. In the end, you will see that the optimal move only depends on N mod 4, with N the number of remaining matches.
EDIT : Proof idea for the N mod 4 :
Every position is either a losing or a winning position. A losing position is a situation where no matter what you play, if your adversary plays optimally, you will lose. Similarly, a winning position is a situation where if you play the right moves, the adversary cannot win. N=1 is a losing position (by definition of the game). Therefore, N=2,3,4 are winning positions because by removing the right amount of matches you put the adversary in a losing position. N=5 is a losing position because no matter what admissible number of matches you remove, you put the adversary in a winning position. N=6,7,8 are winning positions ... you get the idea.
Now it is just about making this proof formal : take as hypothesis that a position N is a losing position if and only if N mod 4 = 1. If that is true up to some integer k, you can prove that it is true for k+1. It's true up to k = 4 as we showed earlier. By recurrence, it is true for any N.
The state of the game at any time can be described by whose turn it is and the number of matches held by each player. After n moves there are 3^n possible histories, but for large n, many fewer than 3^n possible states, so you can save time by, for example, recognising that you are about to encounter a state that you have already encountered and worked out a value for before.
See also https://en.wikipedia.org/wiki/Nim - if this is Nim, or a variety of Nim, there are efficient strategies already worked out for it.

picking element game

This is a simple game:
There is a set, A={a1,...,an}, the opponents can choose one of the first or last elements of set, and at the end the one who collect bigger numbers wins. Now say each participants dose his best, what I need to do is write a Dynamic algorithm to estimate their score.
any idea or clue is truly appreciated.
Here's a hint: to write a dynamic programming algorithm, you typically need a recurrence. Given
A={a1,...,an}
The recurrence would look something like this
f(A)= max( f({a1,...,a_n-1}) , f({a2,...,a_n}) )
Actually the recurrence relation given by dfb may not lead to right answer
as it is not leading to the right sub-optimal structure !
Assume the Player A begins the game :
the structure of problem for him is [a1,a2,...an]
After choosing an element , either a1 or an , its player B's turn to play , and then after that move it is player A's move.
So after two moves , Player A's turn will come again and this will be the right sub-problem for him .The right recurrence relation will be
Suppose from i to j elements are left :
A(i,j)= max(min( A(i+1,j-1),A(i+2,j)+a[i] ), min(A(i,j-2),A(i+1,j-1))+a[j])
Refer to the following link :
http://people.csail.mit.edu/bdean/6.046/dp/
EXAMPLE CODE
Here is Python code to compute the optimal score for first and second players.
A=[3,1,1,3,1,1,3]
cache={}
def go(a,b):
"""Find greatest difference between player 1 coins and player 2 coins when choosing from A[a:b]"""
if a==b: return 0 # no stacks left
if a==b-1: return A[a] # only one stack left
key=a,b
if key in cache:
return cache[key]
v=A[a]-go(a+1,b) # taking first stack
v=max(v,A[b-1]-go(a,b-1)) # taking last stack
cache[key]=v
return v
v = go(0,len(A))
n=sum(A)
print (n+v)/2,(n-v)/2
COUNTEREXAMPLE
Note that the code includes a counter example to one of the other answers to this question.
Consider the case [3,1,1,3,1,1,3].
By symmetry, the first players move always leaves the pattern [1,1,3,1,1,3].
For this the sum of even elements is 1+3+1=5, while the sum of odd is 1+1+3=5, so the argument is that from this position the second player will always win 5, and the first player will always win 5, so the first player will win (as he gets 5 in addition to the 3 from the first move).
However, this logic is flawed because the second player can actually get more.
First player takes 3, leaves [1,1,3,1,1,3] (only choice by symmetry)
Second player takes 3, leaves [1,1,3,1,1]
First player takes 1, leaves [1,3,1,1] (only choice by symmetry)
Second player takes 1, leaves [1,3,1]
First player takes 1, leaves [3,1] (only choice by symmetry)
Second player takes 3, leaves [1]
First player takes 1
So overall first player gets 3+1+1+1=6, while second gets 3+1+3=7 and second player wins.
The flaw is that although it is true that the second player can play such that they will win all even or all odd positions, this is not optimal play and they can actually do better than this in some cases.
Actually you do not need dynamic programming, because it is easy to find an explicit solution for the game above.
Case n is even or n = 1.
The second player to move will always lose.
Case n odd and n > 1.
The second player has a winning strategy iff one of the following 2 scenarios happen:
The elements with even index have bigger sum than all the elements with odd index
All odd elements except the last have bigger sum than all the remainings AND
All odd elements except the first have bigger sum than all the remainings.
Proof sketch:
Case n is even or n = 1: Let Sodd and Seven be the sum of all elements with even/odd indexes. Assume that Sodd > Seven, same argument hold otherwise. The first player has a winning strategy, since he can play in such a way that he will get all odd indexed items.
The case n is odd and n > 1 can also be resolved directly. In fact the first player has two options, he can get the first or last element of the set. Of the remaining elements, partition them the two subsets with odd and even indexes; by the argument above, the second player is going to take the subset with largest sum. If you expand the tree game you will end up with the statement above.

How do I calculate the shanten number in mahjong?

This is a followup to my earlier question about deciding if a hand is ready.
Knowledge of mahjong rules would be excellent, but a poker- or romme-based background is also sufficient to understand this question.
In Mahjong 14 tiles (tiles are like
cards in Poker) are arranged to 4 sets
and a pair. A straight ("123") always
uses exactly 3 tiles, not more and not
less. A set of the same kind ("111")
consists of exactly 3 tiles, too. This
leads to a sum of 3 * 4 + 2 = 14
tiles.
There are various exceptions like Kan
or Thirteen Orphans that are not
relevant here. Colors and value ranges
(1-9) are also not important for the
algorithm.
A hand consists of 13 tiles, every time it's our turn we get to pick a new tile and have to discard any tile so we stay on 13 tiles - except if we can win using the newly picked tile.
A hand that can be arranged to form 4 sets and a pair is "ready". A hand that requires only 1 tile to be exchanged is said to be "tenpai", or "1 from ready". Any other hand has a shanten-number which expresses how many tiles need to be exchanged to be in tenpai. So a hand with a shanten number of 1 needs 1 tile to be tenpai (and 2 tiles to be ready, accordingly). A hand with a shanten number of 5 needs 5 tiles to be tenpai and so on.
I'm trying to calculate the shanten number of a hand. After googling around for hours and reading multiple articles and papers on this topic, this seems to be an unsolved problem (except for the brute force approach). The closest algorithm I could find relied on chance, i.e. it was not able to detect the correct shanten number 100% of the time.
Rules
I'll explain a bit on the actual rules (simplified) and then my idea how to tackle this task. In mahjong, there are 4 colors, 3 normal ones like in card games (ace, heart, ...) that are called "man", "pin" and "sou". These colors run from 1 to 9 each and can be used to form straights as well as groups of the same kind. The forth color is called "honors" and can be used for groups of the same kind only, but not for straights. The seven honors will be called "E, S, W, N, R, G, B".
Let's look at an example of a tenpai hand: 2p, 3p, 3p, 3p, 3p, 4p, 5m, 5m, 5m, W, W, W, E. Next we pick an E. This is a complete mahjong hand (ready) and consists of a 2-4 pin street (remember, pins can be used for straights), a 3 pin triple, a 5 man triple, a W triple and an E pair.
Changing our original hand slightly to 2p, 2p, 3p, 3p, 3p, 4p, 5m, 5m, 5m, W, W, W, E, we got a hand in 1-shanten, i.e. it requires an additional tile to be tenpai. In this case, exchanging a 2p for an 3p brings us back to tenpai so by drawing a 3p and an E we win.
1p, 1p, 5p, 5p, 9p, 9p, E, E, E, S, S, W, W is a hand in 2-shanten. There is 1 completed triplet and 5 pairs. We need one pair in the end, so once we pick one of 1p, 5p, 9p, S or W we need to discard one of the other pairs. Example: We pick a 1 pin and discard an W. The hand is in 1-shanten now and looks like this: 1p, 1p, 1p, 5p, 5p, 9p, 9p, E, E, E, S, S, W. Next, we wait for either an 5p, 9p or S. Assuming we pick a 5p and discard the leftover W, we get this: 1p, 1p, 1p, 5p, 5p, 5p, 9p, 9p, E, E, E, S, S. This hand is in tenpai in can complete on either a 9 pin or an S.
To avoid drawing this text in length even more, you can read up on more example at wikipedia or using one of the various search results at google. All of them are a bit more technical though, so I hope the above description suffices.
Algorithm
As stated, I'd like to calculate the shanten number of a hand. My idea was to split the tiles into 4 groups according to their color. Next, all tiles are sorted into sets within their respective groups to we end up with either triplets, pairs or single tiles in the honor group or, additionally, streights in the 3 normal groups. Completed sets are ignored. Pairs are counted, the final number is decremented (we need 1 pair in the end). Single tiles are added to this number. Finally, we divide the number by 2 (since every time we pick a good tile that brings us closer to tenpai, we can get rid of another unwanted tile).
However, I can not prove that this algorithm is correct, and I also have trouble incorporating straights for difficult groups that contain many tiles in a close range. Every kind of idea is appreciated. I'm developing in .NET, but pseudo code or any readable language is welcome, too.
I've thought about this problem a bit more. To see the final results, skip over to the last section.
First idea: Brute Force Approach
First of all, I wrote a brute force approach. It was able to identify 3-shanten within a minute, but it was not very reliable (sometimes too a lot longer, and enumerating the whole space is impossible even for just 3-shanten).
Improvement of Brute Force Approach
One thing that came to mind was to add some intelligence to the brute force approach. The naive way is to add any of the remaining tiles, see if it produced Mahjong, and if not try the next recursively until it was found. Assuming there are about 30 different tiles left and the maximum depth is 6 (I'm not sure if a 7+-shanten hand is even possible [Edit: according to the formula developed later, the maximum possible shanten number is (13-1)*2/3 = 8]), we get (13*30)^6 possibilities, which is large (10^15 range).
However, there is no need to put every leftover tile in every position in your hand. Since every color has to be complete in itself, we can add tiles to the respective color groups and note down if the group is complete in itself. Details like having exactly 1 pair overall are not difficult to add. This way, there are max around (13*9)^6 possibilities, that is around 10^12 and more feasible.
A better solution: Modification of the existing Mahjong Checker
My next idea was to use the code I wrote early to test for Mahjong and modify it in two ways:
don't stop when an invalid hand is found but note down a missing tile
if there are multiple possible ways to use a tile, try out all of them
This should be the optimal idea, and with some heuristic added it should be the optimal algorithm. However, I found it quite difficult to implement - it is definitely possible though. I'd prefer an easier to write and maintain solution first.
An advanced approach using domain knowledge
Talking to a more experienced player, it appears there are some laws that can be used. For instance, a set of 3 tiles does never need to be broken up, as that would never decrease the shanten number. It may, however, be used in different ways (say, either for a 111 or a 123 combination).
Enumerate all possible 3-set and create a new simulation for each of them. Remove the 3-set. Now create all 2-set in the resulting hand and simulate for every tile that improves them to a 3-set. At the same time, simulate for any of the 1-sets being removed. Keep doing this until all 3- and 2-sets are gone. There should be a 1-set (that is, a single tile) be left in the end.
Learnings from implementation and final algorithm
I implemented the above algorithm. For easier understanding I wrote it down in pseudocode:
Remove completed 3-sets
If removed, return (i.e. do not simulate NOT taking the 3-set later)
Remove 2-set by looping through discarding any other tile (this creates a number of branches in the simulation)
If removed, return (same as earlier)
Use the number of left-over single tiles to calculate the shanten number
By the way, this is actually very similar to the approach I take when calculating the number myself, and obviously never to yields too high a number.
This works very well for almost all cases. However, I found that sometimes the earlier assumption ("removing already completed 3-sets is NEVER a bad idea") is wrong. Counter-example: 23566M 25667P 159S. The important part is the 25667. By removing a 567 3-set we end up with a left-over 6 tile, leading to 5-shanten. It would be better to use two of the single tiles to form 56x and 67x, leading to 4-shanten overall.
To fix, we simple have to remove the wrong optimization, leading to this code:
Remove completed 3-sets
Remove 2-set by looping through discarding any other tile
Use the number of left-over single tiles to calculate the shanten number
I believe this always accurately finds the smallest shanten number, but I don't know how to prove that. The time taken is in a "reasonable" range (on my machine 10 seconds max, usually 0 seconds).
The final point is calculating the shanten out of the number of left-over single tiles. First of all, it is obvious that the number is in the form 3*n+1 (because we started out with 14 tiles and always subtracted 3 tiles).
If there is 1 tile left, we're shanten already (we're just waiting for the final pair). With 4 tiles left, we have to discard 2 of them to form a 3-set, leaving us with a single tile again. This leads to 2 additional discards. With 7 tiles, we have 2 times 2 discards, adding 4. And so on.
This leads to the simple formula shanten_added = (number_of_singles - 1) * (2/3).
The described algorithm works well and passed all my tests, so I'm assuming it is correct. As stated, I can't prove it though.
Since the algorithm removes the most likely tiles combinations first, it kind of has a built-in optimization. Adding a simple check if (current_depth > best_shanten) then return; it does very well even for high shanten numbers.
My best guess would be an A* inspired approach. You need to find some heuristic which never overestimates the shanten number and use it to search the brute-force tree only in the regions where it is possible to get into a ready state quickly enough.
Correct algorithm sample: syanten.cpp
Recursive cut forms from hand in order: sets, pairs, incomplete forms, - and count it. In all variations. And result is minimal Shanten value of all variants:
Shanten = Min(Shanten, 8 - * 2 - - )
C# sample (rewrited from c++) can be found here (in Russian).
I've done a little bit of thinking and came up with a slightly different formula than mafu's. First of all, consider a hand (a very terrible hand):
1s 4s 6s 1m 5m 8m 9m 9m 7p 8p West East North
By using mafu's algorithm all we can do is cast out a pair (9m,9m). Then we are left with 11 singles. Now if we apply mafu's formula we get (11-1)*2/3 which is not an integer and therefore cannot be a shanten number. This is where I came up with this:
N = ( (S + 1) / 3 ) - 1
N stands for shanten number and S for score sum.
What is score? It's a number of tiles you need to make an incomplete set complete. For example, if you have (4,5) in your hand you need either 3 or 6 to make it a complete 3-set, that is, only one tile. So this incomplete pair gets score 1. Accordingly, (1,1) needs only 1 to become a 3-set. Any single tile obviously needs 2 tiles to become a 3-set and gets score 2. Any complete set of course get score 0. Note that we ignore the possibility of singles becoming pairs. Now if we try to find all of the incomplete sets in the above hand we get:
(4s,6s) (8m,9m) (7p,8p) 1s 1m 5m 9m West East North
Then we count the sum of its scores = 1*3+2*7 = 17.
Now if we apply this number to the formula above we get (17+1)/3 - 1 = 5 which means this hand is 5-shanten. It's somewhat more complicated than Alexey's and I don't have a proof but so far it seems to work for me. Note that such a hand could be parsed in the other way. For example:
(4s,6s) (9m,9m) (7p,8p) 1s 1m 5m 8m West East North
However, it still gets score sum 17 and 5-shanten according to formula. I also can't proof this and this is a little bit more complicated than Alexey's formula but also introduces scores that could be applied(?) to something else.
Take a look here: ShantenNumberCalculator. Calculate shanten really fast. And some related stuff (in japanese, but with code examples) http://cmj3.web.fc2.com
The essence of the algorithm: cut out all pairs, sets and unfinished forms in ALL possible ways, and thereby find the minimum value of the number of shanten.
The maximum value of the shanten for an ordinary hand: 8.
That is, as it were, we have the beginnings for 4 sets and one pair, but only one tile from each (total 13 - 5 = 8).
Accordingly, a pair will reduce the number of shantens by one, two (isolated from the rest) neighboring tiles (preset) will decrease the number of shantens by one,
a complete set (3 identical or 3 consecutive tiles) will reduce the number of shantens by 2, since two suitable tiles came to an isolated tile.
Shanten = 8 - Sets * 2 - Pairs - Presets
Determining whether your hand is already in tenpai sounds like a multi-knapsack problem. Greedy algorithms won't work - as Dialecticus pointed out, you'll need to consider the entire problem space.

Resources