Question
A simple two-player game involves a pile of N matchsticks and two
players who have alternating turns. In each turn, a player removes 1,
2 or 3 matchsticks from the pile. The player who removes the last
matchstick loses the game.
A) What are the branching factor and depth of the game tree (give a general solution expressed in terms of N)? How large is the search
space?
B) How many unique states are there in the game? For large N what could be done to make the search more efficient?
Answer
A) I said the branching factor would be 3 but I justified this because the player could only ever remove up to 3 matches, meaning our tree would usually have three children. The second part with regards to the depth, I'm not sure.
B) N x 2 where N is the number of matches remaining. I am not sure how we could make the search more efficient though? Maybe introducing Alpha-beta pruning?
A :
For the depth, just imagine what the longest possible game would look like. It is the game that consists of both players only removing 1 match in each turn. Since there are n matches, such a game would take n turns : the tree has depth n.
B :
There are only 2*N states, each of them accessible from 3 states of higher matchstick count. Since the number of matches necessarily goes down as the game goes on, the graph of possible states is a DAG (Directed Acyclic Graph). A dynamic programming method is therefore possible to analyze this game. In the end, you will see that the optimal move only depends on N mod 4, with N the number of remaining matches.
EDIT : Proof idea for the N mod 4 :
Every position is either a losing or a winning position. A losing position is a situation where no matter what you play, if your adversary plays optimally, you will lose. Similarly, a winning position is a situation where if you play the right moves, the adversary cannot win. N=1 is a losing position (by definition of the game). Therefore, N=2,3,4 are winning positions because by removing the right amount of matches you put the adversary in a losing position. N=5 is a losing position because no matter what admissible number of matches you remove, you put the adversary in a winning position. N=6,7,8 are winning positions ... you get the idea.
Now it is just about making this proof formal : take as hypothesis that a position N is a losing position if and only if N mod 4 = 1. If that is true up to some integer k, you can prove that it is true for k+1. It's true up to k = 4 as we showed earlier. By recurrence, it is true for any N.
The state of the game at any time can be described by whose turn it is and the number of matches held by each player. After n moves there are 3^n possible histories, but for large n, many fewer than 3^n possible states, so you can save time by, for example, recognising that you are about to encounter a state that you have already encountered and worked out a value for before.
See also https://en.wikipedia.org/wiki/Nim - if this is Nim, or a variety of Nim, there are efficient strategies already worked out for it.
This is a simple game:
There is a set, A={a1,...,an}, the opponents can choose one of the first or last elements of set, and at the end the one who collect bigger numbers wins. Now say each participants dose his best, what I need to do is write a Dynamic algorithm to estimate their score.
any idea or clue is truly appreciated.
Here's a hint: to write a dynamic programming algorithm, you typically need a recurrence. Given
A={a1,...,an}
The recurrence would look something like this
f(A)= max( f({a1,...,a_n-1}) , f({a2,...,a_n}) )
Actually the recurrence relation given by dfb may not lead to right answer
as it is not leading to the right sub-optimal structure !
Assume the Player A begins the game :
the structure of problem for him is [a1,a2,...an]
After choosing an element , either a1 or an , its player B's turn to play , and then after that move it is player A's move.
So after two moves , Player A's turn will come again and this will be the right sub-problem for him .The right recurrence relation will be
Suppose from i to j elements are left :
A(i,j)= max(min( A(i+1,j-1),A(i+2,j)+a[i] ), min(A(i,j-2),A(i+1,j-1))+a[j])
Refer to the following link :
http://people.csail.mit.edu/bdean/6.046/dp/
EXAMPLE CODE
Here is Python code to compute the optimal score for first and second players.
A=[3,1,1,3,1,1,3]
cache={}
def go(a,b):
"""Find greatest difference between player 1 coins and player 2 coins when choosing from A[a:b]"""
if a==b: return 0 # no stacks left
if a==b-1: return A[a] # only one stack left
key=a,b
if key in cache:
return cache[key]
v=A[a]-go(a+1,b) # taking first stack
v=max(v,A[b-1]-go(a,b-1)) # taking last stack
cache[key]=v
return v
v = go(0,len(A))
n=sum(A)
print (n+v)/2,(n-v)/2
COUNTEREXAMPLE
Note that the code includes a counter example to one of the other answers to this question.
Consider the case [3,1,1,3,1,1,3].
By symmetry, the first players move always leaves the pattern [1,1,3,1,1,3].
For this the sum of even elements is 1+3+1=5, while the sum of odd is 1+1+3=5, so the argument is that from this position the second player will always win 5, and the first player will always win 5, so the first player will win (as he gets 5 in addition to the 3 from the first move).
However, this logic is flawed because the second player can actually get more.
First player takes 3, leaves [1,1,3,1,1,3] (only choice by symmetry)
Second player takes 3, leaves [1,1,3,1,1]
First player takes 1, leaves [1,3,1,1] (only choice by symmetry)
Second player takes 1, leaves [1,3,1]
First player takes 1, leaves [3,1] (only choice by symmetry)
Second player takes 3, leaves [1]
First player takes 1
So overall first player gets 3+1+1+1=6, while second gets 3+1+3=7 and second player wins.
The flaw is that although it is true that the second player can play such that they will win all even or all odd positions, this is not optimal play and they can actually do better than this in some cases.
Actually you do not need dynamic programming, because it is easy to find an explicit solution for the game above.
Case n is even or n = 1.
The second player to move will always lose.
Case n odd and n > 1.
The second player has a winning strategy iff one of the following 2 scenarios happen:
The elements with even index have bigger sum than all the elements with odd index
All odd elements except the last have bigger sum than all the remainings AND
All odd elements except the first have bigger sum than all the remainings.
Proof sketch:
Case n is even or n = 1: Let Sodd and Seven be the sum of all elements with even/odd indexes. Assume that Sodd > Seven, same argument hold otherwise. The first player has a winning strategy, since he can play in such a way that he will get all odd indexed items.
The case n is odd and n > 1 can also be resolved directly. In fact the first player has two options, he can get the first or last element of the set. Of the remaining elements, partition them the two subsets with odd and even indexes; by the argument above, the second player is going to take the subset with largest sum. If you expand the tree game you will end up with the statement above.
This is a followup to my earlier question about deciding if a hand is ready.
Knowledge of mahjong rules would be excellent, but a poker- or romme-based background is also sufficient to understand this question.
In Mahjong 14 tiles (tiles are like
cards in Poker) are arranged to 4 sets
and a pair. A straight ("123") always
uses exactly 3 tiles, not more and not
less. A set of the same kind ("111")
consists of exactly 3 tiles, too. This
leads to a sum of 3 * 4 + 2 = 14
tiles.
There are various exceptions like Kan
or Thirteen Orphans that are not
relevant here. Colors and value ranges
(1-9) are also not important for the
algorithm.
A hand consists of 13 tiles, every time it's our turn we get to pick a new tile and have to discard any tile so we stay on 13 tiles - except if we can win using the newly picked tile.
A hand that can be arranged to form 4 sets and a pair is "ready". A hand that requires only 1 tile to be exchanged is said to be "tenpai", or "1 from ready". Any other hand has a shanten-number which expresses how many tiles need to be exchanged to be in tenpai. So a hand with a shanten number of 1 needs 1 tile to be tenpai (and 2 tiles to be ready, accordingly). A hand with a shanten number of 5 needs 5 tiles to be tenpai and so on.
I'm trying to calculate the shanten number of a hand. After googling around for hours and reading multiple articles and papers on this topic, this seems to be an unsolved problem (except for the brute force approach). The closest algorithm I could find relied on chance, i.e. it was not able to detect the correct shanten number 100% of the time.
Rules
I'll explain a bit on the actual rules (simplified) and then my idea how to tackle this task. In mahjong, there are 4 colors, 3 normal ones like in card games (ace, heart, ...) that are called "man", "pin" and "sou". These colors run from 1 to 9 each and can be used to form straights as well as groups of the same kind. The forth color is called "honors" and can be used for groups of the same kind only, but not for straights. The seven honors will be called "E, S, W, N, R, G, B".
Let's look at an example of a tenpai hand: 2p, 3p, 3p, 3p, 3p, 4p, 5m, 5m, 5m, W, W, W, E. Next we pick an E. This is a complete mahjong hand (ready) and consists of a 2-4 pin street (remember, pins can be used for straights), a 3 pin triple, a 5 man triple, a W triple and an E pair.
Changing our original hand slightly to 2p, 2p, 3p, 3p, 3p, 4p, 5m, 5m, 5m, W, W, W, E, we got a hand in 1-shanten, i.e. it requires an additional tile to be tenpai. In this case, exchanging a 2p for an 3p brings us back to tenpai so by drawing a 3p and an E we win.
1p, 1p, 5p, 5p, 9p, 9p, E, E, E, S, S, W, W is a hand in 2-shanten. There is 1 completed triplet and 5 pairs. We need one pair in the end, so once we pick one of 1p, 5p, 9p, S or W we need to discard one of the other pairs. Example: We pick a 1 pin and discard an W. The hand is in 1-shanten now and looks like this: 1p, 1p, 1p, 5p, 5p, 9p, 9p, E, E, E, S, S, W. Next, we wait for either an 5p, 9p or S. Assuming we pick a 5p and discard the leftover W, we get this: 1p, 1p, 1p, 5p, 5p, 5p, 9p, 9p, E, E, E, S, S. This hand is in tenpai in can complete on either a 9 pin or an S.
To avoid drawing this text in length even more, you can read up on more example at wikipedia or using one of the various search results at google. All of them are a bit more technical though, so I hope the above description suffices.
Algorithm
As stated, I'd like to calculate the shanten number of a hand. My idea was to split the tiles into 4 groups according to their color. Next, all tiles are sorted into sets within their respective groups to we end up with either triplets, pairs or single tiles in the honor group or, additionally, streights in the 3 normal groups. Completed sets are ignored. Pairs are counted, the final number is decremented (we need 1 pair in the end). Single tiles are added to this number. Finally, we divide the number by 2 (since every time we pick a good tile that brings us closer to tenpai, we can get rid of another unwanted tile).
However, I can not prove that this algorithm is correct, and I also have trouble incorporating straights for difficult groups that contain many tiles in a close range. Every kind of idea is appreciated. I'm developing in .NET, but pseudo code or any readable language is welcome, too.
I've thought about this problem a bit more. To see the final results, skip over to the last section.
First idea: Brute Force Approach
First of all, I wrote a brute force approach. It was able to identify 3-shanten within a minute, but it was not very reliable (sometimes too a lot longer, and enumerating the whole space is impossible even for just 3-shanten).
Improvement of Brute Force Approach
One thing that came to mind was to add some intelligence to the brute force approach. The naive way is to add any of the remaining tiles, see if it produced Mahjong, and if not try the next recursively until it was found. Assuming there are about 30 different tiles left and the maximum depth is 6 (I'm not sure if a 7+-shanten hand is even possible [Edit: according to the formula developed later, the maximum possible shanten number is (13-1)*2/3 = 8]), we get (13*30)^6 possibilities, which is large (10^15 range).
However, there is no need to put every leftover tile in every position in your hand. Since every color has to be complete in itself, we can add tiles to the respective color groups and note down if the group is complete in itself. Details like having exactly 1 pair overall are not difficult to add. This way, there are max around (13*9)^6 possibilities, that is around 10^12 and more feasible.
A better solution: Modification of the existing Mahjong Checker
My next idea was to use the code I wrote early to test for Mahjong and modify it in two ways:
don't stop when an invalid hand is found but note down a missing tile
if there are multiple possible ways to use a tile, try out all of them
This should be the optimal idea, and with some heuristic added it should be the optimal algorithm. However, I found it quite difficult to implement - it is definitely possible though. I'd prefer an easier to write and maintain solution first.
An advanced approach using domain knowledge
Talking to a more experienced player, it appears there are some laws that can be used. For instance, a set of 3 tiles does never need to be broken up, as that would never decrease the shanten number. It may, however, be used in different ways (say, either for a 111 or a 123 combination).
Enumerate all possible 3-set and create a new simulation for each of them. Remove the 3-set. Now create all 2-set in the resulting hand and simulate for every tile that improves them to a 3-set. At the same time, simulate for any of the 1-sets being removed. Keep doing this until all 3- and 2-sets are gone. There should be a 1-set (that is, a single tile) be left in the end.
Learnings from implementation and final algorithm
I implemented the above algorithm. For easier understanding I wrote it down in pseudocode:
Remove completed 3-sets
If removed, return (i.e. do not simulate NOT taking the 3-set later)
Remove 2-set by looping through discarding any other tile (this creates a number of branches in the simulation)
If removed, return (same as earlier)
Use the number of left-over single tiles to calculate the shanten number
By the way, this is actually very similar to the approach I take when calculating the number myself, and obviously never to yields too high a number.
This works very well for almost all cases. However, I found that sometimes the earlier assumption ("removing already completed 3-sets is NEVER a bad idea") is wrong. Counter-example: 23566M 25667P 159S. The important part is the 25667. By removing a 567 3-set we end up with a left-over 6 tile, leading to 5-shanten. It would be better to use two of the single tiles to form 56x and 67x, leading to 4-shanten overall.
To fix, we simple have to remove the wrong optimization, leading to this code:
Remove completed 3-sets
Remove 2-set by looping through discarding any other tile
Use the number of left-over single tiles to calculate the shanten number
I believe this always accurately finds the smallest shanten number, but I don't know how to prove that. The time taken is in a "reasonable" range (on my machine 10 seconds max, usually 0 seconds).
The final point is calculating the shanten out of the number of left-over single tiles. First of all, it is obvious that the number is in the form 3*n+1 (because we started out with 14 tiles and always subtracted 3 tiles).
If there is 1 tile left, we're shanten already (we're just waiting for the final pair). With 4 tiles left, we have to discard 2 of them to form a 3-set, leaving us with a single tile again. This leads to 2 additional discards. With 7 tiles, we have 2 times 2 discards, adding 4. And so on.
This leads to the simple formula shanten_added = (number_of_singles - 1) * (2/3).
The described algorithm works well and passed all my tests, so I'm assuming it is correct. As stated, I can't prove it though.
Since the algorithm removes the most likely tiles combinations first, it kind of has a built-in optimization. Adding a simple check if (current_depth > best_shanten) then return; it does very well even for high shanten numbers.
My best guess would be an A* inspired approach. You need to find some heuristic which never overestimates the shanten number and use it to search the brute-force tree only in the regions where it is possible to get into a ready state quickly enough.
Correct algorithm sample: syanten.cpp
Recursive cut forms from hand in order: sets, pairs, incomplete forms, - and count it. In all variations. And result is minimal Shanten value of all variants:
Shanten = Min(Shanten, 8 - * 2 - - )
C# sample (rewrited from c++) can be found here (in Russian).
I've done a little bit of thinking and came up with a slightly different formula than mafu's. First of all, consider a hand (a very terrible hand):
1s 4s 6s 1m 5m 8m 9m 9m 7p 8p West East North
By using mafu's algorithm all we can do is cast out a pair (9m,9m). Then we are left with 11 singles. Now if we apply mafu's formula we get (11-1)*2/3 which is not an integer and therefore cannot be a shanten number. This is where I came up with this:
N = ( (S + 1) / 3 ) - 1
N stands for shanten number and S for score sum.
What is score? It's a number of tiles you need to make an incomplete set complete. For example, if you have (4,5) in your hand you need either 3 or 6 to make it a complete 3-set, that is, only one tile. So this incomplete pair gets score 1. Accordingly, (1,1) needs only 1 to become a 3-set. Any single tile obviously needs 2 tiles to become a 3-set and gets score 2. Any complete set of course get score 0. Note that we ignore the possibility of singles becoming pairs. Now if we try to find all of the incomplete sets in the above hand we get:
(4s,6s) (8m,9m) (7p,8p) 1s 1m 5m 9m West East North
Then we count the sum of its scores = 1*3+2*7 = 17.
Now if we apply this number to the formula above we get (17+1)/3 - 1 = 5 which means this hand is 5-shanten. It's somewhat more complicated than Alexey's and I don't have a proof but so far it seems to work for me. Note that such a hand could be parsed in the other way. For example:
(4s,6s) (9m,9m) (7p,8p) 1s 1m 5m 8m West East North
However, it still gets score sum 17 and 5-shanten according to formula. I also can't proof this and this is a little bit more complicated than Alexey's formula but also introduces scores that could be applied(?) to something else.
Take a look here: ShantenNumberCalculator. Calculate shanten really fast. And some related stuff (in japanese, but with code examples) http://cmj3.web.fc2.com
The essence of the algorithm: cut out all pairs, sets and unfinished forms in ALL possible ways, and thereby find the minimum value of the number of shanten.
The maximum value of the shanten for an ordinary hand: 8.
That is, as it were, we have the beginnings for 4 sets and one pair, but only one tile from each (total 13 - 5 = 8).
Accordingly, a pair will reduce the number of shantens by one, two (isolated from the rest) neighboring tiles (preset) will decrease the number of shantens by one,
a complete set (3 identical or 3 consecutive tiles) will reduce the number of shantens by 2, since two suitable tiles came to an isolated tile.
Shanten = 8 - Sets * 2 - Pairs - Presets
Determining whether your hand is already in tenpai sounds like a multi-knapsack problem. Greedy algorithms won't work - as Dialecticus pointed out, you'll need to consider the entire problem space.