The Coin Collector's Problem - algorithm

I can not understand package-merge algorithm.
Can anyone explain package-merge algorithm step by step please?
How we package and how we merge?
Is there any other optimal algorithm for solving coin collector's problem?

This is a code-free answer: I'm going to try and explain the method using some example situations, and hope it can be intuited from them as I generalise it. Sorry it's quite long, but if you read it through methodically it should all make sense.
Let's say a coin collector has two sets of coins: a set of dollar coins and a set of half dollar coins. However, the coin collector values each coin differently based on, say, the date it was minted. Some are very rare and so are more valuable.
Now, the situation is: he has no ordinary money except for his valuable coins, and he needs to use them to buy something at a store, where the clerk at the counter will only take them at their standard value. But as he values his collection, he wants to use the least valuable set of coins he can to make the payment.
So let's say he has 7 dollar coins and 5 half dollar coins, and he wants to buy something worth 4 dollars. A simplistic solution would be to take just his dollar coins and rank them by value, select the 4 least valuable, and hand them over, pocketing the remaining 3 dollar coins, which are the most valuable ones.
But then he realises that if he takes two half dollars and "packages" them together, he can treat the package as a new dollar-sized coin. To the clerk at the counter it's worth a dollar, but to the collector it's worth the combined value of the two individual half dollars he used.
A better strategy then is to look at his half-dollars, take the cheapest two, package them together as a new dollar "coin", and add ("merge") it into his set of dollars.
So now he has 8 (7+1) dollar-sized coins/packages and 3 (5-2) half dollars.
This step can be repeated once more, and now he has 9 dollar-sized coins/packages, and 1 remaining half dollar that cannot be packaged with anything else. Since this is his most valuable half dollar, he realises that he should not use it and pockets it again, effectively removing it from the problem.
Now he just has a set of 9 dollar-sized coins/packages. He sorts them by value, selects the 4 that are least valuable, and hands them over to the clerk, pocketing the remaining (most valuable) coins.
To generalise slightly: imagine he also has some quarters with him. Before packaging the half dollar coins, the he must first package the quarters into half-dollar packages (selecting the cheapest pairs in turn, discarding the most valuable remainder, if there is one), and then merge them into the set of half dollar coins.
Hypothetically there could be lesser denominations: 1/8th dollars, 1/16th dollars, etc. As long as they are negative powers of two the strategy can be generalised: sort the coins of least denomination, package them, merge the packages into the set of the next smallest denomination, and continue until you have only dollar-sized coins/packages.
(There's one other case that's worth considering: if the asking price is not a round dollar figure, e.g. 7 1/4 dollars. In this case, before packaging the quarter-sized coins/packages, you first select the cheapest, hand it over and subtract it from the asking price, then package the remaining ones.
This also can be generalised: whenever the price requires a particular denomination (i.e. it is an odd multiple of that denomination); while packaging that denomination you remove the cheapest one first, subtracting it from the price. By the time you get to the dollar stage, the price will then be a whole number of dollars.)

I find the coin analogy unhelpful. You might too. More intuitive for me is to consider a set of rods of various lengths in powers of two: 1/4, 1/2, 1, 2, 4, 8, .... Each rod has a different price. You need to spend the least money on rods and yet span some exact distance X.
With any combination of these rods, you can only /exactly/ span a distance which is diadic, that is "which has factors only of two in its denominator", ie has a terminating binary representation after the point. You can't span 1/5 or 4 3/7 because no finite number of rods matches such distances exactly, there will always be a little left over.
So assuming you have do cover a diadic X, what rods do you use? Think about the smallest binary fraction used in your distance X. If X is 13 3/128 then that's 1/128. That's the level of accuracy you need.
a. If all your rods are bigger than 1/128 then you're stuffed. None of your rods are fine enough to make that last 1/128th. Bad luck.
b. If 1/128 is the smallest kind of rod you have, you're going to have to use at least one of them. There's no choice to go that extra tiny distance: you're going to use one. So may as well buy the cheapest 1/128 rod now. Then you can solve the remaining slightly smaller problem (13 1/64) with the remaining rods.
c. If your smallest rod is tinier than 1/128 (say 1/512) then it's possible that you will use them, but only an even number of them. If you use an odd number, you'll have some tiny amount (here 1/512) left over and whatever you do with the bigger rods can't get rid of that extra little bit and you are stuffed. So you're either gonna use none of them or an even number. If you end up using two, you'll end up using the cheapest two. If you end up using four, you'll end up using the cheapest four, and so on.
So you'll use them in pairs. If you take the cheapest two -- say they cost 5p and 7p -- then if you use them, then you'll use them, effectively, as if they're a single 1/256 piece costing 12p. So take them out of the 1/512 pile and consider them as if a 12p 1/256 piece along with all the other 1/256 pieces. The leftover third 1/512 piece, the most expensive one, is useless, it won't be used for the reason above, it is too fine and will leave over that tiny fraction.
Now we're done with the 1/512 pile, you can now consider the 1/256 pile, which is now the smallest and maybe has a proper 1/256 (say costing 8p) in it, and also our "package" of two 1/512s (costing say 12p). You can then package these up in a pair using the arguments above. It has a cost of 20p and equivalent to a 1/128 piece.
You've not committed to using any of these packages yet, you just know that if you do, you'll use the whole package. And you carry on, up the sizes, making the problem smaller each time, either by reducing the total number of pieces by packaging, or by reducing the target.
So eventually, you will reach your target, or prove that you can't.
The most important thing is to remember that when you build a package you're not, at that point, committing to use it, you're just saying /if/ you do, then you'll use the rods inside as a unit.

It doesn't lend itself to a very short explanation; have you read the Wikipedia page?

Related

Finding the counterfeit coin from a list of 9 coins

Just came across this simple algorithm here to find the odd coin (which weighs heavy) from a list of identical weighing coins.
I can understand that if we take 3 coins at a time, then the minimum number of weighings is just two.
How did I find the answer ?
I manually tried weighing 4 sets of coins at a time, weighing 3 sets of coin at a time, weighing two coins at a time, weighing one coins at a time.
Ofcourse, only if we take 3 coins at a time then the minimum number of steps (two) is achievable.
The question is, how do you know that we have to take 3 coins ?
I am just trying to understand how to approach this puzzle instead of doing all possible combinations and then telling the answer as 2.
1 http://en.wikipedia.org/wiki/Balance_puzzle
In each weighings, exactly three different things can happen, so with two weightings you can only see nine different overall things happening. So with each weighing, you need to be guaranteed of eliminating at least two thirds of the (remaining) possibilities. Weighing three coins on each side is guaranteed to do this. Weighing four coins on each side could maybe eliminate eight coins, but could also eliminate only five.
It can be strictly proved on the ground of Information Theory -- a very beautiful subject, that builds the very foundations of computer science.
There is a proof in those excellent lectures of David MacKay. (sorry but do not remember in which one exactly: probably one of the first five).
The base-case is this:
How do you know that we should take three coins at a time ?
The approach :
First find the base-case.
Here the base-case would be to find the maximum number of coins from which you can find the counterfeit coins in just one-weighing. You can either take two or three coins from which you can find the counterfeit one. So, maximum(two, three) = three.
So, the base-case for this approach would be dividing the available coins by taking three at a time.
2. The generalized formula is 3^n - 3 = (X*2) where X is the available number of coins and n is the number of weighing's required. (Remember n should be floored not ceiled).
Consider X = 9 balls. 3^n = 21 and n is ceiled to 2.
So, the algorithm to tell the minimum number of weighing's would something be similar to:
algo_Min_Weight[int num_Balls]
{
return log base 3 ([num_Balls * 2] + 3);
}

If you know the future prices of a stock, what's the best time to buy and sell?

Interview Question by a financial software company for a Programmer position
Q1) Say you have an array for which the ith element is the price of a given stock on
day i.
If you were only permitted to buy one share of the stock and sell one share
of the stock, design an algorithm to find the best times to buy and sell.
My Solution :
My solution was to make an array of the differences in stock prices between day i and day i+1 for arraysize-1 days and then use Kadane Algorithm to return the sum of the largest continuous sub array.I would then buy at the start of the largest continuous array and sell at the end of the largest
continous array.
I am wondering if my solution is correct and are there any better solutions out there???
Upon answering i was asked a follow up question, which i answered exactly the same
Q2) Given that you know the future closing price of Company x for the next 10 days ,
Design a algorithm to to determine if you should BUY,SELL or HOLD for every
single day ( You are allowed to only make 1 decision every day ) with the aim of
of maximizing profit
Eg: Day 1 closing price :2.24
Day 2 closing price :2.11
...
Day 10 closing price : 3.00
My Solution: Same as above
I would like to know what if theres any better algorithm out there to maximise profit given
that i can make a decision every single day
Q1 If you were only permitted to buy one share of the stock and sell one share of the stock, design an algorithm to find the best times to buy and sell.
In a single pass through the array, determine the index i with the lowest price and the index j with the highest price. You buy at i and sell at j (selling before you buy, by borrowing stock, is in general allowed in finance, so it is okay if j < i). If all prices are the same you don't do anything.
Q2 Given that you know the future closing price of Company x for the next 10 days , Design a algorithm to to determine if you should BUY,SELL or HOLD for every single day ( You are allowed to only make 1 decision every day ) with the aim of of maximizing profit
There are only 10 days, and hence there are only 3^10 = 59049 different possibilities. Hence it is perfectly possible to use brute force. I.e., try every possibility and simply select the one which gives the greatest profit. (Even if a more efficient algorithm were found, this would remain a useful way to test the more efficient algorithm.)
Some of the solutions produced by the brute force approach may be invalid, e.g. it might not be possible to own (or owe) more than one share at once. Moreover, do you need to end up owning 0 stocks at the end of the 10 days, or are any positions automatically liquidated at the end of the 10 days? Also, I would want to clarify the assumption that I made in Q1, namely that it is possible to sell before buying to take advantage of falls in stock prices. Finally, there may be trading fees to be taken into consideration, including payments to be made if you borrow a stock in order to sell it before you buy it.
Once these assumptions are clarified it could well be possible to take design a more efficient algorithm. E.g., in the simplest case if you can only own one share and you have to buy before you sell, then you would have a "buy" at the first minimum in the series, a "sell" at the last maximum, and buys and sells at any minima and maxima inbetween.
The more I think about it, the more I think these interview questions are as much about seeing how and whether a candidate clarifies a problem as they are about the solution to the problem.
Here are some alternative answers:
Q1) Work from left to right in the array provided. Keep track of the lowest price seen so far. When you see an element of the array note down the difference between the price there and the lowest price so far, update the lowest price so far, and keep track of the highest difference seen. My answer is to make the amount of profit given at the highest difference by selling then, after having bought at the price of the lowest price seen at that time.
Q2) Treat this as a dynamic programming problem, where the state at any point in time is whether you own a share or not. Work from left to right again. At each point find the highest possible profit, given that own a share at the end of that point in time, and given that you do not own a share at the end of that point in time. You can work this out from the result of the calculations of the previous time step: In one case compare the options of buying a share and subtracting this from the profit given that you did not own at the end of the previous point or holding a share that you did own at the previous point. In the other case compare the options of selling a share to add to the profit given that you owned at the previous time, or staying pat with the profit given that you did not own at the previous time. As is standard with dynamic programming you keep the decisions made at each point in time and recover the correct list of decisions at the end by working backwards.
Your answer to question 1 was correct.
Your answer to question 2 was not correct. To solve this problem you work backwards from the end, choosing the best option at each step. For example, given the sequence { 1, 3, 5, 4, 6 }, since 4 < 6 your last move is to sell. Since 5 > 4, the previous move to that is buy. Since 3 < 5, the move on 5 is sell. Continuing in the same way, the move on 3 is to hold and the move on 1 is to buy.
Your solution for first problem is Correct. Kadane's Algorithm runtime complexity is O(n) is a optimal solution for maximum subarray problem. And benefit of using this algorithm is that it is easy to implement.
Your solution for second problem is wrong according to me. What you can do is to store the left and right index of maximum sum subarray you find. Once you find have maximum sum subarray and its left and right index. You can call this function again on the left part i.e 0 to left -1 and on right part i.e. right + 1 to Array.size - 1. So, this is a recursion process basically and you can further design the structure of this recursion with base case to solve this problem. And by following this process you can maximize profit.
Suppose the prices are the array P = [p_1, p_2, ..., p_n]
Construct a new array A = [p_1, p_2 - p_1, p_3 - p_2, ..., p_n - p_{n-1}]
i.e A[i] = p_{i+1} - p_i, taking p_0 = 0.
Now go find the maximum sum sub-array in this.
OR
Find a different algorithm, and solve the maximum sub-array problem!
The problems are equivalent.

Finding subsets being used at most k times

Every now and then I read all those conspiracy theories about Lotto-based games being controlled and a computer browsing through the combinations chosen by the players and determining the non-used subset. It got me thinking - how would such algorithm have to work in order to determine such subsets really efficiently? Finding non-used numbers is definitely crossed out as is finding the least used because it's not necesserily providing us with a solution. Also, going deeper, how could an algorithm efficiently choose such a subset that it was used some k times by the players? Saying more formally:
We are given a set of 50 numbers 1 to 50. In the draw 6 numbers are picked.
INPUT: m subsets each consisting of 6 distinct numbers 1 to 50 each,
integer k (0<=k) being the maximum players having all of their 6
numbers correct.
OUTPUT: Subsets which make not more than k players win the jackpot ('winning' means all the numbers they chose were picked in the draw).
Is there any efficient algorithm which could calculate this without using a terrabyte HDD to store all the encounters of every possible 50!/(44!*6!) in the pessimistic case? Honestly, I can't think of any.
If I wanted to run such a conspirancy I would first of all acquire the list of submissions by players. Then I would generate random lottery selections and see how many winners would be produced by each such selection. Then just choose the random lottery selection most attractive to me. There is little point doing anything more sophisticated, because that is probably already powerful enough to be noticed by staticians.
If you want to corrupt the lottery it would probably be easier and safer to select a few competitors you favour and have them win the lottery. In (the book) "1984" I think the state simply announced imaginary lottery winners, with the announcement in each area announcing somebody outside the area. One of the ideas in "The Beckoning Lady" by Margery Allingham is of a gang who attempt to set up a racecourse so they can rig races to allow them to disguise bribes as winnings.
First of all, the total number of combinations (choosing 6 from 50) is not very large. It is about 16 million which can be easily handled.
For each combination keep a count of number of people who played it. While declaring a winner choose the combination that has less than k plays.
If the number within each subset are sorted, then you can treat your subsets as strings - sort them in lexicographical order, then it is easy to count how many players selected each subset (and which subsets were not selected at all). So the time is proportional to the number of players and not the number of numbers in the lottery.

How do I calculate the shanten number in mahjong?

This is a followup to my earlier question about deciding if a hand is ready.
Knowledge of mahjong rules would be excellent, but a poker- or romme-based background is also sufficient to understand this question.
In Mahjong 14 tiles (tiles are like
cards in Poker) are arranged to 4 sets
and a pair. A straight ("123") always
uses exactly 3 tiles, not more and not
less. A set of the same kind ("111")
consists of exactly 3 tiles, too. This
leads to a sum of 3 * 4 + 2 = 14
tiles.
There are various exceptions like Kan
or Thirteen Orphans that are not
relevant here. Colors and value ranges
(1-9) are also not important for the
algorithm.
A hand consists of 13 tiles, every time it's our turn we get to pick a new tile and have to discard any tile so we stay on 13 tiles - except if we can win using the newly picked tile.
A hand that can be arranged to form 4 sets and a pair is "ready". A hand that requires only 1 tile to be exchanged is said to be "tenpai", or "1 from ready". Any other hand has a shanten-number which expresses how many tiles need to be exchanged to be in tenpai. So a hand with a shanten number of 1 needs 1 tile to be tenpai (and 2 tiles to be ready, accordingly). A hand with a shanten number of 5 needs 5 tiles to be tenpai and so on.
I'm trying to calculate the shanten number of a hand. After googling around for hours and reading multiple articles and papers on this topic, this seems to be an unsolved problem (except for the brute force approach). The closest algorithm I could find relied on chance, i.e. it was not able to detect the correct shanten number 100% of the time.
Rules
I'll explain a bit on the actual rules (simplified) and then my idea how to tackle this task. In mahjong, there are 4 colors, 3 normal ones like in card games (ace, heart, ...) that are called "man", "pin" and "sou". These colors run from 1 to 9 each and can be used to form straights as well as groups of the same kind. The forth color is called "honors" and can be used for groups of the same kind only, but not for straights. The seven honors will be called "E, S, W, N, R, G, B".
Let's look at an example of a tenpai hand: 2p, 3p, 3p, 3p, 3p, 4p, 5m, 5m, 5m, W, W, W, E. Next we pick an E. This is a complete mahjong hand (ready) and consists of a 2-4 pin street (remember, pins can be used for straights), a 3 pin triple, a 5 man triple, a W triple and an E pair.
Changing our original hand slightly to 2p, 2p, 3p, 3p, 3p, 4p, 5m, 5m, 5m, W, W, W, E, we got a hand in 1-shanten, i.e. it requires an additional tile to be tenpai. In this case, exchanging a 2p for an 3p brings us back to tenpai so by drawing a 3p and an E we win.
1p, 1p, 5p, 5p, 9p, 9p, E, E, E, S, S, W, W is a hand in 2-shanten. There is 1 completed triplet and 5 pairs. We need one pair in the end, so once we pick one of 1p, 5p, 9p, S or W we need to discard one of the other pairs. Example: We pick a 1 pin and discard an W. The hand is in 1-shanten now and looks like this: 1p, 1p, 1p, 5p, 5p, 9p, 9p, E, E, E, S, S, W. Next, we wait for either an 5p, 9p or S. Assuming we pick a 5p and discard the leftover W, we get this: 1p, 1p, 1p, 5p, 5p, 5p, 9p, 9p, E, E, E, S, S. This hand is in tenpai in can complete on either a 9 pin or an S.
To avoid drawing this text in length even more, you can read up on more example at wikipedia or using one of the various search results at google. All of them are a bit more technical though, so I hope the above description suffices.
Algorithm
As stated, I'd like to calculate the shanten number of a hand. My idea was to split the tiles into 4 groups according to their color. Next, all tiles are sorted into sets within their respective groups to we end up with either triplets, pairs or single tiles in the honor group or, additionally, streights in the 3 normal groups. Completed sets are ignored. Pairs are counted, the final number is decremented (we need 1 pair in the end). Single tiles are added to this number. Finally, we divide the number by 2 (since every time we pick a good tile that brings us closer to tenpai, we can get rid of another unwanted tile).
However, I can not prove that this algorithm is correct, and I also have trouble incorporating straights for difficult groups that contain many tiles in a close range. Every kind of idea is appreciated. I'm developing in .NET, but pseudo code or any readable language is welcome, too.
I've thought about this problem a bit more. To see the final results, skip over to the last section.
First idea: Brute Force Approach
First of all, I wrote a brute force approach. It was able to identify 3-shanten within a minute, but it was not very reliable (sometimes too a lot longer, and enumerating the whole space is impossible even for just 3-shanten).
Improvement of Brute Force Approach
One thing that came to mind was to add some intelligence to the brute force approach. The naive way is to add any of the remaining tiles, see if it produced Mahjong, and if not try the next recursively until it was found. Assuming there are about 30 different tiles left and the maximum depth is 6 (I'm not sure if a 7+-shanten hand is even possible [Edit: according to the formula developed later, the maximum possible shanten number is (13-1)*2/3 = 8]), we get (13*30)^6 possibilities, which is large (10^15 range).
However, there is no need to put every leftover tile in every position in your hand. Since every color has to be complete in itself, we can add tiles to the respective color groups and note down if the group is complete in itself. Details like having exactly 1 pair overall are not difficult to add. This way, there are max around (13*9)^6 possibilities, that is around 10^12 and more feasible.
A better solution: Modification of the existing Mahjong Checker
My next idea was to use the code I wrote early to test for Mahjong and modify it in two ways:
don't stop when an invalid hand is found but note down a missing tile
if there are multiple possible ways to use a tile, try out all of them
This should be the optimal idea, and with some heuristic added it should be the optimal algorithm. However, I found it quite difficult to implement - it is definitely possible though. I'd prefer an easier to write and maintain solution first.
An advanced approach using domain knowledge
Talking to a more experienced player, it appears there are some laws that can be used. For instance, a set of 3 tiles does never need to be broken up, as that would never decrease the shanten number. It may, however, be used in different ways (say, either for a 111 or a 123 combination).
Enumerate all possible 3-set and create a new simulation for each of them. Remove the 3-set. Now create all 2-set in the resulting hand and simulate for every tile that improves them to a 3-set. At the same time, simulate for any of the 1-sets being removed. Keep doing this until all 3- and 2-sets are gone. There should be a 1-set (that is, a single tile) be left in the end.
Learnings from implementation and final algorithm
I implemented the above algorithm. For easier understanding I wrote it down in pseudocode:
Remove completed 3-sets
If removed, return (i.e. do not simulate NOT taking the 3-set later)
Remove 2-set by looping through discarding any other tile (this creates a number of branches in the simulation)
If removed, return (same as earlier)
Use the number of left-over single tiles to calculate the shanten number
By the way, this is actually very similar to the approach I take when calculating the number myself, and obviously never to yields too high a number.
This works very well for almost all cases. However, I found that sometimes the earlier assumption ("removing already completed 3-sets is NEVER a bad idea") is wrong. Counter-example: 23566M 25667P 159S. The important part is the 25667. By removing a 567 3-set we end up with a left-over 6 tile, leading to 5-shanten. It would be better to use two of the single tiles to form 56x and 67x, leading to 4-shanten overall.
To fix, we simple have to remove the wrong optimization, leading to this code:
Remove completed 3-sets
Remove 2-set by looping through discarding any other tile
Use the number of left-over single tiles to calculate the shanten number
I believe this always accurately finds the smallest shanten number, but I don't know how to prove that. The time taken is in a "reasonable" range (on my machine 10 seconds max, usually 0 seconds).
The final point is calculating the shanten out of the number of left-over single tiles. First of all, it is obvious that the number is in the form 3*n+1 (because we started out with 14 tiles and always subtracted 3 tiles).
If there is 1 tile left, we're shanten already (we're just waiting for the final pair). With 4 tiles left, we have to discard 2 of them to form a 3-set, leaving us with a single tile again. This leads to 2 additional discards. With 7 tiles, we have 2 times 2 discards, adding 4. And so on.
This leads to the simple formula shanten_added = (number_of_singles - 1) * (2/3).
The described algorithm works well and passed all my tests, so I'm assuming it is correct. As stated, I can't prove it though.
Since the algorithm removes the most likely tiles combinations first, it kind of has a built-in optimization. Adding a simple check if (current_depth > best_shanten) then return; it does very well even for high shanten numbers.
My best guess would be an A* inspired approach. You need to find some heuristic which never overestimates the shanten number and use it to search the brute-force tree only in the regions where it is possible to get into a ready state quickly enough.
Correct algorithm sample: syanten.cpp
Recursive cut forms from hand in order: sets, pairs, incomplete forms, - and count it. In all variations. And result is minimal Shanten value of all variants:
Shanten = Min(Shanten, 8 - * 2 - - )
C# sample (rewrited from c++) can be found here (in Russian).
I've done a little bit of thinking and came up with a slightly different formula than mafu's. First of all, consider a hand (a very terrible hand):
1s 4s 6s 1m 5m 8m 9m 9m 7p 8p West East North
By using mafu's algorithm all we can do is cast out a pair (9m,9m). Then we are left with 11 singles. Now if we apply mafu's formula we get (11-1)*2/3 which is not an integer and therefore cannot be a shanten number. This is where I came up with this:
N = ( (S + 1) / 3 ) - 1
N stands for shanten number and S for score sum.
What is score? It's a number of tiles you need to make an incomplete set complete. For example, if you have (4,5) in your hand you need either 3 or 6 to make it a complete 3-set, that is, only one tile. So this incomplete pair gets score 1. Accordingly, (1,1) needs only 1 to become a 3-set. Any single tile obviously needs 2 tiles to become a 3-set and gets score 2. Any complete set of course get score 0. Note that we ignore the possibility of singles becoming pairs. Now if we try to find all of the incomplete sets in the above hand we get:
(4s,6s) (8m,9m) (7p,8p) 1s 1m 5m 9m West East North
Then we count the sum of its scores = 1*3+2*7 = 17.
Now if we apply this number to the formula above we get (17+1)/3 - 1 = 5 which means this hand is 5-shanten. It's somewhat more complicated than Alexey's and I don't have a proof but so far it seems to work for me. Note that such a hand could be parsed in the other way. For example:
(4s,6s) (9m,9m) (7p,8p) 1s 1m 5m 8m West East North
However, it still gets score sum 17 and 5-shanten according to formula. I also can't proof this and this is a little bit more complicated than Alexey's formula but also introduces scores that could be applied(?) to something else.
Take a look here: ShantenNumberCalculator. Calculate shanten really fast. And some related stuff (in japanese, but with code examples) http://cmj3.web.fc2.com
The essence of the algorithm: cut out all pairs, sets and unfinished forms in ALL possible ways, and thereby find the minimum value of the number of shanten.
The maximum value of the shanten for an ordinary hand: 8.
That is, as it were, we have the beginnings for 4 sets and one pair, but only one tile from each (total 13 - 5 = 8).
Accordingly, a pair will reduce the number of shantens by one, two (isolated from the rest) neighboring tiles (preset) will decrease the number of shantens by one,
a complete set (3 identical or 3 consecutive tiles) will reduce the number of shantens by 2, since two suitable tiles came to an isolated tile.
Shanten = 8 - Sets * 2 - Pairs - Presets
Determining whether your hand is already in tenpai sounds like a multi-knapsack problem. Greedy algorithms won't work - as Dialecticus pointed out, you'll need to consider the entire problem space.

Finding your own number in a box

100 (or some even number 2N :-) ) prisoners are in a room A. They are numbered from 1 to 100.
One by one (from prisoner #1 to prisoner #100, in order), they will be let into a room B in which 100 boxes (numbered from 1 to 100) await them. Inside the (closed) boxes are numbers from 1 to 100 (the numbers inside the boxes are randomly permuted!).
Once inside room B, each prisoner gets to open 50 boxes (he chooses which one he opens). If he finds the number that was assigned to him in one of these 50 boxes, the prisoner gets to walk into a room C and all boxes are closed again before the next one walks into room B from room A. Otherwise, all prisoners (in rooms A, B and C) gets killed.
Before entering room B, the prisoners can agree on a strategy (algorithm). There is no way to communicate between rooms (and no message can be left in room B!).
Is there an algorithm that maximizes the probability that all prisoners survive? What probability does that algorithm achieve?
Notes:
Doing things randomly (what you call 'no strategy') indeed gives a probability of 1/2 for each prisoner, but then the probability of all of them surviving is 1/2^100 (which is quite low). One can do much better!
The prisoners are not allowed to reorder the boxes!
All prisoners are killed the first time a prisoner fails to find his number. And no communication is possible.
Hint: one can save more than 30 prisoners on average, which is much more that (50/100) * (50/99) * [...] * 1
This puzzle is explained at http://www.math.princeton.edu/~wwong/blog/blog200608191813.shtml and that person does a much better job of explaining the problem.
The "all prisoners are killed" statement is wrong.
The "you can save 30+ on average" is also wrong, the article says that 30% of the time you can save 100% of the prisoners.
I find a low tech solution to this type of problem is always the best way to go.
first we make some assumptions about the situation
The prisoners are not all programmers or mathematicians
They don't want to die
The guards are well armed
So with a 0.005% chance that they will see tomorrow, there is a very simple and low tech solution to this problem. RIOT
its all about losses v potential gain, the chances are the prisoners far out number the guards, and using each other as human shields, as they are all dead men anyway if they don't, they can increase the chances they will over power a guard, once they have his weapon there chance goes up, helping them over power more guards to get more fire power to further increase there survival rate. once the guards realise what's happening, they will probably run for the hills and lock down the prison, this will give the media a heads up and then its a human rights issue.
Implement a sorting algorithm and sort the boxes according to the numbers inside them.
First prisoner sorts 50 boxes, and the second prisoner sorts the other 50 and merges with the first one. (Note that the second prisoner can guess the values inside the first 50 boxes)
After the 2nd prisoner, all of the boxes will be in a sorted order !!!
Everybody else can open the boxes containing their numbers easily then.
I don't know if this is allowed but the best approximation I can find is:
EDIT: Ok, I think this makes it. Of course I'm treating this as a computing problem, I don't think any prisioner will be able to perform this, although is pretty straight forward if you don't.
Find the first 50 primes, let's asume we hold them in an array called primes.
The first prissioner enters room B and opens the first box and finds the number m.
Wait primes[1]^m (that would be 3^m)
Open box 2 and read the number --> n
Wait (primes[2]^n - 1) * primes[1]^m, that would be (5^n - 1) * 3^m and the total time he has been waiting would be 3^n * 5^n
Repeat. After the first prisioner the total time for him would be:
3^m * 5^n * 7^p ... = X
Before the second prisioner enters the room factorize X. You know beforehand the prime numbers that have been used so the factorization is trivial. Doing so you obtain m, n, p, etc so the second prisioner knows every box/number combination the previous prisioner used.
The probability of the first one getting everybody killed is 1/2, the second one will have a 50 / (100 - n) (being n the numbers of attemps of the first one) the third one will have 50 / (100 - n - m) (if n + m = 100 then all positions are known) and so on.
Obviously the next prissioner must skip the already known boxes (except for the last choice if the box which contains his number is already known)
I don't know what's the exact possibility as it dependes on how many choices they have to do but I'd say it's pretty high.
EDIT: Rereading, if the prissioner does not have to stop when he obtains his number then the probability for the whole group is vastly improved, exactly 50%.
EDIT2: #OysterD see it this way. If the first prisioner can open 50 boxes then the second one know if its number is in any of that boxes. If it is, then he can open other 49 (and by doing so learning the box/number comination of the 100 boxes) and finally open his one. So if the first prissioner succeds then everyone succeds. Remember that each prisioner provides a way for the other to know exactly the boxes/number combination for every box he opens.
Maybe I'm not reading it right, but the question seems to be badly constructed or missing information.
If he finds the number that was
assigned to him in one of these 50
boxes, the prisoner gets to walk into
a room C and all boxes are closed
again before the next one walks into
room B from room A. Otherwise, all
prisoners (in rooms A, B and C) gets
killed.
Does the last sentence there mean that all prisoners are killed the first time a prisoner fails to find their number? If not, what happens to prisoners that don't find their number?
If no communication is possible, and whenever a prisoner enters room B it is always in an identical state then there is no possibility for a strategy.
The prisoners could could say before they leave room A which number box they are going to open. But without subsequent prisoners knowing whether they were successful or not (assuming failure for one isn't failure for all) when the next prisoner enters room B they still have the same odds of picking their number as the previous prisoner (always 1:100).
If failure for one is failure for all, then by knowing which box the previous prisoners opened, and by dint of the fact that they haven't all been killed, each successive prisoner could reduce the odds of picking the wrong box by one box. i.e. 1:100 for the first prisoner, 1:99 for the second, down to 1:1 for the last.
The prisoners could agree that prisoner 1 open boxes 1-50.
If they're all still alive, they agree that the next prisoner opens boxes 2-51. (the 2 is arbitrary, but simple to remember this rule) His odds of surviving given that P1 survived are now 50/99. You want to eliminate opening a box when you know that the previous guy found his.
I don't know if that's optimal, but it's lot better than random.
The probability of surviving that looks like
50/100 * 50/99 * 50/98 *. . .50/51 * 1
I think since no communication is possible, the best strategy would involve
distributing the probability of each prisoners as evenly as possible
Am I on the right path or not?
Information available for each prisoner:
The number of survivied prisoners, so if you have an efficient box picking system that utilizes the order any prisoner enters room B, then a strategy is definitely possible
Which boxes the earlier prisoners picked. Of course, no communication is possible during the run and it wouldn't be possible to remember any 100s box picking permutation. But you could use this information to compute in a system before the run starts.
My take:
Draw a table of numbers with 2 columns, the first column contains the box number (from box #1 to box#100). Each prisoner then gets to pick 50 boxes and whatever box they pick, they should put 1 mark on the corresponding row in the second column.
All prisoners are however required to never pick any box twice. And no box may be marked more than 50. Some prisoners may get less options than others since some box may get filled to 50 marks first.
When a prisoner is moved to room B he/she opens whatever boxes he has marked on.
If all prisoners are killed when someone fails to find their number then you either save 100 or 0. There is no way to save 30 people.
Same concept.
Aonther take:
Write down a list of the first 100 binary numbers which has fifty 1s and fifty 0s.
Sort them from lowest to highest.
Prisoner #1 gets the first number, prisoner #2 gets the second, prisoner #3 gets the third and so on...
Each prisoner remembers his/her binary number.
When any prisoner is moved to room B, he/she then match the binary digits of the number he remembered with each of the box, the highest bit is matched with the leftmost box, the next highest bit is matched with the second leftmost box ... the lowest bit is matched with the rightmost box.
He/she opens whatever boxes matched with 1 and leave closed boxes matched with 0.
This would minimizes the probability because early prisoners will get digits that are different from the later prisoners and prisoners which has number close together would get digits close together. This doesn't guarantee survivability but if the early prisoners do survive, chances are the later prisoners would have a higher probability of surviving as well.
I haven't thought out the exact figures and rationale though, but this is one quick solution I can think of at the moment.
There aren't any time limits in the question so I suggest that prisoners should decide to take 1 hour per box and open them in the order presented. If the second prisoner is allowed into the room after 2 hours then he knows that the first prisoner found his own number in box 2. Therefore he knows to skip box 2 in his sequence and opens boxes 1, 3, 4...51
First prisoners odds on losing are 50/100
Give that the first prisoner survived then the second prisoners chance of winning are 50/99
So answer appears to be ((50 ^51)*49!)/100!
which according to google makes 2.89*10^-9
which is pretty much nil
So even if the prisoners knew the boxes the previously lucky ones found their number in there'd be no hope

Resources