I'm coding a board game where there is a bag of possible pieces. Each turn, players remove randomly selected pieces from the bag according to certain rules.
For my implementation, it may be easier to divide up the bag initially into pools for one or more players. These pools would be randomly selected, but now different players would be picking from different bags. Is this any different?
If one player's bag ran out, more would be randomly shuffled into it from the general stockpile.
So long as:
the partition into "pool" bags is random
the assignment of players to a given pool bag is random
the game is such that items drawn by the players are effectively removed from the bag (never returned to the bag,or any other bag, for the duration of the current game)
the players are not cognizant of the content of any of the bags
The two approaches ("original" with one big common bag, "modified" with one pool bag per player are equivalent with regards to probabilities.
It only gets a bit tricky towards the end of the game, when some of players' bags are empty. The fairest to let pick from 100% of the items still in play, hence, they should both pick from which bag they pick and [blindly,of course] pick one item from said bag.
This problem illustrate an interesting characteristic of probabilities which is that probabilities are relative to the amount of knowledge one has about the situation. For example the game host may well know that the "pool" bag assigned to say player X does not include any say Letter "A" (thinking about scrabble), but so long as none of the players know this (and so long as the partitions into pool bag was fully random), the game remains fair, and player "X" still has to assume that his/her probably of hitting an "A" the next time a letter is drawn, is the same as if all remaining letters were available to him/her.
Edit:
Not withstanding the mathematical validity of the assertion that both procedures are fully equivalent, perception is an important factor in games that include a chance component (in particular if the game also includes a pecuniary component). To avoid the ire of players who do not understand this equity, you may stick to the original procedure...
Depending on the game rules, #mjv is right, the initial random division doesn't affect the probabilities. This is analogous to a game where n players draw cards in turn from a face down deck: the initial shuffle of the deck is the random division into the "bags" of cards for each player.
But if you replace the items after each draw, it does matter if there is one bag or many. With one bag any particular item will eventually be drawn by any player with the same probability. With many bags, that item can only be drawn by the player whose bag it was initially placed in.
Popping up to the software level, if the game calls for a single bag, I'd recommend just programming it that way: it should be no more difficult than n bags, and you don't have to prove the new game equivalent to the old.
My intuition tells me that dividing a random collection of things into smaller random subsets would remains equally random... doesn't matter if a player picks from a big pool or a smaller one (that in turns, feed itself into the big one)
For a game it is enough random IMHO!
Depending on how crucial security is, it might be okay (if money is involved (you or them) DO NOT DO THAT). I'm not entirely sure it would be less random from the perspective of an ignorant player.
a) Don't count on them being ignorant, your program could be cracked and then they would know what pieces are coming up
b) It would be very tricky to fill the bags in such a way that you don't introduce vulnerabilities. For instance, let's take the naive algorithm of picking one randomly and putting it in a the first bucket, taking it out, and then doing the same for the second bucket and so on. You just ensured that if there are N pieces, the first player had a probability of 1/N of picking a given piece, the second player had a 1/(N-1), the third had 1/(N-3) and so on. Players can then analyze the pieces already played in order to figure out the probabilities that other players are holding certain pieces.
I THINK the following algorithm might work better, but almost all people get probability wrong the first time they come up with a new algorithm. DON'T USE THIS, just understand that it might cover the security vulnerability I talked about:
Create a list of N ordered items and instantiate P players
Mark 1/P of the items randomly (with replacement) for each player
Do this repeatedly until all N items are marked and there are an equal
number of items marked for each player (NOTE: May take much longer than you may live depending on N and P)
Place the appropriate items in the player's bucket and randomly rearrange (do NOT use a place swapping algorithm)
Even then after all this, you might still have a vulnerability to someone figuring out what's in their bucket from an exploit. Stick with a combined pool, it's still tricky to pick really randomly, but it will make your life easier.
Edit: I know the tone sounds kind of jerky. I mostly included all that bold for people who might read this out of context and try some of these algorithms. I really wish you well :-)
Edit 2: On further consideration, I think that the problem with picking in order might reduce to having players taking turns in the first place. If that's in the rules already it might not matter.
Related
I want to create an algorithm which can most accurately replicate a given input audio source (music) by playing from a predetermined list of sounds. This is an example of what I want to do, but this uses all 88 keys of the piano in various volumes, while I only have a predetermined list of audio files.
Assuming the sounds overlap (multiple instances of the sounds can be played at once) then what's the best algorithm (or existing software, if any) which can do this?
This is a very hard problem, and I don't believe that there is a good algorithm.
First, let's describe the problem to be sure it is right. You have a collection X of audio samples. You can play any combination of them starting at any time t. So the number of waveforms you have to insert is the number in the collection times the number of times they can start.
You want a list of which ones to play, and when to play them (possibly multiple at the same time), to approximate a given wave form. In other words you want to find a subset of the options that adds to the approximation.
If we just wanted to sum or approximate at one point, you'd get the standard Subset Sum problem. Which can be solved by dynamic programming. BUT dynamic programming requires a state which is the number of things in your set times the number of possible values you could sum to. This state is huge. You can make it smaller by grouping values into a smaller number of buckets. (This is fine for approximating.)
But we want to approximate not at one point, but at many. And even a small number of buckets at 20 points makes a combinatorial explosion with too much state to handle.
Therefore I wouldn't hope to find a perfect algorithm. Instead I would recommend looking up things like simulated annealing and genetic algorithms to randomly explore and find a reasonable solution in a reasonable time.
Good luck! (Literally, given the randomness I'm recommending.)
A note on what kind of combination I'm thinking of.
Imagine having a pool of 10,000 candidates. Each turn you randomly do one of these things to create a new candidate. Take one and remove a sample. Take one and move the timing of a sample. Take one and add a random sample. Take 2 and a cutover time, and do the samples from the first before then the samples from the other after.
You then compare your new candidate with a random other candidate, using the simulated annealing temperature to randomly decide which to keep (but over time give a growing preference to the better one).
It will take a lot of calculation, but I believe you'll wind up with some good candidates.
I've been reading this small tutorial on Nimbers and game theory.
Could someone explain why the mex rule governs the nimber of a game position?
See: http://en.wikipedia.org/wiki/Mex_(mathematics)
From the minimal excluded ordinal, it seems to me that the Nimber for a state is actually the minimum state that the person 'cannot' reach. How does that help in governing the state of the current game ?
I see a proof on Wikipedia, but I don't understand anything from it.
http://en.wikipedia.org/wiki/Sprague%E2%80%93Grundy_theorem#Proof
The entire idea of a nimber is to draw an analogy with the well understood game of Nim. So unless you understand THAT game, it won't make sense to you.
In the game of Nim we have a set of piles of things. On each turn, you take as many things as you want from one pile and one pile only. The winner is the person to take the last thing from the last pile.
Now try to convince yourself of the following facts.
In Nim, the nimber of a single pile is the size of that pile.
If we have a 2 pile game, the nimber of the position is the xor of the sizes of the two piles. (You will need to do a double induction.)
If we take the set of piles and split it into two, then the nimber of the whole position is the xor of the nimbers of the two subsets.
Now here is the point. Replace the piles with arbitrary deterministic games with a guaranteed win/lose. Turn the collection into a game where you're taking turns with different games, and the person who wins the last game wins. The nimber as defined above tells you, by analogy with Nim, how to play the combined game perfectly.
If you're playing just the regular 2 person game, then the only fact about the nimber that you actually need to know is whether it is 0 (you're in a losing position) or non-zero (you're in a winning position). The exact nimber is only useful when you can break a complex game into a collection of separate games that you are choosing between on each turn. However a surprising number of mathematical games do admit of such a structure.
For me, it was like this:
Understand Nim, and why the strategy works
Understand Poker Nim, and why the strategy is the same
Understand why the mex is the important number
Poker Nim is just like Nim, except that the players hold onto the ``coins'' that they remove, and on their turn, they may either move any positive number of coins from one stack into their hand, or move any positive number of coins from their hand onto one stack.
Initially, this feels very different. Play can even proceed for infinitely many moves! But that doesn't happen if Bob and Alice are playing hard. Suppose Bob looks at the stacks and sees that he would have a winning strategy if they were playing Nim and not Poker Nim. He can adapt that strategy to Nim as follows: if Alice takes coins off the table, he proceeds as if he is playing Nim; if Alice puts coins onto the table, he immediately removes the coins she just placed. Since she can only have finitely many coins in her hand, she can only stall finitely many times before she is forced to make her losing Nim move.
In Poker Nim, if I have 5 coins in hand and I look at a stack of 3 coins, I can on my move change it to any have 0, 1, 2, 4, 5, 6, 7, or 8 coins. What I can't do is leave it at the mex, which is 3. If I move it down, I am playing Nim. I move it up, you can immediately reverse it back to 3, and I am facing the same situation I was except that now I have fewer than 5 coins in hand.
So that's Poker Nim, and the essence of how the mex becomes relevant. Moves above the mex are reversible, and so cannot ever turn a losing position into a winning won. Moving above the mex is never helpful. Unless you are trying to overwhelm the computational power of your opponent, that is.
I would like to build an AI for the following game:
there are two players on a M x N board
each player can move up/down or left/right
there are different items on the board
the player wins who has more items than the other player in as many categories as possible (having more items in one category makes you the winner of this category, the player with more categories wins the game)
in one turn you can either pick up an item you are standing on or move
player moves are made at the same time
two players standing on the same field have a 0.5 pickup chance if both do it
The game ends if one of the following condition is met:
all the items have been picked up
there is already a clear winner since one player has has more than half the items of more than half of the categories
I have no idea of AI, but I have taken a machine learning class some time ago.
How do I get started on such a problem?
Is there a generalization of this problem?
The canonical choice for adversarial search games like you proposed (called two player zero-sum games) is called Minimax search. From wikipedia, the goal of Minimax is to
Minimize the possible loss for a worst case (maximum loss) scenario. Alternatively, it can be thought of as maximizing the minimum gain.
Hence, it is called minimax, or maximin. Essentially you build a tree of Max and Min levels, where the nodes each have a branching factor equal to the number of possible actions at each turn, 4 in your case. Each level corresponds to one of the player's turns, and the tree extends until the end of the game, allowing you to search for the optimal choice at each turn, assuming the opponent is playing optimally as well. If your opponent is not playing optimally, you will only score better. Essentially, at each node you simulate every possible game and choose the best action for the current turn.
If it seems like generating all possible games would take a long time, you are correct, it's an exponential complexity algorithm. From here you would want to investigate alpha-beta pruning, which essentially allows you to eliminate some of the possible games you are enumerating based on the values you have found so far, and is a fairly simple modification of minimax. This solution will still be optimal. I defer to the wikipedia article for further explanation.
From there, you would want to experiment with different heuristics for eliminating nodes, which could prune the tree of a significant number of nodes to traverse, however do note that eliminating nodes via heuristics will potentially produce a sub-optimal, but still good solution depending on your heuristic. One common tactic is to limit the depth of the search tree, essentially you search maybe 5 moves ahead to determine the best current move, using an estimate of each player's score at 5 moves ahead. Once again, this is a heuristic you could tweak. Something like simply calculating the score of the game as if it ended on that turn might suffice, and is definitely a good starting point.
Finally, for the nodes where probability is concerned, there is a slight modification of Minimax called Expectiminimax that essentially takes care of probability by adding a "third" player that chooses the random choice for you. The nodes for this third player take the expected value of the random event as their value.
The usual approach to any such problem is to play the game with live opponent long enough to find some heuristic solutions (short term goals) that lead you to victory. Then you implement these heuristics in your solution. Start with really small boards (1x3) and small number of categories (1), play them and see what happens, and then advance to more complicated cases.
Without playing the game I can only imagine that categories with less items are more valuable, also categories with items currently closer to you, and categories with items that are farthest away from you but still closer to you than to the opponent.
Every category has a cost, which is number of moves required to gain control of it, but the cost for you is different from the cost for the opponent, and it changes with every move. Category has greater value to you if the cost for you is near the cost for the opponent, but is still less than opponent's cost.
Every time you make a move categories change their values, so you have to recalculate the board and go from there in deciding your next move. The goal is to maximize your values and minimize opponents values, assuming that opponent uses the same algorithm as you.
The search for best move gets more complicated if you explore more than one turn in advance, but is also more effective. In this case you have to simulate opponents moves using the same algorithm, and then choosing your move to which opponent has the weakest counter-move. This strategy is called minimax.
All this is not really an AI, but it is an road map for an algorithm. Neural networks mentioned in the other answer are more AI-like, but I don't know anything about them.
The goal of the AI is to always seek to maintain the win conditions.
If it is practical (depending on how item locations are stored), at the start of each turn, the distance to all remaining items should be known to the AI. Ideally, this would be calculated once when the game is started, then simply "adjusted" based on where the AI moves, instead of recalculating at each turn. It also wouldn't be wise to have the AI do the same thing for the player if the AI isn't going to be only considering it's own situation.
From there is a matter of determining what item should be picked up as an optimization of the following considerations:
What items and item categories does the AI currently have?
What items and item categories does the player currently have?
What items and item categories are near the AI?
What items and item categories are near the Player?
Exactly how you do this largely depends on how difficult to beat you want the AI to be.
A simple way would be to use a greedy approach and simply go after the "current" best choice. This could be done by simply finding the closest item that is not in a category that the player is currently winning by so many items (probably 1-3). This produces an AI that tries to win, but doesn't think ahead making it rather easy to predict.
Allowing for the greedy algorithm to check multiple turns ahead will improve the algorithm, that and considering what the player will do will improve the algorithm further.
Heuristics will lead to a more realistic AI and hard to beat AI. Possibly even practically impossible to beat.
I was out buying groceries the other day and needed to search through my wallet to find my credit card, my customer rewards (loyalty) card, and my photo ID. My wallet has dozens of other cards in it (work ID, other credit cards, etc.), so it took me a while to find everything.
My wallet has six slots in it where I can put cards, with only the first card in each slot initially visible at any one time. If I want to find a specific card, I have to remember which slot it's in, then look at all the cards in that slot one at a time to find it. The closer it is to the front of a slot, the easier it is to find it.
It occurred to me that this is pretty much a data structures question. Suppose that you have a data structure consisting of k linked lists, each of which can store an arbitrary number of elements. You want to distribute elements into the linked lists in a way that minimizes looking up. You can use whatever system you want for distributing elements into the different lists, and can reorder lists whenever you'd like. Given this setup, is there an optimal way to order the lists, under any of the assumptions:
You are given the probabilities of accessing each element in advance and accesses are independent, or
You have no knowledge in advance what elements will be accessed when?
The informal system I use in my wallet is to "hash" cards into different slots based on use case (IDs, credit cards, loyalty cards, etc.), then keep elements within each slot roughly sorted by access frequency. However, maybe there's a better way to do this (for example, storing the k most frequently-used elements at the front of each slot regardless of their use case).
Is there a known system for solving this problem? Is this a well-known problem in data structures? If so, what's the optimal solution?
(In case this doesn't seem programming-related: I could imagine an application in which the user has several drop-down lists of commonly-used items, and wants to keep those items ordered in a way that minimizes the time required to find a particular item.)
Although not a full answer for general k, this 1985 paper by Sleator and Tarjan gives a helpful analysis of the amortised complexity of several dynamic list update algorithms for the case k=1. It turns out that move-to-front is very good: assuming fixed access probabilities for each item, it never requires more than twice the number of steps (moves and swaps) that would be required by the optimal (static) algorithm, in which all elements are listed in nonincreasing order of probability.
Interestingly, a couple of other plausible heuristics -- namely swapping with the previous element after finding the desired element, and maintaining order according to explicit frequency counts -- don't share this desirable property. OTOH, on p. 2 they mention that an earlier paper by Rivest showed that the expected amortised cost of any access under swap-with-previous is <= the corresponding cost under move-to-front.
I've only read the first few pages, but it looks relevant to me. Hope it helps!
You need to look at skip lists. There is a similar problem with arranging stations for a train system where there are express trains and regular trains. An express train stops only at express stations while regular trains stop at regular stations and express stations. Where should the express stops be placed so that one can minimize the average number of stops when travelling from a start station to any station.
The solution is to use stations at ternary numbers (i.e., at 1, 3, 6, 10 etc where T_n = n * (n + 1) / 2).
This is assuming all stops (or cards) are equally likely to be accessed.
If you know the access probabilities of your n cards in advance and you have k wallet slots and accesses are independent, isn't it fairly clear that the greedy solution is optimal? That is, the most frequently-accessed k cards go at the front of the pockets, next-most-frequently accessed k go immediately behind, and so forth? (You never want a lower-probability card ranked before a higher-probability card.)
If you don't know the access probabilities, but you do know they exist and that card accesses are independent, I imagine sorting the cards similarly, but by number-of-accesses-seen-so-far instead is asymptotically optimal. (Move-to-front is cool too, but I don't see an obvious reason to use it here.)
Perhaps you get something interesting if you penalise card moves as well; if I have any known probability distribution on card accesses, independent or not, I just greedily re-sort the cards every time I do an access.
First of all: This is not a question about how to make a program play Five in a Row. Been there, done that.
Introductory explanation
I have made a five-in-a-row-game as a framework to experiment with genetically improving AI (ouch, that sounds awfully pretentious). As with most turn-based games the best move is decided by assigning a score to every possible move, and then playing the move with the highest score. The function for assigning a score to a move (a square) goes something like this:
If the square already has a token, the score is 0 since it would be illegal to place a new token in the square.
Each square can be a part of up to 20 different winning rows (5 horizontal, 5 vertical, 10 diagonal). The score of the square is the sum of the score of each of these rows.
The score of a row depends on the number of friendly and enemy tokens already in the row. Examples:
A row with four friendly tokens should have infinite score, because if you place a token there you win the game.
The score for a row with four enemy tokens should be very high, since if you don't put a token there, the opponent will win on his next turn.
A row with both friendly and enemy tokens will score 0, since this row can never be part of a winning row.
Given this algorithm, I have declared a type called TBrain:
type
TBrain = array[cFriendly..cEnemy , 0..4] of integer;
The values in the array indicates the score of a row with either N friendly tokens and 0 enemy tokens, or 0 friendly tokens and N enemy tokens. If there are 5 tokens in a row there's no score since the row is full.
It's actually quite easy to decide which values should be in the array. Brain[0,4] (four friendly tokens) should be "infinite", let's call that 1.000.000. vBrain[1,4] should be very high, but not so high that the brain would prefer blocking several enemy wins rather than wining itself
Concider the following (improbable) board:
0123456789
+----------
0|1...1...12
1|.1..1..1.2
2|..1.1.1..2
3|...111...2
4|1111.1111.
5|...111....
6|..1.1.1...
7|.1..1..1..
8|1...1...1.
Player 2 should place his token in (9,4), winning the game, not in (4,4) even though he would then block 8 potential winning rows for player 1. Ergo, vBrain[1,4] should be (vBrain[0,4]/8)-1. Working like this we can find optimal values for the "brain", but again, this is not what I'm interested in. I want an algorithm to find the best values.
I have implemented this framework so that it's totally deterministic. There's no random values added to the scores, and if several squares have the same score the top-left will be chosen.
Actual problem
That's it for the introduction, now to the interesting part (for me, at least)
I have two "brains", vBrain1 and vBrain2. How should I iteratively make these better? I Imagine something like this:
Initialize vBrain1 and vBrain2 with random values.
Simulate a game between them.
Assign the values from the winner to the loser, then randomly change one of them slightly.
This doesn't seem work. The brains don't get any smarter. Why?
Should the score-method add some small random values to the result, so that two games between the same two brains would be different? How much should the values change for each iteration? How should the "brains" be initialized? With constant values? With random values?
Also, does this have anything to do with AI or genetic algorithms at all?
PS: The question has nothing to do with Five in a Row. That's just something I chose because I can declare a very simple "Brain" to experiment on.
If you want to approach this problem like a genetic algorithm, you will need an entire population of "brains". Then evaluate them against each other, either every combination or use a tournament style. Then select the top X% of the population and use those as the parents of the next generation, where offspring are created via mutation (which you have) or genetic crossover (e.g., swap rows or columns between two "brains").
Also, if you do not see any evolutionary progress, you may need more than just win/loss, but come up with some kind of point system so that you can rank the entire population more effectively, which makes selection easier.
Generally speaking, yes you can make a brain smarter by using genetic algorithms techniques.
Randomness, or mutation, plays a significant part on genetic programming.
I like this tutorial, Genetic Algorithms: Cool Name & Damn Simple.
(It uses Python for the examples but it's not difficult to understand them)
Take a look at Neuro Evolution of Augmenting Tologies (NEAT). A fancy acronymn which basically means the evolution of neural nets - both their structure (topology) and connection weights. I wrote a .Net implementation called SharpNEAT that you may wish to look at. SharpNEAT V1 also has a Tic-Tac-Toe experiment.
http://sharpneat.sourceforge.net/