Large foreseeable Sudoku, with 81 integers - sudoku

I am in the making of a simple Sudoku for my exams at school. I have decided to have only one sudoku. This ones numbers are then shuffled around to make it look like a new one every time. The problem here is that I need to handle 81 integers. Some of them have to be visible, and some not. I can not myself see an easy way to handle these ints with ease, except with arrays, but that didn't go very well.
If you have any suggestions let me know :)

int[][]
Make it a 9x9 array like the visual sudoku.
Any non-visible number can be negated e.g. -5 instead of 5.
To validate the grid as having a solution check the Math.abs(value) (or whatever the absolute function is in your language of choice). Iterate from 1 to 9 in each 'square' and then for each row and column.
This will only let you know that you have a starting arrangement in which you can fill in numbers in a valid way it won't tell you that you can use logically to find that answer exclusively (e.g. an empty grid is valid but has thousands of solutions).

Related

How to create a logical paper and pencil game like Sudoku?

This question is not limited to Sudoku, but may include Kakuro, Hitori, Nurikabe, etc.
I understand the algorithm to solve Sudoku and other similar puzzles, but I'm having a hard time figuring out how to create them.
Say I want a Sudoku generator (to take the most popular). I guess it needs to work in two steps:
Create a valid solution
Remove parts of the solution until the desired amount of clues are left.
Creating a solution isn't trivial, it usually works well if you go randomly until you reach the last steps and end up in a deadlock.
Removing some parts of the solution required to be sure to remove only redundant ones, which isn't trivial either.
Is there a generic algorithm to work it out? How can I implement such a thing?
I understand my question is "broad" and that I don't show a lot of what I've got so far (splitting the problem in two), but I don't have any lead to start thinking about the algorithm. I'm not asking for a solution, but rather for hints on how to begin.
You could in general approach this as follows:
Define a set of rules which can assist a human in progressing in a game. For instance, in Sudoku, one of those rules could be:
Call the "field of influence" of a given cell, the cells that are either in the same row, the same column or in the same 3x3 block as that given cell. The rule is that this cell cannot have any of the values that are already used in its field of influence. If that means there is only one valid value left, then place that value in this cell.
Another rule could be:
If there is a value that cannot be used anywhere else in the same 3x3 block, then place that value in this cell. Similarly if a value cannot be used anywhere else in the cell's row; or cannot be used anywhere else in the cell's column.
There are obviously other rules. These rules can be more complicated. Rank the rules by how difficult it is for a human to verify and apply them. Try to be as complete as you can, by looking how you, as a human, reason when solving the game. Implement these rules as functions in the program. In the Sudoku example, such a rule function can be applied to a given cell, and return success (i.e. the cell gets a value) or failure (the rule cannot be used to deduct its value).
Let's say the program should generate a Sudoku of a given difficulty. We will interpret that to mean that solving the Sudoku will require the player to use at least once a rule that has at least that difficulty, or an exotic rule that was never foreseen.
Now start from a solved Sudoku. Remove randomly 50% of the values. Check if the Sudoku can be solved by only using known rules that are within the difficulty range. If not, restore 25% of the removed cells, and repeat. If it could be solved, remove 25% more cells randomly. Continue halving the number of involved cells (either restoring them or removing them), much like a binary search algorithm, until you arrive at the end of this search. For a Sudoku game, this process would take about 7 iterations. Then you will have a kind of "local minimum", where the rules can be applied to get to a solution.
This is far from perfect, as it could well be there is some other cell that could be cleared, while still allowing the rules to work towards a solution. So, if you want to refine this search, you could add some additional iterations to remove random cells as long as the resulting board can still be solved by applying the rules.
You could create a Sudoku puzzle by solving one that starts with nothing assigned to start with. If your solver progresses by filling in squares, you could "remove" them by stopping (or rolling back) that process at the appropriate point.

How to generate/iterate all "Connect-Four" games?

I need to iterate all different "Connect-Four" games possible.
The grid has 42 cells, and there are 21 red and 21 yellow pieces.
Every game generated must use every pieces, and all pieces of the same color are indistinguishable (e.g if you swap two reds in a solution, it doesn't count as another solution)
From that I can draw the conclusion that there's
I'm thinking about generating binary strings containing 21 0 and 21 1 but beside generating every 42-char long binary string and testing them one by one, I don't have any idea how to do that. That would be 42! (1.4050061e+51) string to test, so that's not an option.
How would you go about generating all these possible games ?
It seems that you do not care that some of these games will have ended early. To simply generate all of the possible combinations, you should think of the board as a matrix, where a 1 represents a black, and a 0 represents a red.
Now if we vectorize the matrix of values for a full board, then we will get something like
[0,1,1,0,...]
where the exact order depends on the permutation. Now since we have 21 of each color, that means that you are essentially asking for all of the possible permutations of the vector
[ones(1,21),zeros(1,21)]
(in Matlab and Python notation). In Matlab, you would then generate the list of all permutations by using the function
perms([ones(1,21),zeros(1,21)])
I am not sure what you want here because obviously it is not feasible to enumerate all of these in practice. If you are just interested in how to do it, I would suggest that you look in the Matlab implementation. It looks like 10 lines of pretty simple code.

Poker hand flop evaluator

I want to create a lookup table for texas hold'em poker hands. Right now, I am using prime numbers to represent each card, and want to figure out what those cards represent as a total hand. This is because the order of the cards don't matter and the multiplication will provide us with a unique number. Now, I know about the hand evaluators, but they only evaluate the strength of the hand without draws, and do not separate the hands into as many categories as I need.
As an example take the following situation:
Hand: AdKd
Flop: Kc5d3d
(d = diamonds, c = clubs, h = hearts, s = spades)
Now, this would return from the lookup table a pair and also a flush draw.
Now this is more tricky:
Hand: AhAd
Flop: 5c5h3d
This would evaluate to overpair. So, basically, we cannot combine the hand and the flop into a single number, as we want to know exactly how the hand interacts with the flop.
I have already created a way of determining if a flush or a flush draw exists and whether a straight or a straight draw exists. So after that, the suits no longer matter, and we don't care about non-pair hands. Basically, given two numbers that represent the hand and the flop, we get back a hand category. For the last example, Aces are the prime number 41, so for the hand we get 41*41=1681 and for the board, we get 7*7*3=147. Ok now, we go to our lookup table and enter this lookup(147, 1681) and it should return OVERPAIR (or whatever constant we set it to), in constant time.
How do I implement first the lookup table? And the lookup function? (I'm already planning on using a perfect hashing algorithm for both the flop and the hand, but don't really know how to combine them.)
First, for encoding hands I find your approach very raw (not sure why you would use 41 for Ace, why not 37, and how do you differentiate different flavors of Aces). I would suggest the following.
For hand use a number between 1 and 52*51/2 to denote every combination. For flop use 52*51*50/(2*3) numbers and 52*51*50*49/(2*3*4) for flop+turn.
Each combination of these numbers will denote each unique situation. So store whatever annotation (straight, straight-flop etc) you want for each of these.

Shuffle and deal a deck of card with constraints

Here is the facts first.
In the game of bridge there are 4
players named North, South, East and
West.
All 52 cards are dealt with 13 cards
to each player.
There is a Honour counting systems.
Ace=4 points, King=3 points, Queen=2
points and Jack=1 point.
I'm creating a "Card dealer" with constraints where for example you might say that the hand dealt to north has to have exactly 5 spades and between 13 to 16 Honour counting points, the rest of the hands are random.
How do I accomplish this without affecting the "randomness" in the best way and also having effective code?
I'm coding in C# and .Net but some idea in Pseudo code would be nice!
Since somebody already mentioned my Deal 3.1, I'd like to point out some of the optimizations I made in that code.
First of all, to get the most flexibly constraints, I wanted to add a complete programming language to my dealer, so you could generate whole libraries of constraints with different types of evaluators and rules. I used Tcl for that language, because I was already learning it for work, and, in 1994 when Deal 0.0 was released, Tcl was the easiest language to embed inside a C application.
Second, I needed the constraint language to run fairly fast. The constraints are running deep inside the loop. Quite a lot of code in my dealer is little optimizations with lookup tables and the like.
One of the most surprising and simple optimizations was to not deal cards to a seat until a constraint is checked on that seat. For example, if you want north to match constraint A and south to match constraint B, and your constraint code is:
match constraint A to north
match constraint B to south
Then only when you get to the first line do you fill out the north hand. If it fails, you reject the complete deal. If it passes, next fill out the south hand and check its constraint. If it fails, throw out the entire deal. Otherwise, finish the deal and accept it.
I found this optimization when doing some profiling and noticing that most of the time was spent in the random number generator.
There is one fancy optimization, which can work in some instances, call "smart stacking."
deal::input smartstack south balanced hcp 20 21
This generates a "factory" for the south hand which takes some time to build but which can then very quickly fill out the one hand to match this criteria. Smart stacking can only be applied to one hand per deal at a time, because of conditional probability problems. [*]
Smart stacking takes a "shape class" - in this case, "balanced," a "holding evaluator", in this case, "hcp", and a range of values for the holding evaluator. A "holding evaluator" is any evaluator which is applied to each suit and then totaled, so hcp, controls, losers, and hcp_plus_shape, etc. are all holding evalators.
For smartstacking to be effective, the holding evaluator needs to take a fairly limited set of values. How does smart stacking work? That might be a bit more than I have time to post here, but it's basically a huge set of tables.
One last comment: If you really only want this program for bidding practice, and not for simulations, a lot of these optimizations are probably unnecessary. That's because the very nature of practicing makes it unworthy of the time to practice bids that are extremely rare. So if you have a condition which only comes up once in a billion deals, you really might not want to worry about it. :)
[Edit: Add smart stacking details.]
Okay, there are exactly 8192=2^13 possible holdings in a suit. Group them by length and honor count:
Holdings(length,points) = { set of holdings with this length and honor count }
So
Holdings(3,7) = {AK2, AK3,...,AKT,AQJ}
and let
h(length,points) = |Holdings(length,points)|
Now list all shapes that match your shape condition (spades=5):
5-8-0-0
5-7-1-0
5-7-0-1
...
5-0-0-8
Note that the collection of all possible hand shapes has size 560, so this list is not huge.
For each shape, list the ways you can get the total honor points you are looking for by listing the honor points per suit. For example,
Shape Points per suit
5-4-4-0 10-3-0-0
5-4-4-0 10-2-1-0
5-4-4-0 10-1-2-0
5-4-4-0 10-0-3-0
5-4-4-0 9-4-0-0
...
Using our sets Holdings(length,points), we can compute the number of ways to get each of these rows.
For example, for the row 5-4-4-0 10-3-0-0, you'd have:
h(5,10)*h(4,3)*h(4,0)*h(0,0)
So, pick one of these rows at random, with relative probability based on the count, and then, for each suit, choose a holding at random from the correct Holdings() set.
Obviously, the wider the range of hand shapes and points, the more rows you will need to pre-compute. A little more code, you can still do this with some cards pre-determined - if you know where the ace of spades or west's whole hand or whatever.
[*] In theory, you can solve these conditional probability issues for smart stacking with multiple hands, but the solution to the problem would make it effective only for extremely rare types of deals. That's because the number of rows in the factory's table is roughly the product of the number of rows for stacking one hand times the number of rows for stacking the other hand. Also, the h() table has to be keyed on the number of ways of dividing the n cards amongst hand 1, hand 2, and other hands, which changes the number of values from roughly 2^13 to 3^13 possible values, which is about two orders of magnitude bigger.
Since the numbers are quite small here, you could just take the heuristic approach: Randomly deal your cards, evaluate the constraints and just deal again if they are not met.
Depending on how fast your computer is, it might be enough to do this:
Repeat:
do a random deal
Until the board meets all the constraints
As with all performance questions, the thing to do is try it and see!
edit I tried it and saw:
done 1000000 hands in 12914 ms, 4424 ok
This is without giving any thought to optimisation - and it produces 342 hands per second meeting your criteria of "North has 5 spades and 13-16 honour points". I don't know the details of your application but it seems to me that this might be enough.
I would go for this flow, which I think does not affect the randomness (other than by pruning solutions that do not meet constraints):
List in your program all possible combinations of "valued" cards whose total Honour points count is between 13 and 16. Then pick randomly one of these combinations, removing the cards from a fresh deck.
Count how many spades you already have among the valued cards, and pick randomly among the remaining spades of the deck until you meet the count.
Now pick from the deck as much non-spades, non-valued cards as you need to complete the hand.
Finally pick the other hands among the remaining cards.
You can write a program that generates the combinations of my first point, or simply hardcode them while accounting for color symmetries to reduce the number of lines of code :)
Since you want to practise bidding, I guess you will likely be having various forms of constraints (and not just 1S opening, as I guess for this current problem) coming up in the future. Trying to come up with the optimal hand generation tailored to the constraints could be a huge time sink and not really worth the effort.
I would suggest you use rejection sampling: Generate a random deal (without any constraints) and test if it satisfies your constraints.
In order to make this feasible, I suggest you concentrate on making the random deal generation (without any constraints) as fast as you can.
To do this, map each hand to a 12byte integer (the total number of bridge hands fits in 12 bytes). Generating a random 12 byte integer can be done in just 3, 4 byte random number calls, of course since the number of hands is not exactly fitting in 12 bytes, you might have a bit of processing to do here, but I expect it won't be too much.
Richard Pavlicek has an excellent page (with algorithms) to map a deal to a number and back.
See here: http://www.rpbridge.net/7z68.htm
I would also suggest you look at the existing bridge hand dealing software (like Deal 3.1, which is freely available) too. Deal 3.1 also supports doing double dummy analysis. Perhaps you could make it work for you without having to roll one of your own.
Hope that helps.

Optimizing Conway's 'Game of Life'

To experiment, I've (long ago) implemented Conway's Game of Life (and I'm aware of this related question!).
My implementation worked by keeping 2 arrays of booleans, representing the 'last state', and the 'state being updated' (the 2 arrays being swapped at each iteration). While this is reasonably fast, I've often wondered about how to optimize this.
One idea, for example, would be to precompute at iteration N the zones that could be modified at iteration (N+1) (so that if a cell does not belong to such a zone, it won't even be considered for modification at iteration (N+1)). I'm aware that this is very vague, and I never took time to go into the details...
Do you have any ideas (or experience!) of how to go about optimizing (for speed) Game of Life iterations?
I am going to quote my answer from the other question, because the chapters I mention have some very interesting and fine-tuned solutions. Some of the implementation details are in c and/or assembly, yes, but for the most part the algorithms can work in any language:
Chapters 17 and 18 of
Michael Abrash's Graphics
Programmer's Black Book are one of
the most interesting reads I have ever
had. It is a lesson in thinking
outside the box. The whole book is
great really, but the final optimized
solutions to the Game of Life are
incredible bits of programming.
There are some super-fast implementations that (from memory) represent cells of 8 or more adjacent squares as bit patterns and use that as an index into a large array of precalculated values to determine in a single machine instruction if a cell is live or dead.
Check out here:
http://dotat.at/prog/life/life.html
Also XLife:
http://linux.maruhn.com/sec/xlife.html
You should look into Hashlife, the ultimate optimization. It uses the quadtree approach that skinp mentioned.
As mentioned in Arbash's Black Book, one of the most simple and straight forward ways to get a huge speedup is to keep a change list.
Instead of iterating through the entire cell grid each time, keep a copy of all the cells that you change.
This will narrow down the work you have to do on each iteration.
The algorithm itself is inherently parallelizable. Using the same double-buffered method in an unoptimized CUDA kernel, I'm getting around 25ms per generation in a 4096x4096 wrapped world.
what is the most efficient algo mainly depends on the initial state.
if the majority of cells is dead, you could save a lot of CPU time by skipping empty parts and not calculating stuff cell by cell.
im my opinion it can make sense to check for completely dead spaces first, when your initial state is something like "random, but with chance for life lower than 5%."
i would just divide the matrix up into halves and start checking the bigger ones first.
so if you have a field of 10,000 * 10,000, you´d first accumulate the states of the upper left quarter of 5,000 * 5,000.
and if the sum of states is zero in the first quarter, you can ignore this first quarter completely now and check the upper right 5,000 * 5,000 for life next.
if its sum of states is >0, you will now divide up the second quarter into 4 pieces again - and repeat this check for life for each of these subspaces.
you could go down to subframes of 8*8 or 10*10 (not sure what makes the most sense here) now.
whenever you find life, you mark these subspaces as "has life".
only spaces which "have life" need to be divided into smaller subspaces - the empty ones can be skipped.
when you are finished assigning the "has life" attribute to all possible subspaces, you end up with a list of subspaces which you now simply extend by +1 to each direction - with empty cells - and perform the regular (or modified) game of life rules to them.
you might think that dividn up a 10,000*10,000 spae into subspaces of 8*8 is a lot os tasks - but accumulating their states values is in fact much, much less computing work than performing the GoL algo to each cell plus their 8 neighbours plus comparing the number and storing the new state for the net iteration somewhere...
but like i said above, for a random init state with 30% population this wont make much sense, as there will be not many completely dead 8*8 subspaces to find (leave alone dead 256*256 subpaces)
and of course, the way of perfect optimisation will last but not least depend on your language.
-110
Two ideas:
(1) Many configurations are mostly empty space. Keep a linked list (not necessarily in order, that would take more time) of the live cells, and during an update, only update around the live cells (this is similar to your vague suggestion, OysterD :)
(2) Keep an extra array which stores the # of live cells in each row of 3 positions (left-center-right). Now when you compute the new dead/live value of a cell, you need only 4 read operations (top/bottom rows and the center-side positions), and 4 write operations (update the 3 affected row summary values, and the dead/live value of the new cell). This is a slight improvement from 8 reads and 1 write, assuming writes are no slower than reads. I'm guessing you might be able to be more clever with such configurations and arrive at an even better improvement along these lines.
If you don't want anything too complex, then you can use a grid to slice it up, and if that part of the grid is empty, don't try to simulate it (please view Tyler's answer). However, you could do a few optimizations:
Set different grid sizes depending on the amount of live cells, so if there's not a lot of live cells, that likely means they are in a tiny place.
When you randomize it, don't use the grid code until the user changes the data: I've personally tested randomizing it, and even after a long amount of time, it still fills most of the board (unless for a sufficiently small grid, at which point it won't help that much anymore)
If you are showing it to the screen, don't use rectangles for pixel size 1 and 2: instead set the pixels of the output. Any higher pixel size and I find it's okay to use the native rectangle-filling code. Also, preset the background so you don't have to fill the rectangles for the dead cells (not live, because live cells disappear pretty quickly)
Don't exactly know how this can be done, but I remember some of my friends had to represent this game's grid with a Quadtree for a assignment. I'm guess it's real good for optimizing the space of the grid since you basically only represent the occupied cells. I don't know about execution speed though.
It's a two dimensional automaton, so you can probably look up optimization techniques. Your notion seems to be about compressing the number of cells you need to check at each step. Since you only ever need to check cells that are occupied or adjacent to an occupied cell, perhaps you could keep a buffer of all such cells, updating it at each step as you process each cell.
If your field is initially empty, this will be much faster. You probably can find some balance point at which maintaining the buffer is more costly than processing all the cells.
There are table-driven solutions for this that resolve multiple cells in each table lookup. A google query should give you some examples.
I implemented this in C#:
All cells have a location, a neighbor count, a state, and access to the rule.
Put all the live cells in array B in array A.
Have all the cells in array A add 1 to the neighbor count of their
neighbors.
Have all the cells in array A put themselves and their neighbors in array B.
All the cells in Array B Update according to the rule and their state.
All the cells in Array B set their neighbors to 0.
Pros:
Ignores cells that don't need to be updated
Cons:
4 arrays: a 2d array for the grid, an array for the live cells, and an array
for the active cells.
Can't process rule B0.
Processes cells one by one.
Cells aren't just booleans
Possible improvements:
Cells also have an "Updated" value, they are updated only if they haven't
updated in the current tick, removing the need of array B as mentioned above
Instead of array B being the ones with live neighbors, array B could be the
cells without, and those check for rule B0.

Resources