Is there any algorithm that can solve ANY traditional sudoku puzzles, WITHOUT guessing (or similar techniques)? - sudoku

Is there any algorithm that solves ANY traditional sudoku puzzle, WITHOUT guessing?
Here Guessing means trying an candidate and see how far it goes, if a contradiction is found with the guess, backtracking to the guessing step and try another candidate; when all candidates are exhausted without success, backtracking to the previous guessing step (if there is one; otherwise the puzzle proofs invalid.), etc.
EDIT1: Thank you for your replies.
traditional sudoku means 81-box sudoku, without any other constraints. Let us say the we know the solution is unique, is there any algorithm that can GUARANTEE to solve it without backtracking? Backtracking is a universal tool, I have nothing wrong with it but, using a universal tool to solve sudoku decreases the value and fun in deciphering (manually, or by computer) sudoku puzzles.
How can a human being solve the so called "the hardest sudoku in the world", does he need to guess?
I heard some researcher accidentally found that their algorithm for some data analysis can solve all sudoku. Is that true, do they have to guess too?

You can use the techniques that humans use to solve sudokus. Just keep track of every possible number in every square and place a number if there is only one possibility. Keep updating the possibilies until the sudoku is solved. You can exclude possibilities by using the rules or use some more complex reasoning. For example, if in one row two squares have the possibility 1 and 2, all other squares in that row can't be 1 or 2.
However, keep in mind that not every sudoku has a unique solution, and not every sudoku can be solved with this method.
Edit: More complicated human techniques can be found here:
http://www.sudokudragon.com/sudokustrategy.htm

Not a solid answer, just FYI:
There's a online Sudoku solver, solving problem like a human (rather than a computer) with the following strategies.
1: Hidden Singles
2: Naked Pairs/Triples
3: Hidden Pairs/Triples
4: Naked Quads
5: Pointing Pairs
6: Box/Line Reduction
Tough Strategies ==========
7: X-Wing
8: Simple Colouring
9: Y-Wing
10: Sword-Fish
11: XYZ Wing
Diabolical Strategies ==========
12: X-Cycles
13: XY-Chain
14: 3D Medusa
15: Jelly-Fish
16: Unique Rectangles
17: Extended Unique Rect.
18: Hidden Unique Rect's
19: WXYZ Wing
20: Aligned Pair Exclusion
Extreme Strategies ==========
21: Grouped X-Cycles
22: Empty Rectangles
23: Finned X-Wing
24: Finned Sword-Fish
25: Altern. Inference Chains
26: Sue-de-Coq
27: Digit Forcing Chains
28: Nishio Forcing Chains
29: Cell Forcing Chains
30: Unit Forcing Chains
31: Almost Locked Sets
32: Death Blossom
33: Pattern Overlay Method
34: Quad Forcing Chains
"Trial and Error" ==========
35: Bowman's Bingo
I tried it by importing a Sudoku picked from the "very hard" level of a Android Sudoku App, on which I stuck quite a while. The solver worked it out, the most advanced strategy being used is "3D Medusa", really impressive.
About the last strategy,
Bowman’s Bingo doesn’t solve all ‘bifurcating’ Sudokus but if applied thoroughly it will crack more than 80% of them. It’s not a panacea like Tabling or Nishio but it is easier to do and will work better if you are down to your last twenty or so unsolved squares.

If you just want any algorithm that works without guessing, you can write all traditional sudokus and their solution in a big lookup table. Your algorithm would be doing a lookup. No guessing involved (but the lookup table still feels dirty to me).
"[...] Jarvis/Russell computed the number of essentially different (symmetrically distinct) solutions as 5,472,730,538." (From https://en.wikipedia.org/wiki/Mathematics_of_Sudoku#Enumerating_essentially_different_Sudoku_solutions)

An algorithm has been discovered that is deterministic (i.e. no backtracking), and guaranteed to find a solution to all sudoku problems but it's quite complex.
Details can be found here:
http://www.nature.com/srep/2012/121011/srep00725/full/srep00725.html

Related

Questions about Dancing Links/Algorithm X

I'm working on making a sudoku app, and one of the things needed is a way to solve the sudoku. I did a lot of research on some backtracking algorithms, including making my own version, but then came across Dancing Links and Algorithm X. I've seen a few implementations of it, and it looks really cool, but had some questions - I can't quite wrap my head around it fully yet (I don't have much experience coding, so I haven't grasped all of what's needed to fully understand the core of it and how it works, though I am using this as a handy reference)
As far as I understand, you have a sudoku, which you then convert to an array of 1s and 0s - the end goal of which is to find a combinations of rows that will be fully 1s - that then means we've found a valid solution (yay!)
Now, I kinda sorta understand how that works on normal sudokus - for example, if we put a 5 in the top left cell, it removes all the other options along that row and column, and in turn also removes all options of a 5 being in that square too. But what I don't quite understand is if I'm doing a sudoku variation, how will it work? For example, one popular type of variation is X-Sudoku, where, on top of the normal rules, you have to have the numbers 1 to 9 once on each of the main diagonals. Can I just pretend there's an extra 2 rows/columns on the sudoku that also need to be filled from 1-9 and do it that way, or does it not work like that?
Now, the hard question: another variant is anti-knight sudoku. Basically, on top of the normal rules, you can't have the same number a chess knight's move away (2 out and 1 to the side). Since this now gets a bit wonky in terms of the rows and whatnot, can these be added as extra constraints to the algorithm to solve sudokus along these lines?
X-Sudoku can be solved exactly the way you describe.
Anti-knight Sudoku is trickier because the anti-knight constraint does not fit as straightforwardly into the exact cover framework. There's an extension of Algorithm X that handles packing constraints (at most one instead of exactly one) efficiently by treating them as satisfied when choosing an unsatisfied constraint. Then for each triple of consisting of a digit and two squares a knight's jump apart, you have a packing constraint that at most one of those squares is filled with that digit.
If implementing Algorithm X seems like too much of a challenge, you could look into finding a SAT solver library instead.

What's the worst-case valid sudoku puzzle for simple backtracking brute force algorithm?

The "simple/naive backtracking brute force algorithm", "Straightforward Depth-First Search" for sudoku is commonly known and implemented.
and no different implementation seems to exist.
(when i first wrote this question.. i wanted to mean we could completely standardize it, but the wording is bad..)
This guy has described the algorithm well i think: https://stackoverflow.com/a/2075498/3547717
Edit: So let me have it more specified with pseudo code...
var field[9][9]
set the givens in 'field'
if brute (first empty grid) = true then
output solution
else
output no solution
end if
function brute (cx, cy)
for n = 1 to 9
if (n doesn't present in row cy) and (n doesn't present in column cx) and (n doesn't present in block (cx div 3, cy div 3)) then
let field[cx][cy] = n
if (cx, cy) this is the last empty grid then
return true
elseif brute (next empty grid) = true then
return true
end if
let field[cx][cy] = empty
end if
next n
end function
I want to find the puzzle that requires most time. We may call it "hardest" for this particular "standardized" algorithm, but this one is not like those questions asking for "Hardest sudoku".
In fact, a "hard" puzzle under this definition may turn super easy when simply rotated or flipped.
According to the rule "for each grid try number 1 to 9", it tries from 1 on, so we may somehow let it try more by using proper number, by the way there won't be permutation problem.
The sudoku puzzle must be valid, i.e. it should have exactly 1 solution. Some guy got a puzzle requiring 1439 seconds, but it's not valid because of having no solution.
I define the time required (or say time complexity) equivalent to how many times the recursive function is entered. (in my implementation, it's slightly different from the pseudo code above, because of the last entrance, and ensuring unique solution, etc.)
Is there any good way to construct it, or we have to use approximate ones like heuristic algorithms to find inexact solutions?
I've implemented a backtracking with both naive strategy (that I referred to as "simple" above, it's unique) and Peter Norvig's "Least Candidates First" strategy (my implementation is deterministic, but not unique. As Peter has also mentioned, the order of python dict changes the result a lot, in case of a tie on the number of candidates).
https://github.com/farteryhr/labs/blob/master/sudoku.c
The no-solution one:
.....5.8....6.1.43..........1.5........1.6...3.......553.....61........4.........
takes 60 seconds on my laptop to get the no-solution conclusion, entering the recursion function 2549798781 times (called "cycles" later). With my implementation of LCF, 78308087 cycles in 30 seconds to conclude. It's because finding the grid with least candidates needs more operations, a single cycle of LCF strategy uses about 16x more time.
The topmost one on the Hardest list:
4.....8.5.3..........7......2.....6.....8.4......1.......6.3.7.5..2.....1.4......
takes 3.0s, found the solution at cycle 9727397, and 142738236 cycles for ensuring unique solution. (my LCF: 981/7216 in 0.004s)
Many in the "hard" list are still easy for naive, though a larger portion of them needs 10^7 to 10^9 cycles.
On Wikipedia: Sudoku solving algorithms (Original) it's stated that such puzzles against backtracking algorithm can be constructed, by making as many empty grids at the beginning as possible and the permutation of the top row 987654321.
Well the test..
..............3.85..1.2.......5.7.....4...1...9.......5......73..2.1........4...9
takes 1.4s, 69175317 cycles for finding solution, 69207227 cycles ensuring unique solution. Not as good as the hard one provided by Peter, but OK, and it's almost right after finding the solution, the search ends. That's probably how the first row works by being lexicographically large. (my LCF: 29206/46160 in 0.023s)
Yes these are obvious, I'm just asking for better ways...
There are also other ways of measuring the difficulty of Sudoku (through solving)
Sudoku Analyst will get stuck with the multiple-solution puzzle given by Peter (naive 419195/419256, LCF 2529478/2529482, yes, there are some puzzles that make LCF do worse):
.....6....59.....82....8....45........3........6..3.54...325..6..................
This one is easy for both naive backtracking (10008/76703) and LCF backtracking (313/1144), but also gets Sudoku Analyst stuck.
..53.....8......2..7..1.5..4....53...1..7...6..32...8..6.5....9..4....3......97..
Another update:
The most difficult Sudoku puzzles are quickly solved by a straightforward depth-first search algorithm
Ha, finally someone also looking for it, and a super tough one is given! The following valid puzzle:
9..8...........5............2..1...3.1.....6....4...7.7.86.........3.1..4.....2..
In this paper, the algorithm is named SDFS, Straightforward Depth-First Search. The number of cycles stated by the author is 1553023932/1884424814, and with my implementation, it's 1305263522/1584688020. Yes, there will be some difference on precisely where to pop the counter, but the basic behavior matches. On repl.it 's server, it took 97s to find the answer and 119s to finish the search.
You can easily generate the worst case by recording the time taken / no. of operations taken by your code to solve hard sudoku puzzles. You can either use a random generator that generates valid sudoku puzzles (or) you can take hard sudoku puzzles from the internet and run your code against it to measure the time/number of operations. Once you run your code against 10000 such cases the slowest 5 (and the unsolved ones) would be the worst cases for your solution.

What is a "naive" algorithm, and what is a "closed - form" solution?

I have a few questions regarding the semantics of terminology used when describing algorithms.
Firstly, what is meant by a 'naive' algorithm? How does this differ from other solutions to a given problem? What other forms can solutions take?
Secondly, I have heard much reference to having a 'closed - form' solution. I have no idea what this means either - but often it appears when trying to solve recurrence relations...
Thanks for your time
A Naive algorithm is usually the most obvious solution when one is asked a problem. It may not be a smart algorithm but will probably get the job done (...eventually.)
Eg. Trying to search for an element in a sorted array.
A Naive algorithm would be to use a Linear Search.
A Not-So Naive Solution would be to use the Binary Search.
A better example, would be in case of substring search Naive Algorithm is far less efficient than Boyer–Moore or Knuth–Morris–Pratt Algorithm
A Closed Form Solution is a simple Solution that works instantly without any loops,functions etc..
Eg:
Iterative Algorithm for sum of integer from 1 to n
s= 0
for i in 1 to n
s = s + i
end for
print s
Closed Form (for the same problem)
s = n * (n + 1 ) /2
Naive algorithm is a very simple algorithm, one with very simple rules. Sometimes the first one that comes to mind. It may be stupid and very slow, it may not even solve the problem. It may sometimes be the best possible. Here's an example of a problem and "naive" algorithms:
Problem: You are in a (2-dimensional) maze. Find your way out. (meaning: to a spot with an "EXIT" sign :)
Naive algorithm 1: Start walking and choose the right one in every intersection you meet (until you find "EXIT").
Naive algorithm 2: Start walking and choose a random one in every intersection you meet (until you find "EXIT").
Algorithm 1 will not even get you out of some mazes!
Algorithm 2 will get you out of all mazes (although this is rather hard to prove).
Closed form means you can give the one expression as solution, that does solve it without recurrence/recursive. Here one should remark, that it is not always possible to find such a closed form.
Naive means just that what it says: A first, stupid solution to the problem, that solves it, but maybe not very time-/space efficient. What one really considers 'naive' depends on the speaker, the context, and the weather of the next day. Often it is used to distinguish a very sophisticated solution (that uses some kind of trick) from the obvious implementation.

Packing differently sized chunks of data into multiple bins

EDIT: It seems like this problem is called "Cutting stock problem"
I need an algorithm that gives me the (space-)optimal arrangement of chunks in bins. One way would be put the bigger chunks in first. But see how that algorithm fails in this example:
Chunks Bins
-----------------------------
AAA BBB CC DD ( ) ( )
Algorithm Result
-----------------------------
biggest first (AAABBB ) (CC )
optimal (AAACCDD) (BBB)
"Biggest first" can't fit in DD. Maybe it helps to build a table like this:
Size 1: ---
Size 2: CC, DD
Size 3: AAA, BBB
Size 4: CCDD
Size 5: AAACC, AAADD, BBBCC, BBBDD
Size 6: AAABBB
Size 7: AAACCDD, BBBCCDD
Size 8: AAABBBCC, AAABBBDD
Size 10: AAABBBCCDD
This is basically a variant of the bin-packing problem. This problem is is known to be NP-hard, so don't expect to find an efficient optimal algorithm for complex cases (i.e. with many objects and bins).
However, if your number of objects/bins is relatively small, you will probably be fine just exhaustively searching all the possible combinations with a depth-first search.
This is pretty easy to implement: just take the first object, then recursively re-run the algorithm with the first object placed in each of the bins in turn (i.e. subtracting the size of the object from the available bin space). Finally, you just need to keep track of the best "solution" found so far and return this as your final answer once you have tried all combinations.
You may also be able make this algorithm algorithm run faster by:
Considering all objects of equal size as equivalent
Pruning the search tree (i.e. returning early from a branch) if you can't possibly beat the current best solution e.g. when you have already found a perfect fit
Updated based on comments on problem size
Given that it looks like you have a very large number of chunks to deal with, you might want to try the following:
Fit the largest 10-20 chunks using an exhaustive search as above
Allocate the remainder using a largest fit approach
Mikera is right: this multiple Knapsack problem (a variant of the bin packing problem) is NP hard.
Here are a couple of your options (copied from my answer on a similar question):
Brute force, or better yet, branch and bound. Doesn't scale (at all!), but will find you the optimal solution (probably not in our lifetimes though).
Deterministic algorithm: sort the chunks on largest size and go through that list one by one and assign it the best remaining spot. That will finish very fast, but the solution can be far from optimal (or feasible). Here's a nice picture showing an example what can go wrong. But if you want to keep it simple, that's the way to go.
Meta-heuristics, starting from the result of a deterministic algorithm. This will give you a very good result in reasonable time, better than what humans come up with. Depending on how much time you give it and the difficulty of the problem it might or might not be the optimal solution. There are a couple of libraries out there, such as Drools Planner (open source java).
A general best algorithm for this problem doesn't exist yet (see bin packing problem). You can find a few different approaches on wikipedia and/or googling for the "bin packing problem" and maybe "knapsack problem" would also provide some help.
Donald Knuth's Dancing Links algorithm is quick at finding solutions to "exact covering" problems.

Project Euler #163 understanding

I spent quite a long time searching for a solution to this problem. I drew tons of cross-hatched triangles, counted the triangles in simple cases, and searched for some sort of pattern. Unfortunately, I hit the wall. I'm pretty sure my programming/math skills did not meet the prereq for this problem.
So I found a solution online in order to gain access to the forums. I didn't understand most of the methods at all, and some just seemed too complicated.
Can anyone give me an understanding of this problem? One of the methods, found here: http://www.math.uni-bielefeld.de/~sillke/SEQUENCES/grid-triangles (Problem C)
allowed for a single function to be used.
How did they come up with that solution? At this point, I'd really just like to understand some of the concepts behind this interesting problem. I know looking up the solution was not part of the Euler spirit, but I'm fairly sure I would not have solved this problem anyhow.
This is essentially a problem in enumerative combinatorics, which is the art of counting combinations of things. It's a beautiful subject, but probably takes some warming up to before you can appreciate the ninja tricks in the reference you gave.
On the other hand, the comments in the solutions thread for the problem indicate that many have solved the problem using a brute force approach. One of the most common tricks involves taking all possible combinations of three lines in the diagram, and seeing whether they yield a triangle that is inside the largest triangle.
You can cut down the search space considerably by noting that the lines are in one of six directions. Since a combination of lines that includes two lines that are parallel will not yield a triangle, you can iterate over line triples so that each line in the triple has a different direction.
Given three lines, calculate their intersection points. You will have three possibilities
1) the lines are coincident - they all intersect in a common point
2) two of the lines intersect at a point outside the triangle
3) all three points of intersection are distinct, and they all lie within the outer triangle
Just count the combos satisfying condition (3) and you are done. The number of line combos you have to test is O(n3), which is not prohibitive.
EDIT1: rereading your question, I get the impression you might be more interested in getting an explanation of the combinatorics solution/formula than an outline of a brute force approach. If that's the case, say so and I'll delete this answer. But I'd also say that the question in that case would not be suitable for this site.
EDIT2: See also a combinatorics solution by Bill Daly and others. It is mathematically a little gentler than the other one.
I have not solved this problem for project euler and am going off of the question and the solution you provided. In the case of the single function, the methodology presented was ultimately simple pattern finding. The solver broke the presented question into three parts, based on the types of triangles that were present from the intersections. It's a fairly standard aproach to this kind of problem, break the larger pattern down into smaller ones to make solving easier. The functions used to express the various forms of triangles I can only assume were generated with either a very acute pattern finding mind or some number theory / geometry. It is also beyond the scope of this explanation and my knowledge. This problem has nothing to do with programming. It's basically entirely mathematics. If you read through the site you liked you can see the logic that is gone through to reach the questions.

Resources