Solving N-Queens Problem... How far can we go? - algorithm

The N-Queens Problem:
This problem states that given a chess board of size N by N, find the different permutations in which N queens can be placed on the board without any one threatening each other.
My question is:
What is the maximum value of N for which a program can calculate the answer in reasonable amount of time? Or what is the largest N we have seen so far?
Here is my program in CLPFD(Prolog):
generate([],_).
generate([H|T],N) :-
H in 1..N ,
generate(T,N).
lenlist(L,N) :-
lenlist(L,0,N).
lenlist([],N,N).
lenlist([_|T],P,N) :-
P1 is P+1,
lenlist(T,P1,N).
queens(N,L) :-
generate(L,N),lenlist(L,N),
safe(L),
!,
labeling([ffc],L).
notattack(X,Xs) :-
notattack(X,Xs,1).
notattack(X,[],N).
notattack(X,[Y|Ys],N) :-
X #\= Y,
X #\= Y - N,
X #\= Y + N,
N1 is N + 1,
notattack(X,Ys,N1).
safe([]).
safe([F|T]) :-
notattack(F,T),
safe(T).
This program works just fine, but the the time it takes keeps on increasing with N.
Here is a sample execution:
?- queens(4,L).
L = [2, 4, 1, 3] ;
L = [3, 1, 4, 2] ;
No
This means you place the 4 queens at Row 2 in Column1, Row 4 in Column 2, Row 1 in 3 and Row 3 in 4.(In a 4 By 4 chess board)
Now lets see how this program performs(Time taken in calculating the first permutation):
For N=4,5.....10 Computes within a second
For N=11-30 Takes between -1-3 seconds
For N=40..50 Still calculates within a minute
At N=60 It goes out of Global stack(Search space being enormous).
This was a past Homework problem. (The original problem was just to code N-Queens)
I am also interested in seeing alternate implementations in other languages(which performs better than my implementation) or If there is room for improvement in my algorithm/program

This discussion conflates three different computational problems: (1) Finding a solution to the N queens problem, (2) Listing all solutions for some fixed N, and (3) counting all of the solutions for some fixed N. The first problem looks tricky at first for a size of board such as N=8. However, as Wikipedia suggests, in some key ways it is easy when N is large. The queens on a large board don't communicate all that much. Except for memory constraints, a heuristic repair algorithm has an easier and easier job as N increases.
Listing every solution is a different matter. That can probably be done with a good dynamic programming code up to a size that is large enough that there is no point in reading the output.
The most interesting version of the question is to count the solutions. The state of the art is summarized in a fabulous reference known as The Encyclopedia of Integer Sequences. It has been computed up to N=26. I would guess that that also uses dynamic programming, but unlike the case of listing every solution, the algorithmic problem is much deeper and open to further advances.

Loren Pechtel said: "Now for some real insanity: 29 took 9 seconds.
30 took almost 6 minutes!"
This fascinating lack of predictability in backtrack-complexity for different board sizes was the part of this puzzle that most interested me. For years I've been building a list of the 'counts' of algorithm steps needed to find the first solution for each board size - using the simple and well known depth-first algorithm, in a recursive C++ function.
Here's a list of all those 'counts' for boards up to N=49 ... minus N=46 and N=48 which are still work-in-progress:
http://queens.cspea.co.uk/csp-q-allplaced.html
(I've got that listed in the Encyclopedia of Integer Sequences (OEIS) as A140450)
That page includes a link to a list of the matching first solutions.
(My list of First Solutions is OEIS Sequence number A141843)
I don't primarily record how much processing time each solution demands, but rather I record how many failed queen-placements were needed prior to discovery of each board's algorithmically-first solution. Of course the rate of queen placements depends on CPU performance, but given a quick test-run on a particular CPU and a particular board size, it's an easy matter to calculate how long it took to solve one of these 'found' solutions.
For example, on an Intel Pentium D 3.4GHz CPU, using a single CPU thread -
For N=35 my program 'placed' 24 million queens per second and took just 6 minutes to find the first solution.
For N=47 my program 'placed' 20.5 million queens per second and took 199 days.
My current 2.8GHz i7-860 is thrashing through about 28.6 million queens per second, trying to find the first solution for N=48. So far it has taken over 550 days (theoretically, if it had never been uninterrupted) to UNsuccessfully place 1,369,331,731,000,000 (and rapidly climbing) queens.
My web site doesn't (yet) show any C++ code, but I do give a link on that web page to my simple illustration of each one of the 15 algorithm steps needed to solve the N=5 board.
It's a delicious puzzle indeed!

Which Prolog system are you using? For example, with recent versions of SWI-Prolog, you can readily find solutions for N=80 and N=100 within fractions of a second, using your original code. Many other Prolog systems will be much faster than that.
The N-queens problem is even featured in one of the online examples of SWI-Prolog, available as CLP(FD) queens in SWISH.
Example with 100 queens:
?- time((n_queens(100, Qs), labeling([ff], Qs))).
Qs = [1, 3, 5, 57, 59 | ...] .
2,984,158 inferences, 0.299 CPU in 0.299 seconds (100% CPU, 9964202 Lips)
SWISH also shows you nices image of solutions.
Here is an animated GIF showing the complete solution process for N=40 queens with SWI-Prolog:

a short solution presented by raymond hettinger at pycon: easy ai in python
#!/usr/bin/env python
from itertools import permutations
n = 12
cols = range(n)
for vec in permutations(cols):
if (n == len(set(vec[i] + i for i in cols))
== len(set(vec[i] - i for i in cols))):
print vec
computing all permutations is not scalable, though (O(n!))

As to what is the largest N solved by computers there are references in literature in which a solution for N around 3*10^6 has been found using a conflict repair algorithm (i.e. local search). See for example the classical paper of [Sosic and Gu].
As to exact solving with backtracking,there exist some clever branching heuristics which achieve correct configurations with almost no backtracking. These heuristics can also be used to find the first-k solutions to the problem: after finding an initial correct configuration the search backtracks to find other valid configurations in the vicinity.
References for these almost perfect heuristics are [Kale 90] and [San Segundo 2011]

What is the maximum value of N for which a program can calculate the answer in reasonable amount of time? Or what is the largest N we have seen so far?
There is no limit. That is, checking for the validity of a solution is more costly than constructing one solution plus seven symmetrical ones.
See Wikipedia:
"Explicit solutions exist for placing n queens on an n × n board for all n ≥ 4, requiring no combinatorial search whatsoever‌​.".

I dragged out an old Delphi program that counted the number of solutions for any given board size, did a quick modification to make it stop after one hit and I'm seeing an odd pattern in the data:
The first board that took over 1 second to solve was n = 20. 21 solved in 62 milliseconds, though. (Note: This is based off Now, not any high precision system.) 22 took 10 seconds, not to be repeated until 28.
I don't know how good the optimization is as this was originally a highly optimized routine from back when the rules of optimization were very different. I did do one thing very different than most implementations, though--it has no board. Rather, I'm tracking which columns and diagonals are attacked and adding one queen per row. This means 3 array lookups per cell tested and no multiplication at all. (As I said, from when the rules were very different.)
Now for some real insanity: 29 took 9 seconds. 30 took almost 6 minutes!

Actually constrained random walk (generate and test) like what bakore outlined is the way to go if you just need a handful of solutions because these can be generated rapidly. I did this for a class when I was 20 or 21 and published the solution in the Journal of Pascal, Ada & Modula-2, March 1987, "The Queens Problem Revisited". I just dusted off the code from that article today (and this is very inefficient code) and after fixing a couple of problems have been generating N=26 ... N=60 solutions.

If you only want 1 solution then it can be found greedily in linear time O(N). My code in python:
import numpy as np
n = int(raw_input("Enter n: "))
rs = np.zeros(n,dtype=np.int64)
board=np.zeros((n,n),dtype=np.int64)
k=0
if n%6==2 :
for i in range(2,n+1,2) :
#print i,
rs[k]=i-1
k+=1
rs[k]=3-1
k+=1
rs[k]=1-1
k+=1
for i in range(7,n+1,2) :
rs[k]=i-1
k+=1
rs[k]=5-1
elif n%6==3 :
rs[k]=4-1
k+=1
for i in range(6,n+1,2) :
rs[k]=i-1
k+=1
rs[k]=2-1
k+=1
for i in range(5,n+1,2) :
rs[k]=i-1
k+=1
rs[k]=1-1
k+=1
rs[k]=3-1
else :
for i in range(2,n+1,2) :
rs[k]=i-1
k+=1
for i in range(1,n+1,2) :
rs[k]=i-1
k+=1
for i in range(n) :
board[rs[i]][i]=1
print "\n"
for i in range(n) :
for j in range(n) :
print board[i][j],
print
Here, however printing takes O(N^2) time and also python being a slower language any one can try implementing it in other languages like C/C++ or Java. But even with python it will get the first solution for n=1000 within 1 or 2 seconds.

Related

Number of ways to represent a number as a sum of K numbers in subset S

Let the set S be {1 , 2 , 4 , 5 , 10}
Now i want to find the number of ways to represent x as sum of K numbers of the set S. (a number can be included any number of times)
if x = 10 and k = 3
Then the ans should be 2 => (5,4,1) , (4,4,2)
The order of the numbers doesn't matter ie.(4,4,2) and (4,2,4) count as one.
I did some research and found that the set can be represented as a polynomial x^1+x^2+x^4+x^5+x^10 and after raising the polynomial to the power K the coefficients of the product polynomial gives the ans.
But the ans includes (4,4,2) and (4,2,4) as unique terms which i don't want
Is there any way to make (4,4,2) and (4,2,4) count as same term ?
This is a NP-complete, a variant of the sum-subset problem as described here.
So frankly, I don't think you can solve it via a non-exponential (iterate though all combinations) solution, without any restrictions on the problem input (such as maximum number range, etc.).
Without any restrictions on the problem domain, I suggest iterating through all your possible k-set instances (as described in the Pseudo-polynomial time dynamic programming solution) and see which are a solution.
Checking whether 2 solutions are identical is nothing compared to the complexity of the overall algo. So, a hash of the solution set-elements will work just fine:
E.g. hash-order-insensitive(4,4,2)==hash-order-insensitive(4,2,4) => check the whole set, otherwise the solutions are distinct.
PS: you can also describe step-by-step your current solution.

Finding minimum number of days

I got this question as a part of the interview and I am still unable to solve it.
It goes like this
A person has to complete N units of work; the nature of work is the same.
In order to get the hang of the work, he completes only one unit of work in the first day.
He wishes to celebrate the completion of work, so he decides to complete one unit of work in the last day.
Given that he is only allowed to complete x, x+1 or x-1 units of work in a day, where x is the units of work
completed on the previous day.
How many minimum days will he take to complete N units of work?
Sample Input:
6
-1
0
2
3
9
13
Here, line 1 represents the number of input test cases.
Sample Output:
0
0
2
3
5
7
Each number represents the minimum days required for each input in the sample input.
I tried doing it using the coin change approach but was not able to do so.
In 2k days, it's possible to do at most 2T(k) work (where T(k) is the k'th triangular number). In 2k+1 days, it's possible to do at most T(k+1)+T(k) at most work. That's because if there's an even (2k) number of days, the most work is 1+2+3+...+k + k+(k-1)+...3+2+1. Similarly, if there's an odd (2k+1) number of days, the most work is 1+2+3+...+k+(k+1)+k+...+3+2+1.
Given this pattern, it's possible to reduce the amount of work to any value (greater than 1) -- simply reduce the work done on the day with the most work, never picking the start or end day. This never invalidates the rule that the amount of work on one day is never more than 1 difference from an adjacent day.
Call this function F. That is:
F(2k) = 2T(k)
F(2k+1) = T(k)+T(k+1)
Recall that T(k) = k(k+1)/2, so the equations simplify:
F(2k) = k(k+1)
F(2k+1) = k(k+1)/2 + (k+1)(k+2)/2 = (k+1)^2
Armed with these observations, you can solve the original problem by finding the smallest number of days where it's possible to do at least N units of work. That is, the smallest d such that F(d) >= N.
You can, for example, use binary search to find d, or an optimal approach is to solve the equations. The minimal even solution has d/2 * (d/2 + 1) >= N which you can solve as a quadratic equation, and the minimal odd solution has (d+1)^2/4 >= N, which has a solution d = ceil(2sqrt(N)-1). Once you've found the minimal even and odd solutions, then you can pick the smaller of the two.
AS you want to have the minimum amounts of days you can just say yeah x+1, since if you want the minimum amount of days, BUT you have to consider that his last day x should be 1 so you have to break at a given point and go x-1, so now we have to determine the Breakpoint.
The Breakpoint is located in the middle of the days, since you start at 1 and want to end at 1.
For example you have to do 16 units so you distribute your days like:
Work done:
1234321.
7 days worked.
When you can't make an even Breakpoint like above repeat some values
5 = 1211
Samples:
2: 11
3: 111
9: 12321
13: 1234321
If you need to do exactly N units, not more, then you could use dynamic programming on the matrix a[x][y] where x is the amount of work done in the last day, y is the total amount of work, and a[x][y] is the the minimum number of days needed. You could use Dijkstra's algorithm to minimize a[1][N]. Begin with a[1][1]=1.

This puzzle- subset sum?

In today's edition of the Guardian (a newspaper in the UK), in the "Pyrgic puzzles" section on page 43 by Chris Maslanka, the following puzzle was given:
The 3 wise men ... went to Herrods to do their Christmas shopping. Caspar bought Gold, Melchior bought Frankincense, and Balthazar bought a copy of the Daily Myrrh. The cashier tapped in the number of euros of each of these things cost, and meant to add the three numbers, but multiplied them instead. ... the marvel of the thing was that the result was exactly the same: €65.52. What were the three sums [I assume he meant the three numbers]?
My interpretation is: Find an a, b and c such that a + b + c = abc = 65.52 (exactly) where a, b and c are positive decimal numbers with no more than two decimal places. It follows that a, b and c must also be less than 65.52 (approximately).
My approach is thus: I shall find all the candidate sets of a, b and c where a + b + c = 6552 and a, b and c are integers from {1 ... 6550} (Notionally I have multiplied all the operands by 100 for convenience). Then, for all the candidate sets, it is trivial to satisfy the other condition by dividing all the operands by 100 then multiplying them (with arbitrary-precision arithmetic).
This, as I see it, is an instance of the subset sum problem. So I implemented a dirty (exponential time) algorithm which found one distinct solution: a=0.52, b=2, c=63.
Ok, there are better algorithms for the subset sum problem, but don't you think this is getting a little out-of-reach for an average Guardian reader?
On page 40 the answer is listed:
This is easy, by trial and error. Guess 52p for the Daily Myrrh. But by multiplying by 0.52 is roughly halving, so we need one sum to be about 2; so try 2 X 63 X 0.52. Et voilà. Is this answer unique?
Well, we know that the answer is unique (disregarding the other permutations of 2, 63 and 0.52).
What I want to know is: How can this be "easy"? Am I right in characterising the puzzle as an instance of the subset sum problem? Have I overlooked some characteristic of the puzzle which can be utilized to simplify the solution? Was anyone able to adopt a similar "trial and error" approach and if so can they take me through it? Is Chris Maslanka simply undaunted by NP-complete problems?
No, it is not an instance of the subset sum problem, because:
The subset size is limited to 3, making it O(n^3) solution worst case with naive exhaustive search (and not exponential)
There is additional data in here, the product of the numbers.
You are not actually given a set, a set of all integers is just a subproblem of subset-sum, a much easier one.
The important thing to understand here is: if a problem can be solved by an NP-Hard problem - it doesn't mean it is NP-Hard as well, the other way around holds - if you have a problem, and you can solve some NP-Hard problem (polynomially) with it, then your problem is NP-Hard. It is called polynomial reduction1.
The approach is easy because all you have to do is "guess" (by iterating all candidates) a value for a, and from this you can derive what is the possible solution for b,c - (2 variables, two equations if a is known - and in each iteration - it is), thus the solution is even linear - not only sub exponential.
It might even be optimized to use a variation of binary search to get a sub-linear optimization, but I cannot think of that optimization at the moment.
(1) Note: this is some intuitive explanation, and not a formal definition.

Programming problem - Game of Blocks

maybe you would have an idea on how to solve the following problem.
John decided to buy his son Johnny some mathematical toys. One of his most favorite toy is blocks of different colors. John has decided to buy blocks of C different colors. For each color he will buy googol (10^100) blocks. All blocks of same color are of same length. But blocks of different color may vary in length.
Jhonny has decided to use these blocks to make a large 1 x n block. He wonders how many ways he can do this. Two ways are considered different if there is a position where the color differs. The example shows a red block of size 5, blue block of size 3 and green block of size 3. It shows there are 12 ways of making a large block of length 11.
Each test case starts with an integer 1 ≤ C ≤ 100. Next line consists c integers. ith integer 1 ≤ leni ≤ 750 denotes length of ith color. Next line is positive integer N ≤ 10^15.
This problem should be solved in 20 seconds for T <= 25 test cases. The answer should be calculated MOD 100000007 (prime number).
It can be deduced to matrix exponentiation problem, which can be solved relatively efficiently in O(N^2.376*log(max(leni))) using Coppersmith-Winograd algorithm and fast exponentiation. But it seems that a more efficient algorithm is required, as Coppersmith-Winograd implies a large constant factor. Do you have any other ideas? It can possibly be a Number Theory or Divide and Conquer problem
Firstly note the number of blocks of each colour you have is a complete red herring, since 10^100 > N always. So the number of blocks of each colour is practically infinite.
Now notice that at each position, p (if there is a valid configuration, that leaves no spaces, etc.) There must block of a color, c. There are len[c] ways for this block to lie, so that it still lies over this position, p.
My idea is to try all possible colors and positions at a fixed position (N/2 since it halves the range), and then for each case, there are b cells before this fixed coloured block and a after this fixed colour block. So if we define a function ways(i) that returns the number of ways to tile i cells (with ways(0)=1). Then the number of ways to tile a number of cells with a fixed colour block at a position is ways(b)*ways(a). Adding up all possible configurations yields the answer for ways(i).
Now I chose the fixed position to be N/2 since that halves the range and you can halve a range at most ceil(log(N)) times. Now since you are moving a block about N/2 you will have to calculate from N/2-750 to N/2-750, where 750 is the max length a block can have. So you will have to calculate about 750*ceil(log(N)) (a bit more because of the variance) lengths to get the final answer.
So in order to get good performance you have to through in memoisation, since this inherently a recursive algorithm.
So using Python(since I was lazy and didn't want to write a big number class):
T = int(raw_input())
for case in xrange(T):
#read in the data
C = int(raw_input())
lengths = map(int, raw_input().split())
minlength = min(lengths)
n = int(raw_input())
#setup memoisation, note all lengths less than the minimum length are
#set to 0 as the algorithm needs this
memoise = {}
memoise[0] = 1
for length in xrange(1, minlength):
memoise[length] = 0
def solve(n):
global memoise
if n in memoise:
return memoise[n]
ans = 0
for i in xrange(C):
if lengths[i] > n:
continue
if lengths[i] == n:
ans += 1
ans %= 100000007
continue
for j in xrange(0, lengths[i]):
b = n/2-lengths[i]+j
a = n-(n/2+j)
if b < 0 or a < 0:
continue
ans += solve(b)*solve(a)
ans %= 100000007
memoise[n] = ans
return memoise[n]
solve(n)
print "Case %d: %d" % (case+1, memoise[n])
Note I haven't exhaustively tested this, but I'm quite sure it will meet the 20 second time limit, if you translated this algorithm to C++ or somesuch.
EDIT: Running a test with N = 10^15 and a block with length 750 I get that memoise contains about 60000 elements which means non-lookup bit of solve(n) is called about the same number of time.
A word of caution: In the case c=2, len1=1, len2=2, the answer will be the N'th Fibonacci number, and the Fibonacci numbers grow (approximately) exponentially with a growth factor of the golden ratio, phi ~ 1.61803399. For the
huge value N=10^15, the answer will be about phi^(10^15), an enormous number. The answer will have storage
requirements on the order of (ln(phi^(10^15))/ln(2)) / (8 * 2^40) ~ 79 terabytes. Since you can't even access 79
terabytes in 20 seconds, it's unlikely you can meet the speed requirements in this special case.
Your best hope occurs when C is not too large, and leni is large for all i. In such cases, the answer will
still grow exponentially with N, but the growth factor may be much smaller.
I recommend that you first construct the integer matrix M which will compute the (i+1,..., i+k)
terms in your sequence based on the (i, ..., i+k-1) terms. (only row k+1 of this matrix is interesting).
Compute the first k entries "by hand", then calculate M^(10^15) based on the repeated squaring
trick, and apply it to terms (0...k-1).
The (integer) entries of the matrix will grow exponentially, perhaps too fast to handle. If this is the case, do the
very same calculation, but modulo p, for several moderate-sized prime numbers p. This will allow you to obtain
your answer modulo p, for various p, without using a matrix of bigints. After using enough primes so that you know their product
is larger than your answer, you can use the so-called "Chinese remainder theorem" to recover
your answer from your mod-p answers.
I'd like to build on the earlier #JPvdMerwe solution with some improvements. In his answer, #JPvdMerwe uses a Dynamic Programming / memoisation approach, which I agree is the way to go on this problem. Dividing the problem recursively into two smaller problems and remembering previously computed results is quite efficient.
I'd like to suggest several improvements that would speed things up even further:
Instead of going over all the ways the block in the middle can be positioned, you only need to go over the first half, and multiply the solution by 2. This is because the second half of the cases are symmetrical. For odd-length blocks you would still need to take the centered position as a seperate case.
In general, iterative implementations can be several magnitudes faster than recursive ones. This is because a recursive implementation incurs bookkeeping overhead for each function call. It can be a challenge to convert a solution to its iterative cousin, but it is usually possible. The #JPvdMerwe solution can be made iterative by using a stack to store intermediate values.
Modulo operations are expensive, as are multiplications to a lesser extent. The number of multiplications and modulos can be decreased by approximately a factor C=100 by switching the color-loop with the position-loop. This allows you to add the return values of several calls to solve() before doing a multiplication and modulo.
A good way to test the performance of a solution is with a pathological case. The following could be especially daunting: length 10^15, C=100, prime block sizes.
Hope this helps.
In the above answer
ans += 1
ans %= 100000007
could be much faster without general modulo :
ans += 1
if ans == 100000007 then ans = 0
Please see TopCoder thread for a solution. No one was close enough to find the answer in this thread.

Tips for Project Euler Problem #78

This is the problem in question: Problem #78
This is driving me crazy. I've been working on this for a few hours now and I've been able to reduce the complexity of finding the number of ways to stack n coins to O(n/2), but even with those improvements and starting from an n for which p(n) is close to one-million, I still can't reach the answer in under a minute. Not at all, actually.
Are there any hints that could help me with this?
Keep in mind that I don't want a full solution and there shouldn't be any functional solutions posted here, so as not to spoil the problem for other people. This is why I haven't included any code either.
Wikipedia can help you here. I assume that the solution you already have is a recursion such as the one in the section "intermediate function". This can be used to find the solution to the Euler problem, but isn't fast.
A much better way is to use the recursion based on the pentagonal number theorem in the next section. The proof of this theorem isn't straight forward, so I don't think the authors of the problem expect that you come up with the theorem by yourself. Rather it is one of the problems, where they expect some literature search.
This problem is really asking to find the first term in the sequence of integer partitions that’s divisible by 1,000,000.
A partition of an integer, n, is one way of describing how many ways the sum of positive integers, ≤ n, can be added together to equal n, regardless of order. The function p(n) is used to denote the number of partitions for n. Below we show our 5 “coins” as addends to evaluate 7 partitions, that is p(5)=7.
5 = 5
= 4+1
= 3+2
= 3+1+1
= 2+2+1
= 2+1+1+1
= 1+1+1+1+1
We use a generating function to create the series until we find the required n.
The generating function requires at most 500 so-called generalized pentagonal numbers, given by n(3n – 1)/2 with 0, ± 1, ± 2, ± 3…, the first few of which are 0, 1, 2, 5, 7, 12, 15, 22, 26, 35, … (Sloane’s A001318).
We have the following generating function which uses our pentagonal numbers as exponents:
1 - q - q^2 + q^5 + q^7 - q^12 - q^15 + q^22 + q^26 + ...
my blog at blog.dreamshire.com has a perl program that solves this in under 10 sec.
Have you done problems 31 or 76 yet? They form a nice set that is an generalization of the same base problem each time. Doing the easier questions may give you insight into a solution for 78.
here some hints:
Divisibility by one million is not the same thing as just being larger than one million. 1 million = 1,000,000 = 10^6 = 2^6 * 5^6.
So the question is to find a lowest n so that the factors of p(n) contain six 2's and six 5's.

Resources