Prolog: where to begin solving Minesweeper-like puzzle? - prolog

I need to write something like a minesweeper in prolog. I am able to do that in "normal" language but when i try to start coding with prolog i totally don't know how to start.
I need some sort of tips.
Input specification:
Board size: m × n (m, n ∈ {1,...,16}), list of triples (i, j, k), where i ∈ {1,...,m}, j ∈ {1,...,n}, k ∈ {1,...,8}) describing fields with numbers.
For example:
5
5
[(1,1,1), (2,3,3), (2,5,2), (3,2,2), (3,4,4), (4,1,1), (4,3,1), (5,5,2)].
Output: list of digits and the atoms * (for treasure) and (for blank fields). It is a representation of puzzle solution.
Rules of this puzzle:
In 20 fields of a board there are hidden treasures. A digit in a field represents how many neighbour-fields have a treasure. There are no treasures in fields with a digit. Mark all fields with a treasure.
You need to guess how many treasures are hidden in diagonals.
I would be grateful for any tips. I don't want full solution, I want to write it by my own, but without clues I am not able to do that.

A matrix is usually handled as a list of list, you can build using length/2 and findall/3. A matrix of empty variables (where you will place values while guessing....)
build_matrix(NRows, NCols, Mat) :-
findall(Row, (between(1, NRows, _), length(Row, NCols)), Mat).
Accessing elements via coordinates can be done using nth1 (see here for another answer where you can find some detail: see cell/3).
Then you place all your triples constraints: there is finite number of ways of consuming the 'hidden treasure' counters, let Prolog search all the ways, enumerating adjacents.
Process the list of triples, placing each counter in compatible cells, with a recursive predicate. When the list ends, you have a guess.
To keep your code simpler, don't worry about indexes out of matrix bounds, remember that failures are 'normal' when searching...

Related

Doubts about Alphabetic complexity

I'm beginner in Description complexity so i have some doubts to resolve many exercises. Maybe, can someone explainde me how to resolve this problem? also the main idea that involves this problem and maybe some interesting lectures. Thanks in advance.
Alphabetic complexity
Rank the following structures based on their complexity:
d j x k
a a a a
p q r s
a b c d
w x y z
By default, a series of four letters requires 4 × log(26) bits, but we can do much better for several of these sequences (one needs log2(N) bits in general to represent one object taken among N objects; if the set of objects is ordered, the complexity of the nth one is at most log2(n)).
Provide indications showing how the complexity of the above structures can be computed. Be precise. Imagine a simple "machine" with a few avilable operations such as copy, increment, search, last; the machine may read several lists: list of operations, alphabet, list of lists. One needs a few bits to designate a list, plus more bits to designate an element in that list. The problem is to find the smallest designation of the object.

"Squares" Logic Riddle Solution in Prolog

"Squares"
Input:
Board size m × n (m, n ∈ {1,...,32}), list of triples (i, j, k), where i ∈ {1,...,m}, j ∈ {1,...,n}, k ∈ {0,...,(n-2)·(m-2)}) describing fields with numbers.
Output:
List of triples (i, j, d) showing solved riddle. Triple (i, j, d) describes square with opposite vertices at coordinates (i, j) and (i+d, j+d).
Example:
Input:
7.
7. [(3,3,0), (3,5,0), (4,4,1), (5,1,0), (6,6,3)].
Output:
[(1,1,2), (1,5,2), (2,2,4), (5,1,2), (4,4,3)]
Image:
Explanation:
I have to find placement for x squares(x = fields with numbers). On the
circuit of the each square, exactly at one of his corners should be
only one digit equal to amount of digits inside the square. Sides of
squares can't cover each other, same as corners. Square lines are
"filling fields" so (0,0,1) square fills 4 fields and have 0 fields inside.
I need a little help coding solution to this riddle in Prolog. Could someone direct me in the right direction? What predicates, rules I should use.
Since Naimads life is worth saving(!), here is some more directive help.
Programmers tend to work out their projects either in bottomp-up (starting from "lower" levels of functionality, building up to a complete application) or top-down (starting with a high level "shell" of functionality that provides input/output, but with the solution processing stubbed in for later refinement).
Either way I can recommend an approach for Prolog progrmming that has been championed by the Ruby programming community: Test Driven Development.
Let's pick a couple of functional requirement, one high and one low, and discuss how this bit of philosophy might help.
The high level predicate in Prolog is one which takes all the input and output arguments and semantically defines the problem. Note that we can write down such a predicate without giving any care as to how the solution process will get implemented:
/* squaresRiddle/4 takes inputs of Height and Width of a "board" with
a list of (nonnegative) integers located at cells of the board, and
produces a corresponding list of squares having one corner at each
of those locations "boxing in" exactly the specified counts of them
in their interiors. No edges can overlap nor corners coincide. */
squaresRiddle(Height,Width,BoxedCountList, OutputSquaresList).
So far we have just framed the problem. The useful progress is given by the comment (clearly defining the inputs and outputs at a high level), and the code at this point just guarantees "success" with whatever arguments are passed. So a "test" of this code:
?= squaresRiddle(7,7,ListIn,ListOut).
won't produce anything except free variables.
Next let's give some thought to how the input list and output list ought to be represented. In the Question it is proposed to use triples of integers as entries of these lists. I would recommend instead using a named functor, because the semantics of those input triples (giving an x,y coordinate of a cell with a count) are subtly different from that of the output triples (giving an x,y coordinate of upper left corner and an "extent" (height&width) of the square). Note in particular that an output square may be described by a different corner than the one in the corresponding input item, although the one in the input item needs to be one of the four corners of the output square.
So it tends to make the code more readable if these constructs are distinguished by simple functor names, e.g. BoxedCountList = [box(3,3,0),box(3,5,0),box(4,4,1),box(5,1,0),box(6,6,3)] vs. OutputSquaresList = [sqr(1,1,2), sqr(1,5,2), sqr(2,2,4), sqr(5,1,2), sqr(4,4,3)].
Let's give a little thought to how the search for solutions will proceed. The output items are in 1-to-1 correspondence with the input items, so the lists will have equal length in the end. However choosing the kth output item depends not only on the kth input item, but on all the input items (since we count how many are in the interior of the output square) and on the preceding output items (since we disallow corners that meet and edges that overlap).
There are several ways to manage the flow of decision making consistent with this requirement, and it seems to be the crux of this exercise to pick one approach and make it work. If you are familiar with difference lists or accumulators, you have a head start on ways to do it.
Now let's switch to discussing some lower level functionality. Given an input box there are potentially a number of output sqr that could correspond to it. Given a positive "extent" for the output, the X,Y coordinates of the input can be met with any of the four corners of the output square. While more checking is required to verify the solution, this aspect seems a good place to start testing:
/* check that coordinates X,Y are indeed a corner of candidate square */
hasCorner(sqr(SX,SY,Extent),X,Y) :-
(SX is X ; SX is X + Extent),
(SY is Y ; SY is Y + Extent).
How would we make use of this test? Well we could generate all possible squares (possibly limiting ourselves to those that fit inside the height and width of our "board"), and then check whether any have the required corner. This might be fairly inefficient. A better way (as far as efficiency goes) would use the corner to generate possible squares (by extending from the corner by Extent in any of four directions) subject to the parameters of board size. Implementation of this "optimization" is left as an exercise for the reader.

Finding number of possible sequences in an array, with additional conditions

There is a sequence {a1, a2, a3, a4, ..... aN}. A run is the maximal strictly increasing or strictly decreasing continuous part of the sequence. Eg. If we have a sequence {1,2,3,4,7,6,5,2,3,4,1,2} We have 5 possible runs {1,2,3,4,7}, {7,6,5,2}, {2,3,4}, {4,1} and {1,2}.
Given four numbers N, M, K, L. Count the number of possible sequences of N numbers that has exactly M runs, each of the number in the sequence is less than or equal to K and difference between the adjacent numbers is less than equal to L
The question was asked during an interview.
I could only think of a brute force solution. What is an efficient solution for this problem?
Use dynamic programming. For each number in the substring maintain separate count of maximal increasing and maximally decreasing subsequences. When you incrementally add a new number to the end you can use these counts to update the counts for the new number. Complexity: O(n^2)
This can be rephrased as a recurrence problem. Look at your problem as finding #(N, M) (assume K and L are fixed, they are used in the recurrence conditions, so propagate accordingly). Now start with the more restricted count functions A(N, M; a) and D(N, M, a), where A counts those sets with last run ascending, D counts those with last run descending, and a is the value of the last element in the set.
Express #(N, M) in terms of A(N, M; a) and D(N, M; a) (it's the sum over all allowable a). You might note that there are relations between the two (like the reflection A(N, M; a) = D(N, M; K-a)) but that won't matter much for the calculation except to speed table filling.
Now A(N, M; a) can be expressed in terms of A(N-1, M; w), A(N-1, M-1; x), D(N-1, M; y) and D(N-1, M-1; z). The idea is that if you start with a set of size N-1 and know the direction of the last run and the value of the last element, you know whether adding element a will add to an existing run or add a run. So you can count the number of possible ways to get what you want from the possibilities of the previous case.
I'll let you write this recursion down. Note that this is where you account for L (only add up those that obey the L distance restriction) and K (look for end cases).
Terminate the recursion using the fact that A(1, 1; a) = 1, A(1, x>1; a) = 0 (and similarly for D).
Now, since this is a multiple recursion, be sure your implementation stores results in a table and begins by trying lookup (commonly called dynamic programming).
I suppose you mean by 'brute force solution' what I might mean by 'straightforward solution involving nested-loops over N,M,K,L' ? Sometimes the straightforward solution is good enough. One of the times when the straightforward solution is good enough is when you don't have a better solution. Another of the times is when the numbers are not very large.
With that off my chest I would write the loops in the reverse direction, or something like that. I mean:
Create 2 auxiliary data structures, one to contain the indices of the numbers <=K, one for the indices of the numbers whose difference with their neighbours is <=L.
Run through the list of numbers and populate the foregoing auxiliary data structures.
Find the intersection of the values in those 2 data structures; these will be the indices of interesting places to start searching for runs.
Look in each of the interesting places.
Until someone demonstrates otherwise this is the most efficient solution.

How can write the median of medians algorithm in prolog? [duplicate]

Am writing a logic program for kth_largest(Xs,K) that implements the linear algorithm for finding the
kth largest element K of a list Xs. The algorithm has the following steps:
Break the list into groups of five elements.
Efficiently find the median of each of the groups, which can be done with a fixed number of
comparisons.
Recursively find the median of the medians.
Partition the original list with respect to the median of the medians.
Recursively find the kth largest element in the appropriate smaller list.
How do I go about it? I can select an element from a list but I don't know how to get the largest using the above procedure.Here is my definition for selecting an element from a list
select(X; HasXs; OneLessXs)
% The list OneLessXs is the result of removing
% one occurrence of X from the list HasXs.
select(X,[X|Xs],Xs).
select(X,[Y|Ys],[Y|Zs]) :- select(X,Ys,Zs).
I'm going to jump in since no one has attempted an Answer, and hopefully shed some light on the procedure to be programmed.
I've found the Wikipedia article on Selection algorithm to be quite helpful in understanding the bigger picture of "fast" (worst-case linear time) algorithms of this type.
But what you asked at the end of your Question is a somewhat simpler matter. You wrote "How do i go about it? I can select an element from a list but i dont know how to get the largest using the above procedure." (emphasis added by me)
Now there seems to be a bit of confusion about whether you want to implement "the above procedure", which is a general recipe for finding a kth largest element by successive searches for medians, or whether you ask how to use that recipe to find simply the largest element (a special case). Note that the recipe doesn't specifically use a step of finding the largest element on its way to locating the median or the kth largest element.
But you give the code to find an element of a list and the rest of that list after removing that element, a predicate that is nondeterministic and allows backtracking through all members of the list.
The task of finding the largest element is deterministic (at least if all the elements are distinct), and it is an easier task than the general selection of the kth largest element (a task associated with order statistics among other things).
Let's give some simple, hopefully obviously correct, code to find the largest element, and then talk about a more optimized way of doing it.
maxOfList(H,[H|T]) :- upperBound(H,T), !.
maxOfList(X,[_|T]) :- maxOfList(X,T).
upperBound(X,[ ]).
upperBound(X,[H|T]) :-
X >= H,
upperBound(X,T).
The idea should be understandable. We look at the head of the list and ask if that entry is an upper bound for the rest of the list. If so, that must be the maximum value and we're done (the cut makes it deterministic). If not, then the maximum value must occur later in the list, so we discard the head and continue recursively searching for an entry that is an upper bound of all the subsequent elements. The cut is essential here, because we must stop at the first such entry in order to know it is a maximum of the original list.
We've used an auxiliary predicate upperBound/2, which is not unusual, but the overall complexity of this implementation is worst-case quadratic in the length of the list. So there is room for improvement!
Let me pause here to be sure I'm not going totally off-track in trying to address your question. After all you may have meant to ask how to use "the above procedure" to find the kth largest element, and so what I'm describing may be overly specialized. However it may help to understand the cleverness of the general selection algorithms to understand the subtle optimization of the simple case, finding a largest element.
Added:
Intuitively we can reduce the number of comparisons needed in the worst case
by going through the list and keeping track of the largest value found "so
far". In a procedural language we can easily accomplish this by reassigning
the value of a variable, but Prolog doesn't allow us to do that directly.
Instead a Prolog way of doing this is to introduce an extra argument and
define the predicate maxOfList/2 by a call to an auxiliary predicate
with three arguments:
maxOfList(X,[H|T]) :- maxOfListAux(X,H,T).
The extra argument in maxOfListAux/3 can then be used to track the
largest value "so far" as follows:
maxOfListAux(X,X,[ ]).
maxOfListAux(Z,X,[H|T]) :-
( X >= H -> Y = X ; Y = H ),
maxOfListAux(Z,Y,T).
Here the first argument of maxOfListAux represents the final answer as to
the largest element of the list, but we don't know that answer until we
have emptied the list. So the first clause here "finalizes" the answer
when that happens, unifying the first argument with the second argument
(the largest value "so far") just when the tail of the list has reached
the end.
The second clause for maxOfListAux leaves the first argument unbound and
"updates" the second argument accordingly as the next element of the list
exceeds the previous largest value or not.
It isn't strictly necessary to use an auxiliary predicate in this case,
because we might have kept track of the largest value found by using the
head of the list instead of an extra argument:
maxOfList(X,[X]) :- !.
maxOfList(X,[H1,H2|T]) :-
( H1 >= H2 -> Y = H1 ; Y = H2 ),
maxOfList(X,[Y|T]).

Dynamic Programming: Sum-of-products

Let's say you have two lists, L1 and L2, of the same length, N. We define prodSum as:
def prodSum(L1, L2) :
ans = 0
for elem1, elem2 in zip(L1, L2) :
ans += elem1 * elem2
return ans
Is there an efficient algorithm to find, assuming L1 is sorted, the number of permutations of L2 such that prodSum(L1, L2) < some pre-specified value?
If it would simplify the problem, you may assume that L1 and L2 are both lists of integers from [1, 2, ..., N].
Edit: Managu's answer has convinced me that this is impossible without assuming that L1 and L2 are lists of integers from [1, 2, ..., N]. I'd still be interested in solutions that assume this constraint.
I want to first dispell a certain amount of confusion about the math, then discuss two solutions and give code for one of them.
There is a counting class called #P which is a lot like the yes-no class NP. In a qualitative sense, it is even harder than NP. There is no particular reason to believe that this counting problem is any better than #P-hard, although it could be hard or easy to prove that.
However, many #P-hard problems and NP-hard problems vary tremendously in how long they take to solve in practice, and even one particular hard problem can be harder or easier depending on the properties of the input. What NP-hard or #P-hard mean is that there are hard cases. Some NP-hard and #P-hard problems also have less hard cases or even outright easy cases. (Others have very few cases that seem much easier than the hardest cases.)
So the practical question could depend a lot on the input of interest. Suppose that the threshold is on the high side or on the low side, or you have enough memory for a decent number of cached results. Then there is a useful recursive algorithm that makes use of two ideas, one of them already mentioned: (1) After partially assigning some of the values, the remaining threshold for list fragments may rule out all of the permutations, or it may allow all of them. (2) Memory permitting, you should cache the subtotals for some remaining threshold and some list fragments. To improve the caching, you might as well pick the elements from one of the lists in order.
Here is a Python code that implements this algorithm:
list1 = [1,2,3,4,5,6,7,8,9,10,11]
list2 = [1,2,3,4,5,6,7,8,9,10,11]
size = len(list1)
threshold = 396 # This is smack in the middle, a hard value
cachecutoff = 6 # Cache results when up to this many are assigned
def dotproduct(v,w):
return sum([a*b for a,b in zip(v,w)])
factorial = [1]
for n in xrange(1,len(list1)+1):
factorial.append(factorial[-1]*n)
cache = {}
# Assumes two sorted lists of the same length
def countprods(list1,list2,threshold):
if dotproduct(list1,list2) <= threshold: # They all work
return factorial[len(list1)]
if dotproduct(list1,reversed(list2)) > threshold: # None work
return 0
if (tuple(list2),threshold) in cache: # Already been here
return cache[(tuple(list2),threshold)]
total = 0
# Match the first element of list1 to each item in list2
for n in xrange(len(list2)):
total += countprods(list1[1:],list2[:n] + list2[n+1:],
threshold-list1[0]*list2[n])
if len(list1) >= size-cachecutoff:
cache[(tuple(list2),threshold)] = total
return total
print 'Total permutations below threshold:',
print countprods(list1,list2,threshold)
print 'Cache size:',len(cache)
As the comment line says, I tested this code with a hard value of the threshold. It is quite a bit faster than a naive search over all permutations.
There is another algorithm that is better than this one if three conditions are met: (1) You don't have enough memory for a good cache, (2) the list entries are small non-negative integers, and (3) you're interested in the hardest thresholds. A second situation to use this second algorithm is if you want counts for all thresholds flat-out, whether or not the other conditions are met. To use this algorithm for two lists of length n, first pick a base x which is a power of 10 or 2 that is bigger than n factorial. Now make the matrix
M[i][j] = x**(list1[i]*list2[j])
If you compute the permanent of this matrix M using the Ryser formula, then the kth digit of the permanent in base x tells you the number of permutations for which the dot product is exactly k. Moreover, the Ryser formula is quite a bit faster than the summing over all permutations directly. (But it is still exponential, so it does not contradict the fact that computing the permanent is #P-hard.)
Also, yes it is true that the set of permutations is the symmetric group. It would be great if you could use group theory in some way to accelerate this counting problem. But as far as I know, nothing all that deep comes from that description of the question.
Finally, if instead of exactly counting the number of permutations below a threshold, you only wanted to approximate that number, then probably the game changes completely. (You can approximate the permanent in polynomial time, but that doesn't help here.) I'd have to think about what to do; in any case it isn't the question posed.
I realized that there is another kind of caching/dynamic programming that is missing from the above discussion and the above code. The caching implemented in the code is early-stage caching: If just the first few values of list1 are assigned to list2, and if a remaining threshold occurs more than once, then the cache allows the code to reuse the result. This works great if the entries of list1 and list2 are integers that are not too large. But it will be a failed cache if the entries are typical floating point numbers.
However, you can also precompute at the other end, when most of the values of list1 have been assigned. In this case, you can make a sorted list of the subtotals for all of the remaining values. And remember, you can use up list1 in order, and do all of the permutations on the list2 side. For example, suppose that the last three entries of list1 are [4,5,6], and suppose that three of the values in list2 (somewhere in the middle) are [2.1,3.5,3.7]. Then you would cache a sorted list of the six dot products:
endcache[ [2.1, 3.5, 3.7] ] = [44.9, 45.1, 46.3, 46.7, 47.9, 48.1]
What does this do for you? If you look in the code that I did post, the function countprods(list1,list2,threshold) recursively does its work with a sub-threshold. The first argument, list1, might have been better as a global variable than as an argument. If list2 is short enough, countprods can do its work much faster by doing a binary search in the list endcache[list2]. (I just learned from stackoverflow that this is implemented in the bisect module in Python, although a performance code wouldn't be written in Python anyway.) Unlike the head cache, the end cache can speed up the code a lot even if there are no numerical coincidences among the entries of list1 and list2. Ryser's algorithm also stinks for this problem without numerical coincidences, so for this type of input I only see two accelerations: Sawing off a branch of the search tree using the "all" test and the "none" test, and the end cache.
Probably not (without the simplifying assumption): your problem is NP-Hard. Here's a trivial reduction to SUBSET-SUM. Let count_perms(L1, L2, x) represent the function "count the number of permutations of L2 such that prodSum(L1, L2) < x"
SUBSET_SUM(L2,n): # (determine if any subset of L2 adds up to n)
For i in [1,...,len(L2)]
Set L1=[0]*(len(L2)-i)+[1]*i
calculate count_perms(L1,L2,n+1)-count_perms(L1,L2,n)
if result positive, return true
Return false
Thus, if there were a way to calculate your function count_perms(L1, L2, x) efficiently, then we would have an efficient algorithm to calculate SUBSET_SUM(L2,n).
This also turns out to be an abstract algebra problem. It's been awhile for me, but here's a few things to get started. There's nothing terribly significant about the following (it's all very basic; an expansion on the fact that every group is isomorphic to a permutation group), but it provides a different way of looking at the problem.
I'll try to stick to fairly standard notation: "x" is a vector, and "xi" is the ith component of x. If "L" is a list, L is the equivalent vector. "1n" is a vector with all components = 1. The set of natural numbers ℕ is taken to be the positive integers. "[a,b]" is the set of integers from a through b, inclusive. "θ(x, y)" is the angle formed by x and y
Note prodSum is the dot product. The question is equivalent to finding all vectors L generated by an operation (permuting elements) on L2 such that θ(L1, L) less than a given angle α. The operation is equivalent to reflecting a point in ℕn through a subspace with presentation:
< ℕn | (xixj-1)(i,j) ∈ A >
where i and j are in [1,n], A has at least one element and no (i,i) is in A (i.e. A is a non-reflexive subset of [1,n]2 where |A| > 0). Stated more plainly (and more ambiguously), the subspaces are the points where one or more components are equal to one or more other components. The reflections correspond to matrices whose columns are all the standard basis vectors.
Let's name the reflection group "RPn" (it should have another name, but memory fails). RPn is isomorphic to the symmetric group Sn. Thus
|RPn| = |Sn| = n!
In 3 dimensions, this gives a group of order 6. The reflection group is D3, the triangle symmetry group, as a subgroup of the cube symmetry group. It turns out you can also generate the points by rotating L2 in increments of π/3 around the line along 1n. This is the the modular group ℤ6 and this points to a possible solution: find a group of order n! with a minimal number of generators and use that to generate the permutations of L2 as sequences with increasing, then decreasing, angle with L2. From there, we can try to generate the elements L with θ(L1, L) < α directly (for example we can binsearch on the 1st half of each sequence to find the transition point; with that, we can specify the rest of the sequence that fulfills the condition and count it in O(1) time). Let's call this group RP'n.
RP'4 is constructed of 4 subspaces isomorphic to ℤ6. More generally, RP'n is constructed of n subspaces isomorphic to RP'n-1.
This is where my abstract algebra muscles really begins to fail. I'll try to keep working on the construction, but Managu's answer doesn't leave much hope. I fear that reducing RP3 to ℤ6 is the only useful reduction we can make.
It looks like if l1 and l2 are both ordered high->low (or low->high, whatever, if they have the same order), the result is maximized, and if they are ordered oposite, the result is minimized, and other alterations of order appear to follow some rules; swapping two numbers in a continuous list of integers always reduces the sum by a fixed amount which seems to be related to their distance apart (ie swapping 1 and 3 or 2 and 4 have the same effect). This was just from a little messing around, but the idea is that there is a maximum, a minimum, and if some-pre-specified-value is between them, there are ways to count the permutations that make that possible (although; if the list isn't evenly spaced, then there aren't. Well, not that I know of. If l2 is (1 2 4 5) swapping 1 2 and 2 4 would have different effects)

Resources