How does the SAT solver produce the model(assignment[s])? - solver

Here is a very simple cnf instance as (x1 or x2 or x3)&(x1 or x2)&(x2 or x3)and the formula is definitely satisfiable, the solution is x1 = x2 = x3 = 1, that is enough. So,my question is how the solver produce the assignment using DPLL or other procedure? Thanks.

Well, basically, for the case of CDCL
(CDCL SAT solvers implement DPLL, but can learn new clauses and backtrack non-chronologically. Clause learning with conflict analysis does not affect soundness or completeness. Conflict analysis identifies new clauses using the resolution operation. Therefore each learnt clause can be inferred from the original clauses and other learnt clauses by a sequence of resolution steps. If cN is the new learnt clause, then ϕ is satisfiable if and only if ϕ ∪ {cN} is also satisfiable. Moreover, the modified backtracking step also does not affect soundness or completeness, since backtracking information is obtained from each new learnt clause.).(Source : Wikipedia)
it's working as follow :
At first pick a branching variable, x1. A yellow circle means an arbitrary decision.
Now apply unit propagation, which yields that x4 must be 1 (i.e. True). A gray circle means a forced variable assignment during unit propagation. The resulting graph is called implication graph.
Arbitrarily pick another branching variable, x3.
Apply unit propagation and find the new implication graph.
Here the variable x8 and x12 are forced to be 0 and 1, respectively.
Pick another branching variable, x2.
Find implication graph.
Pick another branching variable, x7.
Find implication graph.
Found a conflict!
Find the cut that lead to this conflict. From the cut, find a conflicting condition.
Take the negation of this condition and make it a clause.
Add the conflict clause to the problem.
Non-chronological back jump to appropriate decision level.
Back jump and set variable values accordingly.
(Answer completely from Wikipedia: Conflict-Driven_Clause_Learning#Example)
Here is a list (not complete for sure) of solvers who use the CDCL algorithm, you should check them out :
MiniSAT.
Zchaff SAT.
Z3.
ManySAT.

Related

How the Symbolic State Exploration works in Symbolic Model Checking

The following algorithm is a rough sketch of model checking with Computational Tree Logic (CTL):
It is stated that:
The model-checking problem for CTL is to verify for a given transition system TS and CTL formula Φ whether TS |= Φ... The basic procedure for CTL model checking is rather straightforward:
the set Sat(Φ) of all states satisfying Φ is computed recursively, and
it follows that TS |= Φ if and only if I ⊆ Sat(Φ)
where I is the set of initial states of TS...
The recursive computation of Sat(Φ) basically boils down to a bottom-up traversal of the parse tree of the CTL state formula Φ.
So you essentially (from my understanding), you provide the system with a CTL formula Φ, which is a parse tree, and then it searches through the states, and through the CTL parse tree, and checks if any state satisfies Φ.
The question is:
In the Sat(Φ) method, roughly what happens (the symbolic stuff). They say (2) below, where S is states and A is atomic propositions. Wondering how they actually check the states, given that the program isn't actually running. It is (at least I think) Symbolic Model Checking. Wondering if one could explain roughly how the state checking works. It seems like some sort of input generation has to occur, but at the same time I'm thinking maybe it shouldn't occur.
The reason for it being hard to understand for me is this. Say one of the assertions is for a function addTricky(x, y) which is implemented like this:
function addTricky(x, y) {
if (y >= 1) return 3
return x + y
}
Then I would have a Boolean expression in some logic that says "before addTricky : z = 0. after z = addTricky(x, y) : y >= 1 -> z = 3 ; y < 1; z = x + y".
Basically trying to get at the question of patterns. If Sat(Φ) is doing basically what I just did in that Boolean expression, I wonder if it ever calls/invokes the function addTricky, or if it can do it all symbolically somehow. I don't see how that works yet, wondering if the basics of how the symbolic execution works could be explained a bit. To me I keep imagining it doing some sort of Unit Testing, like plugging in addTricky(1, 1) for example, and checking all the possibilities. Maybe that is "explicit state exploration" vs. symbolic exploration, not sure.
Thank you so much for the help!
(1) For each node of the parse tree, i.e., for each subformula Ψ of Φ, the set Sat(Ψ) of states is computed for which Ψ holds.
(2) Sat(a) = {s ∈ S | a ∈ L(s)}, for any a ∈ A
I think there are two parts to your question: 1) How to go from a software function to a transition system and 2) how is the transition system used to check satisfaction.
1) A transition system is basically an extension of a finite state automaton. If you have a function like you described, you first need to transform it into a transition system. This can be done, for example, by introducing states for each executable line of your code, and transitions between those states that follow the conditions of your code. At the transition system level you do not have the concept of function call, therefore you need to take care of this during the translation e.g., by in lining function definitions. This step is independent on how you verify the transition system. As you can imagine this can lead to pretty large transition systems.
There are other approaches, that are not based on transition systems, that simulate the execution of the program and collect symbolic constraints along the way. Symbolic execution is such an example.
2) Let's say that you inline your addTricky function and get something along these lines
L0: z=0
if (y>=1)
L1: z=3
else
L2: z=x+y
A possible TS is:
(L0: z=0) --[y >= 1]--> (L1: z=3)
|
[y<1]
\/
(L2: z=x+y)
You have 3 executable statements and this leads to a TS whose symboiic states (S) are:
L0: Z=0; X=?; Y=?
L1: Z=3; X=?; Y>=1
L2: Z=X+Y; X=?; Y<1
where ? means any value. The power of this approach is that you can compactly represent all the values of X and Y in a single symbolic state.

Is there an error in this textbook about Peano Arithmetic?

I encountered this doubt in an online intro-logic open course offered by Stanford Uni.
Under the section 9.4 of this textbook here: http://logic.stanford.edu/intrologic/secondary/notes/chapter_09.html
It says:
The axioms shown here define the same relation in terms of 0 and s.(where the functional constant letter s below represents the successor function, e.g. s(0)=1, s(1)=2, s(2)=3 )
∀x.same(x,x)
∀x.(¬same(0,s(x)) ∧ ¬same(s(x),0))
∀x.∀y.(¬same(x,y) ⇒ ¬same(s(x),s(y)))
As my understanding, :
The first sentence says two identical numbers are same. The second and third sentences are used to define what is not same.
The second says no successor of any number is same to 0.
The third says if two numbers are not the same, then their successors are not same. For example, if 1≠3, then 2≠4.
However, I think the third sentence should be bi-conditional because, if I'm not wrong, the definition didn't cover the instance where the number being testified are smaller than the given number,otherwise it is possible to say if 2≠4, then 1=3.
So I wondered is this an error in text book or there's something wrong of my reasoning.
There is no error in this text book. While the statement does hold in both directions, there is no need to state it as an axiom since the other direction follows from the functional property of the successor function and the three axioms listed in the textbook.
A formal proof would involve the axioms that define the successor function. Someone more accustomed to the use of automated provers or just a good student of logic might be able to complete such a formal proof.
Here is just a sketch of a proof. It uses the symbol "=" to denote term equality, i.e. u=v means u and v are syntactically identical terms written using the symbols 0 and s(). Also "u<v" means that u and v are both ground terms and u has strictly less applications of s() than v.
Suppose
∀x.∀y.(¬same(s(x),s(y)) ⇒ ¬same(x,y))
does not hold, then there exist some terms x0 and y0 such that
same(x0,y0) and ¬same(s(x0),s(y0)).
Since s(x0) is a function, it follows from ¬same(s(x0),s(y0)) and ∀x.same(x,x) that x0 and y0 are two different terms. First let us consider the case when x0 < y0, then y0 = s(...s(x0)) where there are n applications of s() and n > 0. The other case when y0 < x0 can be handled similarly.
Substituting s(...s(x0)) for y0 in same(x0,y0) we get same(x0,s(...s(x0))).
Also x0 = s(...s(0)) where there are m applications of s() for some nonnegative integer m. Using the third axiom in the direction provided we can say that if same(s(u),s(v)) then same(u,v). Thus from same(x0,s(...s(x0))) we can "strip" m applications of s() to obtain
same(0,s(...s(0))) where there are n applications of s() in the second argument. This contradicts the second axiom. Q.E.D.

Example channelling constraints ECLiPSe

Can someone provide a simple example of channelling constraints?
Channelling constraints are used to combine viewpoints of a constraint problem. Handbook of Constraint Programming gives a good explanation of how it works and why it can be useful:
The search variables can be the variables of one of the viewpoints, say X1 (this is discussed further below). As
search proceeds, propagating the constraints C1 removes values from the domains of the
variables in X1. The channelling constraints may then allow values to be removed from
the domains of the variables in X2. Propagating these value deletions using the constraints
of the second model, C2, may remove further values from these variables, and again these
removals can be translated back into the first viewpoint by the channelling constraints. The
net result can be that more values are removed within viewpoint V1 than by the constraints
C1 alone, leading to reduced search.
I do not understand how this is implemented. What are these constraints exactly, how do they look like in a real problem? A simple example would be very helpful.
As stated in Dual Viewpoint Heuristics for Binary Constraint Satisfaction Problems (P.A. Geelen):
Channelling constraints of two different models allows for the expression of a relationship between two sets of variables, one of each model.
This implies assignments in one of the viewpoints can be translated into assignments in the other and vice versa, as well as, when search initiates,
excluded values from one model can be excluded from the other as well.
Let me throw in an example I implemented a while ago while writing a Sudoku solver.
Classic viewpoint
Here we interpret the problem in the same way a human would: using the
integers between 1 and 9 and a definition that all rows, columns and blocks must contain every integer exactly once.
We can easily state this in ECLiPSe using something like:
% Domain
dim(Sudoku,[N,N]),
Sudoku[1..N,1..N] :: 1..N
% For X = rows, cols, blocks
alldifferent(X)
And this is yet sufficient to solve the Sudoku puzzle.
Binary boolean viewpoint
One could however choose to represent integers by their binary boolean arrays (shown in the answer by #jschimpf). In case it's not clear what this does, consider the small example below (this is built-in functionality!):
?­ ic_global:bool_channeling(Digit, [0,0,0,1,0], 1).
Digit = 4
Yes (0.00s cpu)
?­ ic_global:bool_channeling(Digit, [A,B,C,D], 1), C = 1.
Digit = 3
A = 0
B = 0
C = 1
D = 0
Yes (0.00s cpu)
If we use this model to represent a Sudoku, every number will be replaced by its binary boolean array and corresponding constraints can be written. Being trivial for this answer, I will not include all the code for the constraints, but a single sum constraint is yet enough to solve a Sudoku puzzle in its binary boolean representation.
Channelling
Having these two viewpoints with corresponding constrained models now gives the opportunity to channel between them and see if any improvements were made.
Since both models are still just an NxN board, no difference in dimension of representation exists and channelling becomes real easy.
Let me first give you a last example of what a block filled with integers 1..9 would look like in both of our models:
% Classic viewpoint
1 2 3
4 5 6
7 8 9
% Binary Boolean Viewpoint
[](1,0,0,0,0,0,0,0,0) [](0,1,0,0,0,0,0,0,0) [](0,0,1,0,0,0,0,0,0)
[](0,0,0,1,0,0,0,0,0) [](0,0,0,0,1,0,0,0,0) [](0,0,0,0,0,1,0,0,0)
[](0,0,0,0,0,0,1,0,0) [](0,0,0,0,0,0,0,1,0) [](0,0,0,0,0,0,0,0,1)
We now clearly see the link between the models and simply write the code to channel our decision variables. Using Sudoku and BinBools as our boards, the code would look something like:
( multifor([Row,Col],1,N), param(Sudoku,BinBools,N)
do
Value is Sudoku[Row,Col],
ValueBools is BinBools[Row,Col,1..N],
ic_global:bool_channeling(Value,ValueBools,1)
).
At this point, we have a channelled model where, during search, if values are pruned in one of the models, its impact will also occur in the other model. This can then of course lead to further overall constraint propagation.
Reasoning
To explain the usefulness of the binary boolean model for the Sudoku puzzle, we must first differentiate between some provided alldifferent/1 implementations by ECLiPSe:
What this means in practice can be shown as following:
?­ [A, B, C] :: [0..1], ic:alldifferent([A, B, C]).
A = A{[0, 1]}
B = B{[0, 1]}
C = C{[0, 1]}
There are 3 delayed goals.
Yes (0.00s cpu)
?­ [A, B, C] :: [0..1], ic_global:alldifferent([A, B, C]).
No (0.00s cpu)
As there has not yet occurred any assignment using the Forward Checking (ic library), the invalidity of the query is not yet detected, whereas the Bounds Consistent version immediately notices this. This behaviour can lead to considerable differences in constraint propagation while searching and backtracking through highly constrained models.
On top of these two libraries there is the ic_global_gac library intended for global constraints for which generalized arc consistency (also called hyper arc consistency or domain consistency) is maintained. This alldifferent/1 constraint provides even more pruning opportunities than the bounds consistent one, but preserving full domain consistency has its cost and using this library in highly constrained models generally leads to a loss in running performance.
Because of this, I found it interesting for the Sudoku puzzle to try and work with the bounds consistent (ic_global) implementation of alldifferent to maximise performance and subsequently try to approach domain consistency myself by channelling the binary boolean model.
Experiment results
Below are the backtrack results for the 'platinumblonde' Sudoku puzzle (referenced as being the hardest, most chaotic Sudoku puzzle to solve in The Chaos Within Sudoku, M. Ercsey­Ravasz and Z. Toroczkai) using respectively forward checking, bounds consistency, domain consistency, standalone binary boolean model and finally, the channelled model:
(ic) (ic_global) (ic_global_gac) (bin_bools) (channelled)
BT 6 582 43 29 143 30
As we can see, our channelled model (using bounds consistency (ic_global)) still needs one backtrack more than the domain consistent implementation, but it definitely performs better than the standalone bounds consistent version.
When we now take a look at the running times (results are the product of calculating an average over multiple executions, this to avoid extremes) excluding the forward checking implementation as it's proven to no longer be interesting for solving Sudoku puzzles:
(ic_global) (ic_global_gac) (bin_bools) (channelled)
Time(ms) 180ms 510ms 100ms 220ms
Looking at these results, I think we can successfully conclude the experiment (these results were confirmed by 20+ other Sudoku puzzle instances):
Channelling the binary boolean viewpoint to the bounds consistent standalone implementation produces a slightly less strong constraint propagation behaviour than that of the domain consistent standalone implementation, but with running times ranging from just as long to notably faster.
EDIT: attempt to clarify
Imagine some domain variable representing a cell on a Sudoku board has a remaining domain of 4..9. Using bounds consistency, it is guaranteed that for both value 4 and 9 other domain values can be found which satisfy all constraints and thus provides consistency. However, no consistency is explicitly guaranteed for other values in the domain (this is what 'domain consistency' is).
Using a binary boolean model, we define the following two sum constraints:
The sum of every binary boolean array is always equal to 1
The sum of every N'th element of every array in every row/col/block is always equal to 1
The extra constraint strength is enforced by the second constraint which, apart from constraining row, columns and blocks, also implicitly says: "every cell can only contain every digit once". This behaviour is not actively expressed when using just the bounds consistent alldifferent/1 constraint!
Conclusion
It is clear that channelling can be very useful to improve a standalone constrained model, however if the new model's constraint strengthness is weaker than that of the current model, obviously, no improvements will be made. Also note that having a more constrained model doesn't necesarilly also mean an overall better performance! Adding more constraints will in fact decrease amounts of backtracks required to solve a problem, but it might also increase the running times of your program if more constraint checks have to occur.
Channeling constraints are used when, in a model, aspects of a problem are represented in more than one way. They are then necessary to synchronize these multiple representations, even though they do not themselves model an aspect of the problem.
Typically, when modelling a problem with constraints, you have several ways of choosing your variables. For example, in a scheduling problem, you could choose to have
an integer variable for each job (indicating which machine does the job)
an integer variable for each machine (indicating which job it performs)
a matrix of Booleans (indicating which job runs on which machine)
or something more exotic
In a simple enough problem, you choose the representation that makes it easiest to formulate the constraints of the problem. However, in real life problems with many heterogeneous constraints it is often impossible to find such a single best representation: some constraints are best represented with one type of variable, others with another.
In such cases, you can use multiple sets of variables, and formulate each individual problem constraint over the most convenient variable set. Of course, you then end up with multiple independent subproblems, and solving these in isolation will not give you a solution for the whole problem. But by adding channeling constraints, the variable sets can be synchronized, and the subproblems thus re-connected. The result is then a valid model for the whole problem.
As hinted in the quote from the handbook, in such a formulation is is sufficient to perform search on only one of the variable sets ("viewpoints"), because the values of the others are implied by the channeling constraints.
Some common examples for channeling between two representations are:
Integer variable and Array of Booleans:
Consider an integer variable T indicating the time slot 1..N when an event takes place, and an array of Booleans Bs[N] such that Bs[T] = 1 iff an event takes place in time slot T. In ECLiPSe:
T #:: 1..N,
dim(Bs, [N]), Bs #:: 0..1,
Channeling between the two representations can then be set up with
( for(I,1,N), param(T,Bs) do Bs[I] #= (T#=I) )
which will propagate information both ways between T and Bs. Another way of implementing this channeling is the special purpose bool_channeling/3 constraint.
Start/End integer variables and Array of Booleans (timetable):
We have integer variables S,E indicating the start and end time of an activity. On the other side an array of Booleans Bs[N] such that Bs[T] = 1 iff the activity takes place at time T. In ECLiPSe:
[S,E] #:: 1..N,
dim(Bs, [N]), Bs #:: 0..1,
Channeling can be achieved via
( for(I,1,N), param(S,E,Bs) do Bs[I] #= (S#=<I and I#=<E) ).
Dual representation Job/Machine integer variables:
Here, Js[J] = M means that job J is executed on machine M, while the dual formulation Ms[M] = J means that machine M executes job J
dim(Js, [NJobs]), Js #:: 0..NMach,
dim(Ms, [NMach]), Ms #:: 1..NJobs,
And channeling is achieved via
( multifor([J,M],1,[NJobs,NMach]), param(Js,Ms) do
(Js[J] #= M) #= (Ms[M] #= J)
).
Set variable and Array of Booleans:
If you use a solver (such as library(ic_sets)) that can directly handle set-variables, these can be reflected into an array of booleans indicating membership of elements in the set. The library provides a dedicated constraint membership_booleans/2 for this purpose.
Here is a simple example, works in SWI-Prolog, but should
also work in ECLiPSe Prolog (in the later you have to use (::)/2 instead of (in)/2):
Constraint C1:
?- Y in 0..100.
Y in 0..100.
Constraint C2:
?- X in 0..100.
X in 0..100.
Channelling Constraint:
?- 2*X #= 3*Y+5.
2*X#=3*Y+5.
All together:
?- Y in 0..100, X in 0..100, 2*X #= 3*Y+5.
Y in 1..65,
2*X#=3*Y+5,
X in 4..100.
So the channel constraint works in both directions, it
reduces the domain of C1 as well as the domain of C2.
Some systems use iterative methods, with the result that this channelling
can take quite some time, here is an example which needs around
1 minute to compute in SWI-Prolog:
?- time(([U,V] ins 0..1_000_000_000, 36_641*U-24 #= 394_479_375*V)).
% 9,883,559 inferences, 53.616 CPU in 53.721 seconds
(100% CPU, 184341 Lips)
U in 346688814..741168189,
36641*U#=394479375*V+24,
V in 32202..68843.
On the other hand ECLiPSe Prolog does it in a blink:
[eclipse]: U::0..1000000000, V::0..1000000000,
36641*U-24 #= 394479375*V.
U = U{346688814 .. 741168189}
V = V{32202 .. 68843}
Delayed goals:
-394479375 * V{32202 .. 68843} +
36641 * U{346688814 .. 741168189} #= 24
Yes (0.11s cpu)
Bye

3SAT solved in polynomial time?

I have seen few errors in the cnf files for both satisfiable and unsatisfiable clauses files SATLIB Benchmark Problems
To be more specific I have found out that the 1st file of the zip folder here:
20 variables, 91 clauses - 1000 instances, all satisfiable
contains a file with the title of "uf20-01", the equation of which is unsatisfiable clearly as the 7th clause at the 15th line and the 87th clause at line number 4 are both exact inverse of each other!((5 19 17) and (-5 -19 -17))
Thus an AND operation having them at any point of time would result in equation to be unsatisfiable.
I have come to a conclusion that if two clauses are exact inverse of each other then and only then the equation is unsatisfiable, else the equation is satisfiable.. I have attempted another UNSAT file of the above link with trial and error and though the MINISAT browser version also says the same file to be unsatisified I have found out a solution for the same in 1's and 0's for every variable.
The algorithm above was posted to a journal by me but got rejected.
My question is :
Can somebody give me an example of an unsatisfiable 3SAT equation that contains only 3 variables(or maybe a bit more..) without having any clause being an exact inverse of the other?
If I can get such a clause then the algorithm is wrong(but still it proves many SAT benchmark problems to be UNSAT) and it would not prove that many UNSAT problems in the 1st link are indeed SAT.
This is teasing my mind and hope you all can understand it, as if the algorithm above is right, then I have proved P=NP! It can start a revolution also..
BTW: I have sent email to SATLIB contact person also but still no reply after 2 days concerning the 2nd link file.
In the 3-Sat in CNF all clauses are OR-clauses and they are combined by AND. So the two lines you cite define the following two clauses
x5 or x17 or x19
(not x5) or (not x17) or (not x19)
which can both be satisfied, for example, by setting x5 to true, x17 to false, and x19 arbitrary.
There are many:
(x1 or x2 or x3) and (not x1 or x2) and not x2 and not x3
In general you will need to introduce more variables to show this. But it intuitively even does not seem true that the inversion of all the variables of any clause is not needed for UNSAT to occur. As another answer points out, even in the most basic case, it is still SAT when this occurs. Perhaps the benchmark test set tends to have this but it does not generalize.

Finite-state transducer that computes the relation

From http://www.cse.ohio-state.edu/~gurari/theory-bk/theory-bk-twoli1.html#30007-23021r2.2.4:
Let M = <Q, Σ, Δ, δ, q0, F> be the deterministic finite-state transducer whose transition diagram is given in Figure 2.E.2.
For each of the following relations find a finite-state transducer that computes the relation.
a. { (x, y) | x is in L(M), and y is in Δ* }.
b. { (x, y) | x is in L(M), y is in Δ*, and (x, y) is not in R(M) }.
Yes, this is HW, but I have been struggling with these questions and could at least use pointers. If you want to create your own c. and/or d. examples just to show me HOW to do it rather than lead me to the answers for a. and b. then obviously I'm fine with that.
Thanks in advance!
Since you don't indicate what progress you've made so far, I'm going to assume that you've made no progress at all, and will give overall guidance for how you can approach this sort of problem.
First of all, examine the transition diagram. Do you understand what all the notations mean? Note that the transducer is described as deterministic. Do you understand what that means? Convince yourself that the transducer depicted in the transition diagram is, in fact, deterministic. Trace through it; try to get a sense for what inputs are accepted by the transducer, and what outputs it gives.
Next, figure out what L(M), Δ, and R(M) are for this transducer, since the questions refer to them. Do you know what those notations mean?
Do you know what it means for a transducer to compute a certain relation? Do you understand the { (x, y) | ... } notation for describing the relation?
Can you modify the transition diagram to eliminate the ε/0 transition and merge it into adjacent transitions (which then might output multiple symbols at a single transition)? (This can help, IMHO, with creating other transducers that accept the same input language. More so with part b, in this case, than part a.)
Describe for yourself the transducers you need to create, in a way that's independent of the original transducer. Will these transducers be deterministic?
Create the transition diagrams for these transducers.

Resources