If I have a situation where a state in the DFA have shift/reduce conflict,
where both shift and reduce applies, let the next symbol be "t" and we have the following rules
X -> F.
Y -> F.tG
and t belongs the follow of X
What should I do in this case?
I know by definition that's not an SLR(1) Grammar but according to the algorithm shown https://imgur.com/a/yxy9L48, what should the algorithm do? Should it report an Error?
The algorithm says we report an error if neither (shift or reduce) applies, but what happens if both apply?
You should have detected this error when you attempted to construct the parser. The SLR parser generation algorithm must fail if there is a conflict.
Related
The following algorithm is a rough sketch of model checking with Computational Tree Logic (CTL):
It is stated that:
The model-checking problem for CTL is to verify for a given transition system TS and CTL formula Φ whether TS |= Φ... The basic procedure for CTL model checking is rather straightforward:
the set Sat(Φ) of all states satisfying Φ is computed recursively, and
it follows that TS |= Φ if and only if I ⊆ Sat(Φ)
where I is the set of initial states of TS...
The recursive computation of Sat(Φ) basically boils down to a bottom-up traversal of the parse tree of the CTL state formula Φ.
So you essentially (from my understanding), you provide the system with a CTL formula Φ, which is a parse tree, and then it searches through the states, and through the CTL parse tree, and checks if any state satisfies Φ.
The question is:
In the Sat(Φ) method, roughly what happens (the symbolic stuff). They say (2) below, where S is states and A is atomic propositions. Wondering how they actually check the states, given that the program isn't actually running. It is (at least I think) Symbolic Model Checking. Wondering if one could explain roughly how the state checking works. It seems like some sort of input generation has to occur, but at the same time I'm thinking maybe it shouldn't occur.
The reason for it being hard to understand for me is this. Say one of the assertions is for a function addTricky(x, y) which is implemented like this:
function addTricky(x, y) {
if (y >= 1) return 3
return x + y
}
Then I would have a Boolean expression in some logic that says "before addTricky : z = 0. after z = addTricky(x, y) : y >= 1 -> z = 3 ; y < 1; z = x + y".
Basically trying to get at the question of patterns. If Sat(Φ) is doing basically what I just did in that Boolean expression, I wonder if it ever calls/invokes the function addTricky, or if it can do it all symbolically somehow. I don't see how that works yet, wondering if the basics of how the symbolic execution works could be explained a bit. To me I keep imagining it doing some sort of Unit Testing, like plugging in addTricky(1, 1) for example, and checking all the possibilities. Maybe that is "explicit state exploration" vs. symbolic exploration, not sure.
Thank you so much for the help!
(1) For each node of the parse tree, i.e., for each subformula Ψ of Φ, the set Sat(Ψ) of states is computed for which Ψ holds.
(2) Sat(a) = {s ∈ S | a ∈ L(s)}, for any a ∈ A
I think there are two parts to your question: 1) How to go from a software function to a transition system and 2) how is the transition system used to check satisfaction.
1) A transition system is basically an extension of a finite state automaton. If you have a function like you described, you first need to transform it into a transition system. This can be done, for example, by introducing states for each executable line of your code, and transitions between those states that follow the conditions of your code. At the transition system level you do not have the concept of function call, therefore you need to take care of this during the translation e.g., by in lining function definitions. This step is independent on how you verify the transition system. As you can imagine this can lead to pretty large transition systems.
There are other approaches, that are not based on transition systems, that simulate the execution of the program and collect symbolic constraints along the way. Symbolic execution is such an example.
2) Let's say that you inline your addTricky function and get something along these lines
L0: z=0
if (y>=1)
L1: z=3
else
L2: z=x+y
A possible TS is:
(L0: z=0) --[y >= 1]--> (L1: z=3)
|
[y<1]
\/
(L2: z=x+y)
You have 3 executable statements and this leads to a TS whose symboiic states (S) are:
L0: Z=0; X=?; Y=?
L1: Z=3; X=?; Y>=1
L2: Z=X+Y; X=?; Y<1
where ? means any value. The power of this approach is that you can compactly represent all the values of X and Y in a single symbolic state.
Here is a very simple cnf instance as (x1 or x2 or x3)&(x1 or x2)&(x2 or x3)and the formula is definitely satisfiable, the solution is x1 = x2 = x3 = 1, that is enough. So,my question is how the solver produce the assignment using DPLL or other procedure? Thanks.
Well, basically, for the case of CDCL
(CDCL SAT solvers implement DPLL, but can learn new clauses and backtrack non-chronologically. Clause learning with conflict analysis does not affect soundness or completeness. Conflict analysis identifies new clauses using the resolution operation. Therefore each learnt clause can be inferred from the original clauses and other learnt clauses by a sequence of resolution steps. If cN is the new learnt clause, then ϕ is satisfiable if and only if ϕ ∪ {cN} is also satisfiable. Moreover, the modified backtracking step also does not affect soundness or completeness, since backtracking information is obtained from each new learnt clause.).(Source : Wikipedia)
it's working as follow :
At first pick a branching variable, x1. A yellow circle means an arbitrary decision.
Now apply unit propagation, which yields that x4 must be 1 (i.e. True). A gray circle means a forced variable assignment during unit propagation. The resulting graph is called implication graph.
Arbitrarily pick another branching variable, x3.
Apply unit propagation and find the new implication graph.
Here the variable x8 and x12 are forced to be 0 and 1, respectively.
Pick another branching variable, x2.
Find implication graph.
Pick another branching variable, x7.
Find implication graph.
Found a conflict!
Find the cut that lead to this conflict. From the cut, find a conflicting condition.
Take the negation of this condition and make it a clause.
Add the conflict clause to the problem.
Non-chronological back jump to appropriate decision level.
Back jump and set variable values accordingly.
(Answer completely from Wikipedia: Conflict-Driven_Clause_Learning#Example)
Here is a list (not complete for sure) of solvers who use the CDCL algorithm, you should check them out :
MiniSAT.
Zchaff SAT.
Z3.
ManySAT.
I'm writing an input file for OTTER that is very simple:
set(auto).
formula_list(usable).
all x y ([Nipah(x) & Encephalitis(y)] -> Causes(x,y)).
exists x y (Nipah(x) & Encephalitis(y)).
end_of_list.
I get this output for the search :
given clause #1: (wt=2) 2 [] Nipah($c2).
given clause #2: (wt=2) 2 [] Encephalitis($c1).
search stopped because sos empty
Why won't OTTER infer Causes($c2,$c1)?
EDIT:
I removed the square brackets from [Nipah(x) & Encephalitis(x)] and it worked. Why does this matter?
I'd answer with a question: Why did you use square brackets in the first place?
Look into Otter manual, Section 4.3, List Notation. Square brackets are used for lists, it's syntactic sugar that is expanded into special terms. In your case, it expanded to something like
all x y ($cons(Nipah(x) & Encephalitis(y), $nil) -> Causes(x,y)).
Why won't OTTER infer Causes($c2,$c1)?
Note that the resolution calculus is not complete in the sense that every formula provable in a given theory could be inferred by the calculus. This would be highly undesirable! Instead, resolution is only refutationally complete, meaning that if a given theory is
contradictory then the resolution will find a proof of the empty clause. So even if a clause C is a logical consequence of a set of clauses T, it doesn't mean that the resolution calculus can derive C from T. In your case, the fact that Causes($c2,$c1) follows from the input doesn't mean Otter has to derive it.
What are finitely failed derivations? Are refutations the same as contradictions in the mathematical sense? What's the difference between general logic programs and definite logic programs?
There are no finitely failed derivations. Only failed derivations and finitely failed derivations trees. A failed derivations is a derivation that ends in failure. For example:
p :- q.
p :- p.
q :- fail.
The derivation that consists of the first rule of p and then the only rule of q is a failed derivation. Derivations might not only fail because an undefined predicate such as fail, but also because some head unification does not completely succeed.
Now what is a finitely failed derivation tree. Well if you look at all the derivations you get a tree. In a finitely failed derivation tree, the tree is finite and each derivation is failed. Finitely failed derivation trees have the following nice property:
- The interpreter terminates.
- The interpreter does not produce any answer substitution.
In practical Prolog systems this means that after posing your question you will get a No after a while (in some Prolog systems a false is displayed). Interestingly the above program will not terminate for the query p. It is an instance of a infinite derivation tree where each derivation is failed. The derivations are:
p - q - fail
p - p - q - fail
p - p - p - q - fail
Etc..
The notion of finitely failed derivation trees is defined for definite Prolog programs. One can now extend the notion of a Prolog program into general Prolog programs. In a general Prolog program the body might contain negative literals. And the idea is that the interpreter regresses into checking for finitely failed derivation trees for these literals.
One important question is how finitely failed derivation trees relate to mathematical derivations. Under what mathematical semantic should a goal fail? And how could we build an interpreter that implements this semantic? A particular class of semantics is based on the refutation method. Here we explain a derivation as establishing a contradiction:
P, ~G |= f => P |- G
This more or less implies double negation elimination and thus classical logic. But also other logics can be instrumental. As a start you might want to lookup the following book:
Logic for Applications
Anil Nerode, Richard A. Shore
2nd. Edition, 1997, Springer
Bye
To preface this, my knowledge of this kind of stuff is puny.
Anyways, I've been developing a context-free grammar to describe the structure of alegbraic expressions so I can teach myself how the CYK parsing algorithm works. I understand how such a structure can work with only infix algebraic expressions, but I cannot understand how to develop a grammar that can handle both the unary and binary definitions of the "-" operator.
For reference, here's the grammar I've written (where S is the start symbol) in CNF:
S -> x
A -> O S
S -> L B
B -> S R
S -> K S
O -> +
O -> -
O -> *
O -> /
O -> ^
K -> -
L -> (
R -> )
The problem is that how can the CYK parsing algorithm know ahead of time whether to decide between S -> K S and A -> O S when it encounters the "-" operator? Is such a grammar context-free anymore? And most importantly, since programming languages can handle languages with both the binary and unary minus sign, how should I reasonably parse this?
This seems like a problem related to finite state automata and I don't remember everything from my coursework, but I wrote a CYK parser in OCaml, so I'll go ahead and take a shot :)
If you're trying to parse an expression like 3- -4 for example, you would have your S -> K S rule consume the -4 and then your A -> O S rule would absorb the - -4. This would eventually work up to the top-most S production rule. You should be careful with the grammar you're using though, since the A production rule you listed cannot be reached from S and you should probably have a S -> S O S rule of some sort.
The way that CYK parsing algorithms work is through backtracking, not through the "knowing ahead of time" that you mentioned in your question. What your CYK algorithm should do is to parse the -4 as a S -> K S rule and then it would try to absorb the second - with the S -> K S rule again because this production rule allows for an arbitrarily long chain of unary -. But once your algorithm realizes that it's stuck with the intermediate parse 3 S, it realizes that it has no production symbols that it can use to parse this. Once it realizes that this is no longer parseable, it will go back and instead try to parse the - as an S -> O S rule instead and continue on its merry way.
This means that your grammar remains context-free since a context-sensitive grammar means that you have terminals on the left side of the production rules, so you're good in that respect. HTH!
The grammar is ambiguous, and the parser cannot decide which case to take.
You should probably use a grammar like the following:
S -> EXPR
EXPR -> (EXPR)
EXPR -> - EXPR
EXPR -> EXPR + EXPR
EXPR -> EXPR - EXPR
// etc...
Grammars based on algebraic expressions are rather difficult to disambiguate. Here are some examples of problems which need to be addressed:
a+b+c naturally creates two parse trees. To resolve this, you need to resolve the ambiguity of the associativity of +. You may wish to let a left-to-right parsing strategy take care of this for you, but careful: exponentiation should probably associate right-to-left.
a+b*c naturally creates two parse trees. To fix this problem, you need to deal with operator precedence.
implicit multiplication (a+bc), if it is allowed, creates all sorts of nightmares, mostly at tokenization.
unary subtraction is problematic, as you mention.
If we want to solve these problems, but still have a fast-parsing grammar specialized for algebra, one approach is to have various "levels" of EXPR, one for each level of binding required by precedence levels. For example,
TERM -> (S)
EXPO -> TERM ^ EXPO
PROD -> PROD * EXPO
PROD -> PROD / EXPO
PROD -> -PROD
SUM -> SUM + PROD
SUM -> SUM - PROD
S -> SUM
This requires that you also allow "promotion" of types: SUM -> PROD, PROD -> EXP, EXP -> TERM, etc, so that things can terminate.
Hope this helps!