OTTER inferences - logic

I'm writing an input file for OTTER that is very simple:
set(auto).
formula_list(usable).
all x y ([Nipah(x) & Encephalitis(y)] -> Causes(x,y)).
exists x y (Nipah(x) & Encephalitis(y)).
end_of_list.
I get this output for the search :
given clause #1: (wt=2) 2 [] Nipah($c2).
given clause #2: (wt=2) 2 [] Encephalitis($c1).
search stopped because sos empty
Why won't OTTER infer Causes($c2,$c1)?
EDIT:
I removed the square brackets from [Nipah(x) & Encephalitis(x)] and it worked. Why does this matter?

I'd answer with a question: Why did you use square brackets in the first place?
Look into Otter manual, Section 4.3, List Notation. Square brackets are used for lists, it's syntactic sugar that is expanded into special terms. In your case, it expanded to something like
all x y ($cons(Nipah(x) & Encephalitis(y), $nil) -> Causes(x,y)).
Why won't OTTER infer Causes($c2,$c1)?
Note that the resolution calculus is not complete in the sense that every formula provable in a given theory could be inferred by the calculus. This would be highly undesirable! Instead, resolution is only refutationally complete, meaning that if a given theory is
contradictory then the resolution will find a proof of the empty clause. So even if a clause C is a logical consequence of a set of clauses T, it doesn't mean that the resolution calculus can derive C from T. In your case, the fact that Causes($c2,$c1) follows from the input doesn't mean Otter has to derive it.

Related

Converting first-order logic to CNF without exponential blowup

When attempting to solve logic problems on a computer, it is usual to first convert them to CNF, because the best solving algorithms expect CNF as input.
For propositional logic, the textbook rules for this conversion are simple, but if you apply them as is, the result is one of the very rare cases where a program encounters double exponential resource consumption without being specifically constructed to do so:
a <=> (b <=> (c <=> ...))
with N variables, generates 2^2^N clauses, one exponential blowup in the conversion of equivalence to AND/OR, and another in the distribution of OR into AND.
The solution to this is to rename subterms. If we rewrite the above as something like
r <=> (c <=> ...)
a <=> (b <=> r)
where r is a fresh symbol that is being defined to be equal to a subterm - in general, we may need O(N) such symbols - the exponential blowups can be avoided.
Unfortunately, this runs into a problem when we try to extend it to first-order logic. Using TPTP notation where ? means 'there exists' and variables begin with capital letters, consider
a <=> ?[X]:p(X)
Admittedly this case is simple enough that there is no actual need to rename the subterm, but it's necessary to use a simple case for illustration, so suppose we are using an algorithm that just automatically renames arguments of the equivalence operator; the point generalizes to more complex cases.
If we try the above trick and rename the ? subterm, we get
r <=> ?[X]:p(X)
Existential variables are converted to Skolem symbols, so that ends up as
r <=> p(s)
The original formula then expands to
(~a | r) & (a | ~r)
Which is by construction equivalent to
(~a | p(s)) & (a | ~p(s))
But this is not correct! Suppose we had not done the renaming, but just expanded the original formula as it was, we would get
(~a | ?[X]:p(X)) & (a | ~?[X]:p(X))
(~a | ?[X]:p(X)) & (a | ![X]:~p(X))
(~a | p(s)) & (a | ~p(X))
which is critically different from the version we got with the renaming.
The problem is that equivalence needs both the positive and negative versions of each argument, but applying negation to terms that contain universal or existential quantifiers, structurally changes those terms; you cannot just encapsulate them in a definition, then apply the negation to the defined symbol.
The upshot of this is that when you have equivalence and the arguments may contain such quantifiers, you actually need to recur through each argument twice, once for the positive version, once for the negative. This suffices to bring back the existential blowup we hoped to avoid by doing the renaming. As far as I can see, this problem is not caused by the way a particular algorithm works, but by the nature of the task.
So my question:
Given an input formula that may contain arbitrary nesting of equivalence and quantifiers, is there any algorithm that will correctly turn this to CNF with a polynomial rather than exponential number of clauses?
As you observed, an existential such as ∃X.p(X) is not in fact equivalent to a Skolemized expression p(S). Its negation ¬∃X.p(X) is not equivalent to ¬p(S), but to ∀Y.¬p(Y).
Possible approaches that avoid the exponential blow-up:
Convert existentials such as ∃X.p(X) to universals such as ¬∀Y.p(Y), or vice versa, so you have a canonical form. Skolemize at a later step.
Remember when you convert that your p(S) is a Skolemized existential, and that its negation is ∀Y.¬p(Y).
Define terms equivalent to universals and existentials, such that a represents ∀Y.p(Y) and ¬a then represents ¬∀Y.p(Y), or equivalently, ∃X.¬p(X).
Use the symmetry of Boolean duals, so that the same transformations apply with AND and OR swapped, De Morgan’s Laws, and the equivalence between existentials and negated universals, to restore the symmetry between the expansions of r and ~r. The negations in the conversion between universals and existentials and in De Morgan's Laws cancel each other out, and the duality of switching AND and OR means you can re-use the result on the left to generate the one on the right mechanically again?
Given that you need to support ALL and NOT ALL statements anyway, this should not create any new problems. Just canonicalize and use the same approach you would for a universal.
If you’re solving by converting to SAT, your terms can represent universals, too. So, in your example, you’re trying to replace a with r, but you can still use ~a, equivalent to the negative universal.
In your expressions. you’d still use (~a | r) & (a | ~r), but expand ~r to its correct rather than the incorrect value. That example is trivial, since that’s just ~a, but you’d normally define r as equivalent to a more complex transformation, and in that case you need to remember what both r and ~r represent. It is not really a simple mechanical transformation of the Skolemized expression.
In this example, I’m not sure why it’s a problem that (~a | r) & (a | ~r) is equivalent to (~a | r) & (a | ~a), which simplifies to (~a | r). That’s not going to give you exponential blow-up? When you translate back to first-order predicate logic, make the correct translation.
Update
Thanks for clarifying what the problem was in chat. As I currently think I understand it, what you have is an equivalence with a left and a right side, which contains other nested equivalences, and you want to expand both the equivalence and its negation. The problem is that, because the negation does not have symmetrical form, you need to recurse twice for each nested equivalence in the tree, once when expanding the equivalence and once when expanding its negation?
You should define a transformation that generates the negative expansion from the positive expansion in linear time, and divide-and-conquer the expressions containing nested equivalences using that. This seems to be what you were after with the ~p(S) transformation.
To do this, you recall that ¬∃X.p(X) is equivalent to ∀X.¬p(X), and vice versa. Then if you’ve expanded p(x) into normal form as conjunctions and disjunctions, De Morgan’s Laws lets you turn an expression like ¬(a ∨ ¬b) into ¬a ∧ b. The inner ¬ on the quantifier transformation and the outer ¬ on the De Morgan transformation cancel each other out. Finally, the dual of any Boolean equivalence remains valid when you replace each ∨ and ∧ with the other and any atom a or ¬a with its inverse.
So, while I might be making an error, especially at 1 AM, it looks to me like what you want is the dual transformation that substitutes:
An outer ∃ for ∀ and vice versa
∧ for ∨ and vice versa
Each term t with ¬t and vice versa
Apply this to the expansion of the positive equivalence to generate the negative dual in time proportional to its length, without further recursion.

What is the difference between Well-formed formula and a preposition in propositional logic

What is the exact difference between Well-formed formula and a proposition in propositional logic?
There's really not much given about Wff in my book.
My book says: "Propositions are also called sentences or statements. Another term formulae or well-formed formulae also refer to the same. That is, we may also call Well formed formula to refer to a proposition". Does that mean they both are the exact same thing?
Proposition: A statement which is true or false, easy for people to read but hard to manipulate using logical equivalences
WFF: An accurate logical statement which is true or false, there should be an official rigorus definition in your textbook. There are 4 rules they must follow. Harder for humans to read but much more precise and easier to manipulate
Example:
Proposition : All men are mortal
WFF: Let P be the set of people, M(x) denote x is a man and S(x)
denote x is mortal Then for all x in P M(x) -> S(x)
It is most likely that there is a typo in the book. In the quote Propositions are also called sentences or statements. Another term formulae or well-formed formulae also refer to the same. That is, we may also call Well formed formula to refer to a preposition, the word "preposition" should be "proposition".
Proposition :- A statement which is either true or false,but not both.
Propositional Form (necessary to understand Well Formed Formula) :- An assertion which contains at least one propositional variable.
Well Formed Formula :-A propositional form satisfying the following rules and any Wff(Well Formed Formula) can be derived using these rules:-
If P is a propositional variable then it is a wff.
If P is a propositional variable,then ~P is a wff.
If P and Q are two wffs then,(A and B),(A or B),(A implies B),(A is equivalent to B) are all wffs.

Custom data structure syntax in Prolog

In Prolog, [H|T] is the list that begins with H and where the remaining elements are in the list T (internally represented with '.'(H, '.'(…))).
Is it possible to define new syntax in a similar fashion? For example, is it possible to define that [T~H] is the list that ends with H and where the remaining elements are in the list T, and then use it as freely as [H|T] in heads and bodies of predicates? Is it also possible to define e.g. <H|T> to be a different structure than lists?
One can interpret your question literally. A list-like data structure, where accessing the tail can be expressed without any auxiliary predicate. Well, these are the minus-lists which were already used in the very first Prolog system — the one which is sometimes called Prolog 0 and which was written in Algol-W. An example from the original report, p.32 transliterated into ISO Prolog:
t(X-a-l, X-a-u-x).
?- t(nil-m-e-t-a-l, Pluriel).
Pluriel = nil-m-e-t-a-u-x.
So essentially you take any left-associative operator.
But, I suspect, that's not what you wanted. You probably want an extension to lists.
There have been several attempts to do this, one more recent was Prolog III/Prolog IV. However, quite similar to constraints, you will have to face how to define equality over these operators. In other words, you need to go beyond syntactic unification into E-unification. The problem sounds easy in the beginning but it is frightening complex. A simple example in Prolog IV:
>> L = [a] o M, L = M o [z].
M ~ list,
L ~ list.
Clearly this is an inconsistency. That is, the system should respond false. There is simply no such M, but Prolog IV is not able to deduce this. You would have to solve at least such problems or get along with them somehow.
In case you really want to dig into this, consider the research which started with J. Makanin's pioneering work:
The Problem of Solvability of Equations in a Free Semi-Group, Akad. Nauk SSSR, vol.233, no.2, 1977.
That said, it might be the case that there is a simpler way to get what you want. Maybe a fully associative list operator is not needed.
Nevertheless, do not expect too much expressiveness from such an extension compared to what we have in Prolog, that is DCGs. In particular, general left-recursion would still be a problem for termination in grammars.
It is possible to extend or redefine syntax of Prolog with iso predicate
:- op(Precedence, Type, Name).
Where Precedence is a number between 0 and 1200, Type describe if the operatot is used postfix,prefix or infix:
infix: xfx, xfy, yfx
prefix: fx, fy
suffix: xf, yf
and finally name is the operator's name.
Operator definitions do not specify the meaning of an operator, but only describe how it can be used syntactically. It is only a definition extending the syntax of Prolog. It doesn't gives any information about when a predicate will succeed. So you need also to describe when your predicate succeeds. To answer your question and also give an example you could define :
:- op( 42, xfy, [ ~ ]).
where you declare an infix operator [ ~ ]. This doesn't means that is a representation of a list (yet). You could define clause:
[T ~ H]:-is_list([H|T]).
which matches [T~H] with the list that ends with H and where the remaining elements are in the list T.
Note also that it is not very safe to define predefined operators
like [ ] or ~ because you overwrite their existing functionality.
For example if you want to consult a file like [file]. this will
return false because you redefined operators.

Using induction to tell if the given symbols make a valid formula in prolog

We have just started to learn prolog in my classroom and our first exercise goes as follows:
Problem:
Note: assume that there are only two atoms such as a and b instead of infinitely many.
a) Write a prolog program that translates the inductive definition of Propositional Logic formula (i.e. that translates Definition 1). For this, you need to use:
1.) a unary predicate “at” to denote atomic formulas (so “at(f)” means “F is an atomic formula”, where f is a constant).
2.) a unary predicate “fmla” to denote formulas (so “fmla(F)” means “F is a formula”).
3.) a unary operation “neg” to denote negation (so “neg(F)” stands for ¬F).
4.) a binary operation “or” to denote disjunction of two formulas (so “or(F,G)” stands for (F∨G)).
Attempt:
at(a). % Our first atom.
at(b). % Our second atom.
fmla(F):-
at(F).
neg(F):-
fmla(F).
or(F,G):-
fmla(F), fmla(G).
fmla(F):-
or(F,G),
neg(F),
fmla(G).
Example of a valid formula: (~A v B) ---> or(neg(a), b).
I believe that the way that I've struct my program is correct but the recursive part doesn't work(I'm also getting a singleton error there). I've trying to figure this out for hours but to no avail. Any help would be appreciated.
You've already got the atoms right. Now simply walk through the inductive definition of a propositional logic formula:
at(a). % Our first atom.
at(b). % Our second atom.
fmla(F):- % an atom is a formula
at(F).
fmla(neg(F)) :- % neg(F) is a formula if F is a formula
fmla(F).
fmla(or(F,G)) :- % or(F,G) is a formula if F and G are formulas
fmla(F),
fmla(G).
fmla(and(F,G)) :- % and(F,G) is a formula if F and G are formulas
fmla(F),
fmla(G).
If you try to query this with your above example:
?- fmla(or(neg(a), b)).
yes
Whether you need the last rule for the conjunction or not depends on which inductive definition you are sticking to. Some textbooks only use negation and disjunction as conjunction can be expressed as and(A,B) = neg(or(neg(A),neg(B))).

Flattening quantification over relations

I have a Relation f defined as f: A -> B × C. I would like to write a firsr-order formula to constrain this relation to be a bijective function from A to B × C?
To be more precise, I would like the first order counter part of the following formula (actually conjunction of the three):
∀a: A, ∃! bc : B × C, f(a)=bc -- f is function
∀a1,a2: A, f(a1)=f(a2) → a1=a2 -- f is injective
∀(b, c) : B × C, ∃ a : A, f(a)=bc -- f is surjective
As you see the above formulae are in Higher Order Logic as I quantified over the relations. What is the first-order logic equivalent of these formulae if it is ever possible?
PS:
This is more general (math) question, rather than being more specific to any theorem prover, but for getting help from these communities --as I think there are mature understanding of mathematics in these communities-- I put the theorem provers tag on this question.
(Update: Someone's unhappy with my answer, and SO gets me fired up in general, so I say what I want here, and will probably delete it later, I suppose.
I understand that SO is not a place for debates and soapboxes. On the other hand, the OP, qartal, whom I assume is the unhappy one, wants to apply the answer from math.stackexchange.com, where ZFC sets dominates, to a question here which is tagged, at this moment, with isabelle and logic.
First, notation is important, and sloppy notation can result in a question that's ambiguous to the point of being meaningless.
Second, having a B.S. in math, I have full appreciation for the logic of ZFC sets, so I have full appreciation for math.stackexchange.com.
I make the argument here that the answer given on math.stackexchange.com, linked to below, is wrong in the context of Isabelle/HOL. (First hmmm, me making claims under ill-defined circumstances can be annoying to people.)
If I'm wrong, and someone teaches me something, the situation here will be redeemed.
The answerer says this:
First of all in logic B x C is just another set.
There's not just one logic. My immediate reaction when I see the symbol x is to think of a type, not a set. Consider this, which kind of looks like your f: A -> BxC:
definition foo :: "nat => int × real" where "foo x = (x,x)"
I guess I should be prolific in going back and forth between sets and types, and reading minds, but I did learn something by entering this term:
term "B × C" (* shows it's of type "('a × 'b) set" *)
Feeling paranoid, I did this to see if had fallen into a major gotcha:
term "f : A -> B × C"
It gives a syntax error. Here I am, getting all pedantic, and our discussion is ill-defined because the notation is ill-defined.
The crux: the formula in the other answer is not first-order in this context
(Another hmmm, after writing what I say below, I'm full circle. Saying things about stuff when the context of the stuff is ill-defined.)
Context is everything. The context of the other site is generally ZFC sets. Here, it's HOL. That answerer says to assume these for his formula, wich I give below:
Ax is true iff x∈A
Bx is true iff x∈B×C
Rxy is true iff f(x)=y
Syntax. No one has defined it here, but the tag here is isabelle, so I take it to mean that I can substitute the left-hand side of the iff for the right-hand side.
Also, the expression x ∈ A is what would be in the formula in a typical set theory textbook, not Rxy. Therefore, for the answerer's formula to have meaning, I can rightfully insert f(x) = y into it.
This then is why I did a lot of hedging in my first answer. The variable f cannot be in the formula. If it's in the formula, then it's a free variable which is implicitly quantified. Here's the formula in Isar syntax:
term "∀x. (Ax --> (∃y. By ∧ Rxy ∧ (∀z. (Bz ∧ Rxz) --> y = z)))"
Here it is with the substitutions:
∀x. (x∈A --> (∃y. y∈B×C ∧ f(x)=y ∧ (∀z. (z∈B×C ∧ f(x)=z) --> y = z)))
In HOL, f(x) = f x, and so f is implicitly, universally quantified. If this is the case, then it's not first-order.
Really, I should dig deep to recall what I was taught, that f(x)=y means:
(x,f(x)) = (x,y) which means we have to have (x,y)∈(A, B×C)
which finally gets me:
∀x. (x∈A -->
(∃y. y∈B×C ∧ (x,y)∈(A,B×C) ∧ (∀z. (z∈B×C ∧ (x,z)∈(A,B×C)) --> y = z)))
Finally, I guess it turns out that in the context of math.stackexchange.com, it's 100% on.
Am I the only one who feels compulsive about questioning what this means in the context of Isabelle/HOL? I don't accept that everything here is defined well enough to show that it's first order.
Really, qartal, your notation should be specific to a particular logic.
First answer
With Isabelle, I answer the question based on my interpretation of your
f: A -> B x C, which I take as a ZFC set, in particular a subset of the
Cartesian product A x (B x C)
You're sort of mixing notation from the two logics, that of ZFC
sets and that of HOL. Consequently, I might be off on what I think you're
asking.
You don't define your relation, so I keep things simple.
I define a simple ZFC function, and prove the first
part of your first condition, that f is a function. The second part would be
proving uniqueness. It can be seen that f satisfies that, so once a
formula for uniqueness is stated correctly, auto might easily prove it.
Please notice that the
theorem is a first-order formula. The characters ! and ? are ASCII
equivalents for \<forall> and \<exists>.
(Clarifications must abound when
working with HOL. It's first-order logic if the variables are atomic. In this
case, the type of variables are numeral. The basic concept is there. That
I'm wrong in some detail is highly likely.)
definition "A = {1,2}"
definition "B = A"
definition "C = A"
definition "f = {(1,(1,1)), (2,(1,1))}"
theorem
"!a. a \<in> A --> (? z. z \<in> (B × C) & (a,z) \<in> f)"
by(auto simp add: A_def B_def C_def f_def)
(To completely give you an example of what you asked for, I would have to redefine my function so its bijective. Little examples can take a ton of work.)
That's the basic idea, and the rest of proving that f is a function will
follow that basic pattern.
If there's a problem, it's that your f is a ZFC set function/relation, and
the logical infrastructure of Isabelle/HOL is set up for functions as a type.
Functions as ordered pairs, ZFC style, can be formalized in Isabelle/HOL, but
it hasn't been done in a reasonably complete way.
Generalizing it all is where the work would be. For a particular relation, as
I defined above, I can limit myself to first-order formulas, if I ignore that
the foundation, Isabelle/HOL, is, of course, higher-order logic.

Resources