Trying to understand the tableaux method of propositional calculus - logic

I have been reading Raymond Smullyan's wonderful book titled Logical Labyrinths. I'm having trouble understanding the completeness proof of the Tableaux method. In order to show that a sentence X is a tautology, we show that F X results in a closed tableau, i.e., every branch of the completed tableaux is closed. Why can't we equivalently say that X is a tautology if every branch of a completed tableau for T X is open? Is it because different branches could assign different values to the same atom to satisfy the branch?
Because every branch is a conjunction (atleast in my understanding), is it fair to say that a completed tableaux expresses a sentence X in a disjunctive normal form?
Any pointers would be very helpful. Thanks a lot.

Related

How is predicate logic represented in Prolog?

may be a strange and broad question and not a 100% programming question, but I hope this is ok. I recently had a discussion about, that a lot of programs in Prolog don´t follow strict predicate logic (of Frege) but often are "object oriented" which I am trying to grasp.
I know that Prolog is based on first order predicate logic especially Horn Clauses and that they are a special form of modus ponens. A fact and a rule if they occur solo are simply clauses, but as soon as I add more than one occurrence they become a predicate.
How are the quantors of first order predicate logic represented and related to fact , rule , predicate or the Prolog concept in general? What does the functor express and what the arguments in relation to predicate logic. How is predicate logic and first order predicate logic reflected in Prolog and where does prolog leave their concepts? e.g. how would I define a point, a line and a vertical line in predicate logic and first order predicate logic.
How do I formulate this in predicate logic and first order predicate logic what is the semantic and logic difference between
vertical(line).
line(vertical).
Or a line and point in this example. Are point and line not predicate logic?
For me it is " point(X) the set of all points" and when I pick a concrete point "there exists one point(110, 12)."
point(X,Y).
line(point(W,X), point(Y,Z)).
vertical(line(point(X,Y), point(X,Z))).
horizontal(line(point(X,Y), point(Z,Y))).
Any info helps! Many thanks, H
A chapter of Programming in Prolog by W.Clocksin and C.Mellish is devoted to explain the relation of Prolog with logic. Citing from there
If we wish to discuss how Prolog is related to logic, we must first establish what we
mean by logic. Logic was originally devised as a way of representing the form of
arguments, so that it would be possible to check in a formal way whether or not they
are valid. Thus we can use logic to express propositions, the relations between propositions and how one can validly infer some propositions from others. The particular
form of logic that we will be talking about here is called the Predicate Calculus. We
will only be able to say a few words about it here. There are scores of good basic
introductions to logic you can turn to for background reading.
If we wish to express propositions about the world, we must be able to describe
the objects that are involved in them. In Predicate Calculus, we represent objects by
terms. A term is of one of the following forms:
A constant symbol. This is a symbol that stands for a single individual or concept.
We can think of this as a Prolog atom, and we will use the Prolog syntax. So
greek, agatha, and peace are constant symbols.
A variable symbol. This is a symbol that we may want to stand for different
individuals at different times. Variables are really only introduced in conjunction
with quantifiers, which are discussed below. We can think of them as Prolog
variables and will use the Prolog syntax. Thus X, Man, and Greek are variable
symbols.
A compound term. A compound term consists of a function symbol, together
with an ordered set of terms as its arguments. The idea is that the compound
term represents some individual that depends on the individuals represented by
the arguments. The function symbol represents how the first depends on the second. For instance, we could have a function symbol standing for the notion of
"distance" and two arguments. In this case, the compound term stands for the
distance between the objects represented by the arguments. We can think of a
compound term as a Prolog structure with the function symbol as the functor.
We will write Predicate Calculus compound terms using the Prolog syntax, so
that, for instance, wife(henry) might mean Henry's wife, distance(point1, X)
might mean the distance between some particular point and some other place to
be specified, and classes(mary, dayafter(W)) might mean the classes that Mary
teaches on the day after some day W to be specified.
Thus in Predicate Calculus the ways of representing objects are just like the ways available in Prolog.
Seems not appropriate to put the entire chapter here... there is also a program, very explanatory, in appendix B, that performs an automatic translation of WFFs into clauses.
The book is very readable, just a pity it's not among the titles in Free Prolog Programming Books section.
I know that Prolog is based on first order predicate logic especially Horn Clauses and that they are a special form of modus ponens.
In a sense, inverse "modus ponens":
a :- b
You want to show "a true", and to do so, you have to show "b true"
A fact and a rule if they occur solo are simply clauses, but as soon as I add more than one occurrence they become a predicate.
No, they are all predicates. The "predicate" is an object/agent/program/platonic-phenomenon which expresses that there (objectively) is some "relationship" between "things", and you can ask the Prolog Processor about that relationship. There is no direct meaning associated to all of that though, it's "strings related to strings via strings". We are working with syntactic machines after all (i.e. computers).
Enter this logic program:
p(x,y). % Predicate p/2 states that there is a relationship p between x and y
And now, you can query the database about what the program is saying:
?- p(x,y).
true. % a p relationship exists (fact, but could also be rule)
?- p(x,A).
A = y. % the thing related to x via p is y
?- p(A,y).
A = x. % the thing related to x via p is y
?- p(A,B).
A = x, % things related via p are x and y
B = y.
?- p(c,d).
false. % not REALLY "false" but "as far as I can tell, there
% is no relationship p between c and d"
Note the interpretation of "false", which is not the "strong false" of classical logic. Even though it is traditionally state that Prolog works in classical logic, this is not really the case:
From "Logic Programming with Strong Negation" (David Pearce, Gerd Wagner, FU Berlin, 1991), appears in Springer LNAI 475: Extensions of Logic Programming, International Workshop Tübingen, FRG, December 8–10, 1989 Proceedings):
According to the standard view, a logic program is a set of definite Horn clauses. Thus, logic programs are regarded as syntactically restricted first-order theories within the framework of classical logic. Correspondingly, the proof theory of logic programs is considered as the specialized version of classical resolution, known as SLD-resolution. This view, however, neglects the fact that a program clause, a_0 <— a_1, a_2, • • •, a_n, is an expression of a fragment of positive logic (a subsystem of intuitionistic logic) rather than an implicational formula of classical logic. The classical interpretation of logic programs, therefore, seems to be a semantical overkill.
It should be clear that in order to explain the deduction mechanism of Prolog one does not have to refer to the indirect method of SLD-resolution which checks for the refutability of the contrary. It is certainly more natural to view Prolog's proof procedure as a kind of natural deduction, as, for example, in [Hallnäs & Schroeder-Heister 1987] and [Miller 1989]. This also is more in line with the intuitions of a Prolog programmer. Since Prolog is the paradigm, logic programming semantics should take it as a point of departure.
Now:
How are the quantors of first order predicate logic represented and related
to fact, rule, predicate or the Prolog concept in general?
That is a long story. Note that Prolog is primarily about "programming using logic", and also about "modeling using logic". The two aspects certainly overlap well for problems that can be solved using explicit enumeration, but Prolog is not made for specifying general FOL constraints describing a sought-for solution. In fact, certain FOL constraints cannot be represented and other have to be transformed into nominally equivalent expression that are agreeable to the machine. Look up "skolemization". For example: https://www.cs.toronto.edu/~sheila/384/w11/Lectures/csc384w11-KR-tutorial.pdf
On the flip side, Prolog provides "meta-predicates" which generate solutions by calling other predicates, so it's making forays into second-order logic. As it must - nobody can survive in the FOL desert for long.
What does the functor express
Nothing. It just stands for itself. Pure syntax. Look up "Herbrand Universe".
How do I formulate this in predicate logic and first order predicate logic
what is the semantic and logic difference between
vertical(line).
line(vertical).
It's you who imbues vertical and line with meaning. So, feelings. You want a "vertial line", so you would say, the "thing" is the "line" and "vertical" is an attribute of the "line". So vertical(line) sounds appropriate. Or maybe attribute(line,vertical). It depends.
Here:
point(X,Y).
line(point(W,X), point(Y,Z)).
You have to aspects:
Predicates express "relationships". "Function symbols" are used to construct "things with structure": you can form trees of stuff with function symbols on nodes and integers/strings/variables on leaves. These are called "term". But terms can appear as predicates or as things, depending on the context, it's quite fluid. So you can for example construct a Prolog program with another Prolog program.
point(X,Y)
line(point(W,X), point(Y,Z))
These are terms!
If you type this into a file program.pl:
point_on_line(point(X,Y),line(point(W,X), point(Y,Z))).
The terms appear as "things" related by predicate point_on_line/2. The whole line is itself a term.
If you type this into a file program.pl:
point(X,Y).
line(point(W,X), point(Y,Z)).
The terms appear as "predicates", and point appears both as predicate point/2 and as "thing" about which predicate line/2 is talking.
This is actually a vast subject and it takes some time getting used to it, much more than functional programming. I had some Prolog and Logic courses at uni but 20 years later I found out that I had badly misunderstood a lot of aspects.

How to understand the recursive search in Prolog?

Here is a section of Prolog code defining numeral in a recursive way:
numeral(0).
numeral(succ(X)) :- numeral(X).
When given query numeral(X). Prolog will return:
X = 0 ;
X = succ(0) ;
X = succ(succ(0)) ;
X = succ(succ(succ(0))) ;
X = succ(succ(succ(succ(0)))) ;
X = succ(succ(succ(succ(succ(0))))) ;
X = succ(succ(succ(succ(succ(succ(0)))))) ;
X = succ(succ(succ(succ(succ(succ(succ(0))))))) ;
X = succ(succ(succ(succ(succ(succ(succ(succ(0))))))))
yes
Based on what I have learned, when doing the query, prolog will firstly make X into a variable like (_G42), then it will search the facts and rules to find the match.
In this case, it will find 0 (fact) as a right match. Then it will also try to match the rule. That is considering _G42 is not 0, and _G42 is the succ of another number. Thus, another variable is generated(like _G44), _G44 will match 0 and will also go further like _G42. Since _G44 matches 0, then it will go backward to _G42, getting _G42 = succ(_G44) = succ(0).
I am not sure if I am right about the understanding. I made a diagram to show my comprehension on this problem.
If the analysis is correct, I still feel difficult to design the recursive function like this. Since I am new to Prolog, I want to know if this kind of definition always used in application (say building an expert system, verifying protocols) or it is just for beginners to better understanding the basic searching procedure? If it is often used, what is the key point to design this kind of recursive definition?
My personal opinion: Especially as a beginner, you have zero chance to"understand the recursive search in Prolog". Countless beginners are trying to understand Prolog in this way, and they very consistently fail.
The sad part is that this hits hardest workers the hardest: You always think you can somehow understand it, but in the end, you cannot, because there are too many ways to invoke even the simplest predicates, with uninstantiated and (partly) instantiated arguments, and even with aliased variables.
Your graph nicely illustrates that such a procedural reading gets extremely unwieldy very quickly for even the simplest conceivable recursive definitions.
A much more tractable approach for understanding the predicate is to read it declaratively:
0 is a numeral
If X is a numeral (whatever X is!), then succ(X) of X is also a numeral.
Note that :- even means &leftarrow;, i.e., an implication from right to left.
My recommendation is to focus on a clear declarative description of what ought to hold. To overcome the initial barriers with Prolog, you must let go the idea that you can trace the steps that the CPU performs in the extreme detail in which you are currently trying to follow it. Prolog is too high-level to be amenable to tracing in this low-level way. It is like trying to interpret between French and English by tracing only the neuronal activities of the speakers.
Write a clear definition and then leave the search to Prolog. There are many other and working ways to understand and break down declarative definitions without getting swamped in low-level details. See for example program-slicing and failure-slicing. They work as long as you stay in the so-called pure monotonic subset of Prolog. Focus on this area, and you will be able to make very fast progress.

Why don't Standardize variables just violate completeness of resolution

I have been reading some notes on converting first order logic (FOL) sentences to conjunctive normal form (CNF), and then performing resolution.
One of the steps of converting to CNF, is Standardize variables.
I have been searching to find an why complete condition of resolution algorithm violate and soundness doesn't violate, if i don't standardize variables.
anyone could add details why just violate completeness, and soundness is remain?
Here is an example that may help you visualize this. Suppose your theory is
(for all X : nice(X)) or (for all X : smart(X)) (1)
which, if you standardize apart, will result in the CNF:
nice(X) or smart(Y)
that is, everybody in a population is nice, or everybody in a population is smart, or both.
Dropping standardize apart will produce a CNF
nice(X) or smart(X)
which is equivalent to
(for all X : nice(X) or smart(X)) (2)
that is, for each individual in a population, that individual is nice, or smart, or both.
This theory (2) is saying less, it is weaker, than the original theory (1), because it admits a situation in which it is not true that everybody is nice, and it is not true that everybody is smart, but it is true that each individual is one or the other or both. In other words, (2) does not imply (1), but (1) implies (2) (if the whole population is nice, then each individual is nice; if the whole population is smart, then each individual is smart; therefore, each individual is either nice or smart). The set of possible worlds accepted by (2) is strictly larger than the set of possible worlds accepted by (1).
What does this say about completeness and soundness?
Using (2) is not complete because I can show you a counter-example of a true theorem in (1) that is not proven true when using (2). Consider the theorem "If John is not nice but smart, then Liz is smart". This is true given (1) because if the whole population shares one of those two qualities, then it must be "smart" since John is not nice, so Liz (and everyone else) must also be smart. However, given (2) this is not true anymore (it could be that Liz is nice but not smart, and everybody else is one or the other, which still would be ok given (2)). So I can no longer prove a true theorem using (2) (after dropping standardize apart), and therefore my inference is not complete.
Now suppose I prove some theorem T to be true, using (2). This means T holds in every possible world in which (2) holds, including the sub-set of them that hold in (1) (according to the third paragraph above from here). Therefore T is true in (1) as well, so performing inference using (2) is still sound.
In a nutshell: when you do not standardize apart, you "join" statements about entire domains (populations) and make them about individuals, which will be weaker and will not imply some facts that were implied before; they will be lost and your procedure will not be complete.

Prolog: How to tell if a predicate is deterministic or not

So from what I understand about deterministic predicates:
Deterministic predicate = 1 solution
Non-deterministic predicate = multiple solutions
Are there any type of rules as to how you can detect if the predicate is one or the other? Like looking at the search tree, etc.
There is no clear, generally accepted consensus about these notions. However, they are usually based rather on the observed answers and not based on the number of solutions. In certain contexts the notions are very implementation related. Non-determinate may mean: leaves a choice point open. And sometimes determinate means: never even creates a choice point.
Answers vs. solutions
To see the difference, consider the goal length(L, 1). How many solutions does it have? L = [a] is one, L = [23] another... but all of these solutions are compactly represented with a single answer substitution: L = [_] which thus contains infinitely many solutions.
In any case, in all implementations I know of, length(L, 1) is a determinate goal.
Now consider the goal repeat which has exactly one solution, but infinitely many answers. This goal is considered non-determinate.
In case you are interested in constraints, things become even more evolved. In library(clpfd), the goal X #> Y, Y #> X has no solution, but still one answer. Combine this with repeat: infinitely many answers and no solution.
Further, the goal append(Xs, Ys, []) has exactly one solution and also exactly one answer, nevertheless it is considered non-determinate in many implementations, since in those implementations it leaves a choice point open.
In an ideal implementation, there would be no answers without solutions (except false), and there would be non-determinism only when there is more than one answer. But then, all of this is mostly undecidable in the general case.
So, whenever you are using these notions make sure on what level things are meant. Rather explicitly say: multiple answers, multiple solutions, leaves no (unnecessary) choice point open.
You need understand the difference between det, semidet and undet, it is more than just number of solutions.
Because there is no loop control operator in Prolog, looping (not recursion) is constructed as a 'sequence generating' predicate (undet) followed by the loop body. Also you can store solutions with some of findall-group predicates as a list and loop later with the member/2 predicate.
So, any piece of your program is either part of loop construction or part of usual flow. So, there is a difference in designing det and undet predicates almost in the intended usage. If you can work with a sequence you always do undet and comment it as so. There is a nice unit-test extension in swi-prolog which can check wheter your predicate always the same in mean of det/semidet/undet (semidet is for usage the same way as undet but as a head of 'if' construction).
So, the difference is pre-design, and this question should not be arised with already existing predicates. It is a good practice always comment the intended usage in a comment like.
% member(?El, ?List) is undet.
Deterministic: Always succeeds with a single answer that is always the same for the same input. Think a of a static list of three items, and you tell your function to return value one. You will get the same answer every time. Additionally, arithmetic functions. 1 + 1 = 2. X + Y = Z.
Semi-deterministic: Succeeds with a single answer that is always the same for the same input, but it can fail. Think of a function that takes a list of numbers, and you ask your function if some number exists in the list. It either does, or it doesn't, based on the contents of the list given and the number asked.
Non-deterministic: Succeeds with a single answer, but can exhibit different behaviors on different runs, even for the same input. Think any kind of math.random(min,max) function like random/3
In essence, this is entirely separate from the concept of choice points, as choice points are a function of Prolog. Where I think the Prolog confusion of these terms comes from is that Prolog can find a single answer, then go back and try for another solution, and you have to use the cut operator ! to tell it that you want to discard your choice points explicitly.
This is very useful to know when working with Prolog Unit Testing

Eligibility trace algorithm, the update order

I am reading Silver et al (2012) "Temporal-Difference Search in Computer Go", and trying to understand the update order for the eligibility trace algorithm.
In the Algorithm 1 and 2 of the paper, weights are updated before updating the eligibility trace. I wonder if this order is correct (Line 11 and 12 in the Algorithm 1, and Line 12 and 13 of the Algorithm 2).
Thinking about an extreme case with lambda=0, the parameter is not updated with the initial state-action pair (since e is still 0). So I doubt the order possibly should be the opposite.
Can someone clarify the point?
I find the paper very instructive for learning the reinforcement learning area, so would like to understand the paper in detail.
If there is a more suitable platform to ask this question, please kindly let me know as well.
It looks to me like you're correct, e should be updated before theta. That's also what should happen according to the math in the paper. See, for example, Equations (7) and (8), where e_t is first computed using phi(s_t), and only THEN is theta updated using delta V_t (which would be delta Q in the control case).
Note that what you wrote about the extreme case with lambda=0 is not entirely correct. The initial state-action pair will still be involved in an update (not in the first iteration, but they will be incorporated in e during the second iteration). However, it looks to me like the very first reward r will never be used in any updates (because it only appears in the very first iteration, where e is still 0). Since this paper is about Go, I suspect it will not matter though; unless they're doing something unconventional, they probably only use non-zero rewards for the terminal game state.

Resources