Reducing this lambda expression - lambda-calculus

I’m trying to practice beta reduction but I’m stuck on how to reduce this problem:
(λx((λy.x)(λx.x))x)y
The outermost λx will obviously be substituted with y, but should I still proceed with reducing ((λy.x)(λx.x))? What am I missing here?

I’m trying to practice beta reduction but I’m stuck on how to reduce this problem:
(λx((λy.x)(λx.x))x)y
To perform the first β-reduction you have to look at the variables x bound by the outer λx, which are,
(λx((λy.x)(λx.x))x)y
^ ^
Note that in the inner λx.x shadows the outer lambda since it uses the same variable name. So the bound x here is bound to this inner lambda, not the outer one.
Another problem is that if you substitute y for x in λy.x then y is captured by the outer lambda. That cannot happen, so you need to perform an α-conversion (renaming) on λy.xto avoid the capture.
So first the α-conversion, the the β-reduction
(λx((λy.x)(λx.x))x)y
→ (λx((λz.x)(λx.x))x)y
→ ((λz.y)(λx.x))y
If you want to continue then λz.y returns y for any input,
((λz.y)(λx.x))y
→ yy
Note the λ-calculus doesn't set a reduction strategy, it only establishes an equivalence relation between terms. So you can pick another reduction order. For instance you could β-reduce the inner β-redex (λy.x)(λx.x) first.
(λx((λy.x)(λx.x))x)y
→ (λx(xx)y
→ yy
This would have avoided the α-conversion.

Related

Does SKS equal SKK?

Context
I started teaching myself lambda calculus last night and I am trying to determine if what I understand so far is correct.
Understanding
SKK is equivalent to the Identity combinator, I.
Where L stands for lambda:
S = LxLyLz((xz)(yz))
K = LxLy(x)
K essentially takes the next 2 (lambda) terms and gives back the first of those. S seems a little more complicated in the untyped lambda calculus.
My Interpretation
SK(any-lambda-term) is also equivalent to I.
I.e. the application of the application of S to K to Any-lambda-term is equivalent to the Identity combinator:
((S K)(Any)) = I = S K K = ((S K)(K))
I am using the convention of “left-association” in my above notation, if that helps (And I tried to make that clear in the 4th term above with parentheses. Everything I have read so far seems to use this convention).
Reasoning
S K = LyLz((K z)(y z))
The next lambda term will be substituted for y, let the term be Y.
S K Y = Lz((K z)(Y z))
(Y z) is the application of Y to z, also a lambda term.
(K z)returns the constant function that returns z, given another term input: (Y z).
Is my interpretation true? If not, can you provide an explanation? I would greatly appreciate it. Particularly if a sort of order of operations can be explained—I regularly find myself confused when considering when to evaluate. Perhaps that will be refined with practice.
Your intuition is correct, but an intuition proves nothing (alas...)
So, how can we prove your statement? Simply by showing that SKK and SKS have the same behaviour. "Behaviour" is an informal notion, which is formally capture by "semantics": if SKK and SKS are equals, then they should always reduce to the same term, according to the SKI-calculus semantics.
Now, there is a deep question, which is: what are the SKI-calculus? Actually, there is not a single way to answer that. What you implicitly do in your question is that you express SKI in terms of λ terms and you rely on the semantics of the λ calculus. This is absolutly correct. An other way to do it could have been to define directly SKI semantics. For instance, if you look at the wikipedia page, you can see that the semantics are not defined with lambda terms (and the fact that it correspond to lambda term is a (nice and expected) side effect). In the rest of this answer, I'll take the same approach as you do, and convert SKI terms in λ terms. A good exercise for you is to redo the proof, using the proper SKI semantics.
So, let formalize your question: your question is whether, for any SKI term t, SKKt = SKSt? Well... Let's see.
SKKt is encoded as (λx.λy.λz.(xz)(yz))(λx.λy.x)(λx.λy.x)t in the λ-calculus. We now just have to reduce it to a normal form (I detail every step, each time I reduce the leftmost λ, even tho it is not the fastest strategy):
(λx.λy.λz.(xz)(yz))(λx.λy.x)(λx.λy.x)t
= (λy.λz.((λx.λy.x)z)(yz))(λx.λy.x)t
= (λz.((λx.λy.x)z)((λx.λy.x)z))t
= ((λx.λy.x)t)((λx.λy.x)t)
= (λy.t)((λx.λy.x)t)
= t
So, the encoding of SKKt in the λ calculus reduces to t (as a sidenote, we just proved that SKK is equivalent to I here). To conclude our proof, we have to reduce SKSt and see whether it also reduces to t.
SKSt is encoded as (λx.λy.λz.(xz)(yz))(λx.λy.x)(λx.λy.λz.(xz)(yz))t. Let reduce it. (I don't detail as much this time)
(λx.λy.λz.(xz)(yz))(λx.λy.x)(λx.λy.λz.(xz)(yz))t
= ((λx.λy.x) t)((λx.λy.λz.(xz)(yz)) t))
= (λy.t)((λx.λy.λz.(xz)(yz)) t))
= t
Hurrah! It also reduce to t, so indeed, SKS and SKK are equivalent. It seems that the third combinator is not important: that as soon as you have SK?, it is equivalent to I. As an exercise, you can easily prove it (same strategy, if it is the case, then for any terms t and s, SKts = s). As mentionned above, an other good exercise is to redo the proof without using the λ semantics, but the proper SKI semantics.
Finally, my answer should raise a new question to you: we have two semantics, one that encodes SKI terms into λ terms, and one that does not. The question you may have is: are the two semantics equivalent? What does it mean for two semantics to be equivalent? If you are only starting to teach yourself λ calculus, it may be a bit early to try to answer those questions right now, but you can keep it in a corner of your head for when you'll get more familiar with formal languages.

Flattening quantification over relations

I have a Relation f defined as f: A -> B × C. I would like to write a firsr-order formula to constrain this relation to be a bijective function from A to B × C?
To be more precise, I would like the first order counter part of the following formula (actually conjunction of the three):
∀a: A, ∃! bc : B × C, f(a)=bc -- f is function
∀a1,a2: A, f(a1)=f(a2) → a1=a2 -- f is injective
∀(b, c) : B × C, ∃ a : A, f(a)=bc -- f is surjective
As you see the above formulae are in Higher Order Logic as I quantified over the relations. What is the first-order logic equivalent of these formulae if it is ever possible?
PS:
This is more general (math) question, rather than being more specific to any theorem prover, but for getting help from these communities --as I think there are mature understanding of mathematics in these communities-- I put the theorem provers tag on this question.
(Update: Someone's unhappy with my answer, and SO gets me fired up in general, so I say what I want here, and will probably delete it later, I suppose.
I understand that SO is not a place for debates and soapboxes. On the other hand, the OP, qartal, whom I assume is the unhappy one, wants to apply the answer from math.stackexchange.com, where ZFC sets dominates, to a question here which is tagged, at this moment, with isabelle and logic.
First, notation is important, and sloppy notation can result in a question that's ambiguous to the point of being meaningless.
Second, having a B.S. in math, I have full appreciation for the logic of ZFC sets, so I have full appreciation for math.stackexchange.com.
I make the argument here that the answer given on math.stackexchange.com, linked to below, is wrong in the context of Isabelle/HOL. (First hmmm, me making claims under ill-defined circumstances can be annoying to people.)
If I'm wrong, and someone teaches me something, the situation here will be redeemed.
The answerer says this:
First of all in logic B x C is just another set.
There's not just one logic. My immediate reaction when I see the symbol x is to think of a type, not a set. Consider this, which kind of looks like your f: A -> BxC:
definition foo :: "nat => int × real" where "foo x = (x,x)"
I guess I should be prolific in going back and forth between sets and types, and reading minds, but I did learn something by entering this term:
term "B × C" (* shows it's of type "('a × 'b) set" *)
Feeling paranoid, I did this to see if had fallen into a major gotcha:
term "f : A -> B × C"
It gives a syntax error. Here I am, getting all pedantic, and our discussion is ill-defined because the notation is ill-defined.
The crux: the formula in the other answer is not first-order in this context
(Another hmmm, after writing what I say below, I'm full circle. Saying things about stuff when the context of the stuff is ill-defined.)
Context is everything. The context of the other site is generally ZFC sets. Here, it's HOL. That answerer says to assume these for his formula, wich I give below:
Ax is true iff x∈A
Bx is true iff x∈B×C
Rxy is true iff f(x)=y
Syntax. No one has defined it here, but the tag here is isabelle, so I take it to mean that I can substitute the left-hand side of the iff for the right-hand side.
Also, the expression x ∈ A is what would be in the formula in a typical set theory textbook, not Rxy. Therefore, for the answerer's formula to have meaning, I can rightfully insert f(x) = y into it.
This then is why I did a lot of hedging in my first answer. The variable f cannot be in the formula. If it's in the formula, then it's a free variable which is implicitly quantified. Here's the formula in Isar syntax:
term "∀x. (Ax --> (∃y. By ∧ Rxy ∧ (∀z. (Bz ∧ Rxz) --> y = z)))"
Here it is with the substitutions:
∀x. (x∈A --> (∃y. y∈B×C ∧ f(x)=y ∧ (∀z. (z∈B×C ∧ f(x)=z) --> y = z)))
In HOL, f(x) = f x, and so f is implicitly, universally quantified. If this is the case, then it's not first-order.
Really, I should dig deep to recall what I was taught, that f(x)=y means:
(x,f(x)) = (x,y) which means we have to have (x,y)∈(A, B×C)
which finally gets me:
∀x. (x∈A -->
(∃y. y∈B×C ∧ (x,y)∈(A,B×C) ∧ (∀z. (z∈B×C ∧ (x,z)∈(A,B×C)) --> y = z)))
Finally, I guess it turns out that in the context of math.stackexchange.com, it's 100% on.
Am I the only one who feels compulsive about questioning what this means in the context of Isabelle/HOL? I don't accept that everything here is defined well enough to show that it's first order.
Really, qartal, your notation should be specific to a particular logic.
First answer
With Isabelle, I answer the question based on my interpretation of your
f: A -> B x C, which I take as a ZFC set, in particular a subset of the
Cartesian product A x (B x C)
You're sort of mixing notation from the two logics, that of ZFC
sets and that of HOL. Consequently, I might be off on what I think you're
asking.
You don't define your relation, so I keep things simple.
I define a simple ZFC function, and prove the first
part of your first condition, that f is a function. The second part would be
proving uniqueness. It can be seen that f satisfies that, so once a
formula for uniqueness is stated correctly, auto might easily prove it.
Please notice that the
theorem is a first-order formula. The characters ! and ? are ASCII
equivalents for \<forall> and \<exists>.
(Clarifications must abound when
working with HOL. It's first-order logic if the variables are atomic. In this
case, the type of variables are numeral. The basic concept is there. That
I'm wrong in some detail is highly likely.)
definition "A = {1,2}"
definition "B = A"
definition "C = A"
definition "f = {(1,(1,1)), (2,(1,1))}"
theorem
"!a. a \<in> A --> (? z. z \<in> (B × C) & (a,z) \<in> f)"
by(auto simp add: A_def B_def C_def f_def)
(To completely give you an example of what you asked for, I would have to redefine my function so its bijective. Little examples can take a ton of work.)
That's the basic idea, and the rest of proving that f is a function will
follow that basic pattern.
If there's a problem, it's that your f is a ZFC set function/relation, and
the logical infrastructure of Isabelle/HOL is set up for functions as a type.
Functions as ordered pairs, ZFC style, can be formalized in Isabelle/HOL, but
it hasn't been done in a reasonably complete way.
Generalizing it all is where the work would be. For a particular relation, as
I defined above, I can limit myself to first-order formulas, if I ignore that
the foundation, Isabelle/HOL, is, of course, higher-order logic.

Prove ¬(¬a = a)

This looks like such an easy problem but still can't figure it out. How do I prove ¬(¬a = a)?
No given premises.
I got this so far (in Fitch):
This is a subproof where I assume the negation of my goal and then try to reach the absurd/contradiction so I can state the negation of my assumption, which would be my goal.
Thanks in advance!
Looking at your screenshot I'd say your =Intro introduces a variable a (that is, a is an object of the domain, rather than a predicate).
I say this because
in all books I've read, the =Intro rule is used for objects rather than predicates, and
for predicates, equals is expressed as "if and only if" which is typically written as ↔ and not =.
So, in other words, the only sensible interpretation of ¬(¬a = a) is that = binds harder than ¬, and the whole formula should be interpreted as ¬(¬(a = a)).
Now you should be able to
introduce a = a
assume the contrary: ¬(a = a)
arrive at a contradiction, ⊥, based on 1. and 2.
Use ¬Intro on 2 and 3 to get ¬(¬(a = a)).

Explanation of Prolog recursive procedure

I'd like someone to explain this procedure if possible (from the book 'learn prolog now'). It takes two numerals and adds them together.
add(0,Y,Y).
add(s(X),Y,s(Z)) :- add(X,Y,Z).
In principle I understand, but I have a few issues. Lets say I issue the query
?- add(s(s(0)), s(0), R).
Which results in:
R = s(s(s(0))).
Step 1 is the match with rule 2. Now X becomes s(0) and Y is still s(0). However Z (according to the book) becomes s(_G648), or s() with an uninstantiated variable inside it. Why is this?
On the final step the 1st rule is matched which ends the recursion. Here the contents of Y somehow end up in the uninstantiated part of what was Z! Very confusing, I need a plain english explanation.
First premises:
We have s(X) defined as the successor of X so basically s(X) = X+1
The _G### notation is used in the trace for internal variables used for the recursion
Let's first look at another definition of addition with successors that I find more intuitive:
add(0,Y,Y).
add(s(A),B,C) :- add(A,s(B),C).
this does basically the same but the recursion is easier to see:
we ask
add(s(s(0)),s(0),R).
Now in the first step prolog says thats equivalent to
add(s(0),s(s(0)),R)
because we have add(s(A),B,C) :- add(A,s(B),C) and if we look at the question A = s(0) and B=s(0). But this still doesn't terminate so we have to reapply that equivalency with A=0 and B=s(s(0)) so it becomes
add(0,s(s(s(0))),R)
which, given add(0,Y,Y). this means that
R = s(s(s(0)))
Your definition of add basically does the same but with two recursions:
First it runs the first argument down to 0 so it comes down to add(0,Y,Y):
add(s(s(0)),s(0),R)
with X=s(0), Y = s(0) and s(Z) = R and Z = _G001
add(s(0),s(0),_G001)
with X = 0, Y=s(0) and s(s(Z)) = s(G_001) = R and Z = _G002
add(0,s(0),_G002)
So now it knows that _G002 is s(0) from the definition add(0,Y,Y) but has to trace its steps back so _G001 is s(_G002) and R is s(_G001) is s(s(_G002)) is s(s(s(0))).
So the point is in order to get to the definition add(0,Y,Y) prolog has to introduce internal variables for a first recursion from which R is then evaluated in a second one.
If you want to understand the meaning of a Prolog program, you might concentrate first on what the relation describes. Then you might want to understand its termination properties.
If you go into the very details of a concrete execution as your question suggests, you will soon be lost in the multiplicity of details. After all, Prolog has two different interlaced control flows (AND- and OR-control) and in addition to that it has unification which subsumes parameter passing, assignment, comparison, and equation solving.
Brief: While computers execute a concrete query effortlessly for zillions of inferences, you will get tired after a screenful of them. You can't beat computers in that. Fortunately, there are better ways to understand a program.
For the meaning, look at the rule first. It reads:
add(s(X),Y,s(Z)) :- add(X,Y,Z).
See the :- in between? It is meant to symbolize an arrow. It is a bit unusual that the arrow points from right-to-left. In informal writing you would write it rather left-to-right. Read this as follows:
Provided, add(X,Y,Z) is true, then also add(s(X),Y,s(Z)) is true.
So we assume that we have already some add(X,Y,Z) meaning "X+Y=Z". And given that, we can conclude that also "(X+1)+Y=(Z+1)" holds.
After that you might be interested to understand it's termination properties. Let me make this very brief: To understand it, it suffices to look at the rule: The 2nd argument is only handed further on. Therefore: The second argument does not influence termination. And both the 1st and 3rd argument look the same. Therefore: They both influence termination in the same manner!
In fact, add/3 terminates, if either the 1st or the 3rd argument will not unify with s(_).
Find more about it in other answers tagged failure-slice, like:
Prolog successor notation yields incomplete result and infinite loop
But now to answer your question for add(s(s(0)), s(0), R). I only look at the first argument: Yes! This will terminate. That's it.
Let's divide the problem in three parts: the issues concerning instantiation of variables and the accumulator pattern which I use in a variation of that example:
add(0,Y,Y).
add(s(X),Y,Z):-add(X,s(Y),Z).
and a comment about your example that uses composition of substitutions.
What Prolog applies in order to see which rule (ie Horn clause) matches (whose head unifies) is the Unification Algorithm which tells, in particular, that if I have a variable, let's say, X and a funtor, ie, f(Y) those two term unify (there is a small part about the occurs check to...check but nevermind atm) hence there is a substitution that can let you convert one into another.
When your second rule is called, indeed R gets unified to s(Z). Do not be scared by the internal representation that Prolog gives to new, uninstantiated variables, it is simply a variable name (since it starts with '_') that stands for a code (Prolog must have a way to express constantly newly generated variables and so _G648, _G649, _G650 and so on).
When you call a Prolog procedure, the parameters you pass that are uninstantiated (R in this case) are used to contain the result of the procedure as it completes its execution, and it will contain the result since at some point during the procedure call it will be instantied to something (always through unification).
If at some point you have that a var, ie K is istantiated to s(H) (or s(_G567) if you prefer), it is still partilally instantiated and to have your complete output you need to recursively instantiate H.
To see what it will be instantiated to, have a read at the accumulator pattern paragraph and the sequent one, tho ways to deal with the problem.
The accumulator pattern is taken from functional programming and, in short, is a way to have a variable, the accumulator (in my case Y itself), that has the burden to carry the partial computations between some procedure calls. The pattern relies on recursion and has roughly this form:
The base step of the recursion (my first rule ie) says always that since you have reached the end of the computation you can copy the partial result (now total) from your accumulator variable to your output variable (this is the step in which, through unification your output var gets instantiated!)
The recursive step tells how to create a partial result and how to store it in the accumulator variable (in my case i 'increment' Y). Note that in the recursive step the output variable is never changed.
Finally, concerning your exemple, it follows another pattern, the composition of substitutions which I think you can understand better having thought about accumulator and instantiation via unification.
Its base step is the same as the accumulator pattern but Y never changes in the recursive step while Z does
It uses to unify the variable in Z with Y by partially instantiating all the computation at the end of each recursive call after you've reached the base step and each procedure call is ending. So at the end of the first call the inner free var in Z has been substituted by unification many times by the value in Y.
Note the code below, after you have reached the bottom call, the procedure call stack starts to pop and your partial vars (S1, S2, S3 for semplicity) gets unified until R gets fully instantiated
Here is the stack trace:
add(s(s(s(0))),s(0),S1). ^ S1=s(S2)=s(s(s(s(0))))
add( s(s(0)) ,s(0),S2). | S2=s(S3)=s(s(s(0)))
add( s(0) ,s(0),S3). | S3=s(S4)=s(s(0))
add( 0 ,s(0),S4). | S4=s(0)
add( 0 ,s(0),s(0)). ______|

OTTER inferences

I'm writing an input file for OTTER that is very simple:
set(auto).
formula_list(usable).
all x y ([Nipah(x) & Encephalitis(y)] -> Causes(x,y)).
exists x y (Nipah(x) & Encephalitis(y)).
end_of_list.
I get this output for the search :
given clause #1: (wt=2) 2 [] Nipah($c2).
given clause #2: (wt=2) 2 [] Encephalitis($c1).
search stopped because sos empty
Why won't OTTER infer Causes($c2,$c1)?
EDIT:
I removed the square brackets from [Nipah(x) & Encephalitis(x)] and it worked. Why does this matter?
I'd answer with a question: Why did you use square brackets in the first place?
Look into Otter manual, Section 4.3, List Notation. Square brackets are used for lists, it's syntactic sugar that is expanded into special terms. In your case, it expanded to something like
all x y ($cons(Nipah(x) & Encephalitis(y), $nil) -> Causes(x,y)).
Why won't OTTER infer Causes($c2,$c1)?
Note that the resolution calculus is not complete in the sense that every formula provable in a given theory could be inferred by the calculus. This would be highly undesirable! Instead, resolution is only refutationally complete, meaning that if a given theory is
contradictory then the resolution will find a proof of the empty clause. So even if a clause C is a logical consequence of a set of clauses T, it doesn't mean that the resolution calculus can derive C from T. In your case, the fact that Causes($c2,$c1) follows from the input doesn't mean Otter has to derive it.

Resources