Why is the introduction of the Ycombinator in λ-calculus necessary? - lambda-calculus

I am reading a book on λ-calculus "Functional programming Through Lambda Calculus" (Greg Michaelson). In the book the author introduces a short-hand notation for defining functions. For example
def identity = λx.x
and goes on saying that we should insist that when using such shorthand "all defined names should be replaced by their definitions before the expression is evaluated"
Later on, when introducing recursion he uses as an example a definition of the addition function such as:
def add x y = if iszero y then x else add (succ x) (pred y)
and goes to say, that had we not had the restriction mentioned above we would be able to evaluate this function by slowly expanding it. However since we have the restriction of replacing all defined names before the evaluation of the expression, we cannot do that since we go on indefinetely replacing add and thus the need of thinking about recursion in a more detailed way.
My question is thus the following: What are the theoritical or practical reasons for placing this restriction upon ourselves? (of having to replace all defined names before the evaluation of the function)? Are there any?

I was trying to show how to build a rich language from a very simple one, by adding successive layers of syntax, where each layer could be translated into the previous layer. So it's important to distinguish translation, which must terminate, from evaluation which needn't. I think it's really interesting that recursion can be translated into non-recursion. I'm sorry if my explanation isn't helpful.

The reason is that we want to stay within the rules of the lambda calculus. Allowing names for terms to mean anything other than immediate substitution would mean adding a recursive let expression to the language, which would mean we would need a truly more expressive system (no longer the lambda calculus).
You can think of the names as no more than syntactic sugar for the original lambda term. The Y-combinator is exactly the way to introduce recursion into a system that does not have it built in.
If the book you are currently reading confuses you, you might want search for some additional resources on the internet explaining the Y-combinator.

I will try to post my own answer, the way I understand it.
For the untyped lambda calculus there is no practical reason, we need the Y combinator. By practical I mean that if someone wants to build an expression evaluator, it is possible to do it without needing the combinator and just slowly expanding the definition.
For theoretical reasons though, we need to make sure that when we define a function this definition has some meaning and is not defined in terms of itself. e.g. there is not much meaning in the following definition:
def something = something
For this reason, we need to see if it is possible to rewrite the definition in a way that it is not self-referential, i.e. it is possible to define something without referring to itself. It turns out that in the untyped lambda calculus we can always do that through the Y-combinator.
Using the Y-combinator we can always construct the solution to the equation x=f(x)=f(f(x))=...=f(f(f(f(x)))=.... for any f,
i.e. we can always rewrite a self-referential definition to a definition that it does not include itself

Related

Understanding the syntax of LISP

Although LISP has some of the most simple syntax I've seen, I am still confused about the fundamentals. I've done research, and I've come to the conclusion that there are two datatypes: "atoms" and lists. However, I have also come across the term "S-expression", which seems to describe both atoms and lists. So, what exactly is an S-expression? Is it a datatype? In addition, I am not sure how to distinguish function calls from data lists in LISP. For example, (1 2 3) is a list, while (f 2 3) could be some function. But how am I supposed to know whether f is a function name or some datatype? Since lists and functions use the same syntax, I am not sure how to differentiate between the two. Finally, most importantly, I need a mental model for thinking about how LISP works. For example, what are the fundamental datatypes? What are the built-in procedures used to do things with the fundamental datatypes? How can we see data and procedures as distinct? For example, in Java, instance variables at the top of classes are used to represent data, while methods are the procedures that manipulate the data. What does this look like in LISP?
(I'm new, so I'm not sure if this question is too broad or not)
I second the recommendations for LispBook and Practical Common Lisp. Both great books. Once you have understood the basics, I really, really recommend Paul Graham's "On Lisp".
To guide you toward answers on your specific questions:
Data types: Lisp has a rich set of data types (numeric types such as integer, rational, float - character and string - arrays and hash-tables - plenty more), but in my opinion, assuming you are already comfortable with integer and string, you should start by reading up on symbol and cons.
symbol: Realise that a symbol is both an identifier and a value. Understand that each symbol can "point to" one data value and, at the same time, "point to" one function (and has three other properties that you don't need to worry about yet)
cons: Discover that this magical thing called a 'cons cell' is just a struct with two pointers. One called "the Address part of the Register" and the other called "the Decrement part of the register". Don't worry about what those names mean (they're no longer relevant), just know that the car function returns the Contents of the Address part of the Register, and cdr returns the Decrement part. Now forget all of that and just remember that modern Lispers think of the cons cell as a struct with two pointers, one called the car and the other called the cdr.
Lists: "There is no spoon." There is no list either. Realise that what we think of as a list is nothing more than a set of cons cells, with the car of each cons pointing to one member of the list, and the cdr of each cons pointing to the next cons (except for the last cons, whose cdr is "the null pointer" nil). This is vital to understand list manipulation, nested lists, tree structures, etc.
Atoms: For now, think of an atom as a number, a character, a string or a symbol. That's enough to get you started you can dig deeper into the details later.
S-expressions: An s-expr is defined as "either an atom, or a cons cell pointing to two s-expressions". Don't worry about what is or isn't an s-expr for now, but checkout wikipedia for a brief intro
Lists vs function calls: Man, oh, man! I struggled with this when I got started! But it's actually quite easy with a bit of practice: "Five times two" is just an English phrase, right? But if I asked you to "evaluate" it, you would probably say "ten". "1 + 2" is a mathematical expression, but a mathematician would evaluate it to 3. "5" is just a number, but if you type it into a calculator and press "=", the calculator will evaluate it to the answer 5. In Lisp (+ 1 2) is just a list. It is a list containing the symbol +, the number 1 and the number 2. BUT - if you ask Lisp to evaluate that list, it will call the function +, passing it 1 and 2 as parameters. A large part of getting comfortable with Lisp is learning when and where s-expressions are evaluated, and where they aren't. For now - anything you type into the REPL will be evaluated, parameters to function calls will be evaluated, parameters to macro calls will probably not be evaluated, you can use quote to prevent evaluation. This will become easier with practice.
Essential functions: Although I said the "list" doesn't really exist ("Its conses all the way down, Mr. Fry"), learn the list based stuff first, car, cdr, cons, list, append, reverse and the applicative stuff mapcar, mapcons, apply. These are a great way to start thinking in both lists and functional programming.
Classes, member data and methods: I recommend leaving object orientation for later. First learn the basics of Lisp. Fight the urge to worry about data encapsulation, access control and polymorphism until you have become friends with the language itself. When you are ready, read "On Lisp", Chapter 25 will walk you through adding object orientation to Lisp yourself, showing how Lisp really is the programmable programming language. The book will then introduce you to CLOS, the standard object-orientation system built into Common Lisp. Get to know CLOS, but certainly browse around for other OO libraries. Lisp is the only language I know where you can actually choose how your want the language to implement OO.
I'm going to stop here. Get comfortable with the first 8 concepts above and a strong mental model will will have sorted itself out.
Everything that is self evaluating is data. eg 2, "hello".
Everything that is quoted is data. eg '(f 3 4), 'test, or the longhand version (quote test)
Everything else that is fed to the REPL needs to be code. eg (f 3 4) is code. It's s-expression and indistinguishable to the quoted data above, but it isn't quoted so it has to be code.
There are some special forms like if, let, lambda, defun, ... that you just need to learn how works. How do you know that if isn't a method in a C dialect like Java or C#? You just have to know them by heart.
You also need to know some essential functions. Usually they are introduced in each and every tutorial together with the special forms. I recommend you read Land of Lisp and Practical Common Lisp for Common Lisp and Realm of Racket and How to design programs for Racket. For pure Scheme I recommend The wizard book (SICP). Don't do them all at the same time. Learning the other ones are easy when you have learned one good enougn.
Now, learning a Lisp is more difficult than learning another C dialect. It's because it is a totally different language with different ways to do stuff. Eg. you won't find for or while loops and variables don't have types but the objects they refer to have. You need to learn how to program almost as if you didn't know a C dialect. (Actually, knowing a C dialect might make the learning harder)
Good luck!
An "S-expression" is 'an atom, or a list of S-expressions' (there are some more complications here, I'll skip those for now, basically boiling down to something called "read macros", where a difference between the textual representation and the internal representation can be done, to simplify human writing).
(1 2 3) is a list. But, (f 1 2 3) is also a list. However, in some (most?) circumstances, it can also be a function call (rather than a function).
I think you mostly have the syntax down, the rest is semantics (in a very technical sense). At this point, I would point you at some good reading material, there's plenty of good books around (I started with an earlier edition of Winston-Horn's "Lisp".

Implementing arithmetic for Prolog

I'm implementing a Prolog interpreter, and I'd like to include some built-in mathematical functions (sum, product, etc). For example, I would like to be able to make calculations using knowledge bases like this one:
NetForce(F) :- Mass(M), Acceleration(A), Product(M, A, F)
Mass(10) :- []
Acceration(12) :- []
So then I should be able to make queries like ?NetForce(X). My question is: what is the right way to build functionality like this into my interpreter?
In particular, the problem I'm encountering is that, in order to evaluate Sum, Product, etc., all their arguments have to be evaluated (i.e. bound to numerical constants) first. For example, while to code above should evaluate properly, the permuted rule:
NetForce(F) :- Product(M, A, F), Mass(M), Acceleration(A)
wouldn't, because M and A aren't bound when the Product term is processed. My current approach is to simply reorder the terms so that mathematical expressions appear last. This works in simple cases, but it seems hacky, and I would expect problems to arise in situations with multiple mathematical terms, or with recursion. Is there a better solution?
The functionality you are describing exists in existing systems as constraint extensions. There is CLP(Q) over the rationals, CLP(R) over the reals - actually floats, and last but not least CLP(FD) which is often extended to a CLP(Z). See for example
library(clpfd).
In any case, starting a Prolog implementation from scratch will be a non-trivial effort, you will have no time to investigate what you want to implement because you will be inundated by much lower level details. So you will have to use a more economical approach and clarify what you actually want to do.
You might study and implement constraint languages in existing systems. Or you might want to use a meta-interpreter based approach. Or maybe you want to implement a Prolog system from scratch. But don't expect that you succeed in all of it.
And to save you another effort: Reuse existing standard syntax. The syntax you use would require you to build an extra parser.
You could use coroutining to delay the evaluation of the product:
product(X, A, B) :- freeze(A, freeze(B, X is A*B))
freeze/2 delays the evaluation of its second argument until its first argument is ground. Used nested like this, it only evaluates X is A*B after both A and B are bound to actual terms.
(Disclaimer: I'm not an expert on advanced Prolog topics, there might be an even simpler way to do this - e.g. I think SICStus Prolog has "block declarations" which do pretty much the same thing in a more concise way and generalized over all declarations of the predicate.)
Your predicates would not be clause order independent, which is pretty important. You need to determine usage modes of your predicates - what will the usage mode of NetForce() be? If I were designing a predicate like Force, I would do something like
force(Mass,Acceleration,Force):- Force is Mass * Acceleration.
This has a usage mode of +,+,- meaning you give me Mass and Acceleration and I will give you the Force.
Otherwise, you are depending on the facts you have defined to unify your variables, and if you pass them to Product first they will continue to unify and unify and you will never stop.

Theorem Proof Using Prolog

How can I write theorem proofs using Prolog?
I have tried to write it like this:
parallel(X,Y) :-
perpendicular(X,Z),
perpendicular(Y,Z),
X \== Y,
!.
perpendicular(X,Y) :-
perpendicular(X,Z),
parallel(Z,Y),
!.
Can you help me?
I was reluctant to post an Answer because this Question is poorly framed. Thanks to theJollySin for adding clean formatting! Something omitted in the rewrite, indicative of what Aman had in mind, was "I inter in Loop" (sic).
We don't know what query was entered that resulted in this looping, so speculation is required. The two rules suggest that Goal involved either the parallel/2 or the perpendicular/2 predicate.
With practice it's not hard to understand what the Prolog engine will do when a query is posed, especially a single goal query. Prolog uses a pretty simple "follow your nose" strategy in attempting to satisfy a goal. Look for the rules for whichever predicate is invoked. Then see if any of those rules, starting with the first and going down in the list of them, can be applied.
There are three topics that beginning Prolog programmers will typically struggle with. One is the recursive nature of the search the Prolog engine makes. Here the only rule for parallel/2 has a right-hand side that invokes two subgoals for perpendicular/2, while the only rule for perpendicular/2 invokes both a subgoal for itself and another subgoal for parallel/2. One should expect that trying to satisfy either kind of query inevitably leads to a Hydra-like struggle with bifurcating heads.
The second topic we see in this example is the use of free variables. If we are to gain knowledge about perpendicularity or parallelism of two specific lines (geometry), then somehow the query or the rules need to provide "binding" of variables to "ground" terms. Again without the actual Goal being queried, it's hard to guess how Aman expected that to work. Perhaps there should have been "facts" supplied about specific lines that are perpendicular or parallel. Lines could be represented merely as atoms (perhaps lowercase letters), but Prolog variables are names that begin with an uppercase letter (as in the two given rules) or with an underscore (_) character.
Finally the third topic that can be quite confusing is how Prolog handles negation. There's only a touch of that in these rules, the place where X\==Y is invoked. But even that brief subgoal requires careful understanding. Prolog implements "negation as failure", so that X\==Y succeeds if and only if X==Y does not succeed. This latter goal is also subtle, because it asks whether X and Y are the same without trying to do any unification. Thus if these are different variables, both free, then X==Y fails (and X\==Ysucceeds). On the other hand, the only way for X==Yto succeed (and thus for X\==Y to fail) would be if both variables were bound to the same ground term. As discussed above the two rules as stated don't provide a way for that to be the case, though something might have taken care of this in the query Goal.
The homework assignment for Aman is to learn about these Prolog topics:
recursion
free and bound variables
negation
Perhaps more concrete suggestions can then be made about Prolog doing geometry proofs!
Added: PTTP (Prolog Technology Theorem Prover) was written by M.E. Stickel in the late 1980's, and this 2006 web page describes it and links to a download.
It also summarizes succinctly why Prolog alone is not " a full general-purpose theorem-proving system." Pointers to later, more capable theorem provers can be followed there as well.

Better to use "and" or "in" when chaining "let" statements?

I realize this is probably a silly question, but...
If I'm chaining a bunch of let statements which do not need to know each other's values, is it better to use and or in?
For example, which of these is preferable, if any:
let a = "foo"
and b = "bar"
and c = "baz"
in
(* etc. *)
or
let a = "foo" in
let b = "bar" in
let c = "baz"
in
(* etc. *)
My intuition tells me the former ought to be "better" (by a very petty definition of "better") because it creates the minimum number of scopes necessary, whereas the latter is a scope-within-a-scope-within-a-scope which the compiler/interpreter takes care to note but is ultimately unimportant and needlessly deep.
My opinion is that in is better. The use of and implies that the definitions are mutually dependent on each other. I think it is better to be clear that this is not the case. On the other hand, some OCaml programmers do prefer and for very short definitions, where the slightly more compact notation can appear cleaner. This is especially true when you can fit the definitions on a single line:
let a = "foo" and b = "bar" in
The answer to the question "which is better?" only really makes sense when the interpretations do not differ. The ML family of languages (at least SML and OCaml) have both a parallel initialization form (and) and a sequential, essentially nested-scope form (in) because they are both useful in certain situations.
In your case the semantics are the same, so you are left to answer the question "what reads best to you?" and maybe (if this is not premature) "which might be executed more efficiently?" In your case the alternatives are:
For the and case: evaluate a bunch of strings and do a parallel binding to three identifiers.
For the in case: bind foo to a, then bind bar to b, then bind baz to c.
Which reads better? It is a toss up in this case because the outcome does not matter. You can poll some English speakers but I doubt there will be much preference. Perhaps a majority will like and as it separates bindings leaving the sole in before the expression.
As to what executes more efficiently, I think a good compiler will likely produce the same code because it can detect the order of binding will not matter. But perhaps you have a compiler that generates code for a multicore processor that does better with and. Or maybe a naive compiler which writes all the RHSs into temporary storage and then binds them, making the and case slower.
These are likely to be non-essential optimization concerns, especially since they involve bindings and are probably not in tight loops.
Bottom line: it's a true toss-up in this case; just make sure to use parallel vs. sequencial correctly when the outcome does matter.
I would say in is better because it reduces scope and better expresses intent. If I see all these defintions chained together in a shared scope manner I would be under the impression that it was done for a reason and would be looking for how they effect each other.
It's the same as the difference between let and let* in Lisp, I believe.
let* (similar in functionality to the in..in.. structure - the second structure in your question) is sometimes used to hide imperative programming since it guarantees sequential execution of expressions (see what Paul Graham had to say about let* in On Lisp).
So, I'd say the former construct is better. But the truth is, I think the latter is more common in the Ocaml programs I have seen. Probably just easier to use in letting one build on previously named expressions.

Preventing evaluation of Mathematica expressions

In a recent SO question three different answers were supplied each using a different method of preventing the evaluation of the Equal[] expression. They were
Defer[]
Unevaluated[]
HoldForm[]
Sometimes I still have trouble choosing between these options (and judging by answers to the before mentioned question, the choice isn't always clear for other people either). Can someone write a clear exposition on the use of these three methods?
There are three other wrappers
Hold[],
HoldPattern[],
HoldComplete[],
and the various Attributes for functions
HoldAll, HoldFirst, HoldRest and the numeric versions NHold* that can also be discussed if you wish!
Edit
I just noticed that this is basically a repeat of the old question (which I had already upvoted, just forgotten...). The accepted answer linked to this talk at the 1999 Mathematica Developer Conference, which doesn't discuss Defer since it is "New in 6". Defer is more closely linked to the frontend than the other evaluation control mechanisms. It is used to create an unevaluated output that will be evaluated if supplied in and Input expression. To quote the Documentation Center:
Defer[expr] returns an object which
remains unchanged until it is
explicitly supplied as Mathematica
input, and evaluated using
Shift+Enter, Evaluate in Place, etc.
Not touching Defer, since I did not work much with it and feel that in any given case its behavior can be reproduced by other mentioned wrappers, and discussing Hold instead on HoldForm (the difference is really in the way they are printed), here is the link to a mathgroup post where I gave a rather extensive explanation of differences between Hold and Unevaluated, including differences in usage and in the evaluation process (my second and third posts in particular).
To put the long story short, Hold is used to preserve an expression unevaluated in between several evaluations (for indefinite time, until we need it), is a visible wrapper in the sense that say Depth[Hold[{1,2,3}]] is not the same as Depth[{1,2,3}] (this is of course a consequence of evaluation), and is generally nothing special - just a wrapper with HoldAll attribute like any other, except being an "official" holding wrapper and being integrated much better with the rest of the system, since many system functions use or expect it.
OTOH, Unevaluated[expr] is used to temporarily, just once, make up for a missing Hold* attribute for a function enclosing expression expr. While resulting in behavior which would require this enclosing function to hold expr as if it had Hold* - attribute, Unevaluated belongs to the argument, and works only once, for a single evaluation, since it gets stripped in the process. Also, because it gets stripped, it often is invisible for the surrounding wrappers, unlike Hold. Finally, it is one of a very few "magic symbols", along with Sequence and Evaluate - these are deeply wired into the system and can not be easily replicated or blocked, unlike Hold - in that sense, Unevaluated is more fundamental.
HoldComplete is used when one wants to prevent certain stages of evaluation process, which Hold does not prevent. This includes splicing sequences, for example:
In[25]:= {Hold[Sequence[1, 2]], HoldComplete[Sequence[1, 2]]}
Out[25]= {Hold[1, 2], HoldComplete[Sequence[1, 2]]},
search for UpValues, for example
In[26]:=
ClearAll[f];
f /: Hold[f[x_]] := f[x];
f[x_] := x^2;
In[29]:= {Hold[f[5]], HoldComplete[f[5]]},
Out[29]= {25, HoldComplete[f[5]]}
and immunity to Evaluate:
In[33]:=
ClearAll[f];
f[x_] := x^2;
In[35]:= {Hold[Evaluate[f[5]]], HoldComplete[Evaluate[f[5]]]}
Out[35]= {Hold[25], HoldComplete[Evaluate[f[5]]]}
In other words, it is used when you want to prevent any evaluation of the expression inside, whatsoever. Like Hold, HoldComplete is nothing special in the sense that it is just an "official" wrapper with HoldAllComplete attribute, and you can make your own which would behave similarly.
Finally, HoldPattern is a normal (usual) head with HoldAll attribute for the purposes of evaluation, but its magic shows in the pattern-matching: it is invisible to the pattern-matcher, and is very important ingredient of the language since it allows pattern-matcher to be consistent with the evaluation process. Whenever there is a danger that the pattern in some rule may evaluate, HoldPattern can be used to ensure that this won't happen, while the pattern remains the same for the pattern-matcher. One thing I'd stress here that this is the only purpose for it. Often people use it also as an escape mechanism for the pattern-matcher, where Verbatim must be used instead. This works, but is conceptually wrong.
One very good account on evaluation process and all these things is a book of David Wagner, Power programming with Mathematica - the kernel, which was written in 1996 for version 3, but most if not all of the discussion there remains valid today. It is out of print alas, but you might have some luck on Amazon (as I had a few years ago).
Leonid Shifrin's answer is quite nice, but I wanted to touch on Defer, which is really useful for only one thing. In some instances, it's nice to be able to directly construct expressions that won't be evaluated, but that a user will be able to easily edit; the basic example for this kind of behavior is button palettes that you can use to insert expressions or expression templates into input cells which the user can then edit as needed. This isn't the only way to do this, and for some more sophisticated applications you'll need to get into the hairy world MakeBoxes, but for the basics Defer will serve nicely.

Resources