Why is "1 2 3" a valid Scheme expression? - scheme

This is valid Scheme
1 2 3
and returns 3. But this is not valid
(1 2 3)
and neither is this valid
(1) (2) (3)
It makes sense that the last two are not valid. But I don't see how the first one should be valid? Can someone please explain?

Probably, the REPL of your implementation is reading a non-empty sequence of expressions and evaluating all of them and giving the sequence of results. (I believe that this behavior of the REPL is implementation specific. R7RS does not mention it in its §5.7. It might vary with other implementations of Scheme and I never used it; I agree with coredump's answer that it may be a fancy and useful feature)
Scheme is able to return and handle several values, e.g. with call-with-values & values ; see also this
So technically 1 2 3 is not a single expression, but a sequence of three expressions. A call to (read) won't give you such a sequence. And on guile 2 on Linux (+ 2 (read)) 55 66 gives immediately two results 57 66, not three (after waiting for input).
Read also about call/cc, continuations, and CPS. There might be some indirect relation (how does your REPL deal internally with them when (read)ing expressions....)

The REPL is reading multiple forms in sequence (this is also how Common Lisp implementations work). Given that the user entered multiple forms, what else could happen?
displaying an error would be correct, but unhelpful
discarding previous/next forms would be confusing
IMHO the behavior of reading all the given forms as-if they were in an implicit progn is the most useful for the user. It is also consistent with how files are read, and allow you for example to paste the content of a file directly into the REPL.

The first one is valid because it is considered a simple expression. Scheme doesn't do anything with it and just echoes it back to you. This is true not only for numbers, but also for all constants including strings, symbols and even lists (if you quote them to create an actual list rather than a function call). For ex. if you type in '(1 2 3), it will be just echoed back to you without being interpreted.
The way scheme, and generally other lisps, evaluate expression can be described by two broad rules (and this is probably a naive interpretation):
If the entered expression is a constant (a string, number, symbol, list or something else), the constant itself is the value. Just echo it back.
If the expression is of the form (procedure arg1 ... arg-n), find the value of procedure, evaluate or find the values of all the arguments (arg1 to arg-n), and apply procedure to the arguments.
More details are available at Evaluating Scheme Expressions chapter in the the book The Scheme Programming Language 4e.

The key observation is that it is not allowed to put in extra parenthesis in Scheme. In (+ 1 2) the expression + is evaluated and the result is a plus-function. That function is applied to 1 2 and you get the result 3.
In your example (1) means evaluate 1 and the result is 1. Then apply 1 to no arguments. And since 1 is not a function you will get an error.
In the example (1 2 3) your system attempts to apply 1 to the arguments 2 and 3 and get an error.
In short: Parenthesis in arithmetics is used to group an operation - in Scheme it means function application.
Finally: 1 2 3 is not a single expression but three expressions. The last of which evaluates to 3.

Related

Why are the names of predicates in scheme in the form of questions?

Racket is the first dialect of scheme I am learning, and I’m not that far in, however due to scheme’s minimal syntax, I believe it’s safe to assume that a question mark in variable names is not treated by the interpreter any differently than any other viable character.
With that run on sentence out of the way, why does scheme use the symbol “?” to denote a function that returns true or false (called a predicate)? For example, in racket, there is a built in function called number?. number? returns true when applied to any number (1, 5, -5, 2.7, etc), and false otherwise. I believe that number? is short for something along the lines of is_the_following_argument_a_number?. Assuming that is true, the expression (number? 5) translates into (is_the_following_argument_a_number? 5).
In english (the language this variable was written in), the predicate of “is the following argument a number?” can be found by first translating the question into its statement form by moving the verb: “the following argument is a number”, and then extracting the predicate: “is a number”. Now, I’m not the best at speaking languages as I am at programming languages, but I believe that is correct. Also, sorry if this is turning into an english question more than a scheme question.
What I am having trouble understanding is the fact that if the lisp community calls number? a predicate, why is the variable name not a predicate in english (I say that the variable name isn’t a predicate in english, not the type of function it is in scheme isn’t a predicate) I found what I thought the predicate of what I thought number? translated into, as being “is a number”, not the entire question “is the following argument a number?”, just the predicate. So, why does the lisp community choose to name predicates in scheme as questions in english? I believe that this is because the community mistakes the values of statements (true or false) for the answers to yes/no questions (yes or no (obviously)). Am I wrong to think this?
A predicate in computer science doesn't have anything to do with a predicate in language grammar. They both derive from having to do with thruth but otherwise they are unrelated concepts. A predicate in Scheme is a procedure that checks if something is true or not and in reality it can have any name. However since we can code information in the name it should contain to the point what it is about, which can be any word or even sentence delimited by hyphens, ending with question mark to indicate that it is indeed a predicate procedure. Both the name in the definition and the usage will stand out to the reader so that they know it without looking at the documentation or the implementation.
Scheme predicates in the very first Scheme report and the second looked like Common Lisp and the predicates in Scheme followed the same naming convention as Common Lisp has today. Old procedures that were in LISP 1.5 has the same name without the common p-ending while new introduced ones had it, like procp (called procedure? today). The reason for this is that Scheme run under MacLisp and borrowed all the dull stuff from it while it was lexical closures that were the magic of Scheme. Actually, it looked a lot looked like Common Lisp.
In the RRRS or R2RS they made all predicated end with ? and it worked with eq? and friends but the arithmetic predicates that used symbols, like <?, =?, <=?, etc, was not a success and were removed in the R3RS.
In a conditional we call the parts predicate, consequence and alternative:
(if (< a 0) ; predicate
(- a) ; consequent
a) ; alternative
Here a predicate is just an expression that either turns true or false. Actually all Scheme values are allowed and only #f is false. A predicate procedure is a procedure that always either returns #t or #f and it is as you are writing that number? check whether the argument is a number and string=? checks whether two arguments are strings that look the same. The pattern is very good and you can imagine what it does just by looking at the name being used while keeping the procedure names short. In speech we often do the same, like saying "coffee?" and getting either positive or negative response. It works most of the times and some times people need to spell it out that they are offering them a hot beverage whose name is coffee. In coding that means looking in the documentation or definition of a procedure.
There are other naming conventions in Scheme.
foo->bar is a procedure that takes an argument that is a foo type and it returns it as a bar type. number->string takes a number and makes a string representation of it. (number->string 5) ; ==> "5"
foo! may change the objects you pass it in order to do the job slightly faster than if it was named foo. set! and set-car! are examples.
*variable* are from CL but in Scheme you can be sure it is a global variable.
CONSTANT, +CONSTANT+, +constant+ are common naming for variables that are considered to be constants.
form* does something similar to what form does, but not quite. Special form let* does something similar to let but it binds one variable at a time.
The code works whether you follow these or not, but you are making it easier to read by using this convention and when you try to make a somparison procedure foo=? is just as easy to understand as are-these-two-foo-things-equal and foo? is just as easy as argument-is-a-foo.
Note that other programming languages also does this. In Java one write isFoo and equals so it's not spelled out there either.
It's just a programming convention. Predicates - meaning: those procedures that return true or false, are defined with a name that ends in a question mark. Similarly, Procedures that have side effects (e.g., that mutate state) are defined with a name that ends in exclamation mark.

Difference between multiple values and plain tuples in Racket?

What is the difference between values and list or cons in Racket or Scheme? When is it better to use one over the other? For example, what would be the disadvantage if quotient/remainder returns (cons _ _) rather than (values _ _)?
Back in 2002 George Caswell asked that question in comp.lang.scheme.
The ensuing thread is long, but has many insights. The discussion
reveals that opinions are divided.
https://groups.google.com/d/msg/comp.lang.scheme/ruhDvI9utVc/786ztruIUNYJ
My answer back then:
> What are the motivations behind Scheme's multiple return values feature?
> Is it meant to reflect the difference in intent, or is there a
> runtime-practical reason?
I imagine the reason being this.
Let's say that need f is called by g. g needs several values from f.
Without multiple value return, f packs the values in a list (or vector),
which is passed to g. g then immediately unpacks the list.
With multple values, the values are just pushed on the stack. Thus no
packing and unpacking is done.
Whether this should be called an optimization hack or not, is up to you.
--
Jens Axel Søgaard
We don't need no side-effecting We don't need no allocation
We don't need no flow control We don't need no special-nodes
No global variables for execution No dark bit-flipping for debugging
Hey! did you leave the args alone? Hey! did you leave those bits alone?
(Chorus) -- "Another Glitch in the Call", a la Pink Floyd
They are semantically the same in Scheme and Racket. In both you need to know how the return looks like to use it.
values is connected to call-with-values and special forms like let-values are just syntax sugar with this procedure call. The user needs to know the form of the result to use call-with-values to make use of the result. A return is often done on a stack and a call is also on a stack. The only reason to favor values in Scheme would be that there are no overhead between the producer return and the consumer call.
With cons (or list) the user needs to know how the data structure of the return looks like. As with values you can use apply instead of call-with-values to do the same thing. As a replacement for let-values (and more) it's easy to make a destructuring-bind macro.
In Common Lisp it's quite different. You can use values always if you have more information to give and the user can still use it as a normal procedure if she only wants to use the first value. Thus for CL you wouldn't need to supply quotient as a variant since quotient/remainder would work just as well. Only when you use special forms or procedures that take multiple values will the fact that the procedure does return more values work the same way as with Scheme. This makes values a better choice in CL than Scheme since you get away with writing one instead of more procedures.
In CL you can access a hash like this:
(gethash 'key *hash* 't)
; ==> T; NIL
If you don't use the second value returned you don't know if T was the default value or the actual value found. Here you see the second value indicating the key was not found in the hash. Often you don't use that value if you know there are only numbers the default value would already be an indication that the key was not found. In Racket:
(hash-ref hash 'key #t)
; ==> #t
In racket failure-result can be a thunk so you get by, but I bet it would return multiple values instead if values did work like in CL. I assume there is more housekeeping with the CL version and Scheme, being a minimalistic language, perhaps didn't want to give the implementors the extra work.
Edit: Missed Alexis' comment on the same topic before posting this
One oft-overlooked practical advantage of using multiple return values over lists is that Racket's compose "just works" with functions that return multiple values:
(define (hello-goodbye name)
(values (format "Hello ~a! " name)
(format "Goodbye ~a." name)))
(define short-conversation (compose string-append hello-goodbye))
> (short-conversation "John")
"Hello John! Goodbye John."
The function produced by compose will pass the two values returned by hello-goodbye as two arguments to string-append. If you're writing code in a functional style with lots of compositions, this is very handy, and it's much more natural than explicitly passing values around yourself with call-with-values and the like.
It's also related to your programming style. If you use values, then it usually means you want to explicitly return n values. Using cons, list or vector usually means you want to return one value which contains something.
There are always pros/cons. For values: It may use less memory on some implemenentations. The caller need to use let-values or other multiple values specific syntax. (I wish I could use just let like CL.)
For cons or other types: You can use let or lambda to receive the returning value. You need to explicitly deconstruct it to get the value you want using car or other procedures.
Which to use and when? Again depending on your programming style and case by case but if the returning value can't be represented in one object (e.g. quotient and remainder), then it might be better to use values to make the procedure's meaning clearer. If the returning value is one object (e.g. name and age for a person), then it might be better to use cons or other constructor (e.g. record).

What is the semantic difference between defining a name with and without parentheses?

(Though this is indeed simple question, I find sometimes it is common mistakes that I made when writing Scheme program as a beginner.)
I encountered some confusion about the define special form. A situation is like below:
(define num1
2)
(define (num2)
2)
I find it occurs quite often that I call num2 without the parentheses and program fails. I usually end up spending hours to find the cause.
By reading the r5rs, I realized that definition without parenthesis, e.g. num1, is a variable; while definition with parenthesis, e.g. num2, is a function without formal parameters.
However, I am still blurred about the difference between a "variable" and "function".
From a emacs lisp background, I can only relate above difference to similar idea as in emacs lisp:
In Emacs Lisp, a symbol can have a value attached to it just as it can
have a function definition attached to it.
[here]
Question: Is this a correct way of understanding the difference between enclosed and non-enclosed definitions in scheme?
There is no difference between a value and a function in Scheme. A function is just a value that can be used in a particular way - it can be called (as opposed to other kinds of value, such as numbers, which cannot be called, but can be e.g. added, which a function cannot).
The parentheses are just a syntactic shortcut - they're a faster, more readable (to experts) way of writing out the definition of the name as a variable containing a function:
(define (num)
2)
;is exactly the same as
(define num
(lambda () 2) )
The second of these should make it more visually obvious that the value being assigned to num is not a number.
If you wanted the function to take arguments, they would either go within the parentheses (after num, e.g. (num x y) in the first form, or within lambda's parentheses (e.g. (lambda (x y)... in the second.
Most tutorials for the total beginner actually don't introduce the first form for several exercises, in order to drive home the point that it isn't separate and doesn't really provide any true functionality on its own. It's just a shorthand to reduce the amount of repetition in your program's text.
In Scheme, all functions are values; variables hold any one value.
In Scheme, unlike Common Lisp and Emacs Lisp, there are no different namespaces for functions and other values. So the statement you quoted is not true for Scheme. In Scheme a symbol is associated with at most one value and that value may or may not be a function.
As to the difference between a non-function value and a nullary function returning that value: In your example the only difference is that, as you know, num2 must be applied to get the numeric value and num1 does not have to be and in fact can't be applied.
In general the difference between (define foo bar) and (define (foo) bar) is that the former evaluated bar right now and foo then refers to the value that bar has been evaluated to, whereas in the latter case bar is evaluated each time that (foo) is used. So if the expression foo is costly to calculate, that cost is paid when (and each time) you call the function, not at the definition. And, perhaps more importantly, if the expression has side effects (like, for example, printing something) those effects happen each time the function is called. Side effects are the primary reason you'd define a function without parameters.
Even though #sepp2k has answered the question, I will make it more clearer with example:
1 ]=> (define foo1 (display 23))
23
;Value: foo1
1 ]=> foo1
;Unspecified return value
Notice in the first one, foo1 is evaluated on the spot (hence it prints) and evaluated value is assigned to name foo1. It doesn't evaluate again and again
1 ]=> (define (foo2) (display 23))
;Value: foo2
1 ]=> foo2
;Value 11: #[compound-procedure 11 foo2]
1 ]=> (foo2)
23
;Unspecified return value
Just foo2 will return another procedure (which is (display 23)). Doing (foo2) actually evaluates it. And each time on being called, it re-evaluates again
1 ]=> (foo1)
;The object #!unspecific is not applicable.
foo1 is a name that refers a value. So Applying foo1 doesn't make sense as in this case that value is not a procedure.
So I hope things are clearer. In short, in your former case it is a name that refers to value returned by evaluating expression. In latter case, each time it is evaluated.

Pythonesque blocks and postfix expressions

In JavaScript,
f = function(x) {
return x + 1;
}
(5)
seems at a glance as though it should assign f the successor function, but actually assigns the value 6, because the lambda expression followed by parentheses is interpreted by the parser as a postfix expression, specifically a function call. Fortunately this is easy to fix:
f = function(x) {
return x + 1;
};
(5)
behaves as expected.
If Python allowed a block in a lambda expression, there would be a similar problem:
f = lambda(x):
return x + 1
(5)
but this time we can't solve it the same way because there are no semicolons. In practice Python avoids the problem by not allowing multiline lambda expressions, but I'm working on a language with indentation-based syntax where I do want multiline lambda and other expressions, so I'm trying to figure out how to avoid having a block parse as the start of a postfix expression. Thus far I'm thinking maybe each level of the recursive descent parser should have a parameter along the lines of 'we have already eaten a block in this statement so don't do postfix'.
Are there any existing languages that encounter this problem, and how do they solve it if so?
Python has semicolons. This is perfectly valid (though ugly and not recommended) Python code: f = lambda(x): x + 1; (5).
There are many other problems with multi-line lambdas in otherwise standard Python syntax though. It is completely incompatible with how Python handles indentation (whitespace in general, actually) inside expressions - it doesn't, and that's the complete opposite of what you want. You should read the numerous python-ideas thread about multi-line lambdas. It's somewhere between very hard to impossible.
If you want arbitrarily complex compound statements inside lambdas you can't use the existing rules for multi-line expressions even if you made all statements expressions. You'd have to change the indentation handling (see the language reference for how it works right now) so that expressions can also contain blocks. This is hard to do without breaking perfectly fine Python code, and will certainly result in a language many Python programmers will consider worse in several regards: Harder to understand, more complex to implement, permits some stupid errors, etc.
Most languages don't solve this exact problem at all. Most candidates (Scala, Ruby, Lisps, and variants of these three) have explicit end-of-block tokens. I know of two languages that have the same problem, one of which (Haskell) has been mentioned by another answer. Coffeescript also uses indentation without end-of-block tokens. It parses the transliteration of your example correctly. However, I could not find any specification of how or why it does this (and I won't dig through the parser source code). Both differ significantly from Python in syntax as well as design philosophy, so their solution is of little (if any) use for Python.
In Haskell, there is an implicit semicolon whenever you start a line with the same indentation as a previous one, assuming the parser is in a layout-sensitive mode.
More specifically, after a token is encountered that signals the start of a (layout-sensitive) block, the indentation level of the first token of the first block item is remembered. Each line that is indented more continues the current block item; each line that is indented the same starts a new block item, and the first line that is indented less implies the closure of the block.
How your last example would be treated depends on whether the f = is a block item in some block or not. If it is, then there will be an implicit semicolon between the lambda expression and the (5), since the latter is indented the same as the former. If it is not, then the (5) will be treated as continuing whatever block item the f = is a part of, making it an argument to the lamda function.
The details are a bit messier than this; look at the Haskell 2010 report.

Write a typeof procedure in Scheme

Help me answer the following question appears in Simply Scheme
6.7 Write a procedure type-of that takes anything as its argument and returns one of the words word, sentence, number, or boolean:
> (type-of '(getting better))
SENTENCE
> (type-of 'revolution)
WORD
> (type-of (= 3 3))
BOOLEAN
(Even though numbers are words, your procedure should return number if its argument is a number.)
You can use the form cond to check several conditions and execute an action accordingly. You can use the predicates boolean?, number?, word? and sentence?¹ to find out whether a value is a boolean, number, word or sentence respectively. That's basically all there is to it.
The only thing you need to consider is that the case for number? must come before the case for word? (because word? would also return true for numbers as the exercise helpfully points out).
¹ The first two are standard scheme, the latter two are defined in simply.scm, which comes with the book.

Resources