I am still having some troubles with this concept. The key paragraph in the r7rs standard is:
"Identifiers that appear in the template but are not pattern
variables or the identifier ellipsis are inserted into the output as literal identifiers. If a literal identifier is inserted as a
free identifier then it refers to the binding of that identifier
within whose scope the instance of syntax-rules appears.
If a literal identifier is inserted as a bound identifier then
it is in effect renamed to prevent inadvertent captures of
free identifiers."
By "bound identifier" am I right that it means any argument to a lambda, a top-level define or a syntax definition ie. define-syntax, let-syntax or let-rec-syntax? (I think I could handle internal defines with a trick at compile time converting them to lambdas.)
By "free identifier" does it mean any other identifier that presumably is defined beforehand with a "bound identifier" expression?
I wonder about the output of code like this:
(define x 42)
(define-syntax double syntax-rules ()
((_) ((lambda () (+ x x)))))
(set! x 3)
(double)
Should the result be 84 or 6?
What about this:
(define x 42)
(define-syntax double syntax-rules ()
((_) ((lambda () (+ x x)))))
(define (proc)
(define x 3)
(double))
(proc)
Am I right to suppose that since define-syntax occurs at top-level, all its free references refer to top-level variables that may or may not exist at the point of definition. So to avoid collisions with local variables at the point of use, we should rename the outputted free reference, say append a '%' to the name (and disallow the user to create symbols with % in them). As well as duplicate the reference to the top-level variable, this time with the % added.
If a macro is defined in some form of nested scope (with let-syntax or let-rec-syntax) this is even trickier if it refers to scoped variables. When there is a use of the macro it will have to expand these references to their form at point of definition of the macro rather than point of use. So I'm guessing the best way is expand it naturally and scan the result for lambdas, if it finds one, rename its arguments at point of definition, as the r7rs suggests. But what about references internal to this lambda, should we change these as well? This seems obvious but was not explicitly stated in the standard.
Also I'm still not sure whether it is best to have a separate expansion phase separate from the compiler, or to interweave expanding macros with compiling code.
Thanks, and excuse me if I've missed something obviously, relatively new to this.
Steve
In your first example, properly written:
(define x 42)
(define-syntax double
(syntax-rules ()
( (_) ((lambda () (+ x x))) ) ))
(set! x 3)
(double)
the only possibility is 6 as there is only one variable called x.
In your second example, properly written:
(define x 42)
(define-syntax double
(syntax-rules ()
((_) ((lambda () (+ x x))) )))
(define (proc)
(define x 3)
(double))
(proc)
the hygienic nature of the Scheme macro system prevents capture of the unrelated local x, so the result is 84.
In general, identifiers (like your x) within syntax-rules refer to what they lexically refer to (the global x in your case). And that binding will be preserved because of hygiene. Because of hygiene you do not have to worry about unintended capture.
Thanks, I think I understand... I still wonder how in certain advanced circumstances hygiene is achieved, eg. the following:
(define (myproc x)
(let-syntax ((double (syntax-rules ()
((double) (+ x x)))))
((lambda (x) (double)) 3)))
(myproc 42)
The site comes up with 84 rather than 6. I wonder how this (correct) referential transparency is achieved just by renaming. The transformer output does not bind new variables, yet still when it expands on line 4, we have to find a way to get to the desired x rather than the most recent.
The best way I can think of is simply rename every time a lambda argument or definition shadows another, ie. keep appending %1, %2 etc... macro outputs will have their exact versions named (eg. x%1) while references to identifiers simply have their unadorned name x and the correct variable is found at compile time.
Thanks, I hope for any clarification.
Steve
Related
I tried to evaluate this: (define lambda (lambda (x) x)). MIT Scheme 11.2 gives an error: ;Unbound variable: x. Chez Scheme 9.5 also gives an error: Exception: variable x is not bound. Why is x not bound? I thought that define would evaluate (lambda (x) x) into an anonymous function, and then define lambda to be that anonymous function. Where does x get involved?
I don't get any errors in Racket 7.2 and Guile 3.0.1.
This isn't valid Scheme code:
(define lambda (lambda (x) x))
R7RS seems pretty clear about forbidding this in Section 5.4 about syntax definitions:
However, it is an error for a definition to define an identifier whose binding has to be known in order to determine the meaning of the definition itself, or of any preceding definition that belongs to the same group of internal definitions.
In the posted code the identifer lambda is being redefined, but the binding of lambda itself must be known in order to determine the meaning of the new definition; the above language forbids this.
R6RS has some similar language in Chapter 10 about the expansion process, but the R6RS language is more tightly coupled to the technical details of the expansion process. I'm pretty sure that it applies in the same way to this case, but not 100% sure.
A definition in the sequence of forms must not define any identifier whose binding is used to determine the meaning of the undeferred portions of the definition or any definition that precedes it in the sequence of forms.
OP notes the error message Exception: variable x is not bound and asks: "Where does x get involved?"
Granting that (define lambda (lambda (x) x)) is malformed and thus not valid Scheme, it isn't too meaningful to try to explain such behaviors. Yet it seems that this sort of behavior can be triggered by attempting to redefine other syntactic keywords in similar fashion. Consider:
> (define if (if x y z))
Exception: variable z is not bound
It seems obvious here that z is unbound, so the error doesn't seem unreasonable. But now:
> (define if (if #t 1 2))
Exception: variable if is not bound
Even if is unbound in the right-hand expression of the define form! If we assume that something similar is happening in (define lambda (lambda (x) x)) then lambda is unbound in the right-hand expression, and if that is the case, then (lambda (x) x) is not evaluated as a special form, but as an ordinary procedure call. The order of evaluation for arguments in a procedure call is unspecified, so it is perfectly reasonable that x could be reported as unbound before lambda in this case. We can get rid of the appearances of unbound xs to see if lambda is being bound:
> (define lambda (lambda))
Exception: variable lambda is not bound
> (define lambda lambda)
Exception: variable lambda is not bound
So it seems that the problem here is that attempting to redefine a syntactic keyword using an expression that relies on that keyword for its meaning can cause the binding for the syntactic keyword to become inaccessible.
In any case, take this last bit with a grain of salt because in the end this sort of redefinition is not valid code, and no particular behavior should be expected of it.
Why is it that:
Function definitions can use definitions defined after it
while variable definitions can't.
For example,
a) the following code snippet is wrong:
; Must define function `f` before variable `a`.
#lang racket
(define a (f))
(define (f) 10)
b) While the following snippet is right:
; Function `g` could be defined after function `f`.
#lang racket
(define (f) (g)) ; `g` is not defined yet
(define (g) 10)
c) Right too :
; Variable `a` could be defined after function `f`
#lang racket
(define (f) a) ; `a` is not defined yet
(define a 10)
You need to know several things about Racket:
In Racket, each file (that starts with #lang) is a module, unlike many (traditional, r5rs) schemes that have no modules.
The scoping rules for modules are similar to the rules for a function, so in a sense, these definitions are similar to definitions in a function.
Racket evaluates definitions from left to right. In scheme lingo you say that Racket's definitions have letrec* semantics; this is unlike some schemes that use letrec semantics where mutually recursive definitions never work.
So the bottom line is that the definitions are all created in the module's environment (similarly in a function, for function-local definitions), and then they are initialized from left to right. Back-references therefore always work, so you can always do something like
(define a 1)
(define b (add1 a))
They are created in a single scope -- so in theory forward definitions are valid in the sense that they're in scope. But actually using a value of a forward-reference is not going to work since you get a special #<undefined> value until the actual value is evaluated. To see this, try to run this code:
#lang racket
(define (foo)
(define a a)
a)
(foo)
A module's toplevel is further restricted so that such references are actually errors, which you can see with:
#lang racket
(define a a)
Having all that in mind, things are a bit more lenient with references inside functions. The thing is that the body of a function is not executed until the function is called -- so if a forward reference happens inside a function, it is valid (= won't get an error or #<undefined>) if the function is called after all of the bindings have been initialized. This applies to plain function definitions
(define foo (lambda () a))
definitions that use the usual syntactic sugar
(define (foo) a)
and even other forms that ultimately expand into functions
(define foo (delay a))
With all of these, you won't get any errors by the same rule -- when all uses of the function bodies happen after the definitions were initialized.
One important note, however, is that you shouldn't confuse this kind of initialization with assignment. This means that things like
(define x (+ x 1))
are not equivalent to x = x+1 in mainstream languages. They're more like some var x = x+1 in a language that will fail with some "reference to uninitialized variable" error. This is because define creates a new binding in the current scope, it does not "modify" an existing one.
The following is an approximate general Scheme description, an analogy.
Defining a function
(define (f) (g))
is more or less like
f := (lambda () (g))
so the lambda expression is evaluated, and the resulting functional object (usually a closure) is stored in the new variable f being defined. The function g will have to be defined when the function f will be called.
Similarly, (define (h) a) is like h := (lambda () a) so only when the function h will be called, the reference to the variable a will be checked, to find its value.
But
(define a (f))
is like
a := (f)
i.e. the function f has to be called with no arguments, and the result of that call stored in the new variable a being defined. So the function has to be defined already, at that point.
Each definition in a file is executed in sequence, one after another. Each definition is allowed to refer to any of the variables being defined in a file, both above and below it (they are all said to belong to the same scope), but it is allowed to use values of only those variables that are defined above it.
(there is an ambiguity here: imagine you were using a built-in function, say with (define a (+ 1 2)), but were also defining it later on in the file, say (define + -). Is it a definition, or a redefinition? In the first case, which is Racket's choice, use before definition is forbidden. In the second, the "global" value is used in calculating the value of a, and then the function is redefined. Some Schemes may go that route. Thanks go to Eli Barzilay for showing this to me, and to Chris Jester-Young for helping out).
How the scheme compiler determines, which functions will be available during macroexpansion?
I mean such low level mechanisms, like syntax-case, where you can not only generate patter substitution, but call some functions, at least in a fender part
Edit:
I mean, I need to use an ordinary function in macroexpansion process. E. g.:
(define (twice a)
(declare 'compile-time)
(* 2 a))
(let-syntax ((mac (lambda (x)
(syntax-case x ()
((_ n) (syntax (display (unsyntax (twice n)))))))))
(mac 4))
Where n is known to be a number, and evaluation of (twice n) occurs during expansion.
Every Scheme compiler determines the functions referenced by a macro expansion. In your case, the compilation of 'let-syntax' will result in the compiler determining that 'twice' is free (syntactically out-of-scope within 'let-syntax'). When the macro is applied, the free reference to the 'twice' function will have been resolved.
Different Scheme compilers perform the free value resolution at possibly different times. You can witness this by defining 'twice' as:
(define twice
(begin (display 'bound')
(lambda (x) (* 2 x))))
[In your case, with let-syntax it will be hard to notice. I'd suggest define-syntax and then a later use of '(mac 4'). With that, some compilers (guile) will print 'bound' when the define-syntax is compiled; others (ikarus) will print 'bound' when '(mac 4)' is expanded.]
It depends what macro system you are using. Some of these systems allow you to call regular scheme functions during expansion. For example, Explicit Renaming Macros let you do this:
(define-syntax swap!
(er-macro-transformer
(lambda (form rename compare?)
...
`(let ((tmp ,x))
(set! ,x ,y)
(set! ,y tmp)))))
That said, the macro systems available to you will depend upon what Scheme you are using.
I've been studying Scheme recently and come across a function that is defined in the following way:
(define remove!
(let ((null? null?)
(cdr cdr)
(eq? eq?))
(lambda ... function that uses null?, cdr, eq? ...)
What is the purpose of binding null? to null? or cdr to cdr, when these are built in functions that are available in a function definition without a let block?
In plain R5RS Scheme, there is no module system -- only the toplevel. Furthermore, the mentality is that everything can be modified, so you can "customize" the language any way you want. But without a module system this does not work well. For example, I write
(define (sub1 x) (- x 1))
in a library which you load -- and now you can redefine -:
(define - +) ; either this
(set! - +) ; or this
and now you unintentionally broke my library which relied on sub1 decrementing its input by one, and as a result your windows go up when you drag them down, or whatever.
The only way around this, which is used by several libraries, is to "grab" the relevant definition of the subtraction function, before someone can modify it:
(define sub1 (let ((- -)) (lambda (x) (- x 1))))
Now things will work "more fine", since you cannot modify the meaning of my sub1 function by changing -. (Except... if you modify it before you load my library...)
Anyway, as a result of this (and if you know that the - is the original one when the library is loaded), some compilers will detect this and see that the - call is always going to be the actual subtraction function, and therefore they will inline calls to it (and inlining a call to - can eventually result in assembly code for subtracting two numbers, so this is a big speed boost). But like I said in the above comment, this is more coincidental to the actual reason above.
Finally, R6RS (and several scheme implementations before that) has fixed this and added a library system, so there's no use for this trick: the sub1 code is safe as long as other code in its library is not redefining - in some way, and the compiler can safely optimize code based on this. No need for clever tricks.
That's a speed optimization. Local variable access is usually faster than global variables.
How can I pass a variable by reference in scheme?
An example of the functionality I want:
(define foo
(lambda (&x)
(set! x 5)))
(define y 2)
(foo y)
(display y) ;outputs: 5
Also, is there a way to return by reference?
See http://community.schemewiki.org/?scheme-faq-language question "Is there a way to emulate call-by-reference?".
In general I think that fights against scheme's functional nature so probably there is a better way to structure the program to make it more scheme-like.
Like Jari said, usually you want to avoid passing by reference in Scheme as it suggests that you're abusing side effects.
If you want to, though, you can enclose anything you want to pass by reference in a cons box.
(cons 5 (void))
will produce a box containing 5. If you pass this box to a procedure that changes the 5 to a 6, your original box will also contain a 6. Of course, you have to remember to cons and car when appropriate.
Chez Scheme (and possibly other implementations) has a procedure called box (and its companions box? and unbox) specifically for this boxing/unboxing nonsense: http://www.scheme.com/csug8/objects.html#./objects:s43
You can use a macro:
scheme#(guile-user)> (define-macro (foo var)`(set! ,var 5))
scheme#(guile-user)> (define y 2)
scheme#(guile-user)> (foo y)
scheme#(guile-user)> (display y)(newline)
5
lambda!
(define (foo getx setx)
(setx (+ (getx) 5)))
(define y 2)
(display y)(newline)
(foo
(lambda () y)
(lambda (val) (set! y val)))
(display y)(newline)
Jari is right it is somewhat unscheme-like to pass by reference, at least with variables. However the behavior you want is used, and often encouraged, all the time in a more scheme like way by using closures. Pages 181 and 182(google books) in the seasoned scheme do a better job then I can of explaining it.
Here is a reference that gives a macro that allows you to use a c like syntax to 'pass by reference.' Olegs site is a gold mine for interesting reads so make sure to book mark it if you have not already.
http://okmij.org/ftp/Scheme/pointer-as-closure.txt
You can affect an outer context from within a function defined in that outer context, which gives you the affect of pass by reference variables, i.e. functions with side effects.
(define (outer-function)
(define referenced-var 0)
(define (fun-affects-outer-context) (set! referenced-var 12) (void))
;...
(fun-affects-outer-context)
(display referenced-var)
)
(outer-function) ; displays 12
This solution limits the scope of the side effects.
Otherwise there is (define x (box 5)), (unbox x), etc. as mentioned in a subcomment by Eli, which is the same as the cons solution suggested by erjiang.
You probably have use too much of C, PHP or whatever.
In scheme you don't want to do stuff like pass-by-*.
Understand first what scope mean and how the different implementation behave (in particular try to figure out what is the difference between LISP and Scheme).
By essence a purely functional programming language do not have side effect. Consequently it mean that pass-by-ref is not a functional concept.