Question about Common Lisp compilation order - compilation

I wrote a macro and a function in one file like this:
(defun test ()
(let ((x '(1 2 3)))
(macro-test (x real-b)
(print (+ 1 (car real-b))))))
(defmacro macro-test ((a b) &body body)
`(do ((,b ,a (cdr ,b)))
((not ,b))
,#body))
Then I load this file in repl and run (test). I got this error:
The variable REAL-B is unbound.
However, when I put defmacro before defun. Everything is fine.
I am confused in common lisp compilation order. I know if defmacro uses some functions inside, those functions should (eval-when (:compile-toplevel :load-toplevel :execute)), otherwise compiling will failed.
However, if macro definitions and function definitions are same in compilation time, the order is matter, right? The macro should be located in before where they are used (if I make two functions, the order doesn't matter). May I get more detail about SBCL's compiling order? And is it only for SBCL? Or in standard of Common Lisp?
Thank you!

The order does always matter: when you want to use a macro, it has to be known. The macro does a source transformation. How would you be able to do that source transformation with an unknown macro?
The Common Lisp standard does not require a multi-pass compilation in such a way that first all source code is read and all macros are collected and then compilation starts from the top of the file. File compilation in Common Lisp just walks through the source code from start to end. There might be multiple compilation phases later, but that is left to the implementations...
How should Lisp compile the function test when the macro macro-test is unknown? The Lisp compiler needs a) to know that it is a macro and b) it needs to have its definition to expand the macro form.
For Common Lisp this is a basic rule:
if we have a form (foo bar baz) then the evaluation basically looks at foo.
if foo is a special operator -> use that special operator
if foo is a macro operator -> macro expand the code and start again
if foo is a function -> call that function with the evaluated arguments
else -> an error
In compilation it looks similar:
if foo is a special operator -> compile that special form
if foo is a macro operator -> macro expand the macro form and compile that code
if foo is a function -> compile that function form
else -> warn and then assume foo is a function and compile a call to a future function of that name

Related

Can a program contain unbound variables if they will not be evaluated?

Consider the following program, where foo and bar have not been defined.
(define (f)
foo)
(if #t
(display "Hello!")
bar)
Is this a valid Scheme program? Are Scheme programs allowed to have unbound variables as long as those variables are never evaluated?
No:
"It is a syntax violation to reference an unbound variable" (r6rs report 9.1)
"It is an error to reference an unbound variable." (r7rs report 4.1.1)
The question code may be evaluated by a Scheme implementation, but is not a Scheme program,
and if included in a program the implementation should signal an error:
$ scheme
> (if #t (display "Hello!") bar)
Hello!
> (top-level-program (import (rnrs))
(if #t (display "Hello!") bar))
Exception: attempt to reference unbound identifier bar
>
Notes
(top-level-program ) is a Chez Scheme extension, and can be used to enter a Scheme program interactively.
Extracts from r6rs report related to the meanings of Scheme program, variable, and unbound
(italic emphasis in original, bold added):
5.1 Programs and libraries
A Scheme program consists of a top-level program together with a set of libraries
5.2 Variables, keywords, and regions
... An identifier that names a location is called a variable and is said to be bound to that location ...
Every mention of an identifier refers to the binding of the identifier that establishes the innermost
of the regions containing the use. ...
If there is no binding for the identifier, it is said to be unbound.
9.1. Primitive expression types / Variable references
An expression consisting of a variable (section 5.2) is a variable reference ...
It is a syntax violation to reference an unbound variable.
5.5. Syntax violations
... implementations must detect violations of the syntax. A syntax violation is an error with respect to
the syntax of library bodies, top-level bodies ...
If a top-level or library form in a program is not syntactically correct, then ...
execution of that top-level program or library must not be allowed to begin.
So a Scheme program can contain an unbound variable, but a standard-compliant implementation must not start evaluating
the program, even if evaluation of the variable could never be attempted.
No, but yes, but no...
Interpreting the terms Scheme program, unbound, and variable as used in recent
Scheme reports, it seems that the answer is no (see my other answer for references).
But this depends on one more assumption about the sample code - that if has its usual meaning
as the syntactic keyword described in the language defining reports.
Because identifiers in Scheme programs can shadow keywords, this is a valid
Scheme program in which bar appears to be unbound:
(import (rnrs))
(let-syntax ([if (syntax-rules ()
[(if e1 e2 _) (if e1 e2)])])
(if #t
(display "Hello!")
bar))
So:
$ scheme
> (top-level-program (import (rnrs))
(let-syntax ([if (syntax-rules ()
[(if e1 e2 _) (if e1 e2)])])
(if #t
(display "Hello!")
bar)
(if #f
(display "Hello!")
bar)))
Hello!
>

Scheme expression error on DrRacket, but evaluates correctly on other interpreters

the following expression failed on DrRacket but successfuly evaluated on other Scheme interpreter:
(define (f x) (g x))
This special form defines a function f(x) that when invoked - returns the invocation of g(x).
DrRacket complains that :
g: unbound identifier in: g
however, g must not be defined in that stage, since i just define f but not invoke (f) (i can bound g later but before calling (f) which is perfectly fine in other interpreters )
If you want to runt R5RS code in DrRacket, you need to choose the R5RS first.
In the menu "Language" choose the menu item "Choose Language...".
Then under other languages choose "R5RS".
Finally make sure "Disallow redefinition of initial bindings" is unselected.
Now you will get the same result as in the other implementations.
When you press RUN it wraps the definitions as a module and compiles it. It does more flow analysis than interpreted code and it knows g will never exist since the whole file has been parsed. As long as you define g Racket will be ok, but you haven't defined it before not after and that is the problem.
As an alternative you can enter all the code in the interactive window. eg.

How to determine if a variable exists in Chicken Scheme?

Is there a way in Chicken Scheme to determine at run-time if a variable is currently defined?
(let ((var 1))
(print (is-defined? var)) ; #t
(print (is-defined? var)) ; #f
EDIT: XY problem.
I'm writing a macro that generates code. This generated code must call the macro in mutual recursion - having the macro simply call itself won't work. When the macro is recursively called, I need it to behave differently than when it is called initially. I would use a nested function, but uh....it's a macro.
Rough example:
(defmacro m (nested)
(if nested
BACKQUOTE(print "is nested")
BACKQUOTE(m #t)
(yes, I know scheme doesn't use defmacro, but I'm coming from Common Lisp. Also I can't seem to put backquotes in here without it all going to hell.)
I don't want the INITIAL call of the macro to take an extra argument that only has meaning when called recursively. I want it to know by some other means.
Can I get the generated code to call a macro that is nested within the first macro and doesn't exist at the call site, maybe? For example, generating code that calls (,other-macro) instead of (macro)?
But that shouldn't work, because a macro isn't a first-class object like a function is...
When you write recursive macros I get the impression that you have an macro expansion (m a b ...) that turns into a (m-helper a (b ...)) that might turn into (let (a ...) (m b ...)). That is not directly recursive since you are turning code into code that just happens to contain a macro.
With destructuring-bind you really only need to keep track of two variables. One for car and one for cdr and with an implicit renaming macro the stuff not coming from the form is renamed and thus hygenic:
(define-syntax destructuring-bind
(ir-macro-transformer
(lambda (form inject compare?)
(define (parse-structure structure expression optional? body)
;;actual magic happens here. Returns list structure with a mix of parts from structure as well as introduced variables and globals
)
(match form
[(structure expression) . body ]
`(let ((tmp ,expression))
,(parse-structure structure 'tmp #f body))))))
To check if something from input is the same symbol you use the supplied compare? procedure. eg. (compare? expression '&optional).
There's no way to do that in general, because Scheme is lexically scoped. It doesn't make much sense to ask if a variable is defined if an referencing an undefined variable is an error.
For toplevel/global variables, you can use the symbol-utils egg but it is probably not going to work as you expect, considering that global variables inside modules are also rewritten to be something else.
Perhaps if you can say what you're really trying to do, I can help you with an alternate solution.

The Order of Variable and Function Definitions

Why is it that:
Function definitions can use definitions defined after it
while variable definitions can't.
For example,
a) the following code snippet is wrong:
; Must define function `f` before variable `a`.
#lang racket
(define a (f))
(define (f) 10)
b) While the following snippet is right:
; Function `g` could be defined after function `f`.
#lang racket
(define (f) (g)) ; `g` is not defined yet
(define (g) 10)
c) Right too :
; Variable `a` could be defined after function `f`
#lang racket
(define (f) a) ; `a` is not defined yet
(define a 10)
You need to know several things about Racket:
In Racket, each file (that starts with #lang) is a module, unlike many (traditional, r5rs) schemes that have no modules.
The scoping rules for modules are similar to the rules for a function, so in a sense, these definitions are similar to definitions in a function.
Racket evaluates definitions from left to right. In scheme lingo you say that Racket's definitions have letrec* semantics; this is unlike some schemes that use letrec semantics where mutually recursive definitions never work.
So the bottom line is that the definitions are all created in the module's environment (similarly in a function, for function-local definitions), and then they are initialized from left to right. Back-references therefore always work, so you can always do something like
(define a 1)
(define b (add1 a))
They are created in a single scope -- so in theory forward definitions are valid in the sense that they're in scope. But actually using a value of a forward-reference is not going to work since you get a special #<undefined> value until the actual value is evaluated. To see this, try to run this code:
#lang racket
(define (foo)
(define a a)
a)
(foo)
A module's toplevel is further restricted so that such references are actually errors, which you can see with:
#lang racket
(define a a)
Having all that in mind, things are a bit more lenient with references inside functions. The thing is that the body of a function is not executed until the function is called -- so if a forward reference happens inside a function, it is valid (= won't get an error or #<undefined>) if the function is called after all of the bindings have been initialized. This applies to plain function definitions
(define foo (lambda () a))
definitions that use the usual syntactic sugar
(define (foo) a)
and even other forms that ultimately expand into functions
(define foo (delay a))
With all of these, you won't get any errors by the same rule -- when all uses of the function bodies happen after the definitions were initialized.
One important note, however, is that you shouldn't confuse this kind of initialization with assignment. This means that things like
(define x (+ x 1))
are not equivalent to x = x+1 in mainstream languages. They're more like some var x = x+1 in a language that will fail with some "reference to uninitialized variable" error. This is because define creates a new binding in the current scope, it does not "modify" an existing one.
The following is an approximate general Scheme description, an analogy.
Defining a function
(define (f) (g))
is more or less like
f := (lambda () (g))
so the lambda expression is evaluated, and the resulting functional object (usually a closure) is stored in the new variable f being defined. The function g will have to be defined when the function f will be called.
Similarly, (define (h) a) is like h := (lambda () a) so only when the function h will be called, the reference to the variable a will be checked, to find its value.
But
(define a (f))
is like
a := (f)
i.e. the function f has to be called with no arguments, and the result of that call stored in the new variable a being defined. So the function has to be defined already, at that point.
Each definition in a file is executed in sequence, one after another. Each definition is allowed to refer to any of the variables being defined in a file, both above and below it (they are all said to belong to the same scope), but it is allowed to use values of only those variables that are defined above it.
(there is an ambiguity here: imagine you were using a built-in function, say with (define a (+ 1 2)), but were also defining it later on in the file, say (define + -). Is it a definition, or a redefinition? In the first case, which is Racket's choice, use before definition is forbidden. In the second, the "global" value is used in calculating the value of a, and then the function is redefined. Some Schemes may go that route. Thanks go to Eli Barzilay for showing this to me, and to Chris Jester-Young for helping out).

How do you define a constant in PLT Scheme?

How do I declare that a symbol will always stand for a particular value and cannot be changed throughout the execution of the program?
As far as I know, this isn't possible in Scheme. And, for all intents and purposes, it's not strictly necessary. Just define the value at the toplevel like a regular variable and then don't change it. To help you remember, you can adopt a convention for naming these kinds of constants - I've seen books where toplevel variables are defined with *stars* around their name.
In other languages, there is a danger that some library will override the definition you've created. However, Scheme's lexical scoping coupled with PLT's module system ensure this will never happen.
In PLT Scheme, you write your definitions in your own module -- and if your own code is not using `set!', then the binding can never change. In fact, the compiler uses this to perform various optimizations, so this is not just a convention.
You could define a macro that evaluates to the constant, which would protect you against simple uses of set!
(define-syntax constant
(syntax-rules ()
((_) 25)))
Then you just use (constant) everywhere, which is no more typing than *constant *
A really hackish answer that I thought of was to define a reader macro that returns your constant:
#lang racket
(current-readtable
(make-readtable (current-readtable)
#\k 'dispatch-macro (lambda (a b c d e f) 5)))
#k ;; <-- is read as 5
It would then be impossible to redefine this (without changing your reader macro):
(set! #k 6) ;; <-- error, 5 is not an identifier

Resources