R6RS Library body: definition after expression - scheme

Consider the following code:
#!r6rs
(library
(test)
(export)
(import (rnrs))
(define a 5)
(begin
(define b 4)
(+ 3 b))
'cont
(define c 5)
'done)
From the R6RS Report 7.1:
A <library body> is like a <body> (see section 11.3) except that a <library body>s need not include any expressions. It must have the following form:
<definition> ... <expression> ...
I thought it would emit error because definition of c is after expression 'cont, but this program is accepted cleanly.
After Then, I thought a and c could be exported. But, not c but b can be exported. (a can be exported as I thought.)
I think there are something I didn't realize about R6RS library rules. What is the point I'm missing? Thanks in advance.
p.s) I'm using Racket v5.3.3

From the R6RS 2007 specification:
A library definition must have the following form:
(library <library name>
(export <export spec> ...)
(import <import spec> ...)
<library body>)
...
The <library body> is the library body, consisting of a sequence of definitions
followed by a sequence of expressions. The definitions may be both for local
(unexported) and exported bindings, and the expressions are initialization
expressions to be evaluated for their effects.
Thus, for your example code, an error should have been raised.

Sorry, this is not the right answer. It is how the program toplevel works, not library toplevel. Leaving it here for reference.
In the program toplevel things work a bit different from normal (normal being the way you interpreted it).
The code will be rewritten by the compiler to look something like:
(define a 5)
(define b 4)
(define dummy1 (+ 3 b))
(define dummy2 'cont)
(define c 5)
'done
Notes:
begin splices in the toplevel
For any non-definition, the expression is assigned to a 'dummy' variable
The toplevel eventually ends up looking like a letrec* and the same rules apply

Related

Can a program contain unbound variables if they will not be evaluated?

Consider the following program, where foo and bar have not been defined.
(define (f)
foo)
(if #t
(display "Hello!")
bar)
Is this a valid Scheme program? Are Scheme programs allowed to have unbound variables as long as those variables are never evaluated?
No:
"It is a syntax violation to reference an unbound variable" (r6rs report 9.1)
"It is an error to reference an unbound variable." (r7rs report 4.1.1)
The question code may be evaluated by a Scheme implementation, but is not a Scheme program,
and if included in a program the implementation should signal an error:
$ scheme
> (if #t (display "Hello!") bar)
Hello!
> (top-level-program (import (rnrs))
(if #t (display "Hello!") bar))
Exception: attempt to reference unbound identifier bar
>
Notes
(top-level-program ) is a Chez Scheme extension, and can be used to enter a Scheme program interactively.
Extracts from r6rs report related to the meanings of Scheme program, variable, and unbound
(italic emphasis in original, bold added):
5.1 Programs and libraries
A Scheme program consists of a top-level program together with a set of libraries
5.2 Variables, keywords, and regions
... An identifier that names a location is called a variable and is said to be bound to that location ...
Every mention of an identifier refers to the binding of the identifier that establishes the innermost
of the regions containing the use. ...
If there is no binding for the identifier, it is said to be unbound.
9.1. Primitive expression types / Variable references
An expression consisting of a variable (section 5.2) is a variable reference ...
It is a syntax violation to reference an unbound variable.
5.5. Syntax violations
... implementations must detect violations of the syntax. A syntax violation is an error with respect to
the syntax of library bodies, top-level bodies ...
If a top-level or library form in a program is not syntactically correct, then ...
execution of that top-level program or library must not be allowed to begin.
So a Scheme program can contain an unbound variable, but a standard-compliant implementation must not start evaluating
the program, even if evaluation of the variable could never be attempted.
No, but yes, but no...
Interpreting the terms Scheme program, unbound, and variable as used in recent
Scheme reports, it seems that the answer is no (see my other answer for references).
But this depends on one more assumption about the sample code - that if has its usual meaning
as the syntactic keyword described in the language defining reports.
Because identifiers in Scheme programs can shadow keywords, this is a valid
Scheme program in which bar appears to be unbound:
(import (rnrs))
(let-syntax ([if (syntax-rules ()
[(if e1 e2 _) (if e1 e2)])])
(if #t
(display "Hello!")
bar))
So:
$ scheme
> (top-level-program (import (rnrs))
(let-syntax ([if (syntax-rules ()
[(if e1 e2 _) (if e1 e2)])])
(if #t
(display "Hello!")
bar)
(if #f
(display "Hello!")
bar)))
Hello!
>

SICP Ch5 eceval compiler in Racket: set-cdr! into quoted list (not a dup)

This is not a duplicate of
set-car!, set-cdr! unbound in racket? or
of
Implement SICP evaluator using Racket
or of
How to install sicp package module in racket?,
but rather a follow-up question because the solutions proposed therein do not
work for me. First, the need: Section 5.5.5 of SICP, the compiler plus
explicit-control evaluator (code here in "ch5-eceval-compiler.scm"), are
entirely dependent on set-car! and set-cdr! into explicit quoted lists. I
would like to copy and modify this code without a complete, bottom-up rewrite in
immutable form. I'd also accept a reference to a scheme implementation that can
run the code out-of-the-box or with some minimal, straightforward adaptation,
i.e., a scheme that has set-car! and set-cdr! or some
work-around. Neither guile nor racket give me an easy time running this code.
EDIT: mit scheme will load the eceval compiler. I'm leaving the question up for those who might want to get it going in racket (I would rather, for example).
Here is a deeper explanation, including the things I explored and tried out, and
how I diagnosed the quoted list as the deepest problem. When I hand-converted
the quoted list into an mquoted nest of mlists, the code broke in much worse
ways and the rabbit hole got much deeper. I had to revert after a couple of
hours of delicate brain surgery that failed.
Here is an MVE of the kind of structure that section 5.5.5 relies on. This is small, but structurally like the real thing:
(define foo '(a b))
(set-cdr! foo '(c))
The real thing starts like this:
(define eceval
(make-machine
'(exp env val proc argl continue unev
compapp ;*for compiled to call interpreted
)
eceval-operations ;; ----------------------------------------------
'( ;; <<<<<<<<======== BIG QUOTED LIST CAUSING TROUBLE / NOT MCONSES!
;;SECTION 5.4.4, as modified in 5.5.7 ;; -------------------------------
;;*for compiled to call interpreted (from exercise 5.47)
(assign compapp (label compound-apply))
;;*next instruction supports entry from compiler (from section 5.5.7)
(branch (label external-entry))
read-eval-print-loop
(perform (op initialize-stack))
(perform
(op prompt-for-input) (const ";;; EC-Eval input:"))
...
and goes on for quite a while. The evaluator is the "machine-code" in the quoted
list, and various generated code does set-car! and set-cdr! into registers
and environment frames and other things. The code fails on load.
There seems to be no easy way to convert the evaluator into immutable form
without a full rewrite, and I'm trying to avoid that. Of course, set-car! and
set-cdr! are not available in #lang racket, and I don't think they're in
guile, either (at least guile refused to load "ch5-eceval-compiler.scm,"
throwing a mutability error, on which I did not dig deeper).
One solution proposed in
set-car!, set-cdr! unbound in racket? is
to rewrite the code using mcons, mcar, mlist, etc. according to (require
compatibility/mlist) (require rnrs/mutable-pairs-6). Those compatibility
packages have no replacement for quote, so I tried writing my own mquote. I
spent a couple of hours on such a refactoring, but the exercise was not
converging, just going deeper and deeper down the rabbit hole and ending up with
even deeper problems. It seems that to pursue even a refactoring I must
understand more semantics about "ch5-eceval-compiler.scm," and if I must, I
might as well rewrite it in immutable form.
Easier solutions proposed in
set-car!, set-cdr! unbound in racket?
are to use #lang sicp or #lang r5rs. There follows three experiments that
referenced other answers on stack overflow:
#lang r5rs
(define foo '(a b))
(set-cdr! foo '(c))
foo
Error: struct:exn:fail:contract:variable
set-cdr!: undefined;
cannot reference an identifier before its definition
in module: "/usr/share/racket/pkgs/r5rs-lib/r5rs/main.rkt"
-----------------------------------------------
which points to a place where set-cdr! is clearly defined:
...
(module main scheme/base
(require scheme/mpair
racket/undefined
(for-syntax scheme/base syntax/kerncase
"private/r5rs-trans.rkt")
(only-in mzscheme transcript-on transcript-off))
(provide (for-syntax syntax-rules ...
(rename-out [syntax-rules-only #%top]
[syntax-rules-only #%app]
[syntax-rules-only #%datum]))
(rename-out
[mcons cons]
[mcar car]
[mcdr cdr]
[set-mcar! set-car!] ;; --------------------------
[set-mcdr! set-cdr!] ;; <<<<<<<<======== LOOK HERE
[mpair? pair?] ;; --------------------------
[mmap map]
[mfor-each for-each])
= < > <= >= max min + - * /
abs gcd lcm exp log sin cos tan not eq?
call-with-current-continuation make-string
symbol->string string->symbol make-rectangular
exact->inexact inexact->exact number->string string->number
...
Here is a similar failure with #lang sicp
#lang sicp
(define foo '(a b))
(set-cdr! foo '(c))
foo
Error: struct:exn:fail:contract:variable
set-cdr!: undefined;
cannot reference an identifier before its definition
in module: "/home/rebcabin/.racket/7.2/pkgs/sicp/sicp/main.rkt"
----------------------------------------------------
pointing to code that only indirectly defines set-cdr!, but clearly in the
appropriate package:
....
#lang racket
(require racket/provide ;; --------------------------------------------
(prefix-in r5rs: r5rs) ;; <<<<<<<<======== PULL IN SET-CDR! ETC. HERE?
(rename-in racket [random racket:random])) ;; ------------------------
(provide (filtered-out (λ (name) (regexp-replace #px"^r5rs:" name ""))
(except-out (all-from-out r5rs) r5rs:#%module-begin))
(rename-out [module-begin #%module-begin]))
(define-syntax (define+provide stx)
(syntax-case stx ()
[(_ (id . args) . body) #'(begin
(provide id)
(define (id . args) . body))]
[(_ id expr) #'(begin
(provide id)
(define id expr))]))
...
I dig deeper into
Implement SICP evaluator using Racket
and find
(require (only-in (combine-in rnrs/base-6
rnrs/mutable-pairs-6)
set-car!
set-cdr!))
(define foo '(a b))
(set-cdr! foo '(c))
foo
yielding
Error: struct:exn:fail:contract
set-mcdr!: contract violation
expected: mpair?
given: '(a b)
argument position: 1st
other arguments...:
'(c)
This error implies that the problem really is the quoted list. I don't have an
easy way to make the big quoted list in eceval into an mlist or chain of
mcons. I tried it and it's very verbose and error prone, plus I think the code
that loads eceval scans and patches that list, so it uses other list operations.
I had to revert after going south in a bad way.
Perhaps I missed some way to automate the transformation, a macro, but that's a
deeper rabbit hole (my scheme macro-fu is too old).
So I'm stuck. Nothing easy or recommended works. I'd like to either (1) know a scheme implementation that will run this code (2) some way I can implement set-car! and set-cdr! in racket (3) some other kind of work around (4) or maybe I just made a stupid mistake that one of you kind people will easily fix.
I tried (by either running racket directly or running it via DrRacket)
#lang sicp
(define foo '(a b))
(set-cdr! foo '(c))
foo
and it outputs (a c).
This also works:
#lang r5rs
(define foo '(a b))
(set-cdr! foo '(c))
(display foo)
To answer your question regarding the implementation of SICP (which I am currently maintaining), you are correct that (prefix-in r5rs: r5rs) will import set-cdr!.
Update: I just installed Geiser and now experienced the same problem that you did when I C-c C-b. However, C-c C-a works as expected.
Personally, I would use racket-mode instead of Geiser.
Mit-scheme will load the eceval compiler from the code drop mentioned in the question. On Ubuntu, mit-scheme loads with sudo apt-install mit-scheme. The geiser package of emacs finds mit via run-mit. The problem is solved.

Racket R6RS support: syntax-case

This simple R6RS program:
#!r6rs
(import (rnrs base)
(rnrs syntax-case)
(rnrs io simple))
(define-syntax stest
(lambda (x)
(syntax-case x ()
((_ z) #'(z 0)))))
(stest display)
works with Chez, Guile and Ypsilon but not Racket. It gives me this:
test.scm:7:3: lambda: unbound identifier in the transformer
environment;
also, no #%app syntax transformer is bound
My question is, is it broken for R6RS or do I have to do something else? I'm testing with version 6.12.
The Racket implementation of R6RS is not non-compliant in this case. Indeed, if anything, it is obeying the standard more closely: your program as-written is not careful about import phases. The problem is that define-syntax evaluates its right-hand side during expand-time, as noted by section 11.2.2 Syntax definitions:
Binds <keyword> to the value of <expression>, which must evaluate, at macro-expansion time, to a transformer.
Unlike other Scheme standards, R6RS takes care to distinguish between phases, since it permits arbitrary programming at compile-time (while other Scheme standards do not). Therefore, section 7.1 Library form specifies how to import libraries at specific phases:
Each <import spec> specifies a set of bindings to be imported into the library, the levels at which they are to be available, and the local names by which they are to be known. An <import spec> must be one of the following:
<import set>
(for <import set> <import level> ...)
An <import level> is one of the following:
run
expand
(meta <level>)
where <level> represents an exact integer object.
Therefore, you need to import (rnrs base) at both the run and expand phases, and you need to import (rnrs syntax-case) at the expand phase. You can do this with the following program:
#!r6rs
(import (for (rnrs base) run expand)
(for (rnrs syntax-case) expand)
(rnrs io simple))
(define-syntax stest
(lambda (x)
(syntax-case x ()
((_ z) #'(z 0)))))
(stest display)
This program works in Racket. I have not tested if it also works on the other Scheme implementations you listed, but it ought to if they are standards-compliant.

The Order of Variable and Function Definitions

Why is it that:
Function definitions can use definitions defined after it
while variable definitions can't.
For example,
a) the following code snippet is wrong:
; Must define function `f` before variable `a`.
#lang racket
(define a (f))
(define (f) 10)
b) While the following snippet is right:
; Function `g` could be defined after function `f`.
#lang racket
(define (f) (g)) ; `g` is not defined yet
(define (g) 10)
c) Right too :
; Variable `a` could be defined after function `f`
#lang racket
(define (f) a) ; `a` is not defined yet
(define a 10)
You need to know several things about Racket:
In Racket, each file (that starts with #lang) is a module, unlike many (traditional, r5rs) schemes that have no modules.
The scoping rules for modules are similar to the rules for a function, so in a sense, these definitions are similar to definitions in a function.
Racket evaluates definitions from left to right. In scheme lingo you say that Racket's definitions have letrec* semantics; this is unlike some schemes that use letrec semantics where mutually recursive definitions never work.
So the bottom line is that the definitions are all created in the module's environment (similarly in a function, for function-local definitions), and then they are initialized from left to right. Back-references therefore always work, so you can always do something like
(define a 1)
(define b (add1 a))
They are created in a single scope -- so in theory forward definitions are valid in the sense that they're in scope. But actually using a value of a forward-reference is not going to work since you get a special #<undefined> value until the actual value is evaluated. To see this, try to run this code:
#lang racket
(define (foo)
(define a a)
a)
(foo)
A module's toplevel is further restricted so that such references are actually errors, which you can see with:
#lang racket
(define a a)
Having all that in mind, things are a bit more lenient with references inside functions. The thing is that the body of a function is not executed until the function is called -- so if a forward reference happens inside a function, it is valid (= won't get an error or #<undefined>) if the function is called after all of the bindings have been initialized. This applies to plain function definitions
(define foo (lambda () a))
definitions that use the usual syntactic sugar
(define (foo) a)
and even other forms that ultimately expand into functions
(define foo (delay a))
With all of these, you won't get any errors by the same rule -- when all uses of the function bodies happen after the definitions were initialized.
One important note, however, is that you shouldn't confuse this kind of initialization with assignment. This means that things like
(define x (+ x 1))
are not equivalent to x = x+1 in mainstream languages. They're more like some var x = x+1 in a language that will fail with some "reference to uninitialized variable" error. This is because define creates a new binding in the current scope, it does not "modify" an existing one.
The following is an approximate general Scheme description, an analogy.
Defining a function
(define (f) (g))
is more or less like
f := (lambda () (g))
so the lambda expression is evaluated, and the resulting functional object (usually a closure) is stored in the new variable f being defined. The function g will have to be defined when the function f will be called.
Similarly, (define (h) a) is like h := (lambda () a) so only when the function h will be called, the reference to the variable a will be checked, to find its value.
But
(define a (f))
is like
a := (f)
i.e. the function f has to be called with no arguments, and the result of that call stored in the new variable a being defined. So the function has to be defined already, at that point.
Each definition in a file is executed in sequence, one after another. Each definition is allowed to refer to any of the variables being defined in a file, both above and below it (they are all said to belong to the same scope), but it is allowed to use values of only those variables that are defined above it.
(there is an ambiguity here: imagine you were using a built-in function, say with (define a (+ 1 2)), but were also defining it later on in the file, say (define + -). Is it a definition, or a redefinition? In the first case, which is Racket's choice, use before definition is forbidden. In the second, the "global" value is used in calculating the value of a, and then the function is redefined. Some Schemes may go that route. Thanks go to Eli Barzilay for showing this to me, and to Chris Jester-Young for helping out).

What exactly is the purpose of syntax objects in scheme?

I am trying to write a small scheme-like language in python, in order to try to better understand scheme.
The problem is that I am stuck on syntax objects. I cannot implement them because I do not really understand what they are for and how they work.
To try to understand them, I played around a bit with syntax objects in DrRacket.
From what I've been able to find, evaluating #'(+ 2 3) is no different from evaluating '(+ 2 3), except in the case that there is a lexical + variable shadowing the one in the top-level namespace, in which case (eval '(+ 2 3)) still returns 5, but (eval #'(+ 2 3)) just throws an error.
For example:
(define (top-sym)
'(+ 2 3))
(define (top-stx)
#'(+ 2 3))
(define (shadow-sym)
(define + *)
'(+ 2 3))
(define (shadow-stx)
(define + *)
#'(+ 2 3))
(eval (top-sym)), (eval (top-stx)), and (eval (shadow-sym)) all return 5, while (eval (shadow-stx)) throws an error. None of them return 6.
If I didn't know better, I would think that the only thing that's special about syntax objects (aside from the trivial fact that they store the location of the code for better error reporting) is that they throw an error under certain circumstances where their symbol counterparts would have returned a potentially unwanted value.
If the story were that simple, there would be no real advantage to using syntax objects over regular lists and symbols.
So my question is: What am I missing about syntax objects that makes them so special?
Syntax objects are the repository for lexical context for the underlying Racket compiler. Concretely, when we enter program like:
#lang racket/base
(* 3 4)
The compiler receives a syntax object representing the entire content of that program. Here's an example to let us see what that syntax object looks like:
#lang racket/base
(define example-program
(open-input-string
"
#lang racket/base
(* 3 4)
"))
(read-accept-reader #t)
(define thingy (read-syntax 'the-test-program example-program))
(print thingy) (newline)
(syntax? thingy)
Note that the * in the program has a compile-time representation as a syntax object within thingy. And at the moment, the * in thingy has no idea where it comes from: it has no binding information yet. It's during the process of expansion, during compilation, that the compiler associates * as a reference to the * of #lang racket/base.
We can see this more easily if we interact with things at compile time. (Note: I am deliberately avoiding talking about eval because I want to avoid mixing up discussion of what happens during compile-time vs. run-time.)
Here is an example to let us inspect more of what these syntax objects do:
#lang racket/base
(require (for-syntax racket/base))
;; This macro is only meant to let us see what the compiler is dealing with
;; at compile time.
(define-syntax (at-compile-time stx)
(syntax-case stx ()
[(_ expr)
(let ()
(define the-expr #'expr)
(printf "I see the expression is: ~s\n" the-expr)
;; Ultimately, as a macro, we must return back a rewrite of
;; the input. Let's just return the expr:
the-expr)]))
(at-compile-time (* 3 4))
We'll use a macro here, at-compile-time, to let us inspect the state of things during compilation. If you run this program in DrRacket, you will see that DrRacket first compiles the program, and then runs it. As it compiles the program, when it sees uses of at-compile-time, the compiler will invoke our macro.
So at compile-time, we'll see something like:
I see the expression is: #<syntax:20:17 (* 3 4)>
Let's revise the program a little bit, and see if we can inspect the identifier-binding of identifiers:
#lang racket/base
(require (for-syntax racket/base))
(define-syntax (at-compile-time stx)
(syntax-case stx ()
[(_ expr)
(let ()
(define the-expr #'expr)
(printf "I see the expression is: ~s\n" the-expr)
(when (identifier? the-expr)
(printf "The identifier binding is: ~s\n" (identifier-binding the-expr)))
the-expr)]))
((at-compile-time *) 3 4)
(let ([* +])
((at-compile-time *) 3 4))
If we run this program in DrRacket, we'll see the following output:
I see the expression is: #<syntax:21:18 *>
The identifier binding is: (#<module-path-index> * #<module-path-index> * 0 0 0)
I see the expression is: #<syntax:24:20 *>
The identifier binding is: lexical
12
7
(By the way: why do we see the output from at-compile-time up front? Because compilation is done entirely before runtime! If we pre-compile the program and save the bytecode by using raco make, we would not see the compiler being invoked when we run the program.)
By the time the compiler reaches the uses of at-compile-time, it knows to associate the appropriate lexical binding information to identifiers. When we inspect the identifier-binding in the first case, the compiler knows that it's associated to a particular module (in this case, #lang racket/base, which is what that module-path-index business is about). But in the second case, it knows that it's a lexical binding: the compiler already walked through the (let ([* +]) ...), and so it knows that uses of * refer back to the binding set up by the let.
The Racket compiler uses syntax objects to communicate that kind of binding information to clients, such as our macros.
Trying to use eval to inspect this sort of stuff is fraught with issues: the binding information in the syntax objects might not be relevant, because by the time we evaluate the syntax objects, their bindings might refer to things that don't exist! That's fundamentally the reason you were seeing errors in your experiments.
Still, here is one example that shows the difference between s-expressions and syntax objects:
#lang racket/base
(module mod1 racket/base
(provide x)
(define x #'(* 3 4)))
(module mod2 racket/base
(define * +) ;; Override!
(provide x)
(define x #'(* 3 4)))
;;;;;;;;;;;;;;;;;;;;;;;;;;;
(require (prefix-in m1: (submod "." mod1))
(prefix-in m2: (submod "." mod2)))
(displayln m1:x)
(displayln (syntax->datum m1:x))
(eval m1:x)
(displayln m2:x)
(displayln (syntax->datum m2:x))
(eval m2:x)
This example is carefully constructed so that the contents of the syntax objects refer only to module-bound things, which will exist at the time we use eval. If we were to change the example slightly,
(module broken-mod2 racket/base
(provide x)
(define x
(let ([* +])
#'(* 3 4))))
then things break horribly when we try to eval the x that comes out of broken-mod2, since the syntax object is referring to a lexical binding that doesn't exist by the time we eval. eval is a difficult beast.

Resources