Why separating syntax analysis from execution? - scheme

In SICP Chapter 4, the metacircular evaluator is modified by separating the syntax analysis from the execution, making the eval procedure look like:
(define (eval exp env)
((analyze exp) env))
and the book says that this will save work since analyze will be called once on an expression, while the execution procedure may be called many times.
My question is, how does this optimization work? It will work for recursive procedure calls, but how about other cases? The evaluator evaluates expressions one after another, eval will still be called on each expression even if they have identical forms.

You need to see several things: (a) the analyze function walks over each expression exactly once, (b) there is no code outside of analyze that scans the syntax, (c) the function that analyze returns does not call itself therefore running that function never leads to any further scanning of the syntax, (d) this is all unlike the usual evaluation functions where calling a function twice means that its syntax is scanned twice.
BTW, a much better name for analyze is compile -- it really does translate the input language (sexprs) to a target one (a function, acting as the machine code here).

The difference between a compiler and a interpreter is that:
A compiler scan your source code only once and change it into to execution code (machine code maybe). When you execute your program the next time, you directly execute the execution code without analyzing the source code, which is efficient.
A interpreter, however, analyze the source code each time you execute your program.
This optimization only makes sense in cases of your program will be executed more than once.
As #Eli Barzilay said, "a much better name for analyze is compile", your analyzed functions is like the execution code. The recursive functions are like programs which would be executed more than once.

analyze just does syntax analyses once and store the transformed definition and so on in the environment which may be used through lookup-variable-value directly when the related procedure is executed.
In contrast, the original metacircular evaluator twists the syntax analysis and execution which makes each execution invokes syntax analysis as well.
This link maybe helpful:
http://www.cs.brandeis.edu/~mairson/Courses/cs21b/Handouts/feeley-notes.pdf

Related

Clojure: capture runtime value of function arg, to use in REPL

Problem
My web front-end calls back-end queries with complex arguments. During development, to avoid time-consuming manual replication of those arguments, I want to capture their values in Vars, which can be used in REPL.
Research
This article shows that inline def is a suitable solution, but I couldn't make it work. After a call from the front-end happened, the Var remained unbound.
I launched the backend with REPL through VS Code + Calva, with the following code:
(defn get-analytics-by-category [params]
(def params params)
...)
And here's the evaluation of the Var in the REPL:
#object[clojure.lang.Var$Unbound 0x1298f89e "Unbound: #'urbest.db.queries/params"]
Question
Why the code above didn't bound value of the argument to the Var? Is there another solution?
The best way that I found is to use scope-capture library. It captures all the local variables with addition of 1 line in a function, and then with another 1-liner you can define all those variables as global, which allows you to evaluate in REPL any sub-expression in the function using runtime values.
If you ever spent a lot of time reproducing complex runtime values, I strongly recommend watching their 8-min demo.
My issue with inline-def was likely caused by reloading namespace after the Var was bound to a value. After restarting VS Code and carefully doing everything again, the issue went away.
Another way to look at the runtime is to use a debugger.
It is more flexible than scope-capture, but requires a bit more work to make variables available outside an execution.
VS Code's Calva extension comes with one, there's also Emacs packages.

Make-array in SBCL

How does make-array work in SBCL? Are there some equivalents of new and delete operators in C++, or is it something else, perhaps assembler level?
I peeked into the source, but didn't understand anything.
When using SBCL compiled from source and an environment like Emacs/Slime, it is possible to navigate the code quite easily using M-. (meta-point). Basically, the make-array symbol is bound to multiple things: deftransform definitions, and a defun. The deftransform are used mostly for optimization, so better just follow the function, first.
The make-array function delegates to an internal make-array% one, which is quite complex: it checks the parameters, and dispatches to different specialized implementation of arrays, based on those parameters: a bit-vector is implemented differently than a string, for example.
If you follow the case for simple-array, you find a function which calls allocate-vector-with-widetag, which in turn calls allocate-vector.
Now, allocate-vector is bound to several objects, multiple defoptimizers forms, a function and a define-vop form.
The function is only:
(defun allocate-vector (type length words)
(allocate-vector type length words))
Even if it looks like a recursive call, it isn't.
The define-vop form is a way to define how to compile a call to allocate-vector. In the function, and anywhere where there is a call to allocate-vector, the compiler knows how to write the assembly that implements the built-in operation. But the function itself is defined so that there is an entry point with the same name, and a function object that wraps over that code.
define-vop relies on a Domain Specific Language in SBCL that abstracts over assembly. If you follow the definition, you can find different vops (virtual operations) for allocate-vector, like allocate-vector-on-heap and allocate-vector-on-stack.
Allocation on heap translates into a call to calc-size-in-bytes, a call to allocation and put-header, which most likely allocates memory and tag it (I followed the definition to src/compiler/x86-64/alloc.lisp).
How memory is allocated (and garbage collected) is another problem.
allocation emits assembly code using %alloc-tramp, which in turns executes the following:
(invoke-asm-routine 'call (if to-r11 'alloc-tramp-r11 'alloc-tramp) node)
There are apparently assembly routines called alloc-tramp-r11 and alloc-tramp, which are predefined assembly instructions. A comment says:
;;; Most allocation is done by inline code with sometimes help
;;; from the C alloc() function by way of the alloc-tramp
;;; assembly routine.
There is a base of C code for the runtime, see for example /src/runtime/alloc.c.
The -tramp suffix stands for trampoline.
Have also a look at src/runtime/x86-assem.S.

Why do Julia programmers need to prefix macros with the at-sign?

Whenever I see a Julia macro in use like #assert or #time I'm always wondering about the need to distinguish a macro syntactically with the # prefix. What should I be thinking of when using # for a macro? For me it adds noise and distraction to an otherwise very nice language (syntactically speaking).
I mean, for me '#' has a meaning of reference, i.e. a location like a domain or address. In the location sense # does not have a meaning for macros other than that it is a different compilation step.
The # should be seen as a warning sign which indicates that the normal rules of the language might not apply. E.g., a function call
f(x)
will never modify the value of the variable x in the calling context, but a macro invocation
#mymacro x
(or #mymacro f(x) for that matter) very well might.
Another reason is that macros in Julia are not based on textual substitution as in C, but substitution in the abstract syntax tree (which is much more powerful and avoids the unexpected consequences that textual substitution macros are notorious for).
Macros have special syntax in Julia, and since they are expanded after parse time, the parser also needs an unambiguous way to recognise them
(without knowing which macros have been defined in the current scope).
ASCII characters are a precious resource in the design of most programming languages, Julia very much included. I would guess that the choice of # mostly comes down to the fact that it was not needed for something more important, and that it stands out pretty well.
Symbols always need to be interpreted within the context they are used. Having multiple meanings for symbols, across contexts, is not new and will probably never go away. For example, no one should expect #include in a C program to go viral on Twitter.
Julia's Documentation entry Hold up: why macros? explains pretty well some of the things you might keep in mind while writing and/or using macros.
Here are a few snippets:
Macros are necessary because they execute when code is parsed,
therefore, macros allow the programmer to generate and include
fragments of customized code before the full program is run.
...
It is important to emphasize that macros receive their arguments as
expressions, literals, or symbols.
So, if a macro is called with an expression, it gets the whole expression, not just the result.
...
In place of the written syntax, the macro call is expanded at parse
time to its returned result.
It actually fits quite nicely with the semantics of the # symbol on its own.
If we look up the Wikipedia entry for 'At symbol' we find that it is often used as a replacement for the preposition 'at' (yes it even reads 'at'). And the preposition 'at' is used to express a spatial or temporal relation.
Because of that we can use the #-symbol as an abbreviation for the preposition at to refer to a spatial relation, i.e. a location like #tony's bar, #france, etc., to some memory location #0x50FA2C (e.g. for pointers/addresses), to the receiver of a message (#user0851 which twitter and other forums use, etc.) but as well for a temporal relation, i.e. #05:00 am, #midnight, #compile_time or #parse_time.
And since macros are processed at parse time (here you have it) and this is totally distinct from the other code that is evaluated at run time (yes there are many different phases in between but that's not the point here).
In addition to explicitly direct the attention to the programmer that the following code fragment is processed at parse time! as oppossed to run time, we use #.
For me this explanation fits nicely in the language.
thanks#all ;)

Seeking clarification on Scheme eval

I am confused about eval. I looked at the specification of eval in schemers.org. It says
procedure: (eval expression environment-specifier)
It indicates to me that environment-specifier is mandatory requirement. However, when I tested eval using two interpreters -- the one at repl.it and Elk Scheme -- both of them work without environment-specifier. My question is: Are they both non-conformant interpreters or did I read the documentation at schmers.org wrong?
And then..
Elk Scheme has no problem evaluating (eval 5) and (eval (list + 5 6)) but the Scheme interpreter at repl.it is not able to evaluate them. The later will evaluate (eval `(+ 5 6)) fine but not the first two expressions. My question is: Is the behavior of the repl.it interpreter conformant?
How do other Scheme interpreters deal with the first two expressions?
The Scheme report is what you need to follow to write compatible programs. Other than that the implementations themselves can have their own syntax and procedures and it won't interfere since a standard conforming program wouldn't use them. Not all Scheme reports had eval mandatory so you need to find out which report it should conform to and if it needs some switch to follow the standard. eg. ikarus needs --r6rs-script as a switch to run R6RS programs correctly. I think Elk is R4RS so eval is not specified in that report and BiwaScheme seems to have references to R6RS in it's source so it should take a second argument. that it works is no proof it's conformant so you should dig a little in their documentation.
Also, everything defined as undefined in the report you might actually choose something and it's still according to report. E.g. I've seen define return the object that was bound and all the ! procedures return the object it mutated and it's all according to the report since any value is equal to the reports undefined value.
Also, very few errors in the report actually demands a error being signaled. Since it's not mandatory to signal errors its up to the developer to make sure they do not do anything that is considered an error. The implementations can actually return something like a incredibly wrong value, crash or if it's a nice implementation it signals an error. Any one of those are equally fine for the report. In fact this is one of them:
(define (test) "hello")
(string-set! (test) 0 #\H) ; might signal an error
(test) ; might evaluate to "hello", "Hello"
In most Scheme implementations you won't get any error and it will probably make test return "Hello". The report states this specifically to be an error so I guess it means you should never make a program like this to any Scheme interpreter since the outcome is undefined.

Circumvent EVAL in SCHEME

Peter Norvig in PAIP says:
in modern lisps...eval is used less often (in fact, in Scheme there is
no eval at all). If you find yourself using eval, you are probably
doing the wrong thing
What are some of the ways to circumvent using eval in scheme? Arent there case where eval is absolutely necessary?
There are cases where eval is necessary, but they always involve advanced programs that do things like dynamically loading some code (eg, a servlet in a web server). As for a way to "circumvent" using it -- that depends on the actual problem you're trying to solve, there's no magic solution to avoiding eval except for ... eval.
(BTW, my guess is that PAIP was written a long time ago, before eval was added to the Scheme Report.)
He's wrong. Of course there is eval in Scheme.
You'll need eval in some very rare case. The case that comes into mind first is when you'll build program with a program and then execute it. This happens mainly with genetic algorithm for example. In this case you build a lot a randomized programs that you'll need to execute. Having eval in conjunction with code being data make lisp the easiest programming language to do genetic algorithm.
Having these properties comes at a great cost (in term of speed and size of your program) because you'll remove all possibility to do compile time optimization on the code that will be evaled and you must keep the full interpreter in your resulting binary.
As a result it is considered poor design to use eval when it can be avoided.
The claim that Scheme has no eval is inaccurate at least for the most recent versions of the Scheme standard (R5RS and later). Usually, what you want is a macro instead, which will generate code at compilation time.
It is true that eval should be avoided. For starters, I've never seen a satisfactory definition of how should it behave, for example:
What environment expressions should be evaluated in when no environment is passed?
When you do pass in an environment, how do those work? For example, the standards specify no way you can pre-bind a value in that environment object.
That said, I've worked with a Scheme application that uses eval to generate code dynamically at runtime for cases where the structure of the computation cannot be known at compilation time. The intent has been to get the Scheme system to compile the code at runtime for performance reasons—and the difficulty is that there is no standard way to tell a Scheme system "compile this code."
It should go without saying also that eval can be a huge security risk. You should never eval anything that doesn't have a huge wall of separation from user input. Basically, if you want to use eval safely, you should be doing so in the context of the code-generation phase of a compiler-like system, after you've parsed some input (using a comprehensively defined grammar!).
First, PAIP is written for Common Lisp, not Scheme, so I don't know that he'd say the same thing. CL macros do much the same thing as eval, although at compile time instead of run time, and there's other things you could do. If you'd show me an example of using eval in Common Lisp, I could try to come up with other methods of doing the same thing.
I'm not a Scheme programmer. I can only speak from Norvig's perspective, as a Common Lisp programmer. I don't think he was talking about Scheme, and I don't know if he knew or knows Scheme particularly well.
Second, Norvig says "you are probably doing the wrong thing" rather than "you're doing the wrong thing". This implies that, for all he knows, there's times when eval is the correct thing to use. In a language like C, I'd say the same thing about goto, although they're quite useful in some restricted circumstances, but most goto use is by people who don't know any better.
One use I've seen for 'eval' in scripting environments is to parameterize some code with runtime values. for instance, in psuedo-C:
param = read_integer();
fn = eval("int lambda(int x) {
int param = " + to_string(param) + ";
return param*x; }");
I hope you find that really ugly. String pasting to create code at runtime? Ick. In Scheme and other lexically scoped Lisps, you can make parameterized functions without using eval.
(define make-my-fn
(lambda (param)
(lambda (x) (* param x)))
(let* ([ param (read-integer) ]
[ fn (make-my-fn param ])
;; etc.
)
Like was mentioned, dynamic code loading and such still need eval, but parameterized code and code composition can be produced with first class functions.
You could write a scheme interpreter in scheme. Such is certainly possible, but it is not practical.
Granted, this is a general answer, as I have no used scheme, but it may help you nonetheless. :)

Resources