Pattern match function in Scheme Meta Circular Evaluator - scheme

I'm trying to add a pattern matching function to an existing scheme meta circular evaluator (this is homework) and I'm a bit lost on the wording of the instructions. I was hoping someone more skilled in this regard could help me interpret this.
The syntax for match should look like the following: (match a ((p1 v1) (p2 v2) (p3 v3)))
And it could be used to find length like so:
(define length (lambda (x)
(match x (('() 0)
(('cons a b) (+ 1 (length b))))))
The pattern language in the function should contain numeric constants, quoted constants, variables, and cons. If patterns are exhausted without finding a match, an error should be thrown.
I thought I understood the concept of pattern matching but implementing it in a function this way has me a bit thrown off. Would anyone be willing to explain what the above syntax is doing (specifically, how match is used in length above) so I can get a better understanding of what this function should do?

(match x (('() 0)
(('cons a b) (+ 1 (length b)))))
It may be most helpful to consider what this code would need to expand into. For each pattern, you'd need a test to determine whether the object you're trying to match matches, and you'd need code to figure out how to bind variables to its subparts. In this case, you'd want an expansion roughly like:
(if (equal? '() x)
0
(if (pair? x)
(let ((a (car x))
(b (cdr x)))
(+ 1 (length b)))
;; indicate failure to match
'NO-MATCH))
You could write that with a cond, too, of course, but if you have to procedurally expand this, it might be easier to use nested if forms.
If you're actually implementing this as a function (and not as a macro (i.e., source transformation), then you'll need to specify exactly how you can work with environments, etc.

I suggest you read the chapter four, Structured Types and the Semantics of Pattern Matching, from The Implementation of Functional Languages. The chapter is written by Simon L. Peyton Jones and Philip Wadler.

Related

Main difference between using define and let in scheme [duplicate]

Ok, this is a fairly basic question: I am following the SICP videos, and I am a bit confused about the differences between define, let and set!.
1) According to Sussman in the video, define is allowed to attach a value to avariable only once (except when in the REPL), in particular two defines in line are not allowed. Yet Guile happily runs this code
(define a 1)
(define a 2)
(write a)
and outputs 2, as expected. Things are a little bit more complicated because if I try to do this (EDIT: after the above definitions)
(define a (1+ a))
I get an error, while
(set! a (1+ a))
is allowed. Still I don't think that this the only difference between set! and define: what is that I am missing?
2) The difference between define and let puzzles me even more. I know in theory let is used to bind variables in local scope. Still, it seems to me that this works the same with define, for instance I can replace
(define (f x)
(let ((a 1))
(+ a x)))
with
(define (g x)
(define a 1)
(+ a x))
and f and g work the same: in particular the variable a is unbound outside g as well.
The only way I can see this useful is that let may have a shorter scope that the whole function definition. Still it seems to me that one can always add an anonymous function to create the necessary scope, and invoke it right away, much like one does in javascript. So, what is the real advantage of let?
Your confusion is reasonable: 'let' and 'define' both create new bindings. One advantage to 'let' is that its meaning is extraordinarily well-defined; there's absolutely no disagreement between various Scheme systems (incl. Racket) about what plain-old 'let' means.
The 'define' form is a different kettle of fish. Unlike 'let', it doesn't surround the body (region where the binding is valid) with parentheses. Also, it can mean different things at the top level and internally. Different Scheme systems have dramatically different meanings for 'define'. In fact, Racket has recently changed the meaning of 'define' by adding new contexts in which it can occur.
On the other hand, people like 'define'; it has less indentation, and it usually has a "do-what-I-mean" level of scoping allowing natural definitions of recursive and mutually recursive procedures. In fact, I got bitten by this just the other day :).
Finally, 'set!'; like 'let', 'set!' is pretty straightforward: it mutates an existing binding.
FWIW, one way to understand these scopes in DrRacket (if you're using it) is to use the "Check Syntax" button, and then hover over various identifiers to see where they're bound.
Do you mean (+ 1 a) instead of (1+ a) ? The latter is not syntactically valid.
Scope of variables defined by let are bound to the latter, thus
(define (f x)
(let ((a 1))
(+ a x)))
is syntactically possible, while
(define (f x)
(let ((a 1)))
(+ a x))
is not.
All variables have to be defined in the beginning of the function, thus the following code is possible:
(define (g x)
(define a 1)
(+ a x))
while this code will generate an error:
(define (g x)
(define a 1)
(display (+ a x))
(define b 2)
(+ a x))
because the first expression after the definition implies that there are no other definitions.
set! doesn't define the variable, rather it is used to assign the variable a new value. Therefore these definitions are meaningless:
(define (f x)
(set! ((a 1))
(+ a x)))
(define (g x)
(set! a 1)
(+ a x))
Valid use for set! is as follows:
(define x 12)
> (set! x (add1 x))
> x
13
Though it's discouraged, as Scheme is a functional language.
John Clements answer is good. In some cases, you can see what the defines become in each version of Scheme, which might help you understand what's going on.
For example, in Chez Scheme 8.0 (which has its own define quirks, esp. wrt R6RS!):
> (expand '(define (g x)
(define a 1)
(+ a x)))
(begin
(set! g (lambda (x) (letrec* ([a 1]) (#2%+ a x))))
(#2%void))
You see that the "top-level" define becomes a set! (although just expanding define in some cases will change things!), but the internal define (that is, a define inside another block) becomes a letrec*. Different Schemes will expand that expression into different things.
MzScheme v4.2.4:
> (expand '(define (g x)
(define a 1)
(+ a x)))
(define-values
(g)
(lambda (x)
(letrec-values (((a) '1)) (#%app + a x))))
You may be able to use define more than once but it's not
idiomatic: define implies that you are adding a definition to the
environment and set! implies you are mutating some variable.
I'm not sure about Guile and why it would allow (set! a (+1 a)) but
if a isn't defined yet that shouldn't work. Usually one would use
define to introduce a new variable and only mutate it with set!
later.
You can use an anonymous function application instead of let, in
fact that's usually exactly what let expands into, it's almost
always a macro. These are equivalent:
(let ((a 1) (b 2))
(+ a b))
((lambda (a b)
(+ a b))
1 2)
The reason you'd use let is that it's clearer: the variable names are right next to the values.
In the case of internal defines, I'm not sure that Yasir is
correct. At least on my machine, running Racket in R5RS-mode and in
regular mode allowed internal defines to appear in the middle of the
function definition, but I'm not sure what the standard says. In any
case, much later in SICP, the trickiness that internal defines pose is
discussed in depth. In Chapter 4, how to implement mutually recursive
internal defines is explored and what it means for the implementation
of the metacircular interpreter.
So stick with it! SICP is a brilliant book and the video lectures are wonderful.

Can any case of using call/cc be rewritten equivalently without using it?

Can any case of using call/cc be rewritten equivalently without using it?
For example
In (g (call/cc f)), is the purpose of f to evaluate the value of
some expression, so that g can be applied to the value?
Is (g (call/cc f)) always able to be rewritten equivalently
without call/cc e.g. (g expression)?
In ((call/cc f) arg), is the purpose of f to evaluate the
definition of some function g, so that function g can be
applied to the value of arg?
Is ((call/cc f) arg) always able to be rewritten equivalently
without call/cc e.g. (g arg)?
If the answers are yes, why do we need to use call/cc?
I am trying to understand the purpose of using call/cc, by contrasting it to not using it.
The key to the direct answer here is the notion of "Turing equivalence". That is, essentially all of the commonly used programming languages (C, Java, Scheme, Haskell, Lambda Calculus etc. etc.) are equivalent in the sense that for any program in one of these languages, there is a corresponding program in each of the other languages which has the same meaning.
Beyond this, though, some of these equivalences may be "nice" and some may be really horrible. This suggests that we reframe the question: which features can be rewritten in a "nice" way into languages without that feature, and which cannot?
A formal treatment of this comes from Matthias Felleisen, in his 1991 paper "On the Expressive Power of Programming Languages" (https://www.sciencedirect.com/science/article/pii/016764239190036W), which introduces a notion of macro expressibility, pointing out that some features can be rewritten in a local way, and some require global rewrites.
The answer to your original question is obviously yes. Scheme is Turing-complete, with or without call/cc, so even without call/cc, you can still compute anything that is computable.
Why "it is more convenient than writing the equivalent expression using lambda"?
The classic paper On the Expressive Power of Programming Languages by Matthias Felleisen gives one answer to this question. Pretty much, to rewrite a program with call/cc to one without it, you might potentially need to transform your whole program (global transformation). This is to contrast some other constructs that only need a local transformation (i.e., can be written as macro) to remove them.
The key is: If your program is written in continuation passing style, you don't need call/cc. If not, good luck.
I whole-heartedly recommend:
Daniel P. Friedman. "Applications of Continuations: Invited Tutorial". 1988 Principles of Programming Languages (POPL88). January 1988
https://cs.indiana.edu/~dfried/appcont.pdf
If you enjoy reading that paper, then check out:
https://github.com/scheme-live/bibliography/blob/master/page6.md
Of course anything that is written with call/cc can be written without it, because everything in Scheme is ultimately written using lambda. You use call/cc because it is more convenient than writing the equivalent expression using lambda.
There are two senses to this question: an uninteresting one and an interesting one:
The uninteresting one. Is there some computation that you can do with call/cc that you can't do in a language which does not have it?
No, there isn't: call/cc doesn't make a language properly more powerful: it is famously the case that a language with only λ and function application is equivalent to a universal Turing machine, and thus there is no (known...) more powerful computational system.
But that's kind of uninteresting from the point of view of programming-language design: subject to the normal constraints on memory &c, pretty much all programming languages are equivalent to UTMs, but people still prefer to use languages which don't involve punching holes in paper tape if they can.
The interesting one. Is it the case that call/cc makes some desirable features of a programming language easier to express?
The answer to this is yes, it does. I'll just give a couple of examples. Let's say you want to have some kind of non-local exit feature in your language, so some deeply-nested bit of program can just say 'to hell with this I want out', without having to climb back out through some great layer of functions. This is trivial with call/cc: the continuation procedure is the escape procedure. You can wrap it in some syntax if you want it to be nicer:
(define-syntax with-escape
(syntax-rules ()
[(_ (e) form ...)
(call/cc (λ (e) form ...))]))
(with-escape (e)
... code in here, and can call e to escape, and return some values ...)
Can you implement this without call/cc? Well, yes, but not without either relying on some other special construct (say block and return-from in CL), or without turning the language inside out in some way.
And you can build on things like this to implement all sorts of non-local escapes.
Or, well, let's say you want GO TO (the following example is Racket):
(define (test n)
(define m 0)
(define start (call/cc (λ (c) c)))
(printf "here ~A~%" m)
(set! m (+ m 1))
(when (< m n)
(start start)))
Or, with some syntax around this:
(define-syntax-rule (label place)
(define place (call/cc identity)))
(define (go place)
(place place))
(define (horrid n)
(define m 0)
(label start)
(printf "here ~A~%" m)
(set! m (+ m 1))
(when (< m n)
(go start)))
So, OK, this perhaps is not a desirable feature of a programming language. But, well, Scheme doesn't have GO TO right, and yet, here, it does.
So, yes, call/cc (especially when combined with macros) makes a lot of desirable features of a programming language possible to express. Other languages have all these special-purpose, limited hacks, Scheme has this universal thing from which all these special-purpose hacks can be built.
The problem is that call/cc doesn't stop with the good special-purpose hacks: you can also build all the awful horrors that used to blight programming languages out of it. call/cc is like having access to an elder god: it's really convenient if you want dread power, but you'd better be careful what comes with it when you call, because it may well be an unspeakable horror from beyond spacetime.
An easy use of call/cc is as a bail out. eg.
;; (1 2) => (2 4)
;; #f if one element is not a number
(define (double-numbers lst)
(call/cc
(lambda (exit)
(let helper ((lst lst))
(cond ((null? lst) '())
((not (number? (car lst))) (exit #f))
(else (cons (* 2 (car lst)) (helper (cdr lst)))))))))
So to understand this. If we are doing (double-numbers '(1 2 r)) the result is #f, but the helper has done (cons 1 (cons 2 (exit #f)))
Without call/cc we see the continuation would be whatever called double-numbers since it actually return normally from it. Here is an example without call/cc:
;; (1 2) => (2 4)
;; #f if one element is not a number
(define (double-numbers lst)
(define (helper& lst cont)
(cond ((null? lst) (cont '()))
((not (number? (car lst))) #f) ; bail out, not using cont
(else (helper& (cdr lst)
(lambda (result)
(cont (cons (* 2 (car lst)) result)))))))
(helper& lst values)) ; values works as an identity procedure
I imagine it gets harder pretty quick. Eg. my generator implementation. The generator relies on having access to continuations to mix the generator code with where it's used, but without call/cc you'll need to do CPS in both the generator, the generated generator and the code that uses it.

How to use symbols and lists in scheme to process data?

I am a newbie in scheme, and I am in the process of writing a function that checks pairwise disjointess of rules (for the time being is incomplete), I used symbols and lists in order to represent the rues of the grammar. Uppercase symbol is a non-terminal in the grammar, and lowercase is a terminal. I am trying to check if a rule passes the pairwise disjointness test.
I will basically check if a rule has only one unique terminal in it. if it is the case, that rule passes the pairwise disjointness test. In scheme, I am thinking to realize that by representing the terminal symbol in lower case. An example of that rule would be:
'(A <= (a b c))
I will then check the case of a rule that contains an or. like:
'(A <= (a (OR (a b) (a c))))
Finally, I will check recursively for non terminals. A rule for that case would be
'(A <= (B b c))
However, What is keeping me stuck is how to use those symbols as data in order to be processed and recurse upon it. I thought about converting the symbols to strings, but that did not in case of having a list like that for example '(a b c) How can I do it?
Here is what I reached so far:
#lang racket
(define grammar
'(A <= (a A b))
)
(define (pairwise-disjoint lst)
(print(symbol->string (car lst)))
(print( cddr lst))
)
Pairwise Disjoint
As far as I know, the only way to check if a set is pairwise disjoint is to enumerate every possible pair and check for matches. Note that this does not follow the racket syntax, but the meaning should still be pretty clear.
(define (contains-match? x lst)
(cond ((null? x) #f) ; Nothing to do
((null? lst) #f) ; Finished walking full list
((eq? x (car lst)) #t) ; Found a match, no need to go further
(else
(contains-match? x (cdr lst))))) ; recursive call to keep walking
(define (pairwise-disjoint? lst)
(if (null? lst) #f
(let ((x (car lst)) ; let inner vars just for readability
(tail (cdr lst)))
(not
;; for each element, check against all later elements in the list
(or (contains-match? x tail)
(contains-match? (car tail) (cdr tail)))))))
It's not clear to me what else you're trying to do, but this is the going to be the general method. Depending on your data, you may need to use a different (or even custom-made) check for equality, but this works as is for normal symbols:
]=> (pairwise-disjoint? '(a b c d e))
;Value: #t
]=> (pairwise-disjoint? '(a b c d e a))
;Value: #f
Symbols & Data
This section is based on what I perceive to be a pretty fundamental misunderstanding of scheme basics by OP, and some speculation about what their actual goal is. Please clarify the question if this next bit doesn't help you!
However, What is keeping me stuck is how to use those symbols as data...
In scheme, you can associate a symbol with whatever you want. In fact, the define keyword really just tells the interpreter "Whenever I say contains-match? (which is a symbol) I'm actually referring to this big set of instructions over there, so remember that." The interpreter remembers this by storing the symbol and the thing it refers to in a big table so that it can be found later.
Whenever the interpreter runs into a symbol, it will look in its table to see if it knows what it actually means and substitute the real value, in this case a function.
]=> pairwise-disjoint?
;Value 2: #[compound-procedure 2 pairwise-disjoint?]
We tell the interpreter to keep the symbol in place rather than substituting by using the quote operator, ' or (quote ...):
]=> 'pairwise-disjoint?
;Value: pairwise-disjoint?
All that said, using define for your purposes is probably a really poor decision for all of the same reasons that global variables are generally bad.
To hold the definitions of all your particular symbols important to the grammar, you're probably looking for something like a hash table where each symbol you know about is a key and its particulars are the associated value.
And, if you want to pass around symbols, you really need to understand the quote and quasiquote.
Once you have your definitions somewhere that you can find them, the only work that's left to you is writing something like I did above that is maybe a little more tailored to your particular situation.
Data Types
If you have Terminals and Non-Terminals, why not make data-types for each? In #lang racket the way to introduce new data type is with struct.
;; A Terminal is just has a name.
(struct Terminal (name))
;; A Non-terminal has a name and a list of terms
;; The list of terms may contain Terminals, Non-Terminals, or both.
(struct Non-terminal (name terms))
Processing Non-terminals
Now we can find the Terminals in a Non-Terminal's list of terms using the predicate Terminal? which is provided automatically when we define the Terminal as a struct.
(define (find-terminals non-terminal)
(filter Terminal? (Non-terminal-terms non-terminal)))
Pairwise Disjoint Terminals
Once we have filtered the list of terms we can determine properties:
;; List(Terminal) -> Boolean
define (pairwise-disjoint? terminals)
(define (roundtrip terms)
(set->list (list->set terms)))
(= (length (roundtrip terminals)
(length terminals))))
The round trip list->set->list isn't necessarily optimized for speed, of course and profiling actual working implementations may justify refactoring, but at least it's been black-boxed.
Notes
Defining data types with struct provides all sorts of options for validating data as the type is instantiated. If you look at the Racket code base, you will see struct used frequently in the more recent portions.
Since grammar has a list within a list, I think you'll have to either test via list? before calling symbol->string (since, as you discovered, symbol->string won't work on a list), or else you could do something like this:
(map symbol->string (flatten grammar))
> '("A" "<=" "a" "A" "b")
Edit: For what you're doing, i guess the flatten route might not be that helpful. so ya, test via list? each time when parsing and handle accordingly.

How to create a Scheme definition to parse a compound S-expression and put in a normal form

Given an expression in the form of : (* 3 (+ x y)), how can I evaluate the expression so as to put it in the form (+ (* 3 x) (* 3 y))? (note: in the general case, 3 is any constant, and "x" or "y" could be terms of single variables or other s-expressions (e.g. (+ 2 x)).
How do I write a lambda expression that will recursively evaluate the items (atoms?) in the original expression and determine whether they are a constant or a term? In the case of a term, it would then need to be evaluated again recursively to determine the type of each item in that term's list.
Again, the crux of the issue for me is the recursive "kernel" of the definition.
I would obviously need a base case that would determine once I have reached the last, single atom in the deepest part of the expression. Then recursively work "back up", building the expression in the proper form according to rules.
Coming from a Java / C++ background I am having great difficulty in understanding how to do this syntactically in Scheme.
Let's take a quick detour from the original problem to something slightly related. Say that you're given the following: you want to write an evaluator that takes "string-building" expressions like (* 3 "hello") and "evaluates" it to "hellohellohello". Other examples that we'd like to make work include things like
(+ "rock" (+ (* 5 "p") "aper")) ==> "rockpppppaper"
(* 3 (+ "scis" "sors")) ==> "scissorsscissorsscissors"
To tackle a problem like this, we need to specify exactly what the shape of the inputs are. Essentially, we want to describe a data-type. We'll say that our inputs are going to be "string-expressions". Let's call them str-exprs for short. Then let's define what it means to be a str-expr.
A str-expr is either:
<string>
(+ <str-expr> <str-expr>)
(* <number> <str-expr>)
By this notation, we're trying to say that str-exprs will all fit one of those three shapes.
Once we have a good idea of what the shape of the data is, we have a better guide to write functions that process str-exprs: they must case-analyze those three alternatives!
;; A str-expr is either:
;; a plain string, or
;; (+ str-expr str-expr), or
;; (* number str-expr)
;; We want to write a definition to "evaluate" such string-expressions.
;; evaluate: str-expr -> string
(define (evaluate expr)
(cond
[(string? expr)
...]
[(eq? (first expr) '+)
...]
[(eq? (first expr) '*)
...]))
where the '...'s are things that we'll be filling in.
Actually, we know how to fill in a little more about the '...': we know that in the second and third cases, the inner parts are themselves str-exprs. Those are prime spots where recurrence will probably happen: since our data is described in terms of itself, the programs that process them will also probably need to refer to themselves. In short, programs that process str-exprs will almost certainly follow this shape:
(define (evaluate expr)
(cond
[(string? expr)
... expr
...]
[(eq? (first expr) '+)
... (evaluate (second expr))
... (evaluate (third expr))
...]
[(eq? (first expr) '*)
... (second expr)
... (evaluate (third expr))
...]))
That's all without even doing any real work: we can figure this part out just purely because that's what the data's shape tells us. Filling in the remainder of the '...'s to make this all work out is actually not too bad, especially when we also consider the test cases we've cooked up. (Code)
It's this kind of standard data-analysis/case-analysis that's at the heart of your question, and it's one that's covered extensively by curricula such as HTDP. This is not Scheme or Racket specific: you'd do the same kind of data analysis in Java, and you see the same kind of approach in many other places. In Java, the low-mechanism used for the case analysis might be done differently, perhaps with dynamic dispatch, but the core ideas are all the same. You need to describe the data. Once you have a data definition, use it to help you sketch out what the code needs to look like to process that data. Use test cases to triangulate how to fill in the sketch.
You need to break down your problem. I would start by following the HtDP (www.htdp.org) approach. What are your inputs? Can you specify them precisely? In this case, those inputs are going to be self-referential.
Then, specify the output form. In fact, your text above is a little fuzzy on this: I think I know what your output form looks like, but I'm not entirely sure.
Next, write a set of tests. These should be based on the structure of your input terms; start with the simplest ones, and work upward from there.
Once you have a good set of tests, implementing the function should be pretty straightforward. I'd be glad to help if you get stuck!

Why does this code not work in Scheme?

(define a 42)
(set! 'a 10)
(define a 42)
(define (symbol) 'a)
(set! (symbol) 10)
(define a (cons 1 2))
(set! (car a) 10)
I tried running them in DrScheme and they don't work. Why?
Think of set! is a special form like define which does not evaluate its first operand. You are telling the scheme interpreter to set that variable exactly how you write it. In your example, it will not evaluate the expression 'a to the word a. Instead, it will look for a variable binding named "'a" (or depending on your interpreter might just break before then since I think 'a is not a valid binding).
For the last set of expressions, if you want to set the car of a pair, use the function (set-car! pair val) which works just like any scheme function in that it evaluates all of its operands. It takes in two values, a pair and some scheme value, and mutates the pair so that the car is now pointing to the scheme value.
So for example.
>(define pair (cons 1 2))
>pair
(1 . 2)
>(set-car! pair 3)
(3 . 2)
Because the first argument of set! is a variable name, not an "lvalue", so to speak.
For the last case, use (set-car! a 10).
The issue is with (set! 'a 10), as you shouldn't be quoting the symbol a.
It sounds like you're trying to learn Scheme, and you don't know Lisp, yes? If so, I strongly recommend trying Clojure as an easier to learn Lisp. I failed to grasp the interaction between the reader, evaluation, symbols, special forms, macros, and so forth in both Common Lisp and Scheme because those things all seemed to interact in tangled ways, but I finally really understand them in Clojure. Even though it's new, I found Clojure documentation is actually clearer than anything I found for Scheme or CL. Start with the videos at http://clojure.blip.tv/ and then read the docs at clojure.org.

Resources