How to create a Scheme definition to parse a compound S-expression and put in a normal form - scheme

Given an expression in the form of : (* 3 (+ x y)), how can I evaluate the expression so as to put it in the form (+ (* 3 x) (* 3 y))? (note: in the general case, 3 is any constant, and "x" or "y" could be terms of single variables or other s-expressions (e.g. (+ 2 x)).
How do I write a lambda expression that will recursively evaluate the items (atoms?) in the original expression and determine whether they are a constant or a term? In the case of a term, it would then need to be evaluated again recursively to determine the type of each item in that term's list.
Again, the crux of the issue for me is the recursive "kernel" of the definition.
I would obviously need a base case that would determine once I have reached the last, single atom in the deepest part of the expression. Then recursively work "back up", building the expression in the proper form according to rules.
Coming from a Java / C++ background I am having great difficulty in understanding how to do this syntactically in Scheme.

Let's take a quick detour from the original problem to something slightly related. Say that you're given the following: you want to write an evaluator that takes "string-building" expressions like (* 3 "hello") and "evaluates" it to "hellohellohello". Other examples that we'd like to make work include things like
(+ "rock" (+ (* 5 "p") "aper")) ==> "rockpppppaper"
(* 3 (+ "scis" "sors")) ==> "scissorsscissorsscissors"
To tackle a problem like this, we need to specify exactly what the shape of the inputs are. Essentially, we want to describe a data-type. We'll say that our inputs are going to be "string-expressions". Let's call them str-exprs for short. Then let's define what it means to be a str-expr.
A str-expr is either:
<string>
(+ <str-expr> <str-expr>)
(* <number> <str-expr>)
By this notation, we're trying to say that str-exprs will all fit one of those three shapes.
Once we have a good idea of what the shape of the data is, we have a better guide to write functions that process str-exprs: they must case-analyze those three alternatives!
;; A str-expr is either:
;; a plain string, or
;; (+ str-expr str-expr), or
;; (* number str-expr)
;; We want to write a definition to "evaluate" such string-expressions.
;; evaluate: str-expr -> string
(define (evaluate expr)
(cond
[(string? expr)
...]
[(eq? (first expr) '+)
...]
[(eq? (first expr) '*)
...]))
where the '...'s are things that we'll be filling in.
Actually, we know how to fill in a little more about the '...': we know that in the second and third cases, the inner parts are themselves str-exprs. Those are prime spots where recurrence will probably happen: since our data is described in terms of itself, the programs that process them will also probably need to refer to themselves. In short, programs that process str-exprs will almost certainly follow this shape:
(define (evaluate expr)
(cond
[(string? expr)
... expr
...]
[(eq? (first expr) '+)
... (evaluate (second expr))
... (evaluate (third expr))
...]
[(eq? (first expr) '*)
... (second expr)
... (evaluate (third expr))
...]))
That's all without even doing any real work: we can figure this part out just purely because that's what the data's shape tells us. Filling in the remainder of the '...'s to make this all work out is actually not too bad, especially when we also consider the test cases we've cooked up. (Code)
It's this kind of standard data-analysis/case-analysis that's at the heart of your question, and it's one that's covered extensively by curricula such as HTDP. This is not Scheme or Racket specific: you'd do the same kind of data analysis in Java, and you see the same kind of approach in many other places. In Java, the low-mechanism used for the case analysis might be done differently, perhaps with dynamic dispatch, but the core ideas are all the same. You need to describe the data. Once you have a data definition, use it to help you sketch out what the code needs to look like to process that data. Use test cases to triangulate how to fill in the sketch.

You need to break down your problem. I would start by following the HtDP (www.htdp.org) approach. What are your inputs? Can you specify them precisely? In this case, those inputs are going to be self-referential.
Then, specify the output form. In fact, your text above is a little fuzzy on this: I think I know what your output form looks like, but I'm not entirely sure.
Next, write a set of tests. These should be based on the structure of your input terms; start with the simplest ones, and work upward from there.
Once you have a good set of tests, implementing the function should be pretty straightforward. I'd be glad to help if you get stuck!

Related

How to use symbols and lists in scheme to process data?

I am a newbie in scheme, and I am in the process of writing a function that checks pairwise disjointess of rules (for the time being is incomplete), I used symbols and lists in order to represent the rues of the grammar. Uppercase symbol is a non-terminal in the grammar, and lowercase is a terminal. I am trying to check if a rule passes the pairwise disjointness test.
I will basically check if a rule has only one unique terminal in it. if it is the case, that rule passes the pairwise disjointness test. In scheme, I am thinking to realize that by representing the terminal symbol in lower case. An example of that rule would be:
'(A <= (a b c))
I will then check the case of a rule that contains an or. like:
'(A <= (a (OR (a b) (a c))))
Finally, I will check recursively for non terminals. A rule for that case would be
'(A <= (B b c))
However, What is keeping me stuck is how to use those symbols as data in order to be processed and recurse upon it. I thought about converting the symbols to strings, but that did not in case of having a list like that for example '(a b c) How can I do it?
Here is what I reached so far:
#lang racket
(define grammar
'(A <= (a A b))
)
(define (pairwise-disjoint lst)
(print(symbol->string (car lst)))
(print( cddr lst))
)
Pairwise Disjoint
As far as I know, the only way to check if a set is pairwise disjoint is to enumerate every possible pair and check for matches. Note that this does not follow the racket syntax, but the meaning should still be pretty clear.
(define (contains-match? x lst)
(cond ((null? x) #f) ; Nothing to do
((null? lst) #f) ; Finished walking full list
((eq? x (car lst)) #t) ; Found a match, no need to go further
(else
(contains-match? x (cdr lst))))) ; recursive call to keep walking
(define (pairwise-disjoint? lst)
(if (null? lst) #f
(let ((x (car lst)) ; let inner vars just for readability
(tail (cdr lst)))
(not
;; for each element, check against all later elements in the list
(or (contains-match? x tail)
(contains-match? (car tail) (cdr tail)))))))
It's not clear to me what else you're trying to do, but this is the going to be the general method. Depending on your data, you may need to use a different (or even custom-made) check for equality, but this works as is for normal symbols:
]=> (pairwise-disjoint? '(a b c d e))
;Value: #t
]=> (pairwise-disjoint? '(a b c d e a))
;Value: #f
Symbols & Data
This section is based on what I perceive to be a pretty fundamental misunderstanding of scheme basics by OP, and some speculation about what their actual goal is. Please clarify the question if this next bit doesn't help you!
However, What is keeping me stuck is how to use those symbols as data...
In scheme, you can associate a symbol with whatever you want. In fact, the define keyword really just tells the interpreter "Whenever I say contains-match? (which is a symbol) I'm actually referring to this big set of instructions over there, so remember that." The interpreter remembers this by storing the symbol and the thing it refers to in a big table so that it can be found later.
Whenever the interpreter runs into a symbol, it will look in its table to see if it knows what it actually means and substitute the real value, in this case a function.
]=> pairwise-disjoint?
;Value 2: #[compound-procedure 2 pairwise-disjoint?]
We tell the interpreter to keep the symbol in place rather than substituting by using the quote operator, ' or (quote ...):
]=> 'pairwise-disjoint?
;Value: pairwise-disjoint?
All that said, using define for your purposes is probably a really poor decision for all of the same reasons that global variables are generally bad.
To hold the definitions of all your particular symbols important to the grammar, you're probably looking for something like a hash table where each symbol you know about is a key and its particulars are the associated value.
And, if you want to pass around symbols, you really need to understand the quote and quasiquote.
Once you have your definitions somewhere that you can find them, the only work that's left to you is writing something like I did above that is maybe a little more tailored to your particular situation.
Data Types
If you have Terminals and Non-Terminals, why not make data-types for each? In #lang racket the way to introduce new data type is with struct.
;; A Terminal is just has a name.
(struct Terminal (name))
;; A Non-terminal has a name and a list of terms
;; The list of terms may contain Terminals, Non-Terminals, or both.
(struct Non-terminal (name terms))
Processing Non-terminals
Now we can find the Terminals in a Non-Terminal's list of terms using the predicate Terminal? which is provided automatically when we define the Terminal as a struct.
(define (find-terminals non-terminal)
(filter Terminal? (Non-terminal-terms non-terminal)))
Pairwise Disjoint Terminals
Once we have filtered the list of terms we can determine properties:
;; List(Terminal) -> Boolean
define (pairwise-disjoint? terminals)
(define (roundtrip terms)
(set->list (list->set terms)))
(= (length (roundtrip terminals)
(length terminals))))
The round trip list->set->list isn't necessarily optimized for speed, of course and profiling actual working implementations may justify refactoring, but at least it's been black-boxed.
Notes
Defining data types with struct provides all sorts of options for validating data as the type is instantiated. If you look at the Racket code base, you will see struct used frequently in the more recent portions.
Since grammar has a list within a list, I think you'll have to either test via list? before calling symbol->string (since, as you discovered, symbol->string won't work on a list), or else you could do something like this:
(map symbol->string (flatten grammar))
> '("A" "<=" "a" "A" "b")
Edit: For what you're doing, i guess the flatten route might not be that helpful. so ya, test via list? each time when parsing and handle accordingly.

Pattern match function in Scheme Meta Circular Evaluator

I'm trying to add a pattern matching function to an existing scheme meta circular evaluator (this is homework) and I'm a bit lost on the wording of the instructions. I was hoping someone more skilled in this regard could help me interpret this.
The syntax for match should look like the following: (match a ((p1 v1) (p2 v2) (p3 v3)))
And it could be used to find length like so:
(define length (lambda (x)
(match x (('() 0)
(('cons a b) (+ 1 (length b))))))
The pattern language in the function should contain numeric constants, quoted constants, variables, and cons. If patterns are exhausted without finding a match, an error should be thrown.
I thought I understood the concept of pattern matching but implementing it in a function this way has me a bit thrown off. Would anyone be willing to explain what the above syntax is doing (specifically, how match is used in length above) so I can get a better understanding of what this function should do?
(match x (('() 0)
(('cons a b) (+ 1 (length b)))))
It may be most helpful to consider what this code would need to expand into. For each pattern, you'd need a test to determine whether the object you're trying to match matches, and you'd need code to figure out how to bind variables to its subparts. In this case, you'd want an expansion roughly like:
(if (equal? '() x)
0
(if (pair? x)
(let ((a (car x))
(b (cdr x)))
(+ 1 (length b)))
;; indicate failure to match
'NO-MATCH))
You could write that with a cond, too, of course, but if you have to procedurally expand this, it might be easier to use nested if forms.
If you're actually implementing this as a function (and not as a macro (i.e., source transformation), then you'll need to specify exactly how you can work with environments, etc.
I suggest you read the chapter four, Structured Types and the Semantics of Pattern Matching, from The Implementation of Functional Languages. The chapter is written by Simon L. Peyton Jones and Philip Wadler.

Why is a Lisp file not a list of statements?

I've been learning Scheme through the Little Schemer book and it strikes me as odd that a file in Scheme / Lisp isn't a list of statements. I mean, everything is supposed to be a list in Lisp, but a file full of statements doesn't look like a list to me. Is it represented as a list underneath? Seems odd that it isn't a list in the file.
For instance...
#lang scheme
(define atom?
(lambda (x)
(and (not (pair? x)) (not (null? x)))))
(define sub1
(lambda (x y)
(- x y)))
(define add1
(lambda (x y)
(+ x y)))
(define zero?
(lambda (x)
(= x 0)))
Each define statement is a list, but there is no list of define statements.
It is not, because there is no practical reasons for it. In fact, series of define statements change internal state of the language. Information about the state can be accessible via functions. For example , you can ask Lisp if some symbol is bound to a function.
There is no practical benefit in traversing all entered forms (for example, define forms). I suppose that this approach (all statements are elements of a list) would lead to code that would be hard to read.
Also, I think it not quite correct to think that "everything is supposed to be a list in Lisp", since there are also some atomic types, which are quite self-sufficient.
When you evaluate a form, if the form defines something, that definition is added to the environment, and that environment is (or can be) a single list. You can build a program without using files, by just typing definitions into the REPL. In Lisp as in any language, the program “lives” in the run-time environment, not the source files.

Little Schemer "S-expression" predicate

Is it true that this is an S-expression?
xyz
asks The Little Schemer. but how to test?
syntactically, i get how to test other statements like
> (atom? 'turkey)
and
> (list? '(atom))
not entirely sure how to test this...
> (list? '(atom turkey) or)
as it just returns...
or: bad syntax in: or
but anyway, knowing how to test for S-expressions is foxing me
so, as per usual, any illumination much appreciated
An "S-expression" is built of atoms via several (possibly zero) cons applications:
(define (sexp? expr)
(or
; several cases:
(atom? expr)
; or
(and (pair? expr) ; a pair is built by a cons
(sexp? (car expr)) ; from a "car"
(sexp? .........)) ; and a "cdr"
)))
This is practically in English. Nothing more to say about it (in code, I mean). Except, after defining the missing
(define (atom? x)
(not (pair? x)))
we see that (sexp? ...) can only return #t. This is the whole point to it: in Lisp, everything is an S-expression – either an atom, or a pair of S-expressions.
The previous answer is correct -- Scheme (and Lisp) are languages that are based on S-expressions. And the provided code is a great start.
But it's not quite correct that everything is an S-expression in those languages. In this case you have an expression that isn't syntactically correct, so the language chokes when it tries to read it in. In other words, it's not an S-expression.
I know it's frustrating, and honestly, not that great of an answer, but this is a great lesson in one of the Golden Rules of Computer Programming: Garbage In, Garbage Out. The fat is that you are going to get these kinds of errors simply because it isn't really feasible, when starting out programming, to test for every possible way that something isn't an S-expression without using the language itself.

Why does this code not work in Scheme?

(define a 42)
(set! 'a 10)
(define a 42)
(define (symbol) 'a)
(set! (symbol) 10)
(define a (cons 1 2))
(set! (car a) 10)
I tried running them in DrScheme and they don't work. Why?
Think of set! is a special form like define which does not evaluate its first operand. You are telling the scheme interpreter to set that variable exactly how you write it. In your example, it will not evaluate the expression 'a to the word a. Instead, it will look for a variable binding named "'a" (or depending on your interpreter might just break before then since I think 'a is not a valid binding).
For the last set of expressions, if you want to set the car of a pair, use the function (set-car! pair val) which works just like any scheme function in that it evaluates all of its operands. It takes in two values, a pair and some scheme value, and mutates the pair so that the car is now pointing to the scheme value.
So for example.
>(define pair (cons 1 2))
>pair
(1 . 2)
>(set-car! pair 3)
(3 . 2)
Because the first argument of set! is a variable name, not an "lvalue", so to speak.
For the last case, use (set-car! a 10).
The issue is with (set! 'a 10), as you shouldn't be quoting the symbol a.
It sounds like you're trying to learn Scheme, and you don't know Lisp, yes? If so, I strongly recommend trying Clojure as an easier to learn Lisp. I failed to grasp the interaction between the reader, evaluation, symbols, special forms, macros, and so forth in both Common Lisp and Scheme because those things all seemed to interact in tangled ways, but I finally really understand them in Clojure. Even though it's new, I found Clojure documentation is actually clearer than anything I found for Scheme or CL. Start with the videos at http://clojure.blip.tv/ and then read the docs at clojure.org.

Resources