Idiomatic scheme and generic programming, why only on numbers? - scheme

In Scheme, procedures like +, -, *, / works on different types of numbers, but we don't much see any other generic procedures.
For example, length works only on list so that vector-length and string-length are needed.
I guess it comes from the fact that the language doesn't really offer any mechanism for defining generic procedure (except cond of course) like "type classes" in Haskell or a standardized object system.
Is there an idiomatic scheme way to handle generic procedures that I'm not aware of ?

Keeping in mind that all "different types of numbers" are all scheme numbers (i.e. (number? n) equals #t) - this behavior actually makes sense. +, -, *, / and all other arithmetic operators operate on numbers only (even though in other languages they would be classified as different number types int, long, float, etc...) This is due to the fact that you can't explicitly declare number types in scheme.
If you really need a generic solution, besides using external libraries, the easiest way is to roll your own:
(define org-length length)
(define (length x)
(cond
((string? x) (string-length x))
((vector? x) (vector-length x))
; keep going ...
(else (org-length x))))

No, but you can build your own. Welcome to Scheme!
In the past I've used Swindle to provide generic functions. It's bundled with PLT Scheme. It worked well for me, but it's been a few years. There may be other alternatives out there now.

Read SICP, sections 2.4 and 2.5, which cover the implementation of procedures that can operate on generic data types by means of attaching "tags" to data objects. It's also in lecture 4-B of that MIT video series.

You really want to have an object system for that. You may want to have a look at Tiny CLOS, for instance, which is the de-facto standard object system for Chicken Scheme (see the reference manual), but seems to be available for most Scheme implementations.

Finally, I found out a very neat solution in PLT Scheme :
(require (rename-in scheme [length list-length]))
(define length
(λ (x)
((cond [(list? x) list-length]
[(string? x) string-length]
[(vector? x) vector-length]
[else (error "whatever")]) x)))
(length '(a b c))
(length "abc")
(length #(1 2 3))

Related

Can any case of using call/cc be rewritten equivalently without using it?

Can any case of using call/cc be rewritten equivalently without using it?
For example
In (g (call/cc f)), is the purpose of f to evaluate the value of
some expression, so that g can be applied to the value?
Is (g (call/cc f)) always able to be rewritten equivalently
without call/cc e.g. (g expression)?
In ((call/cc f) arg), is the purpose of f to evaluate the
definition of some function g, so that function g can be
applied to the value of arg?
Is ((call/cc f) arg) always able to be rewritten equivalently
without call/cc e.g. (g arg)?
If the answers are yes, why do we need to use call/cc?
I am trying to understand the purpose of using call/cc, by contrasting it to not using it.
The key to the direct answer here is the notion of "Turing equivalence". That is, essentially all of the commonly used programming languages (C, Java, Scheme, Haskell, Lambda Calculus etc. etc.) are equivalent in the sense that for any program in one of these languages, there is a corresponding program in each of the other languages which has the same meaning.
Beyond this, though, some of these equivalences may be "nice" and some may be really horrible. This suggests that we reframe the question: which features can be rewritten in a "nice" way into languages without that feature, and which cannot?
A formal treatment of this comes from Matthias Felleisen, in his 1991 paper "On the Expressive Power of Programming Languages" (https://www.sciencedirect.com/science/article/pii/016764239190036W), which introduces a notion of macro expressibility, pointing out that some features can be rewritten in a local way, and some require global rewrites.
The answer to your original question is obviously yes. Scheme is Turing-complete, with or without call/cc, so even without call/cc, you can still compute anything that is computable.
Why "it is more convenient than writing the equivalent expression using lambda"?
The classic paper On the Expressive Power of Programming Languages by Matthias Felleisen gives one answer to this question. Pretty much, to rewrite a program with call/cc to one without it, you might potentially need to transform your whole program (global transformation). This is to contrast some other constructs that only need a local transformation (i.e., can be written as macro) to remove them.
The key is: If your program is written in continuation passing style, you don't need call/cc. If not, good luck.
I whole-heartedly recommend:
Daniel P. Friedman. "Applications of Continuations: Invited Tutorial". 1988 Principles of Programming Languages (POPL88). January 1988
https://cs.indiana.edu/~dfried/appcont.pdf
If you enjoy reading that paper, then check out:
https://github.com/scheme-live/bibliography/blob/master/page6.md
Of course anything that is written with call/cc can be written without it, because everything in Scheme is ultimately written using lambda. You use call/cc because it is more convenient than writing the equivalent expression using lambda.
There are two senses to this question: an uninteresting one and an interesting one:
The uninteresting one. Is there some computation that you can do with call/cc that you can't do in a language which does not have it?
No, there isn't: call/cc doesn't make a language properly more powerful: it is famously the case that a language with only λ and function application is equivalent to a universal Turing machine, and thus there is no (known...) more powerful computational system.
But that's kind of uninteresting from the point of view of programming-language design: subject to the normal constraints on memory &c, pretty much all programming languages are equivalent to UTMs, but people still prefer to use languages which don't involve punching holes in paper tape if they can.
The interesting one. Is it the case that call/cc makes some desirable features of a programming language easier to express?
The answer to this is yes, it does. I'll just give a couple of examples. Let's say you want to have some kind of non-local exit feature in your language, so some deeply-nested bit of program can just say 'to hell with this I want out', without having to climb back out through some great layer of functions. This is trivial with call/cc: the continuation procedure is the escape procedure. You can wrap it in some syntax if you want it to be nicer:
(define-syntax with-escape
(syntax-rules ()
[(_ (e) form ...)
(call/cc (λ (e) form ...))]))
(with-escape (e)
... code in here, and can call e to escape, and return some values ...)
Can you implement this without call/cc? Well, yes, but not without either relying on some other special construct (say block and return-from in CL), or without turning the language inside out in some way.
And you can build on things like this to implement all sorts of non-local escapes.
Or, well, let's say you want GO TO (the following example is Racket):
(define (test n)
(define m 0)
(define start (call/cc (λ (c) c)))
(printf "here ~A~%" m)
(set! m (+ m 1))
(when (< m n)
(start start)))
Or, with some syntax around this:
(define-syntax-rule (label place)
(define place (call/cc identity)))
(define (go place)
(place place))
(define (horrid n)
(define m 0)
(label start)
(printf "here ~A~%" m)
(set! m (+ m 1))
(when (< m n)
(go start)))
So, OK, this perhaps is not a desirable feature of a programming language. But, well, Scheme doesn't have GO TO right, and yet, here, it does.
So, yes, call/cc (especially when combined with macros) makes a lot of desirable features of a programming language possible to express. Other languages have all these special-purpose, limited hacks, Scheme has this universal thing from which all these special-purpose hacks can be built.
The problem is that call/cc doesn't stop with the good special-purpose hacks: you can also build all the awful horrors that used to blight programming languages out of it. call/cc is like having access to an elder god: it's really convenient if you want dread power, but you'd better be careful what comes with it when you call, because it may well be an unspeakable horror from beyond spacetime.
An easy use of call/cc is as a bail out. eg.
;; (1 2) => (2 4)
;; #f if one element is not a number
(define (double-numbers lst)
(call/cc
(lambda (exit)
(let helper ((lst lst))
(cond ((null? lst) '())
((not (number? (car lst))) (exit #f))
(else (cons (* 2 (car lst)) (helper (cdr lst)))))))))
So to understand this. If we are doing (double-numbers '(1 2 r)) the result is #f, but the helper has done (cons 1 (cons 2 (exit #f)))
Without call/cc we see the continuation would be whatever called double-numbers since it actually return normally from it. Here is an example without call/cc:
;; (1 2) => (2 4)
;; #f if one element is not a number
(define (double-numbers lst)
(define (helper& lst cont)
(cond ((null? lst) (cont '()))
((not (number? (car lst))) #f) ; bail out, not using cont
(else (helper& (cdr lst)
(lambda (result)
(cont (cons (* 2 (car lst)) result)))))))
(helper& lst values)) ; values works as an identity procedure
I imagine it gets harder pretty quick. Eg. my generator implementation. The generator relies on having access to continuations to mix the generator code with where it's used, but without call/cc you'll need to do CPS in both the generator, the generated generator and the code that uses it.

Data Structures in Scheme

I am learning Scheme, coming from a background of Haskell, and I've run into a pretty surprising issue - scheme doesn't seem to have custom data types??? (ie. objects, structs, etc.). I know some implementations have their own custom macros implementing structs, but R6RS itself doesn't seem to provide any such feature.
Given this, I have two questions:
Is this correct? Am I missing a feature that allows creation of custom data types?
If not, how do scheme programmers structure a program?
For example, any function trying to return multiple items of data needs some way of encapsulating the data. Is the best practice to use a hash map?
(define (read-user-input)
(display "1. Add todo\n2. Delete todo\n3. Modify todo\n")
(let ((cmd-num (read)))
(if (equal? cmd-num "1") '(("command-number" . cmd-num) ("todo-text" . (read-todo)))
(if (equal? cmd-num "2") '(("command-number" . cmd-num) ("todo-id" . (read-todo-id)))
'(("command-number" . cmd-num) ("todo-id" . (read-todo-id)))))))
In order to answer your question, I think it might help to give you a slightly bigger-picture comment.
Scheme has often been described as not so much a single language as a family of languages. This is particularly true of R5RS, which is still what many people mean when they say "Scheme."
Nearly every one of the languages in the Scheme family has structures. I'm personally most familiar with Racket, where you can define structures with
struct or define-struct.
"But", you might say, "I want to write my program so that it runs in all versions of Scheme." Several very smart people have succeeded in doing this: Dorai Sitaram and Oleg Kiselyov both come to mind. However, my observation about their work is that generally, maintaining compatibility with many versions of scheme without sacrificing performance usually requires a high level of macro expertise and a good deal of Serious Thinking.
It's true that several of the SRFIs describe structure facilities. My own personal advice to you is to pick a Scheme implementation and allow yourself to feel good about using whatever structure facilities it provides. In some ways, this is not unlike Haskell; there are features that are specific to ghc, and generally, I claim that most Haskell programmers are happy to use these features without worrying that they don't work in all versions of Haskell.
Absolutely not. Scheme has several SRFIs for custom types, aka. record types, and with R7RS Red edition it will be SRFI-136, but since you mention R6RS it has records defined in the standard too.
Example using R6RS:
#!r6rs
(import (rnrs))
(define-record-type (point make-point point?)
(fields (immutable x point-x)
(immutable y point-y)))
(define test (make-point 3 7))
(point-x test) ; ==> 3
(point-y test) ; ==> 7
Early Scheme (and lisp) didn't have record types and you usually made constructors and accessors:
Example:
(define (make-point x y)
...)
(define (point-x p)
...)
(define (point-y p)
...)
This is the same contract the record types actually create. How it is implemented is really not important. Here are some ideas:
(define make-point cons)
(define point-x car)
(define point-y cdr)
This works most of the time, but is not really very safe. Perhaps this is better:
(define tag (list 'point))
(define idx-tag 0)
(define idx-x 1)
(define idx-y 2)
(define (point? p)
(and (vector? p)
(positive? (vector-length p))
(eq? tag (vector-ref p idx-tag))))
(define (make-point x y)
(vector tag x y))
;; just an abstraction. Might not be exported
(define (point-acc p idx)
(if (point? p)
(vector-ref p idx)
(raise "not a point")))
(define (point-x p)
(point-acc p idx-x))
(define (point-y p)
(point-acc p idx-y))
Now if you look the the reference implementation for record types you'll find they use vectors so the vector version and R6RSs isn't that different.
Lookup? You can use a vector, list or a case:
Example:
;; list is good for a few elements
(define ops `((+ . ,+) (- . ,-)))
(let ((found (assq '+ ops)))
(if found
((cdr found) 1 2)
(raise "not found")))
; ==> 3
;; case (switch)
((case '+
((+) +)
((-) -)
(else (raise "not found"))) 1 2) ; ==> 3
Of course you have hash tables in SRFI-125 so for a large number of elements its probably vice. Know that it probably uses vector to store the elements :-)

Scheme: what is the intuition for COND to support multiple expressions in its body?

All of the following are correct. But versoin 2 seems a bit confusing as it suggests an order/sequence of execution, which I think is discouraged in functional progrmaming. So I wonder what is the intuition/benifit of allowing version 2. Is it just for simpler code than versoin 3?
; version 1
(define (foo x)
(cond ((> x 0) 1)))
; version 2
(define (foo x)
(cond ((> x 0) 1 2 3)))
; version 3
(define (foo x)
(cond ((> x 0)
(begin 1 2 3))))
It is not only discouraged, but pointless, for functional programming (either of version 2 or 3). But it is useful if you need to produce side-effects (for example, printing), and version 2 is a bit simpler than version 3.
Scheme isn't a functional language, let alone a non-strictly evaluated one. Scheme directly provides sequenced evaluation of forms. The cond form itself isn't strictly functional: it evaluates the test clauses in strict order, and when it finds one which yields true, it skips the remaining ones. So even without using multiple forms in a single cond clause, we can express imperative programming:
(cond
((> x 10)
(set! y 3))
((< x 0)
(set! z 5)))
The cond form has a long history in Lisp. It was present in some of the earliest versions of Lisp and is described in the 1960 Lisp 1 manual. In that manual, the cond which is described in fact doesn't allow multiple forms: it arguments are strict pairs. It is still that way in the Lisp 1.5 manual. At some point, Lisp dialects started exhibiting the multiple-forms support in cond clauses. Curiously, though, the "cond pair" terminology refuses to die.
The intuition behind allowing (cond (test1 e1 e2 .. en)) is that if you do not provide this, the programmer will get the desired behavior anyway, at the cost of extra verbiage, as your example shows with the explicit begin: another level of parenthesis nesting accompanied by an operator symbol.
It is a backward-compatible extension to the original cond. Allowing the extra forms doesn't change the meaning of cond expressions that were previously correct; it adds meaning to cond expressions that were previously ill-formed.
Other dialects of Lisp, such as Common Lisp and Emacs Lisp, have multiple form evaluation in their cond clauses, so not allowing it in Scheme would only reduce compatibility, adding to someone's workload when they convert code from another dialect to Scheme.

How to use symbols and lists in scheme to process data?

I am a newbie in scheme, and I am in the process of writing a function that checks pairwise disjointess of rules (for the time being is incomplete), I used symbols and lists in order to represent the rues of the grammar. Uppercase symbol is a non-terminal in the grammar, and lowercase is a terminal. I am trying to check if a rule passes the pairwise disjointness test.
I will basically check if a rule has only one unique terminal in it. if it is the case, that rule passes the pairwise disjointness test. In scheme, I am thinking to realize that by representing the terminal symbol in lower case. An example of that rule would be:
'(A <= (a b c))
I will then check the case of a rule that contains an or. like:
'(A <= (a (OR (a b) (a c))))
Finally, I will check recursively for non terminals. A rule for that case would be
'(A <= (B b c))
However, What is keeping me stuck is how to use those symbols as data in order to be processed and recurse upon it. I thought about converting the symbols to strings, but that did not in case of having a list like that for example '(a b c) How can I do it?
Here is what I reached so far:
#lang racket
(define grammar
'(A <= (a A b))
)
(define (pairwise-disjoint lst)
(print(symbol->string (car lst)))
(print( cddr lst))
)
Pairwise Disjoint
As far as I know, the only way to check if a set is pairwise disjoint is to enumerate every possible pair and check for matches. Note that this does not follow the racket syntax, but the meaning should still be pretty clear.
(define (contains-match? x lst)
(cond ((null? x) #f) ; Nothing to do
((null? lst) #f) ; Finished walking full list
((eq? x (car lst)) #t) ; Found a match, no need to go further
(else
(contains-match? x (cdr lst))))) ; recursive call to keep walking
(define (pairwise-disjoint? lst)
(if (null? lst) #f
(let ((x (car lst)) ; let inner vars just for readability
(tail (cdr lst)))
(not
;; for each element, check against all later elements in the list
(or (contains-match? x tail)
(contains-match? (car tail) (cdr tail)))))))
It's not clear to me what else you're trying to do, but this is the going to be the general method. Depending on your data, you may need to use a different (or even custom-made) check for equality, but this works as is for normal symbols:
]=> (pairwise-disjoint? '(a b c d e))
;Value: #t
]=> (pairwise-disjoint? '(a b c d e a))
;Value: #f
Symbols & Data
This section is based on what I perceive to be a pretty fundamental misunderstanding of scheme basics by OP, and some speculation about what their actual goal is. Please clarify the question if this next bit doesn't help you!
However, What is keeping me stuck is how to use those symbols as data...
In scheme, you can associate a symbol with whatever you want. In fact, the define keyword really just tells the interpreter "Whenever I say contains-match? (which is a symbol) I'm actually referring to this big set of instructions over there, so remember that." The interpreter remembers this by storing the symbol and the thing it refers to in a big table so that it can be found later.
Whenever the interpreter runs into a symbol, it will look in its table to see if it knows what it actually means and substitute the real value, in this case a function.
]=> pairwise-disjoint?
;Value 2: #[compound-procedure 2 pairwise-disjoint?]
We tell the interpreter to keep the symbol in place rather than substituting by using the quote operator, ' or (quote ...):
]=> 'pairwise-disjoint?
;Value: pairwise-disjoint?
All that said, using define for your purposes is probably a really poor decision for all of the same reasons that global variables are generally bad.
To hold the definitions of all your particular symbols important to the grammar, you're probably looking for something like a hash table where each symbol you know about is a key and its particulars are the associated value.
And, if you want to pass around symbols, you really need to understand the quote and quasiquote.
Once you have your definitions somewhere that you can find them, the only work that's left to you is writing something like I did above that is maybe a little more tailored to your particular situation.
Data Types
If you have Terminals and Non-Terminals, why not make data-types for each? In #lang racket the way to introduce new data type is with struct.
;; A Terminal is just has a name.
(struct Terminal (name))
;; A Non-terminal has a name and a list of terms
;; The list of terms may contain Terminals, Non-Terminals, or both.
(struct Non-terminal (name terms))
Processing Non-terminals
Now we can find the Terminals in a Non-Terminal's list of terms using the predicate Terminal? which is provided automatically when we define the Terminal as a struct.
(define (find-terminals non-terminal)
(filter Terminal? (Non-terminal-terms non-terminal)))
Pairwise Disjoint Terminals
Once we have filtered the list of terms we can determine properties:
;; List(Terminal) -> Boolean
define (pairwise-disjoint? terminals)
(define (roundtrip terms)
(set->list (list->set terms)))
(= (length (roundtrip terminals)
(length terminals))))
The round trip list->set->list isn't necessarily optimized for speed, of course and profiling actual working implementations may justify refactoring, but at least it's been black-boxed.
Notes
Defining data types with struct provides all sorts of options for validating data as the type is instantiated. If you look at the Racket code base, you will see struct used frequently in the more recent portions.
Since grammar has a list within a list, I think you'll have to either test via list? before calling symbol->string (since, as you discovered, symbol->string won't work on a list), or else you could do something like this:
(map symbol->string (flatten grammar))
> '("A" "<=" "a" "A" "b")
Edit: For what you're doing, i guess the flatten route might not be that helpful. so ya, test via list? each time when parsing and handle accordingly.

Why is a Lisp file not a list of statements?

I've been learning Scheme through the Little Schemer book and it strikes me as odd that a file in Scheme / Lisp isn't a list of statements. I mean, everything is supposed to be a list in Lisp, but a file full of statements doesn't look like a list to me. Is it represented as a list underneath? Seems odd that it isn't a list in the file.
For instance...
#lang scheme
(define atom?
(lambda (x)
(and (not (pair? x)) (not (null? x)))))
(define sub1
(lambda (x y)
(- x y)))
(define add1
(lambda (x y)
(+ x y)))
(define zero?
(lambda (x)
(= x 0)))
Each define statement is a list, but there is no list of define statements.
It is not, because there is no practical reasons for it. In fact, series of define statements change internal state of the language. Information about the state can be accessible via functions. For example , you can ask Lisp if some symbol is bound to a function.
There is no practical benefit in traversing all entered forms (for example, define forms). I suppose that this approach (all statements are elements of a list) would lead to code that would be hard to read.
Also, I think it not quite correct to think that "everything is supposed to be a list in Lisp", since there are also some atomic types, which are quite self-sufficient.
When you evaluate a form, if the form defines something, that definition is added to the environment, and that environment is (or can be) a single list. You can build a program without using files, by just typing definitions into the REPL. In Lisp as in any language, the program “lives” in the run-time environment, not the source files.

Resources