Self-referential data structures in Lisp/Scheme - data-structures

Is there a way to construct a self-referential data structure (say a graph with cycles) in lisp or scheme? I'd never thought about it before, but playing around I can find no straightforward way to make one due to the lack of a way to make destructive modification. Is this just an essential flaw of functional languages, and if so, what about lazy functional languages like haskell?

In Common Lisp you can modify list contents, array contents, slots of CLOS instances, etc.
Common Lisp also allows to read and write circular data structures. Use
? (setf *print-circle* t)
T
; a list of two symbols: (foo bar)
? (defvar *ex1* (list 'foo 'bar))
*EX1*
; now let the first list element point to the list,
; Common Lisp prints the circular list
? (setf (first *ex1*) *ex1*)
#1=(#1# BAR)
; one can also read such a list
? '#1=(#1# BAR)
#1=(#1# BAR)
; What is the first element? The list itself
? (first '#1=(#1# BAR))
#1=(#1# BAR)
?
So-called pure Functional Programming Languages don't allow side-effects. Most Lisp dialects are not pure. They allow side-effects and they allow to modify data-structures.
See Lisp introduction books for more on that.

In Scheme, you can do it easily with set!, set-car!, and set-cdr! (and anything else ending in a bang ('!'), which indicates modification):
(let ((x '(1 2 3)))
(set-car! x x)
; x is now the list (x 2 3), with the first element referring to itself
)

Common Lisp supports modification of data structures with setf.
You can build a circular data structure in Haskell by tying the knot.

You don't need `destructive modification' to construct self-referential data structures; e.g., in Common Lisp, '#1=(#1#) is a cons-cell that contains itself.
Scheme and Lisp are capable of making destructive modifications: you can construct the circular cons above alternatively like this:
(let ((x (cons nil nil)))
(rplaca x x) x)
Can you let us know what material you're using while learning Lisp/Scheme? I'm compiling a target list for our black helicopters; this spreading of misinformation about Lisp and Scheme has to be stopped.

Yes, and they can be useful. One of my college professors created a Scheme type he called Medusa Numbers. They were arbitrary precision floating point numbers that could include repeating decimals. He had a function:
(create-medusa numerator denominator) ; or some such
which created the Medusa Number that represented the rational. As a result:
(define one-third (create-medusa 1 3))
one-third => ; scheme hangs - when you look at a medusa number you turn to stone
(add-medusa one-third (add-medusa one-third one-third)) => 1
as said before, this is done with judicious application of set-car! and set-cdr!

Not only is it possible, it's pretty central to the Common Lisp Object System: standard-class is an instance of itself!

I upvoted the obvious Scheme techniques; this answer addresses only Haskell.
In Haskell you can do this purely functionally using let, which is considered good style. One nice example is regexp-to-NFA conversion. You can also do it imperatively using IORefs, which is considered poor style as it forces all your code into the IO monad.
In general Haskell's lazy evaluation lends itself to lovely functional implementations of both cyclic and infinite data structures. In any complex let binding, all things bound may be used in all definitions. For example translating a particular finite-state machine into Haskell is a snap, no matter how many cycles it may have.

CLOS example:
(defclass node ()
((child :accessor node-child :initarg :child)))
(defun make-node-cycle ()
(let* ((node1 (make-instance 'node))
(node2 (make-instance 'node :child node1)))
(setf (node-child node1) node2)))

Tying the Knot (circular data structures in Haskell) on StackOverflow
See also the Haskell Wiki page: Tying the Knot

Hmm, self referential data structures in Lisp/Scheme, and SICP streams are not mentioned? Well, to summarize, streams == lazily evaluated list. It might be exactly the kind of self reference you've intended, but it's a kind of self reference.
So, cons-stream in SICP is a syntax that delays evaluating its arguments. (cons-stream a b) will return immediately without evaluating a or b, and only evaluates a or b when you invoke car-stream or cdr-stream
From SICP, http://mitpress.mit.edu/sicp/full-text/sicp/book/node71.html:
>
(define fibs
(cons-stream 0
(cons-stream 1
(add-streams (stream-cdr fibs)
fibs))))
This definition says that fibs is a
stream beginning with 0 and 1, such
that the rest of the stream can be
generated by adding fibs to itself
shifted by one place:
In this case, 'fibs' is assigned an object whose value is defined lazily in terms of 'fibs'
Almost forgot to mention, lazy streams live on in the commonly available libraries SRFI-40 or SRFI-41. One of these two should be available in most popular Schemes, I think

I stumbled upon this question while searching for "CIRCULAR LISTS LISP SCHEME".
This is how I can make one (in STk Scheme):
First, make a list
(define a '(1 2 3))
At this point, STk thinks a is a list.
(list? a)
> #t
Next, go to the last element (the 3 in this case) and replace the cdr which currently contains nil with a pointer to itself.
(set-cdr! (cdr ( cdr a)) a)
Now, STk thinks a is not a list.
(list? a)
> #f
(How does it work this out?)
Now if you print a you will find an infinitely long list of (1 2 3 1 2 3 1 2 ... and you will need to kill the program. In Stk you can control-z or control-\ to quit.
But what are circular-lists good for?
I can think of obscure examples to do with modulo arithmetic such as a circular list of the days of the week (M T W T F S S M T W ...), or a circular list of integers represented by 3 bits (0 1 2 3 4 5 6 7 0 1 2 3 4 5 ..).
Are there any real-world examples?

Related

How to create a Scheme definition to parse a compound S-expression and put in a normal form

Given an expression in the form of : (* 3 (+ x y)), how can I evaluate the expression so as to put it in the form (+ (* 3 x) (* 3 y))? (note: in the general case, 3 is any constant, and "x" or "y" could be terms of single variables or other s-expressions (e.g. (+ 2 x)).
How do I write a lambda expression that will recursively evaluate the items (atoms?) in the original expression and determine whether they are a constant or a term? In the case of a term, it would then need to be evaluated again recursively to determine the type of each item in that term's list.
Again, the crux of the issue for me is the recursive "kernel" of the definition.
I would obviously need a base case that would determine once I have reached the last, single atom in the deepest part of the expression. Then recursively work "back up", building the expression in the proper form according to rules.
Coming from a Java / C++ background I am having great difficulty in understanding how to do this syntactically in Scheme.
Let's take a quick detour from the original problem to something slightly related. Say that you're given the following: you want to write an evaluator that takes "string-building" expressions like (* 3 "hello") and "evaluates" it to "hellohellohello". Other examples that we'd like to make work include things like
(+ "rock" (+ (* 5 "p") "aper")) ==> "rockpppppaper"
(* 3 (+ "scis" "sors")) ==> "scissorsscissorsscissors"
To tackle a problem like this, we need to specify exactly what the shape of the inputs are. Essentially, we want to describe a data-type. We'll say that our inputs are going to be "string-expressions". Let's call them str-exprs for short. Then let's define what it means to be a str-expr.
A str-expr is either:
<string>
(+ <str-expr> <str-expr>)
(* <number> <str-expr>)
By this notation, we're trying to say that str-exprs will all fit one of those three shapes.
Once we have a good idea of what the shape of the data is, we have a better guide to write functions that process str-exprs: they must case-analyze those three alternatives!
;; A str-expr is either:
;; a plain string, or
;; (+ str-expr str-expr), or
;; (* number str-expr)
;; We want to write a definition to "evaluate" such string-expressions.
;; evaluate: str-expr -> string
(define (evaluate expr)
(cond
[(string? expr)
...]
[(eq? (first expr) '+)
...]
[(eq? (first expr) '*)
...]))
where the '...'s are things that we'll be filling in.
Actually, we know how to fill in a little more about the '...': we know that in the second and third cases, the inner parts are themselves str-exprs. Those are prime spots where recurrence will probably happen: since our data is described in terms of itself, the programs that process them will also probably need to refer to themselves. In short, programs that process str-exprs will almost certainly follow this shape:
(define (evaluate expr)
(cond
[(string? expr)
... expr
...]
[(eq? (first expr) '+)
... (evaluate (second expr))
... (evaluate (third expr))
...]
[(eq? (first expr) '*)
... (second expr)
... (evaluate (third expr))
...]))
That's all without even doing any real work: we can figure this part out just purely because that's what the data's shape tells us. Filling in the remainder of the '...'s to make this all work out is actually not too bad, especially when we also consider the test cases we've cooked up. (Code)
It's this kind of standard data-analysis/case-analysis that's at the heart of your question, and it's one that's covered extensively by curricula such as HTDP. This is not Scheme or Racket specific: you'd do the same kind of data analysis in Java, and you see the same kind of approach in many other places. In Java, the low-mechanism used for the case analysis might be done differently, perhaps with dynamic dispatch, but the core ideas are all the same. You need to describe the data. Once you have a data definition, use it to help you sketch out what the code needs to look like to process that data. Use test cases to triangulate how to fill in the sketch.
You need to break down your problem. I would start by following the HtDP (www.htdp.org) approach. What are your inputs? Can you specify them precisely? In this case, those inputs are going to be self-referential.
Then, specify the output form. In fact, your text above is a little fuzzy on this: I think I know what your output form looks like, but I'm not entirely sure.
Next, write a set of tests. These should be based on the structure of your input terms; start with the simplest ones, and work upward from there.
Once you have a good set of tests, implementing the function should be pretty straightforward. I'd be glad to help if you get stuck!

Common lisp macro syntax keywords: what do I even call this?

I've looked through On Lisp, Practical Common Lisp and the SO archives in order to answer this on my own, but those attempts were frustrated by my inability to name the concept I'm interested in. I would be grateful if anyone could just tell me the canonical term for this sort of thing.
This question is probably best explained by an example. Let's say I want to implement Python-style list comprehensions in Common Lisp. In Python I would write:
[x*2 for x in range(1,10) if x > 3]
So I begin by writing down:
(listc (* 2 x) x (range 1 10) (> x 3))
and then defining a macro that transforms the above into the correct comprehension. So far so good.
The interpretation of that expression, however, would be opaque to a reader not already familiar with Python list comprehensions. What I'd really like to be able to write is the following:
(listc (* 2 x) for x in (range 1 10) if (> x 3))
but I haven't been able to track down the Common Lisp terminology for this. It seems that the loop macro does exactly this sort of thing. What is it called, and how can I implement it? I tried macro-expanding a sample loop expression to see how it's put together, but the resulting code was unintelligible. Could anyone guide me in the right direction?
Thanks in advance.
Well, what for does is essentially, that it parses the forms supplied as its body. For example:
(defmacro listc (expr &rest forms)
;;
;;
;; (listc EXP for VAR in GENERATOR [if CONDITION])
;;
;;
(labels ((keyword-p (thing name)
(and (symbolp thing)
(string= name thing))))
(destructuring-bind (for* variable in* generator &rest tail) forms
(unless (and (keyword-p for* "FOR") (keyword-p in* "IN"))
(error "malformed comprehension"))
(let ((guard (if (null tail) 't
(destructuring-bind (if* condition) tail
(unless (keyword-p if* "IF") (error "malformed comprehension"))
condition))))
`(loop
:for ,variable :in ,generator
:when ,guard
:collecting ,expr)))))
(defun range (start end &optional (by 1))
(loop
:for k :upfrom start :below end :by by
:collecting k))
Apart from the hackish "parser" I used, this solution has a disadvantage, which is not easily solved in common lisp, namely the construction of the intermediate lists, if you want to chain your comprehensions:
(listc x for x in (listc ...) if (evenp x))
Since there is no moral equivalent of yield in common lisp, it is hard to create a facility, which does not require intermediate results to be fully materialized. One way out of this might be to encode the knowledge of possible "generator" forms in the expander of listc, so the expander can optimize/inline the generation of the base sequence without having to construct the entire intermediate list at run-time.
Another way might be to introduce "lazy lists" (link points to scheme, since there is no equivalent facility in common lisp -- you had to build that first, though it's not particularily hard).
Also, you can always have a look at other people's code, in particular, if they tries to solve the same or a similar problem, for example:
Iterate
Loop in SBCL
Pipes (which does the lazy list thing)
Macros are code transformers.
There are several ways of implementing the syntax of a macro:
destructuring
Common Lisp provides a macro argument list which also provides a form of destructuring. When a macro is used, the source form is destructured according to the argument list.
This limits how macro syntax looks like, but for many uses of Macros provides enough machinery.
See Macro Lambda Lists in Common Lisp.
parsing
Common Lisp also gives the macro the access to the whole macro call form. The macro then is responsible for parsing the form. The parser needs to be provided by the macro author or is part of the macro implementation done by the author.
An example would be an INFIX macro:
(infix (2 + x) * (3 + sin (y)))
The macro implementation needs to implement an infix parser and return a prefix expression:
(* (+ 2 x) (+ 3 (sin y)))
rule-based
Some Lisps provide syntax rules, which are matched against the macro call form. For a matching syntax rule the corresponding transformer will be used to create the new source form. One can easily implement this in Common Lisp, but by default it is not a provided mechanism in Common Lisp.
See syntax case in Scheme.
LOOP
For the implementation of a LOOP-like syntax one needs to write a parser which is called in the macro to parse the source expression. Note that the parser does not work on text, but on interned Lisp data.
In the past (1970s) this has been used in Interlisp in the so-called 'Conversational Lisp', which is a Lisp syntax with a more natural language like surface. Iteration was a part of this and the iteration idea has then brought to other Lisps (like Maclisp's LOOP, from where it then was brought to Common Lisp).
See the PDF on 'Conversational Lisp' by Warren Teitelmann from the 1970s.
The syntax for the LOOP macro is a bit complicated and it is not easy to see the boundaries between individual sub-statements.
See the extended syntax for LOOP in Common Lisp.
(loop for i from 0 when (oddp i) collect i)
same as:
(loop
for i from 0
when (oddp i)
collect i)
One problem that the LOOP macro has is that the symbols like FOR, FROM, WHEN and COLLECT are not the same from the "COMMON-LISP" package (a namespace). When I'm now using LOOP in source code using a different package (namespace), then this will lead to new symbols in this source namespace. For that reason some like to write:
(loop
:for i :from 0
:when (oddp i)
:collect i)
In above code the identifiers for the LOOP relevant symbols are in the KEYWORD namespace.
To make both parsing and reading easier it has been proposed to bring parentheses back.
An example for such a macro usage might look like this:
(iter (for i from 0) (when (oddp i) (collect i)))
same as:
(iter
(for i from 0)
(when (oddp i)
(collect i)))
In above version it is easier to find the sub-expressions and to traverse them.
The ITERATE macro for Common Lisp uses this approach.
But in both examples, one needs to traverse the source code with custom code.
To complement Dirk's answer a little:
Writing your own macros for this is entirely doable, and perhaps a nice exercise.
However there are several facilities for this kind of thing (albeit in a lisp-idiomatic way) out there of high quality, such as
Loop
Iterate
Series
Loop is very expressive, but has a syntax not resembling the rest of common lisp. Some editors don't like it and will indent poorly. However loop is defined in the standard. Usually it's not possible to write extentions to loop.
Iterate is even more expressive, and has a familiar lispy syntax. This doesn't require any special indentation rules, so all editors indenting lisp properly will also indent iterate nicely. Iterate isn't in the standard, so you'll have to get it yourself (use quicklisp).
Series is a framework for working on sequences. In most cases series will make it possible not to store intermediate values.

Common Lisp: What is the downside to using this filter function on very large lists?

I want to filter out all elements of list 'a from list 'b and return the filtered 'b. This is my function:
(defun filter (a b)
"Filters out all items in a from b"
(if (= 0 (length a)) b
(filter (remove (first a) a) (remove (first a) b))))
I'm new to lisp and don't know how 'remove does its thing, what kind of time will this filter run in?
There are two ways to find out:
you could test it with data
you could analyze your source code
Let's look at the source code.
lists are built of linked cons cells
length needs to walk once through a list
for EVERY recursive call of FILTER you compute the length of a. BAD!
(Use ENDP instead.)
REMOVE needs to walk once through a list
for every recursive call you compute REMOVE twice: BAD!
(Instead of using REMOVE on a, recurse with the REST.)
the call to FILTER will not necessarily be an optimized tail call.
In some implementations it might, in some you need to tell the compiler
that you want to optimize for tail calls, in some implementations
no tail call optimization is available. If not, then you get a stack
overflow on long enough lists.
(Use looping constructs like DO, DOLIST, DOTIMES, LOOP, REDUCE, MAPC, MAPL, MAPCAR, MAPLIST, MAPCAN, or MAPCON instead of recursion, when applicable.)
Summary: that's very naive code with poor performance.
Common Lisp provides this built in: SET-DIFFERENCE should do what you want.
http://www.lispworks.com/documentation/HyperSpec/Body/f_set_di.htm#set-difference
Common Lisp does not support tail-call optimization (as per the standard) and you might just run out of memory with an abysmal call-stack (depending on the implementation).
I would not write this function, becuase, as Rainer Joswig says, the standard already provides SET-DIFFERENCE. Nonetheless, if I had to provide an implementation of the function, this is the one I would use:
(defun filter (a b)
(let ((table (make-hash-table)))
(map 'nil (lambda (e) (setf (gethash e table) t)) a)
(remove-if (lambda (e) (gethash e table)) b)))
Doing it this way provides a couple of advantages, the most important one being that it only traverses b once; using a hash table to keep track of what elements are in a is likely to perform much better if a is long.
Also, using the generic sequence functions like MAP and REMOVE-IF mean that this function can be used with strings and vectors as well as lists, which is an advantage even over the standard SET-DIFFERENCE function. The main downside of this approach is if you want extend the function with a :TEST argument that allows the user to provide an equality predicate other than the default EQL, since CL hash-tables only work with a small number of pre-defined equality predicates (EQ, EQL, EQUAL and EQUALP to be precise).
(defun filter (a b)
"Filters out all items in a from b"
(if (not (consp a)) b
(filter (rest a) (rest b))))

Scheme Infix to Postfix

Let me establish that this is part of a class assignment, so I'm definitely not looking for a complete code answer. Essentially we need to write a converter in Scheme that takes a list representing a mathematical equation in infix format and then output a list with the equation in postfix format.
We've been provided with the algorithm to do so, simple enough. The issue is that there is a restriction against using any of the available imperative language features. I can't figure out how to do this in a purely functional manner. This is our fist introduction to functional programming in my program.
I know I'm going to be using recursion to iterate over the list of items in the infix expression like such.
(define (itp ifExpr)
(
; do some processing using cond statement
(itp (cdr ifExpr))
))
I have all of the processing implemented (at least as best I can without knowing how to do the rest) but the algorithm I'm using to implement this requires that operators be pushed onto a stack and used later. My question is how do I implement a stack in this function that is available to all of the recursive calls as well?
(Updated in response to the OP's comment; see the new section below the original answer.)
Use a list for the stack and make it one of the loop variables. E.g.
(let loop ((stack (list))
... ; other loop variables here,
; like e.g. what remains of the infix expression
)
... ; loop body
)
Then whenever you want to change what's on the stack at the next iteration, well, basically just do so.
(loop (cons 'foo stack) ...)
Also note that if you need to make a bunch of "updates" in sequence, you can often model that with a let* form. This doesn't really work with vectors in Scheme (though it does work with Clojure's persistent vectors, if you care to look into them), but it does with scalar values and lists, as well as SRFI 40/41 streams.
In response to your comment about loops being ruled out as an "imperative" feature:
(let loop ((foo foo-val)
(bar bar-val))
(do-stuff))
is syntactic sugar for
(letrec ((loop (lambda (foo bar) (do-stuff))))
(loop foo-val bar-val))
letrec then expands to a form of let which is likely to use something equivalent to a set! or local define internally, but is considered perfectly functional. You are free to use some other symbol in place of loop, by the way. Also, this kind of let is called 'named let' (or sometimes 'tagged').
You will likely remember that the basic form of let:
(let ((foo foo-val)
(bar bar-val))
(do-stuff))
is also syntactic sugar over a clever use of lambda:
((lambda (foo bar) (do-stuff)) foo-val bar-val)
so it all boils down to procedure application, as is usual in Scheme.
Named let makes self-recursion prettier, that's all; and as I'm sure you already know, (self-) recursion with tail calls is the way to go when modelling iterative computational processes in a functional way.
Clearly this particular "loopy" construct lends itself pretty well to imperative programming too -- just use set! or data structure mutators in the loop's body if that's what you want to do -- but if you stay away from destructive function calls, there's nothing inherently imperative about looping through recursion or the tagged let itself at all. In fact, looping through recursion is one of the most basic techniques in functional programming and the whole point of this kind of homework would have to be teaching precisely that... :-)
If you really feel uncertain about whether it's ok to use it (or whether it will be clear enough that you understand the pattern involved if you just use a named let), then you could just desugar it as explained above (possibly using a local define rather than letrec).
I'm not sure I understand this all correctly, but what's wrong with this simpler solution:
First:
You test if your argument is indeed a list:
If yes: Append the the MAP of the function over the tail (map postfixer (cdr lst)) to the a list containing only the head. The Map just applies the postfixer again to each sequential element of the tail.
If not, just return the argument unchanged.
Three lines of Scheme in my implementation, translates:
(postfixer '(= 7 (/ (+ 10 4) 2)))
To:
(7 ((10 4 +) 2 /) =)
The recursion via map needs no looping, not even tail looping, no mutation and shows the functional style by applying map. Unless I'm totally misunderstanding your point here, I don't see the need for all that complexity above.
Edit: Oh, now I read, infix, not prefix, to postfix. Well, the same general idea applies except taking the second element and not the first.

Append! in Scheme?

I'm learning R5RS Scheme at the moment (from PocketScheme) and I find that I could use a function that is built into some variants of Scheme but not all: Append!
In other words - destructively changing a list.
I am not so much interested in the actual code as an answer as much as understanding the process by which one could pass a list as a function (or a vector or string) and then mutate it.
example:
(define (append! lst var)
(cons (lst var))
)
When I use the approach as above, I have to do something like (define list (append! foo (bar)) which I would like something more generic.
Mutation, though allowed, is strongly discouraged in Scheme. PLT even went so far as to remove set-car! and set-cdr! (though they "replaced" them with set-mcar! and set-mcdr!). However, a spec for append! appeared in SRFI-1. This append! is a little different than yours. In the SRFI, the implementation may, but is not required to modify the cons cells to append the lists.
If you want to have an append! that is guaranteed to change the structure of the list that's being appended to, you'll probably have to write it yourself. It's not hard:
(define (my-append! a b)
(if (null? (cdr a))
(set-cdr! a b)
(my-append! (cdr a) b)))
To keep the definition simple, there is no error checking here, but it's clear that you will need to pass in a list of length at least 1 as a, and (preferably) a list (of any length) as b. The reason a must be at least length 1 is because you can't set-cdr! on an empty list.
Since you're interested in how this works, I'll see if I can explain. Basically, what we want to do is go down the list a until we get to the last cons pair, which is (<last element> . null). So we first see if a is already the last element in the list by checking for null in the cdr. If it is, we use set-cdr! to set it to the list we're appending, and we're done. If not, we have to call my-append! on the cdr of a. Each time we do this we get closer to the end of a. Since this is a mutation operation, we're not going to return anything, so we don't need to worry about forming our modified list as the return value.
Better late than never for putting in a couple 2-3 cents on this topic...
(1) There's nothing wrong with using the destructive procedures in Scheme while there is a single reference to the stucture being modified. So for example, building a large list efficiently, piecemeal via a single reference - and when complete, making that (now presumably not-to-be-modified) list known and referred to from various referents.
(2) I think APPEND! should behave like APPEND, only (potentially) destructively. And so APPEND! should expect any number of lists as arguments. Each list but the last would presumably be SET-CDR!'d to the next.
(3) The above definition of APPEND! is essentially NCONC from Mac Lisp and Common Lisp. (And other lisps).

Resources