How does one make symbols that refer to structure slots in lisp? - binary-tree

I am self learning lisp and thought a nice non-trivial program would be to write a set of standard tree insertion and manipulation routines. I figured it could be done with CONS but wanted to try it with a structure.
I put together one version that worked:
(defstruct treenode data left right)
(defun tree-insert ( value tree )
"Insert data into tree"
(if tree
(if (< value (treenode-data tree))
(setf (treenode-left tree) (tree-insert value (treenode-left tree)))
(setf (treenode-right tree) (tree-insert value (treenode-right tree))))
(setf tree (make-treenode :data value)))
tree)
which rebuilt the tree at every step which seemed computationally inefficient. By inefficient, I mean that I have to use setf every time I do another level of recursion. So I wanted to try a scheme that passed the tree by reference rather than by value so I could make assignments in the subroutine that inserts into a tree.
I cobbled the following together, which does not work (but give me credit for having comments):
(defstruct treenode data left right)
(defun tree-insert ( value tree )
"Insert data value into tree, using pass by reference.
value A datum to insert, in this version has to be a number.
tree The tree passed as a symbol."
(setq tval (symbol-value tree))
(if (eq tval nil)
(set tree (make-treenode :data value)) ; Empty tree. Place data here.
(if (< value (treenode-data tval)) ; Non-empty node. Decide which subtree for insert.
(tree-insert value (treenode-left tval)) ; Left side
(tree-insert value (treenode-right tval)))) ; Right side. This is a stable sort.
nil)
? (setf tr nil)
NIL
? (tree-insert 10 'tr)
NIL
? tr
#S(TREENODE :DATA 10 :LEFT NIL :RIGHT NIL)
?
The initial insert works fine. Passing a symbol the (set tree ...) correctly inserts the structure with left and right porinters nil.
Of course, the problem that follows is that on the recursive call to tree-insert I am not passing a symbol.
That is the hangup. I haven't found a way to refer to a structure slot as a symbol that I can then pass to tree-insert.
I've been looking around for a couple of days and found this interesting comment about the defstruct macro: "defstruct not only defines an access function for each slot, but also arranges for setf to work properly on such access functions, defines a predicate named name-p, defines a constructor function named make-name, and defines a copier function named copy-name. All names of automatically created functions are interned in whatever package is current at the time the defstruct form is processed (see package). Also, all such functions may be declared inline at the discretion of the implementation to improve efficiency; if you do not want some function declared inline, follow the defstruct form with a notinline declaration to override any automatic inline declaration."
So, what could I do to do the magic that setf does? I know I can do assignments to slots with setf, but I haven't gotten setf to work in a function due to lexical scope rules. Maybe like adding automatic functions to allow symbol generating, like (treenode-data-symbol tr)?
Of course, lisp programmers have dealt with binary trees since before my first PDP-8/L. What's the lispy way to do this?
This is an edited question. User Rainer Joswig gave a very fast and concise response. I learned a lot from the example he gave. I was interested in the issue of modifying the tree directly rather than using a return value from a function.
From the comments I've seen here, and the one answer by Rainer Joswig, should I draw the conclusion that the pointer manipulation is computationally low in cost, and that the best lisp approach is to use a function that returns a tree rather than relying on an approach of modifying the argument?

simple version for your inspiration:
(defstruct node a b v)
(defun insert-tree (tree value)
(cond ((null tree)
(setf tree (make-node :v value)))
((> (node-v tree)
value)
(setf (node-a tree)
(insert-tree (node-a tree) value)))
(t
(setf (node-b tree)
(insert-tree (node-b tree) value))))
tree)
using it:
CL-USER 171 > (let ((tree nil))
(loop for i in '(4 7 3 5 9 10 11 8)
do (setf tree (insert-tree tree i)))
(pprint tree)
(values))
#S(NODE :A #S(NODE :A NIL :B NIL :V 3)
:B #S(NODE :A #S(NODE :A NIL :B NIL :V 5)
:B #S(NODE :A #S(NODE :A NIL :B NIL :V 8)
:B #S(NODE :A NIL
:B #S(NODE :A NIL
:B NIL
:V 11)
:V 10)
:V 9)
:V 7)
:V 4)
Now, if wanted to do less setf operations, we could check whether the returned subtree is the same which we passed. This will only not be the case when we create a new node.
(defun insert-tree (tree value)
(cond ((null tree)
(setf tree (make-node :v value)))
((> (node-v tree)
value)
(let ((new-tree (insert-tree (node-a tree) value)))
(unless (eql new-tree (node-a tree))
(setf (node-a tree) new-tree))))
(t
(setf (node-b tree)
(insert-tree (node-b tree) value))))
tree)
or with a local macro hiding part of the code:
(defun insert-tree (tree value)
(macrolet ((insert (place call &aux (new-value-sym (gensym "new-value")))
`(let ((,new-value-sym ,call))
(unless (eql ,place ,new-value-sym)
(setf ,place ,new-value-sym)))))
(cond ((null tree)
(setf tree (make-node :v value)))
((> (node-v tree)
value)
(insert (node-a tree) (insert-tree (node-a tree) value)))
(t
(insert (node-b tree) (insert-tree (node-b tree) value))))
tree))

Trying to add an answer from another angle.
In standard Common Lisp structures have a bunch of limitations to make them low-level and efficient to use. Among those limitations:
access to structure slots via slot names is undefined. some implementations do it, others not.
redefining a structure definition has undefined consequences. This means that in some cases, one best restarts Lisp to do that...
The idea behind that: all operations to structures should be able to be inlined and an executing program should not need any further information about structure slots (their names, their memory locations, ...). There would be no dynamic lookup at runtime.
Then Common Lisp in general has this further limitation: it has no first class pointers. There is no mechanism to provide a pointer only pointing directly to the slot of a structure. In some older Lisp dialects this might be possible via a concept of locatives - pointers in those languages. Common Lisp does not support that.
This means practically: to update a slot of a structure, one needs access to the structure and a setter operation.
How do update the slot of a structure?
I can think of two simple ways:
pass the structure, a new value and an indicator what to update -> then dispatch on the indicator and call the right updater
Example
(defun update (s indicator value)
(case indicator
(:a (setf (node-a s) value))
(:b (setf (node-b s) value))))
(update tree :a (make-node :v 100))
Pass a closure, which does the update
Example:
(let ((tree ...))
(flet ((do-something (updater)
(funcall updater (make-node :v 100))))
(do-something (lambda (value) (setf (node-a tree) value) ...)))

With much thanks to Rainer and Will, I understand Common Lisp better, now. The point about not having first class pointers is huge. I don't have to keep looking for that anymore, though I did see a package that implemented refs in my searches.
The key problem in my approach was that I defined an empty tree as nil. Since passing nil doesn't allow any manipulation of the argument, nil being immutable, the algorithm was bound to fail.
Redefining the empty tree as '(nil) allows the program to work.
;; Make list of 5 random numbers.
(setf r5 (loop for i from 1 to 5 collect (random 100)))
;; Initialize tr to empty tree.
;; Empty tree is '(nil). Tree with data is '(data left right),
;; where left and right are either empty tree or tree with data.
(setf tr '(nil))
(defun tree-insert ( value tree )
"Insert data into tree. tree is modified with an insertion."
(if (equal tree '(nil))
(progn ; Empty (sub)tree. Insert value.
(setf (car tree) value)
(setf (cdr tree) (list (list nil)(list nil))))
(progn ; Non-empty subtree.
(if (< value (car tree))
(tree-insert value (second tree)) ; Insert on left.
(tree-insert value (third tree))))) ; Insert on right.
nil)
;; Load tree with the list of random numbers defined above.
(mapc (lambda (val) (tree-insert val tr)) r5)
(defun tree-walk (tree)
"Retrieve keys in sorted order."
(if (car tree)
(progn
(tree-walk (second tree)) ; Left subtree.
(format t " ~d" (car tree))
(tree-walk (third tree))))) ; Right subtree.
;; Walk the tree.
(tree-walk tr)
Example in use:
? (setf r5 (loop for i from 1 to 5 collect (random 100)))
(22 50 76 20 49)
? (setf tr '(nil))
(NIL)
? (mapc (lambda (val) (tree-insert val tr)) r5)
;Compiler warnings :
; In an anonymous lambda form at position 37: Undeclared free variable TR
(22 50 76 20 49)
? tr
(22 (20 (NIL) (NIL)) (50 (49 (NIL) (NIL)) (76 (NIL) (NIL))))
? (tree-walk tr)
20 22 49 50 76
NIL
?
So, several things to make this work. A mutable object has to be passed to the procedure. In this case I redesigned the structure to be a list, either '(nil) for empty, or '(data left right), where left and right are either '(nil) or '(data left right). A list containing nil can be manipulated. However, I had to use car and cdr to access the structure so as to preserve Lisp's pointer that was passed to the procedure.
Another thing I had to do was not use a list constant in the functon definitions. I'm sure knowledgeable people will be in the know about this, and chucle a bit about the opaque error that follow until the issue is understood, but if I had used '((nil)(nil)) rather than (list (list nil)(list nil)) in tree-insert it would not work. Looks like Lisp compiles list shorthand notation to a pointer to an abject in memory which is used on all subsequent calls of the function.
Oh, there is a leftover progn function call in tree-insert. That was from when I wrapped everything with progn to let me add print statements during debugging.
Running timing on the functions was interesting. It's fast! I'll run some timing comparisons to compare the functional reassignment approach vs the search and insert algorithms.
Thanks, again, to the expert comments. Since last contributing I've learned a little about map, loop/collect, and that variables leak out of functions into global space when let isn't used in function definitions. Also wrapping a funciton with a lot of output with (progn ... nil) saves screen space after large data structures are used. I have learned a lot with just this exercise.

Related

Is there any way to create a copy of a binary tree using a recursive breadth first implementation?

Essentially what I am trying to do is take a binary tree with data definition
binary_tree: number | (symbol binary_tree binary_tree)
and create a new version of the tree where each leaf (a number) is replaced with a counter number. I am trying to do this left-to-right and then top-down, so using a breadth first search seems like the obvious choice to visit every node in order. However, my problem is this. I need to accumulate a new binary tree to return it. Is there any possible way to do this since we are visiting each node in order?
So in short if I have a tree defined like this:
(define bt '(foo (bar 26 12) (baz 11 (quux 117 14))))
i need to process and accumulate a new list such that
(define bt '(foo (bar 0 1) (baz 2 (quux 3 4))))
Here is my code:
(define (number-leaves bst)
(define (helper queue counter)
(cond[(non-empty-queue? queue)
(define x (dequeue! queue))
(cond [(number? x)(cons counter (helper queue (+ 1 counter)))]
[(symbol? (car x))(begin (enqueue! queue (car(cdr x)))
(enqueue! queue (car(cdr(cdr x))))
(cons (list(car x)) (helper queue counter)))])]
['()]))
(begin (define q (make-queue))
(enqueue! q bst)
(helper q 0)))
as of now this function returns
(foo bar baz 0 1 2 quux 3 4)
It seems to me that it is impossible to accumulate into a recursive data definition while processing the tree breadth first. What can I do? (NB: car = first and cdr = rest in the EOPL racket dialect)
I think you will need to do two passes.
breath first while making a lookup based on the pairs that hold the leaf and their incremental new value. It's important to use the pairs since the numbers by themselves are not guaranteed unique, thus (eq? 2 2) can be #t even though they are different places.. Comparing the pairs that holds the twos are guaranteed to be only eq? with the very same value.
A standard post order tree traversal where you fetch the new values from the lookup.
The lookup should be a hash for efficiency, but it can be a assoc list for small trees. If you expect the hash to do O(1) when iterating over all the element twice will still only make it O(n).

Tree sorting function in Scheme

Im struggling with this problem for over 2 days now and still somehow I cannot solve it.
I have to write a function in SCHEME that takes a list in a tree and displays items in sorted order.
The way I define trees is '(6 (left... ) (right...))
My function to choose a tree:
(define (tree-sort tree)
(cond ((null? tree) '())
((> (car tree) (cadr tree))
(tree-sort (cadr tree)))
(else
(tree-sort (caddr tree))))
)
So I guess I should also have a function that sorts the most indepth list?
I really dont get it and this is the last time I will ever have to deal with scheme. I have never used stackoverflow so please excuse me if the formating is wrong.
Kindly thank you!
Now that you clarified that the tree is already sorted, then you're looking for an in-order traversal of the tree, which returns a sorted list of the elements - I'm assuming that you're interested in a list as the output, because of the base case shown in the question. Try something like this:
(define (tree-sort tree)
(if (empty-tree? tree)
'()
(append (tree-sort (left-subtree tree))
(list (value tree))
(tree-sort (right-subtree tree)))))
Use the appropriate procedures for testing if the tree is empty and for accessing each node's value, left and right subtrees. The above procedure will return a sorted list with the trees' values.

Scheme binary search tree

I have already defined helper functions to be:
;; returns value of node
(define (value node)
(if (null? node) '()
(car node)))
;; returns left subtree of node
(define (left node)
(if (null? node) '()
(cadr node)))
;; returns right subtree of node
(define (right node)
(if (null? node) '()
(caddr node)))
and I am trying to write a function leaves that returns a list with the leaves of the tree in order of left to right.
(define (leaves tree)
(if (and (?null (left tree)) (?null (right tree)))
???
(leaves (left tree)) (leaves (right tree))))
but that is as far as I can get
ex: (leaves '(1 (2 () ()) (3 () ()))) should evaluate to '(2 3)
In what you have so far, the ??? is going to need to evaluate to the value of the leaf, ie. (value node) because it is the base case of your iteration. Also, you're going to need to combine the values you get back from the base case in your iteration case. list is usually a good first candidate to try when you need to combine multiple results cons is usually my second try. Taking those suggestions, your leaves function looks like this:
(define (leaves tree)
(if (and (null? (left tree)) (null? (right tree)))
(value tree)
(list (leaves (left tree)) (leaves (right tree)))))
which, when run on your sample of (leaves '(1 (2 () ()) (3 () ()))) does indeed evaluate to (2 3).
HOWEVER; YOU'RE NOT DONE! We're only testing with 1 level of recursion. What if we make a bigger tree? Something like: (leaves '(1 (2 (4 () ()) (5 () ())) (3 (6 () ()) (7 () ())))) Running this gives ((4 5) (6 7)). Those are the right values in the right order, but we have too much structure in there, too many parenthesis. This is a typical problem you will encounter throughout your scheme career, so let me explain why it happens, and how you can go about attacking the problem.
If look at the two branches of our if form, you'll notice that (value tree) returns an atom, or a number in this case. The else branch takes two of ??? and turns it into a list of ???. We're going to be executing the else branch multiple times - any time we're not in the base case. This means we're going to continue to wrap, and wrap, and wrap into a deeper and deeper list structure. So here's what we do about it.
Lets return a list in our base case, and keep our list flat in the recursion case. To return a list in our base case it is as simple as returning (list (value tree)) instead of just (value tree). In the recursion case, we need a function that takes 2 lists and combines them without making a deeper list. Such a function does exist - append. So let's look at what our leaves function looks like now:
(define (leaves tree)
(if (and (null? (left tree)) (null? (right tree)))
(list (value tree))
(append (leaves (left tree)) (leaves (right tree)))))
Intermezzo - Test cases
Racket has test suite library that has a very low barrier to entry called rackunit. Let's throw together a few quick test cases at the bottom of the file.
(require rackunit)
;;empty tree
(check-equal? (leaves '()) '())
;;simple balanced tree
(check-equal?
(leaves '(1 (2 () ()) (3 () ())))
'(2 3))
;;larger balanced tree
(check-equal?
(leaves '(1 (2 (4 () ()) (5 () ())) (3 (6 () ()) (7 () ()))))
'(4 5 6 7))
;;unbalanced tree
(check-equal?
(leaves '(1 (2 (4 () ()) ()) (3 () ())))
'(4 3))
Recently, racket has added support for submodules and specific support for test submodules if you are curious and want to look into them.
Back to our leaves problem. Running our tests, we notice our function doesn't behave well on unbalanced trees. We get extra ()s when we have a node that only has 1 leaf. That is because we are traversing both the left and the right subtrees whenever we're at a node that isn't a leaf. What we really need are two more cases in our if. We could nest the ifs, but scheme's cond form makes better sense.
Now, the template we're aiming to fill out is:
(define (leaves tree)
(cond [(leaf? tree) (...)]
[(and (has-left? tree) (has-right? tree))
(...)]
[(has-left? tree) (...)]
[(has-right? tree) (...)]
[else (error "should never get here")]))
I'll stop there in-case this is homework, and to give you the satisfaction of understanding and solving this the rest of the way. I hope my explanations have given you more direction that just "here's the code" answers.
Well this seems like you're doing Breadth First Search but with the alteration that you don't print yourself if you have two children (or just one, if you don't want to print nodes that only have one child).
I would aim for solving that first, and then changing your solution to that to solve this problem.
(define (list-of-leaves tree)
(if(leaf? tree)
(list (node tree))
(cond((right-branch-only? tree)(list-of-leaves (right-branch tree)))
((left-branch-only? tree)(list-of-leaves (left-branch tree)))
(else(append (list-of-leaves (left-branch tree))
(list-of-leaves (right-branch tree)))))))

Walking through a binary tree

I'm having issues writing a function to walk through a binary tree, the function takes in a search_term, list and returns true or false. Here is what I have and it's essentially the same thing I found googling how to implement binary search in Scheme.
(define (tree-lookup val tree)
(if (empty-tree? tree)
#f
(let ((curr-val (node-value tree))
(left (node-left tree))
(right (node-right tree)))
(cond ((equal? val curr-val) #t)
((< val curr-val))
(tree-lookup val left)
(else
(tree-lookup val right))))))
(define tree-test '(((1 2) 3)(4 (5 6)) 7 (8 9 10))) ; Test tree
The problem comes when it tries to compare the "val" variable with the node. This means I am comparing a real number to a list, like (< 2 '((1 2) 3)). I tried testing for atom values only but then I am stuck on how to go back up the tree when I reach a leaf.
Here's the error message:
<: contract violation
expected: real?
given: '{{1 2} 3}
argument position: 2nd
other arguments...:
8
At first glance the procedure looks fine. I suspect that the problem is in the node-value, node-left and node-right procedures, or in the way you're building the tree - for starters, the sample tree provided in the question doesn't seem right to me.
Think about it, the error message signals that the < operator was applied to the list '{{1 2} 3}, meaning that curr-val is a list, but it should be a value.

Idiomatic way to recurse through collections in Clojure

I'm trying to understand what is the idiomatic way in Clojure to recurse through a tree or a list represented by a Clojure list (or another collection type).
I could write the following to count the elements in a flat collection (ignore the fact that it's not tail-recursive):
(defn length
([xs]
(if (nil? (seq xs))
0
(+ 1 (length (rest xs))))))
Now in Scheme or CL all the examples only ever do this over lists, so the idiomatic base case test in those languages would be (nil? xs). In Clojure we'd like this function to work on all collection types, so is the idiomatic test (nil? (seq xs)), or maybe (empty? xs), or something completely different?
The other case I'd like to consider is tree traversal, i.e. traversing through a list or vector that represents a tree, e.g. [1 2 [3 4].
For example, counting the nodes in a tree:
(defn node-count [tree]
(cond (not (coll? tree)) 1
(nil? (seq tree)) 0
:else (+ (node-count (first tree)) (node-count (rest tree)))))
Here we use (not (coll? tree)) to check for atoms, whereas in Scheme/CL we'd use atom?. We also use (nil? (seq tree)) to check for an empty collection. And finally we use first and rest to destructure the current tree to the left branch and the rest of the tree.
So to summarise, are the following forms idiomatic in Clojure:
(nil? (seq xs)) to test for the empty collection
(first xs) and (rest xs) to dig into the collection
(not (coll? xs)) to check for atoms
The idiomatic test for a non-empty seqable is (seq coll):
(if (seq coll)
...
)
The nil? is unnecessary, since a non-nil return value from seq is guaranteed to be a seq and thus neither nil nor false and therefore truthy.
If you want to deal with the nil case first, you can change the if to if-not or seq to empty?; the latter is implemented as a composition of seq with not (which is why it is not idiomatic to write (not (empty? xs)), cf. the docstring of empty?).
As for first / rest -- it's useful to remember about the strict variant of rest, next, the use of which is more idiomatic than wrapping rest in a seq.
Finally, coll? checks if its argument is a Clojure persistent collection (an instance of clojure.lang.IPersistentCollection). Whether this is an appropriate check for "non-atoms" depends on whether the code needs to handle Java data structures as non-atoms (via interop): e.g. (coll? (java.util.HashSet.)) is false, as is (coll? (into-array [])), but you can call seq on both. There is a function called seqable? in core.incubator in the new modular contrib which promises to determine whether (seq x) would succeed for a given x.
I personally like the following approach to recurse through a collection:
(defn length
"Calculate the length of a collection or sequence"
([coll]
(if-let [[x & xs] (seq coll)]
(+ 1 (length xs))
0)))
Features:
(seq coll) is idiomatic for testing whether a collection is empty (as per Michal's great answer)
if-let with (seq coll) automatically handles both the nil and empty collection case
You can use destructuring to name the first and next elements as you like for use in your function body
Note that in general it is better to write recursive functions using recur if possible, so that you get the benefits of tail recursion and don't risk blowing up the stack. So with this in mind, I'd actually probably write this specific function as follows:
(defn length
"Calculate the length of a collection or sequence"
([coll]
(length coll 0))
([coll accumulator]
(if-let [[x & xs] (seq coll)]
(recur xs (inc accumulator))
accumulator)))
(length (range 1000000))
=> 1000000

Resources