Updating a Deep Data Structure (Common Lisp) - data-structures

I’d like some advice about accessing and updating a data structure that is several layers deep. I originally wanted to use a generalized-reference setf function, but couldn’t get this to work. As an alternative, the functions listed below do work correctly to query and update the data structure (called *kb*). But it seems more complex than it needs to be. Is there a simpler solution?
(defun fetch (node edge)
"Retrieves all the nodes in the kb associated with a node and edge."
(alexandria:hash-table-keys (cdr (assoc edge (gethash node *kb*)))))
(defun associate (node1 edge node2)
"Adds an association to the kb."
(declare (symbol node1 edge node2))
(when (not (gethash edge *edges*))
(error "Mentioned edge ~A was not predefined" edge))
(let ((alist (gethash node1 *kb*)))
(if alist
(let ((pair (assoc edge alist)))
(if pair
(setf (gethash node2 (cdr pair)) t)
(let ((ht (make-hash-table :test #'eq)))
(setf (gethash node2 ht) t)
(setf (gethash node1 *kb*) (acons edge ht alist)))))
(let ((ht (make-hash-table :test #'eq)))
(setf (gethash node2 ht) t)
(setf (gethash node1 *kb*) (acons edge ht nil))))))
As background, the *kb* data structure is a hash table. The hash table keys are symbols and the values are alists. Each alist consists of key.value pairs, where the keys are symbols and the values are hash tables. Each of the latter hash tables simply represents a set of symbols as keys with t values. (ps: I chose alists for the intermediate structure, since there are only 20 or so possible edge symbols.) Thanks for any insights…

Related

How does one make symbols that refer to structure slots in lisp?

I am self learning lisp and thought a nice non-trivial program would be to write a set of standard tree insertion and manipulation routines. I figured it could be done with CONS but wanted to try it with a structure.
I put together one version that worked:
(defstruct treenode data left right)
(defun tree-insert ( value tree )
"Insert data into tree"
(if tree
(if (< value (treenode-data tree))
(setf (treenode-left tree) (tree-insert value (treenode-left tree)))
(setf (treenode-right tree) (tree-insert value (treenode-right tree))))
(setf tree (make-treenode :data value)))
tree)
which rebuilt the tree at every step which seemed computationally inefficient. By inefficient, I mean that I have to use setf every time I do another level of recursion. So I wanted to try a scheme that passed the tree by reference rather than by value so I could make assignments in the subroutine that inserts into a tree.
I cobbled the following together, which does not work (but give me credit for having comments):
(defstruct treenode data left right)
(defun tree-insert ( value tree )
"Insert data value into tree, using pass by reference.
value A datum to insert, in this version has to be a number.
tree The tree passed as a symbol."
(setq tval (symbol-value tree))
(if (eq tval nil)
(set tree (make-treenode :data value)) ; Empty tree. Place data here.
(if (< value (treenode-data tval)) ; Non-empty node. Decide which subtree for insert.
(tree-insert value (treenode-left tval)) ; Left side
(tree-insert value (treenode-right tval)))) ; Right side. This is a stable sort.
nil)
? (setf tr nil)
NIL
? (tree-insert 10 'tr)
NIL
? tr
#S(TREENODE :DATA 10 :LEFT NIL :RIGHT NIL)
?
The initial insert works fine. Passing a symbol the (set tree ...) correctly inserts the structure with left and right porinters nil.
Of course, the problem that follows is that on the recursive call to tree-insert I am not passing a symbol.
That is the hangup. I haven't found a way to refer to a structure slot as a symbol that I can then pass to tree-insert.
I've been looking around for a couple of days and found this interesting comment about the defstruct macro: "defstruct not only defines an access function for each slot, but also arranges for setf to work properly on such access functions, defines a predicate named name-p, defines a constructor function named make-name, and defines a copier function named copy-name. All names of automatically created functions are interned in whatever package is current at the time the defstruct form is processed (see package). Also, all such functions may be declared inline at the discretion of the implementation to improve efficiency; if you do not want some function declared inline, follow the defstruct form with a notinline declaration to override any automatic inline declaration."
So, what could I do to do the magic that setf does? I know I can do assignments to slots with setf, but I haven't gotten setf to work in a function due to lexical scope rules. Maybe like adding automatic functions to allow symbol generating, like (treenode-data-symbol tr)?
Of course, lisp programmers have dealt with binary trees since before my first PDP-8/L. What's the lispy way to do this?
This is an edited question. User Rainer Joswig gave a very fast and concise response. I learned a lot from the example he gave. I was interested in the issue of modifying the tree directly rather than using a return value from a function.
From the comments I've seen here, and the one answer by Rainer Joswig, should I draw the conclusion that the pointer manipulation is computationally low in cost, and that the best lisp approach is to use a function that returns a tree rather than relying on an approach of modifying the argument?
simple version for your inspiration:
(defstruct node a b v)
(defun insert-tree (tree value)
(cond ((null tree)
(setf tree (make-node :v value)))
((> (node-v tree)
value)
(setf (node-a tree)
(insert-tree (node-a tree) value)))
(t
(setf (node-b tree)
(insert-tree (node-b tree) value))))
tree)
using it:
CL-USER 171 > (let ((tree nil))
(loop for i in '(4 7 3 5 9 10 11 8)
do (setf tree (insert-tree tree i)))
(pprint tree)
(values))
#S(NODE :A #S(NODE :A NIL :B NIL :V 3)
:B #S(NODE :A #S(NODE :A NIL :B NIL :V 5)
:B #S(NODE :A #S(NODE :A NIL :B NIL :V 8)
:B #S(NODE :A NIL
:B #S(NODE :A NIL
:B NIL
:V 11)
:V 10)
:V 9)
:V 7)
:V 4)
Now, if wanted to do less setf operations, we could check whether the returned subtree is the same which we passed. This will only not be the case when we create a new node.
(defun insert-tree (tree value)
(cond ((null tree)
(setf tree (make-node :v value)))
((> (node-v tree)
value)
(let ((new-tree (insert-tree (node-a tree) value)))
(unless (eql new-tree (node-a tree))
(setf (node-a tree) new-tree))))
(t
(setf (node-b tree)
(insert-tree (node-b tree) value))))
tree)
or with a local macro hiding part of the code:
(defun insert-tree (tree value)
(macrolet ((insert (place call &aux (new-value-sym (gensym "new-value")))
`(let ((,new-value-sym ,call))
(unless (eql ,place ,new-value-sym)
(setf ,place ,new-value-sym)))))
(cond ((null tree)
(setf tree (make-node :v value)))
((> (node-v tree)
value)
(insert (node-a tree) (insert-tree (node-a tree) value)))
(t
(insert (node-b tree) (insert-tree (node-b tree) value))))
tree))
Trying to add an answer from another angle.
In standard Common Lisp structures have a bunch of limitations to make them low-level and efficient to use. Among those limitations:
access to structure slots via slot names is undefined. some implementations do it, others not.
redefining a structure definition has undefined consequences. This means that in some cases, one best restarts Lisp to do that...
The idea behind that: all operations to structures should be able to be inlined and an executing program should not need any further information about structure slots (their names, their memory locations, ...). There would be no dynamic lookup at runtime.
Then Common Lisp in general has this further limitation: it has no first class pointers. There is no mechanism to provide a pointer only pointing directly to the slot of a structure. In some older Lisp dialects this might be possible via a concept of locatives - pointers in those languages. Common Lisp does not support that.
This means practically: to update a slot of a structure, one needs access to the structure and a setter operation.
How do update the slot of a structure?
I can think of two simple ways:
pass the structure, a new value and an indicator what to update -> then dispatch on the indicator and call the right updater
Example
(defun update (s indicator value)
(case indicator
(:a (setf (node-a s) value))
(:b (setf (node-b s) value))))
(update tree :a (make-node :v 100))
Pass a closure, which does the update
Example:
(let ((tree ...))
(flet ((do-something (updater)
(funcall updater (make-node :v 100))))
(do-something (lambda (value) (setf (node-a tree) value) ...)))
With much thanks to Rainer and Will, I understand Common Lisp better, now. The point about not having first class pointers is huge. I don't have to keep looking for that anymore, though I did see a package that implemented refs in my searches.
The key problem in my approach was that I defined an empty tree as nil. Since passing nil doesn't allow any manipulation of the argument, nil being immutable, the algorithm was bound to fail.
Redefining the empty tree as '(nil) allows the program to work.
;; Make list of 5 random numbers.
(setf r5 (loop for i from 1 to 5 collect (random 100)))
;; Initialize tr to empty tree.
;; Empty tree is '(nil). Tree with data is '(data left right),
;; where left and right are either empty tree or tree with data.
(setf tr '(nil))
(defun tree-insert ( value tree )
"Insert data into tree. tree is modified with an insertion."
(if (equal tree '(nil))
(progn ; Empty (sub)tree. Insert value.
(setf (car tree) value)
(setf (cdr tree) (list (list nil)(list nil))))
(progn ; Non-empty subtree.
(if (< value (car tree))
(tree-insert value (second tree)) ; Insert on left.
(tree-insert value (third tree))))) ; Insert on right.
nil)
;; Load tree with the list of random numbers defined above.
(mapc (lambda (val) (tree-insert val tr)) r5)
(defun tree-walk (tree)
"Retrieve keys in sorted order."
(if (car tree)
(progn
(tree-walk (second tree)) ; Left subtree.
(format t " ~d" (car tree))
(tree-walk (third tree))))) ; Right subtree.
;; Walk the tree.
(tree-walk tr)
Example in use:
? (setf r5 (loop for i from 1 to 5 collect (random 100)))
(22 50 76 20 49)
? (setf tr '(nil))
(NIL)
? (mapc (lambda (val) (tree-insert val tr)) r5)
;Compiler warnings :
; In an anonymous lambda form at position 37: Undeclared free variable TR
(22 50 76 20 49)
? tr
(22 (20 (NIL) (NIL)) (50 (49 (NIL) (NIL)) (76 (NIL) (NIL))))
? (tree-walk tr)
20 22 49 50 76
NIL
?
So, several things to make this work. A mutable object has to be passed to the procedure. In this case I redesigned the structure to be a list, either '(nil) for empty, or '(data left right), where left and right are either '(nil) or '(data left right). A list containing nil can be manipulated. However, I had to use car and cdr to access the structure so as to preserve Lisp's pointer that was passed to the procedure.
Another thing I had to do was not use a list constant in the functon definitions. I'm sure knowledgeable people will be in the know about this, and chucle a bit about the opaque error that follow until the issue is understood, but if I had used '((nil)(nil)) rather than (list (list nil)(list nil)) in tree-insert it would not work. Looks like Lisp compiles list shorthand notation to a pointer to an abject in memory which is used on all subsequent calls of the function.
Oh, there is a leftover progn function call in tree-insert. That was from when I wrapped everything with progn to let me add print statements during debugging.
Running timing on the functions was interesting. It's fast! I'll run some timing comparisons to compare the functional reassignment approach vs the search and insert algorithms.
Thanks, again, to the expert comments. Since last contributing I've learned a little about map, loop/collect, and that variables leak out of functions into global space when let isn't used in function definitions. Also wrapping a funciton with a lot of output with (progn ... nil) saves screen space after large data structures are used. I have learned a lot with just this exercise.

Taking a name and adding it to a binary tree structure

A node in this tree is a list of three items (name left right), where name is a string, and left and right are the child trees i feel like i have gotten off track is there an easy way to write this with just (define(insert name left right))
(define tree
(lambda (node word)
(cond
((null? node) (make-tree word))
((string=? word (tree-word node))
(set-tree-count! node (+ (tree-count node) 1))
node)
((string<? word (tree-word node))
(set-tree-left! node (tree (tree-left node) word))
node)
(else
(set-tree-right! node (tree (tree-right node) word))
node))))
There's no need to use mutation operations, in general in Scheme we avoid them, in this case in particular is easy (and recommended) to build a new tree as we go. And, why the count? this problem has nothing to do with adding numbers. Also notice that the definition (insert name left right) doesn't make much sense, we want to insert a word starting from a tree's root node; left and right aren't useful as parameters. Let's start again from scratch.
(define (insert node word)
(cond ((null? node) (make-tree word '() '()))
((string=? word (tree-word node)) node)
((string>=? word (tree-word node))
(make-tree (tree-word node)
(tree-left node)
(insert (tree-right node) word)))
(else
(make-tree (tree-word node)
(insert (tree-left node) word)
(tree-right node)))))

Accessing a variable field within a node

Hi I am new to Racket using it for a binary tree structure.
Using the following structure
(define-struct human(age hight))
I have created the following object/variable/human
(define James(make-human 10 50))
If I have a node in a binary tree structure
(define-struct node (left human right))
How can I compare a different object's hight (say Michael) with James given that James is within the node, so for an example:
(define (insert-human-into-tree human node)
(cond
[(empty? node)(make-node empty human empty)]
[(<= human-hight( **node-human-hight**))
I need to know how to access the hight field of the human object which will be within the node ( node-human-hight).
Use the accessor procedures in the struct, for example:
(define-struct human(age hight))
(define james (make-human 10 50))
(define-struct node (left human right))
(define anode (make-node null james null))
; access the human in the node, and then the height of that human
(human-hight (node-human anode))
=> 50
... And it's spelled "height", not "hight". So, to answer the question, the comparison would look like this:
(<= (human-hight (node-human node)) (human-hight human))
Your insert-human-into-tree is going to need a bit more beyond the comparison to actually insert a new node. It will look something like this (which also answers your prime question):
(define (insert-human-into-tree human node)
(if (empty? node)
(make-node empty human empty)
(let ((other (node-human node)))
(cond ((< (human-height other) (human-height human))
(insert-human-into-tree human (node-left node))) ; go left
((> (human-height other) (human-height human))
(insert-human-into-tree human (node-right node))) ; go right
(else ;; same height
...)))))
This will get you started.

How would I go about improving the efficiency (Scheme)?

I have this code:
(define tree `(A (B (C)) (D (E)) (C (E))))
(define (prog1 graph)
(let ([seen `()])
(define (sub g)
(cond
[(member (car g) seen) `()]
[else
(set! seen (cons (car g) seen))
(cond
[(null? (cdr g)) (list (car g))]
[else
(cons (car g) (map sub (cdr g)))])]))
(delete `() (sub graph))))
(define delete
(lambda (x y)
(if (null? y )
`()
(if (eqv? (car y) x)
(delete x (cdr y))
(cons (car y) (delete x (cdr y)))))))
It prints a connected graph where all the nodes appear once.
Running (prog1 tree)
prints: (A (B (C)) (D (E)))
I have looked at various depth-first searches in lisp (something which is similar to what I'm trying to do) and they appear to be much more elegant to this, some using iterative approaches. I am aware that the program isn't very efficient (on huge trees it runs pretty slow) so how would I go about improving the efficiency of this code?
Thanks, James
The member procedure performs an O(n) lookup on lists every time it's invoked. That's not what we want for quickly testing set membership, for that you should use a data structure providing an O(1) complexity for both adding elements and testing element membership in a collection, ideally a Set data structure or in its place a Hash Table. For example, in Racket try replacing these lines (or use the default hash table implementation in your Scheme interpreter):
(let ([seen `()]) => (let ([seen (make-hash)])
[(member (car g) seen) `()] => [(hash-has-key? seen (car g)) '()]
(set! seen (cons (car g) seen)) => (hash-set! seen (car g) 'ok)
Also, in general you want to use quotes in your code: '() instead of quasiquotes: `(), see the links to understand the difference and when it's appropriate to use quasiquoting.
Finally, you can use the built-in remove procedure, there's no need to implement your own delete.
In most cases the bottleneck in this code will not be the tree traversal, but the member lookup. The complexity of your function seems to be roughly O(M*N), where M is the number of distinct nodes and N is the number of total nodes. The reason why M goes into this as a factor is because you're looking up nodes in a linear list, which takes time proportional to the length of the list (which in your case is proportional to the number of distinct nodes).
The way to get rid of the M is to use a more efficient data structure for the lookup. R6RS defines hash tables, for instance.

Finding all paths in a directed graph with cycles

I am working on a problem which requires finding all paths between two nodes in a directed graph. The graph may have cycles.
Notice that this particular implementation approach is an iterative DFS.
Several approaches I've considered are as follows -
BFS does not appear to have a way to neatly manage this kind of pathing relationships between nodes.
I don't see an easy mechanism for a DFS recursive algorithm to pass up the path when the terminating node is found. (Likely enough it could be done, if I implement a Maybe monad kind of thing).
Creating a GRAPH-PARENT routine. That would add a decent amount of churn (& bugs) in the existing code.
Abstractly, what needs to happen is a tree needs to be generated, with the start node as root, and all leafs are the terminating nodes. Each path from leaf to root is a legal path. That is what a recursive DFS would trace out.
I'm reasonably sure it can be done here, but I don't see exactly how to do it.
I've defined a protocol for this algorithm where GRAPH-EQUAL and GRAPH-NEXT can be defined for arbitrary objects.
The debug node type is a SEARCH-NODE, and it has the data accessor SEARCH-NODE-DATA.
(defun all-paths (start end)
(let ((stack (list start))
(mark-list (list start)) ;I've chosen to hold marking information local to all-paths, instead of marking the objects themselves.
(all-path-list '())) ; Not used yet, using debug statements to think about the problem
(do () ;; intializing no variables
;; While Stack still has elements
((not stack))
(let ((item (pop stack)))
;; I'm looking at the item.
(format t "I: ~a~%" (search-node-data item))
(cond ((graph-equal item end)
(format t "*Q: ~a~%" (loop for var in stack collect (search-node-data var)))
;;Unmark the terminal node so we can view it it next time.
(setf mark-list (remove item mark-list))))
(loop for next in (graph-next item)
do
(cond ((not (in next mark-list :test #'graph-equal))
;; mark the node
(push next mark-list)
;;Put it on the stack
(push next stack))))))))
See A Very General Method for Computing Shortest Paths for an algorithm that can return all paths in a graph (even when there are cycles) as regular expressions over the alphabet of edges in finite time (assuming a finite graph).
You need to pass the path list (mark-list) along with the nodes, since that is part of the state. I've renamed it path in this code:
(defun all-paths (start end)
(let ((stack (list '(start (start)))) ; list of (node path) elements
(all-path-list '()))
(do ()
((not stack))
(let ((item (pop stack)))
(let ((node (first item))
(path (second item)))
(format t "I: ~a~%" (search-node-data node))
(cond ((graph-equal node end)
(format t "*Q: ~a~%"
(loop for var in path
collect (search-node-data var)))))
(loop for next in (graph-next node)
do
(cond ((not (in next path :test #'graph-equal))
(push (list next (cons next path)) stack)))))))))

Resources