Could someone explain to me what the tree traversal is doing here? - algorithm

I am having a hard time understanding the tree_traverse function and what it does, could someone please explain.
(Peek and pop is just from a stack implementation which I made.)
type 'a tree = Leaf | Branch of ('a tree * 'a * 'a tree);;
let rec tree_traverse (t,mem,prim,seco) = match t with
| Leaf ->
if is_empty mem
then []
else tree_traverse (peek mem, pop mem, prim, seco)
| Branch (l,nd,r) ->
let mem1 = add (prim(l,r),mem) in
let mem2 = add (seco(l,r), mem1) in
nd :: tree_traverse (peek mem2, pop mem2, prim, seco)
where an example of a tree is
let t = Branch (Branch(Branch(Leaf,1,Leaf), 2, Leaf), 3,
Branch (Branch(Leaf,5,Leaf), 6, Branch(Leaf,7,Leaf)))

This function implements some sort of the worklist algorithm, it returns a list of nodes in some order that depends on the implementation of prim and seco functions, as well as add and pop.
Neither prim nor seco parameter is changed during the recursion, so the could be removed from the list of parameters. If we will assume the following implementations
let add (x,xs) = x :: xs
let pop (x::xs) = xs
let peek (x::xs) = x
let prim (x,y) = x
let seco (x,y) = y
let is_empty = function [] -> true | _ -> false
then the tree_traverse function will return the list of nodes in the depth-first order.
Given your example and fixing the implementation to the specified above functions, we can now follow the execution of the function:
tree_traverse Branch (Branch(Branch(Leaf,1,Leaf), 2, Leaf), 3,
Branch (Branch(Leaf,5,Leaf), 6, Branch(Leaf,7,Leaf)))
doesn't match with the Leaf case so we go to the second case and get it deconstructed as
| Branch (l,nd,r)) ->
(* l is bound to Branch (Branch(Leaf,1,Leaf), 2, Leaf) *)
(* nd is bound to 3 *)
(* r is bound to Branch (Branch(Leaf,5,Leaf), 6, Branch(Leaf,7,Leaf)) *)
let mem1 = add (prim(l,r),mem) in
let mem2 = add (seco(l,r), mem1) in
nd :: tree_traverse (peek mem2, pop mem2, prim, seco)
we push the left sub-branch l to the first stack mem1, and push the left and right subbranches to the stack mem2. Then we prepend 3 to the result and recurse with into the top of our stack mem2 while dropping it.
In the next step, we are matching on the top of our second stack, which contains the right branch, namely Branch (Branch(Leaf,5,Leaf), 6, Branch(Leaf,7,Leaf)), we land again to the second case, with
Branch(Leaf,5,Leaf) being bound to the l variable, and Branch(Leaf,7,Leaf) to r. We add 6 to the result, and push l then push and immediately pop r and recurse into it.
On the third step of recursion we are called with Branch(Leaf,7,Leaf), we add 7 to our result, and push left and right Leaf to our stack.
On the fourth step, we peek the Leaf node, and finally get into the first case, where we look into the stack, and if it is not empty, then we recurse into the top. In our case the stack contains our sibling left Leaf, then the left sibling of the parent node, etc.
You can do the same using the OCaml toplevel, e.g.,
#trace tree_traverse;;
tree_traverse (t,[],prim,seco);;
Use the definitions of the helper function, as I provided above.

Related

How to add index numbers to the nodes of a tree

Suppose I have this data type for representing a tree (a rose tree):
type tree =
| Function of string * tree list
| Terminal of int
For example:
Function ("+", [Function ("*", [Terminal 5; Terminal 6]);
Function ("sqrt", [Terminal 3])])
represents the following tree ((5 * 6) + sqrt(3)):
I want to convert this tree into another tree data structure called an "indexed tree" that contains the depth-first (or breath-first) index of each node. In the image above, I have labelled all nodes with their depth-first index.
This is the data type for indexed trees:
type index = int
type indexed_tree =
| IFunction of index * string * indexed_tree list
| ITerminal of index * int
This represents the indexed tree (depth-first) for the image above:
IFunction (0, "+", [IFunction (1, "*", [ITerminal (2, 5); ITerminal (3, 6)]);
IFunction (4, "sqrt", [ITerminal (5, 3)])])
This represents the indexed tree (breadth-first) for the image above:
IFunction (0, "+", [IFunction (1, "*", [ITerminal (3, 5); ITerminal 4, 6)]);
IFunction (2, "sqrt", [ITerminal (5, 3)])])
Now the problem is: how do I define a function tree -> indexed_tree?
I tried to adapt DFS and BFS techniques of keeping a stack, but I soon realized that this problem is completely different. DFS and BFS are only searching for one item, and they can ignore the rest of the tree. Here, I am trying to label the nodes of a tree with their index numbers. How can I do this?
EDIT
Below is my implementation for getting the subtree rooted at the specified index (both depth-first indexing and breadth-first indexing are implemented). I am unable to see how I could adapt this implementation to convert a given tree into an indexed tree. I tried to make use of counter (see implementation below), but the complication is that a depth-first traversal has to backtrack, and I don't know how to pass the counter around when backtracking.
(* Helper function for subtree_index_dfs and subtree_index_bfs.
* join_func should either be prepend (for depth-first), or postpend
* (for breadth-first). *)
let subtree_index tree index join_func =
let node_children = function
| Terminal _ -> []
| Function (_, children) -> children in
let rec loop counter stack =
match stack with
| [] -> failwith "Index out of bounds"
| (hd::_) when counter = index -> hd
| (hd::tl) -> loop (counter + 1) (join_func (node_children hd) tl)
in
loop 0 [tree]
(* Get the subtree rooted at the specified index.
* Index starts at 0 at the root of the tree and is ordered depth-first. *)
let subtree_index_dfs tree index =
let prepend a b =
a#b
in
subtree_index tree index prepend
(* Get the subtree rooted at the specified index.
* Index starts at 0 at the root of the tree and is ordered breadth-first. *)
let subtree_index_bfs tree index =
let append a b =
b#a
in
subtree_index tree index append
(* Misc. *)
let rec string_of_tree t =
match t with
| Terminal i -> string_of_int i
| Function (sym, children) ->
let children_str = List.map (fun child -> string_of_tree child) children
in
"(" ^ sym ^ " " ^ String.concat " " children_str ^ ")"
let print_tree t =
print_endline (string_of_tree t)
Example usage:
let () =
let t1 = Function ("+", [Function ("*", [Terminal 5; Terminal 6]);
Function ("sqrt", [Terminal 3])])
in
print_tree (subtree_index_dfs t1 0); (* (+ ( * 5 6) (sqrt 3)) *)
print_tree (subtree_index_dfs t1 1); (* ( * 5 6) *)
print_tree (subtree_index_dfs t1 2); (* 5 *)
print_tree (subtree_index_dfs t1 3); (* 6 *)
print_tree (subtree_index_dfs t1 4); (* (sqrt 3) *)
print_tree (subtree_index_dfs t1 5); (* 3 *)
print_tree (subtree_index_dfs t1 6); (* Exception: Failure "Index out of bounds". *)
This isn't a complete answer, and I think asking for both depth-first and breadth-first is a bit too broad as I don't see there being much to generallize, but I hope it will at least get you a bit further, and perhaps spark some ideas.
I think the problem you're having stems from your current code being overgeneralized. You're basically transforming the tree, rather inefficiently, into a list and then indexing into that list. You can't then transform the list back into a tree, because you've thrown that information away.
Depth-first traversal is really very simple, and lends itself naturally to a recursive implementation. If we ignore the index for now, the tree -> indexed_tree function is just this:
let indexed tree =
let rec loop = function
| Terminal value -> ITerminal (0, value)
| Function (name, children) -> IFunction (0, name, loop_children children)
and loop_children = function
| [] -> []
| child :: rest -> loop child :: loop_children rest
in loop tree
Then we just have to fill in the index. This does so by passing an incrementing index up while recursing, and returning the node count along with the constructed subtree on the way down so that we know how much to increment the index by:
let indexed_dfs tree =
let rec loop i = function
| Function (name, children) ->
let (child_count, children') = loop_children (i + 1) children in
(child_count + 1, IFunction (i, name, children'))
| Terminal value -> (1, ITerminal (i, value))
and loop_children i = function
| [] -> (0, [])
| child :: rest ->
let (count, child') = loop i child in
let (rest_count, rest') = loop_children (i + count) rest in
(count + rest_count, child' :: rest')
in snd (loop 0 tree)
Instead of the count each call could also just have returned the last index, but I thought it was simpler to follow two separate concepts instead of overloading a single one.
Breadth-first transformation is unfortunately significantly trickier, as we can't easily construct a new tree breadth-first. But we also shouldn't need to. I think instead of keeping track of the total count it might be an idea to keep track of the accumulated counts, or offsets, of each level by passing around a stack of offsets for the levels seen thus far, and then use that to calculate the index of the node currently being constructed.

Magic code for level binary tree traversal - what is going on?

We have a definition of binary tree:
type 'a tree =
| Node of 'a tree * 'a * 'a tree
| Null;;
And also a helpful function for traversing the tree"
let rec fold_tree f a t =
match t with
| Null -> a
| Node (l, x, r) -> f x (fold_tree f a l) (fold_tree f a r);;
And here is a "magic" function which, when given a binary tree, returns a list in which we have lists of elements on particular levels, for example, when given a tree:
(source: ernet.in)
the function returns [[1];[2;3];[4;5;6;7];[8;9]].
let levels tree =
let aux x fl fp =
fun l ->
match l with
| [] -> [x] :: (fl (fp []))
| h :: t -> (x :: h) :: (fl (fp t))
in fold_tree aux (fun x -> x) tree [];;
And apparently it works, but I can't wrap my mind around it. Could anyone explain in simple terms what is going on? Why does this function work?
How do you combine two layer lists of two subtrees and get a layer list of a bugger tree? Suppose you have this tree
a
/ \
x y
where x and y are arbitrary trees, and they have their layer lists as [[x00,x01,...],[x10,x11,...],...] and [[y00,y01,...],[y10,y11,...],...] respectively.
The layer list of the new tree will be [[a],[x00,x01,...]++[y00,y01,...],[x10,x11,...]++[y10,y11,...],...]. How does this function build it?
Let's look at this definition
let rec fold_tree f a t = ...
and see what kind of arguments we are passing to fold_tree in our definition of levels.
... in fold_tree aux (fun x -> x) tree []
So the first argument, aux, is some kind of long and complicated function. We will return to it later.
The second argument is also a function — the identity function. This means that fold_tree will also return a function, because fold_tree always returns the same type of value as its second argument. We will argue that the function fold_tree applied to this set of arguments takes a list of layers, and adds layers of a given tree to it.
The third argument is our tree.
Wait, what's the fourth argument? fold_tree is only supposed to get tree? Yes, but since it returns a function (see above), that function gets applied to that fourth argument, the empty list.
So let's return to aux. This aux function accepts three arguments. One is the element of the tree, and two others are the results of the folds of the subtrees, that is, whatever fold_tree returns. In our case, these two things are functions again.
So aux gets a tree element and two functions, and returns yet another function. Which function is that? It takes a list of layers, and adds layers of a given tree to it. How it does that? It prepends the root of the tree to the first element (which is the top layer) of the list, and then adds the layers of the right subtree to the tail of the list (which is all the layers below the top) by calling the right function on it, and then adds the layers of the left subtree to the result by calling the left function on it. Or, if the incoming list is empty, it just the layers list afresh by applying the above step to the empty list.

The right way to use a data structure in OCaml

Ok, I have written a binary search tree in OCaml.
type 'a bstree =
|Node of 'a * 'a bstree * 'a bstree
|Leaf
let rec insert x = function
|Leaf -> Node (x, Leaf, Leaf)
|Node (y, left, right) as node ->
if x < y then
Node (y, insert x left, right)
else if x > y then
Node (y, left, insert x right)
else
node
I guess the above code does not have problems.
When using it, I write
let root = insert 4 Leaf
let root = insert 5 root
...
Is this the correct way to use/insert to the tree?
I mean, I guess I shouldn't declare the root and every time I again change the variable root's value, right?
If so, how can I always keep a root and can insert a value into the tree at any time?
This looks like good functional code for inserting into a tree. It doesn't mutate the tree during insertion, but instead it creates a new tree containing the value. The basic idea of immutable data is that you don't "keep" things. You calculate values and pass them along to new functions. For example, here's a function that creates a tree from a list:
let tree_of_list l = List.fold_right insert l Leaf
It works by passing the current tree along to each new call to insert.
It's worth learning to think this way, as many of the benefits of FP derive from the use of immutable data. However, OCaml is a mixed-paradigm language. If you want to, you can use a reference (or mutable record field) to "keep" a tree as it changes value, just as in ordinary imperative programming.
Edit:
You might think the following session shows a modification of a variable x:
# let x = 2;;
val x : int = 2
# let x = 3;;
val x : int = 3
#
However, the way to look at this is that these are two different values that happen to both be named x. Because the names are the same, the old value of x is hidden. But if you had another way to access the old value, it would still be there. Maybe the following will show how things work:
# let x = 2;;
val x : int = 2
# let f () = x + 5;;
val f : unit -> int = <fun>
# f ();;
- : int = 7
# let x = 8;;
val x : int = 8
# f ();;
- : int = 7
#
Creating a new thing named x with the value 8 doesn't affect what f does. It's still using the same old x that existed when it was defined.
Edit 2:
Removing a value from a tree immutably is analogous to adding a value. I.e., you don't actually modify an existing tree. You create a new tree without the value that you don't want. Just as inserting doesn't copy the whole tree (it re-uses large parts of the previous tree), so deleting won't copy the whole tree either. Any parts of the tree that aren't changed can be re-used in the new tree.
Edit 3
Here's some code to remove a value from a tree. It uses a helper function that adjoins two trees that are known to be disjoint (furthermore all values in a are less than all values in b):
let rec adjoin a b =
match a, b with
| Leaf, _ -> b
| _, Leaf -> a
| Node (v, al, ar), _ -> Node (v, al, adjoin ar b)
let rec delete x = function
| Leaf -> Leaf
| Node (v, l, r) ->
if x = v then adjoin l r
else if x < v then Node (v, delete x l, r)
else Node (v, l, delete x r)
(Hope I didn't just spoil your homework!)

Find the deepest element of a Binary Tree in SML

This is a homework question.
My question is simple: Write a function btree_deepest of type 'a btree -> 'a list that returns the list of the deepest elements of the tree. If the tree is empty, then deepest should return []. If there are multiple elements of the input tree at the same maximal depth, then deepest should return a list containing those deepest elements, ordered according to a preorder traversal. Your function must use the provided btree_reduce function and must not be recursive.
Here is my code:
(* Binary tree datatype. *)
datatype 'a btree = Leaf | Node of 'a btree * 'a * 'a btree
(* A reduction function. *)
(* btree_reduce : ('b * 'a * 'b -> 'b) -> 'b -> 'a tree -> 'b) *)
fun btree_reduce f b bt =
case bt of
Leaf => b
| Node (l, x, r) => f (btree_reduce f b l, x, btree_reduce f b r)
(* btree_size : 'a btree -> int *)
fun btree_size bt =
btree_reduce (fn(x,a,y) => x+a+y) 1 bt
(* btree_height : 'a btree -> int *)
fun btree_height bt =
btree_reduce (fn(l,n,r) => Int.max(l, r)+1) 0 bt
I know that I have to create a function to pass to btree_reduce to build the list of deepest elements and that is where I am faltering.
If I were allowed to use recursion then I would just compare the heights of the left and right node then recurse on whichever branch was higher (or recurse on both if they were the same height) then return the current element when the height is zero and throw these elements into a list.
I think I just need a push in the right direction to get started...
Thanks!
Update:
Here is an attempt at a solution that doesn't compile:
fun btree_deepest bt =
let
val (returnMe, height) = btree_reduce (fn((left_ele, left_dep),n,(right_ele, right_dep)) =>
if left_dep = right_dep
then
if left_dep = 0
then ([n], 1)
else ([left_ele::right_ele], left_dep + 1)
else
if left_dep > right_dep
then (left_ele, left_dep+1)
else (right_ele, right_dep+1)
)
([], 0) bt
in
returnMe
end
In order to get the elements of maximum depth, you will need to keep track of two things simultaneously for every subtree visited by btree_reduce: The maximum depth of that subtree, and the elements found at that depth. Wrap this information up in some data structure, and you have your type 'b (according to btree_reduce's signature).
Now, when you need to combine two subtree results in the function you provide to btree_reduce, you have three possible cases: "Left" sub-result is "deeper", "less deep", or "of equal depth" to the "right" sub-result. Remember that the sub-result represent the depths and node values of the deepest nodes in each subtree, and think about how to combine them to gain the depth and the values of the deepest nodes for the current tree.
If you need more pointers, I have an implementation of btree_deepest ready which I'm just itching to share; I've not posted it yet since you specifically (and honorably) asked for hints, not the solution.
Took a look at your code; it looks like there is some confusion based on whether X_ele are single elements or lists, which causes the type error. Try using the "#" operator in your first 'else' branch above:
if left_dep = 0
then ([n], 1)
else (left_ele # right_ele, left_dep + 1)

binary search tree for finding more than one object

I've just read about binary search trees from the "Learn You a Haskell" book, and I'm wondering whether it is effective to search more than one element using this tree? For example, suppose I have a bunch of objects where every object has some index, and
5
/ \
3 7
/ \ / \
1 4 6 8
if I need to find an element by index 8, I need to do only three steps 5 -> 7 -> 8, instead of iterating over the whole list until the end. But what if I need to find several objects, say 1, 4, 6, 8? It seems like I'd need to repeat the same action for each element 5-> 3 -> 1 5 -> 3 -> 4, 5 -> 7 -> 6 and 5 -> 7 -> 8.
So my question is: does it still make sense to use binary search tree for finding more than one element? Could it be better than checking each element for condition (which leads only to O(n) in the worst case)?
Also, what kind of data structure is better to use if I need to check more than one attribute. E.g. in the example above, I was looking only for the id attribute, but what if I also need to search by name, or color, etc?
You can share some of the work. See members, which takes in a list of values and outputs a list of exactly those values of the input list that are in the tree. Note: The order of the input list is not perserved in the output list.
EDIT: I'm actually not sure if you can get better performance (from a theoretical standpoint) with members over doing map member. I think that if the input list is sorted, then you could by splitting the list in threes (lss, eqs, gts) could be done easily.
data BinTree a
= Branch (BinTree a) a (BinTree a)
| Leaf
deriving (Show, Eq, Ord)
empty :: BinTree a
empty = Leaf
singleton :: a -> BinTree a
singleton x = Branch Leaf x Leaf
add :: (Ord a) => a -> BinTree a -> BinTree a
add x Leaf = singleton x
add x tree#(Branch left y right) = case compare x y of
EQ -> tree
LT -> Branch (add x left) y right
GT -> Branch left y (add x right)
member :: (Ord a) => a -> BinTree a -> Bool
member x Leaf = False
member x (Branch left y right) = case compare x y of
EQ -> True
LT -> member x left
GT -> member x right
members :: (Ord a) => [a] -> BinTree a -> [a]
members xs Leaf = []
members xs (Branch left y right) = eqs ++ members lts left ++ members gts right
where
comps = map (\x -> (compare x y, x)) xs
grab ordering = map snd . filter ((ordering ==) . fst)
eqs = grab EQ comps
lts = grab LT comps
gts = grab GT comps
A quite acceptable solution when searching for multiple elements is to search for them one at a time with the most efficient algorithm (which is O(log n) in your case). However, it can be quite advantageous to step through the entire tree and pool all the elements that match a certain condition, it really depends on where and how often you search inside your code. If you only search at one point in your code it would make sense to collect all the elements in the tree in one shot instead of searching for them one by one. If you decide to opt for that solution then you could feasibly use other data structures such as a list.
If you need to check for multiple attributes I suggest replacing "id" with a tuple containing all the different possible identifiers (id, color, ...). You can then unpack the tuple and compare whichever identifiers you want.
Assuming your binary tree is balanced, if you have a constant number k of search items, then k searches with a total time of O(k * log(n)) is still better than a single O(n) search, where at each character, you still have to do k comparisons, making it O(k*n). Even if the list of search items is sorted, and you can binary search in O(log(k)) time to see if your current item is a match, you're still at O(n * log(k)), which is worse than the tree unless k is Theta(n).
No.
A single search is O(log n). 4 searchs is (4 log n). A linear search, which would pick up all items, is O(n). The tree structure of a btree means finding more than one datum requires a walk (which is actually worse than a list walk).

Resources