How to add index numbers to the nodes of a tree - algorithm

Suppose I have this data type for representing a tree (a rose tree):
type tree =
| Function of string * tree list
| Terminal of int
For example:
Function ("+", [Function ("*", [Terminal 5; Terminal 6]);
Function ("sqrt", [Terminal 3])])
represents the following tree ((5 * 6) + sqrt(3)):
I want to convert this tree into another tree data structure called an "indexed tree" that contains the depth-first (or breath-first) index of each node. In the image above, I have labelled all nodes with their depth-first index.
This is the data type for indexed trees:
type index = int
type indexed_tree =
| IFunction of index * string * indexed_tree list
| ITerminal of index * int
This represents the indexed tree (depth-first) for the image above:
IFunction (0, "+", [IFunction (1, "*", [ITerminal (2, 5); ITerminal (3, 6)]);
IFunction (4, "sqrt", [ITerminal (5, 3)])])
This represents the indexed tree (breadth-first) for the image above:
IFunction (0, "+", [IFunction (1, "*", [ITerminal (3, 5); ITerminal 4, 6)]);
IFunction (2, "sqrt", [ITerminal (5, 3)])])
Now the problem is: how do I define a function tree -> indexed_tree?
I tried to adapt DFS and BFS techniques of keeping a stack, but I soon realized that this problem is completely different. DFS and BFS are only searching for one item, and they can ignore the rest of the tree. Here, I am trying to label the nodes of a tree with their index numbers. How can I do this?
EDIT
Below is my implementation for getting the subtree rooted at the specified index (both depth-first indexing and breadth-first indexing are implemented). I am unable to see how I could adapt this implementation to convert a given tree into an indexed tree. I tried to make use of counter (see implementation below), but the complication is that a depth-first traversal has to backtrack, and I don't know how to pass the counter around when backtracking.
(* Helper function for subtree_index_dfs and subtree_index_bfs.
* join_func should either be prepend (for depth-first), or postpend
* (for breadth-first). *)
let subtree_index tree index join_func =
let node_children = function
| Terminal _ -> []
| Function (_, children) -> children in
let rec loop counter stack =
match stack with
| [] -> failwith "Index out of bounds"
| (hd::_) when counter = index -> hd
| (hd::tl) -> loop (counter + 1) (join_func (node_children hd) tl)
in
loop 0 [tree]
(* Get the subtree rooted at the specified index.
* Index starts at 0 at the root of the tree and is ordered depth-first. *)
let subtree_index_dfs tree index =
let prepend a b =
a#b
in
subtree_index tree index prepend
(* Get the subtree rooted at the specified index.
* Index starts at 0 at the root of the tree and is ordered breadth-first. *)
let subtree_index_bfs tree index =
let append a b =
b#a
in
subtree_index tree index append
(* Misc. *)
let rec string_of_tree t =
match t with
| Terminal i -> string_of_int i
| Function (sym, children) ->
let children_str = List.map (fun child -> string_of_tree child) children
in
"(" ^ sym ^ " " ^ String.concat " " children_str ^ ")"
let print_tree t =
print_endline (string_of_tree t)
Example usage:
let () =
let t1 = Function ("+", [Function ("*", [Terminal 5; Terminal 6]);
Function ("sqrt", [Terminal 3])])
in
print_tree (subtree_index_dfs t1 0); (* (+ ( * 5 6) (sqrt 3)) *)
print_tree (subtree_index_dfs t1 1); (* ( * 5 6) *)
print_tree (subtree_index_dfs t1 2); (* 5 *)
print_tree (subtree_index_dfs t1 3); (* 6 *)
print_tree (subtree_index_dfs t1 4); (* (sqrt 3) *)
print_tree (subtree_index_dfs t1 5); (* 3 *)
print_tree (subtree_index_dfs t1 6); (* Exception: Failure "Index out of bounds". *)

This isn't a complete answer, and I think asking for both depth-first and breadth-first is a bit too broad as I don't see there being much to generallize, but I hope it will at least get you a bit further, and perhaps spark some ideas.
I think the problem you're having stems from your current code being overgeneralized. You're basically transforming the tree, rather inefficiently, into a list and then indexing into that list. You can't then transform the list back into a tree, because you've thrown that information away.
Depth-first traversal is really very simple, and lends itself naturally to a recursive implementation. If we ignore the index for now, the tree -> indexed_tree function is just this:
let indexed tree =
let rec loop = function
| Terminal value -> ITerminal (0, value)
| Function (name, children) -> IFunction (0, name, loop_children children)
and loop_children = function
| [] -> []
| child :: rest -> loop child :: loop_children rest
in loop tree
Then we just have to fill in the index. This does so by passing an incrementing index up while recursing, and returning the node count along with the constructed subtree on the way down so that we know how much to increment the index by:
let indexed_dfs tree =
let rec loop i = function
| Function (name, children) ->
let (child_count, children') = loop_children (i + 1) children in
(child_count + 1, IFunction (i, name, children'))
| Terminal value -> (1, ITerminal (i, value))
and loop_children i = function
| [] -> (0, [])
| child :: rest ->
let (count, child') = loop i child in
let (rest_count, rest') = loop_children (i + count) rest in
(count + rest_count, child' :: rest')
in snd (loop 0 tree)
Instead of the count each call could also just have returned the last index, but I thought it was simpler to follow two separate concepts instead of overloading a single one.
Breadth-first transformation is unfortunately significantly trickier, as we can't easily construct a new tree breadth-first. But we also shouldn't need to. I think instead of keeping track of the total count it might be an idea to keep track of the accumulated counts, or offsets, of each level by passing around a stack of offsets for the levels seen thus far, and then use that to calculate the index of the node currently being constructed.

Related

Could someone explain to me what the tree traversal is doing here?

I am having a hard time understanding the tree_traverse function and what it does, could someone please explain.
(Peek and pop is just from a stack implementation which I made.)
type 'a tree = Leaf | Branch of ('a tree * 'a * 'a tree);;
let rec tree_traverse (t,mem,prim,seco) = match t with
| Leaf ->
if is_empty mem
then []
else tree_traverse (peek mem, pop mem, prim, seco)
| Branch (l,nd,r) ->
let mem1 = add (prim(l,r),mem) in
let mem2 = add (seco(l,r), mem1) in
nd :: tree_traverse (peek mem2, pop mem2, prim, seco)
where an example of a tree is
let t = Branch (Branch(Branch(Leaf,1,Leaf), 2, Leaf), 3,
Branch (Branch(Leaf,5,Leaf), 6, Branch(Leaf,7,Leaf)))
This function implements some sort of the worklist algorithm, it returns a list of nodes in some order that depends on the implementation of prim and seco functions, as well as add and pop.
Neither prim nor seco parameter is changed during the recursion, so the could be removed from the list of parameters. If we will assume the following implementations
let add (x,xs) = x :: xs
let pop (x::xs) = xs
let peek (x::xs) = x
let prim (x,y) = x
let seco (x,y) = y
let is_empty = function [] -> true | _ -> false
then the tree_traverse function will return the list of nodes in the depth-first order.
Given your example and fixing the implementation to the specified above functions, we can now follow the execution of the function:
tree_traverse Branch (Branch(Branch(Leaf,1,Leaf), 2, Leaf), 3,
Branch (Branch(Leaf,5,Leaf), 6, Branch(Leaf,7,Leaf)))
doesn't match with the Leaf case so we go to the second case and get it deconstructed as
| Branch (l,nd,r)) ->
(* l is bound to Branch (Branch(Leaf,1,Leaf), 2, Leaf) *)
(* nd is bound to 3 *)
(* r is bound to Branch (Branch(Leaf,5,Leaf), 6, Branch(Leaf,7,Leaf)) *)
let mem1 = add (prim(l,r),mem) in
let mem2 = add (seco(l,r), mem1) in
nd :: tree_traverse (peek mem2, pop mem2, prim, seco)
we push the left sub-branch l to the first stack mem1, and push the left and right subbranches to the stack mem2. Then we prepend 3 to the result and recurse with into the top of our stack mem2 while dropping it.
In the next step, we are matching on the top of our second stack, which contains the right branch, namely Branch (Branch(Leaf,5,Leaf), 6, Branch(Leaf,7,Leaf)), we land again to the second case, with
Branch(Leaf,5,Leaf) being bound to the l variable, and Branch(Leaf,7,Leaf) to r. We add 6 to the result, and push l then push and immediately pop r and recurse into it.
On the third step of recursion we are called with Branch(Leaf,7,Leaf), we add 7 to our result, and push left and right Leaf to our stack.
On the fourth step, we peek the Leaf node, and finally get into the first case, where we look into the stack, and if it is not empty, then we recurse into the top. In our case the stack contains our sibling left Leaf, then the left sibling of the parent node, etc.
You can do the same using the OCaml toplevel, e.g.,
#trace tree_traverse;;
tree_traverse (t,[],prim,seco);;
Use the definitions of the helper function, as I provided above.

Generate all unique directed graphs with 2 inputs to each node

I'm trying to generate all unique digraphs that fit a spec:
each node must have exactly 2 inputs
and are allowed arbitrarily many outputs to other nodes in the graph
My current solution is slow. Eg for 6 nodes, the algo has taken 1.5 days to get where I think it's complete, but it'll probably be checking for a few more days still.
My algorithm for a graph with n nodes:
generate all n-length strings of 0, where one symbol is a 1, eg, for n=3, [[0,0,1], [0,1,0], [1,0,0]]. These can be thought of as rows from an identity matrix.
generate all possible n * n matrixes where each row is all possible combinations of step 1. + step 1.
This is the connectivity matrix where each cell represents a connection from column-index to row-index
So, for n=3, these are possible:
[0,1,0] + [1,0,0] = [1,1,0]
[1,0,0] + [1,0,0] = [2,0,0]
These represent the inputs to a node, and by adding step 1 to itself, the result will always represent 2 inputs.
For ex:
A B C
A' [[0,1,1],
B' [0,2,0],
C' [1,1,0]]
So B and C connect to A once each: B -> A', C -> A',
And B connects to itself twice: B => B'
I only want unique ones, so for each connectivity matrix generated, I can only keep it if it is not isomorphic to an already-seen graph.
This step is expensive. I need to convert the graph to a "canonical form" by running through each permutation of isomorphic graphs, sorting them, and considering the first one as the "canonical form".
If anyone dives into testing any of this out, here are the count of unique graphs for n nodes:
2 - 6
3 - 44
4 - 475
5 - 6874
6 - 109,934 (I think, it's not done running yet but I haven't found a new graph in >24 hrs.)
7 - I really wanna know!
Possible optimizations:
since I get to generate the graphs to test, is there a way of ruling them out, without testing, as being isomorphic to already-seen ones?
is there a faster graph-isomorphism algorithm? I think this one is related to "Nauty", and there are others I've read of in papers, but I haven't had the expertise (or bandwidth) to implement them yet.
Here's a demonstrable connectivity matrix that can be plotted at graphonline.ru for fun, showing self connections, and 2 connections to t he same node:
1, 0, 0, 0, 0, 1,
1, 0, 0, 0, 1, 0,
0, 1, 0, 1, 0, 0,
0, 1, 2, 0, 0, 0,
0, 0, 0, 1, 0, 1,
0, 0, 0, 0, 1, 0,
here's the code in haskell if you want to play with it, but I'm more concerned about getting the algorithm right (eg pruning down the search space), than the implementation:
-- | generate all permutations of length n given symbols from xs
npermutations :: [a] -> Int -> [[a]]
npermutations xs size = mapM (const xs) [1..size]
identity :: Int -> [[Int]]
identity size = scanl
(\xs _ -> take size $ 0 : xs) -- keep shifting right
(1 : (take (size - 1) (repeat 0))) -- initial, [1,0,0,...]
[1 .. size-1] -- correct size
-- | return all possible pairings of [Column]
columnPairs :: [[a]] -> [([a], [a])]
columnPairs xs = (map (\x y -> (x,y)) xs)
<*> xs
-- | remove duplicates
rmdups :: Ord a => [a] -> [a]
rmdups = rmdups' Set.empty where
rmdups' _ [] = []
rmdups' a (b : c) = if Set.member b a
then rmdups' a c
else b : rmdups' (Set.insert b a) c
-- | all possible patterns for inputting 2 things into one node.
-- eg [0,1,1] means cells B, and C project into some node
-- [0,2,0] means cell B projects twice into one node
binaryInputs :: Int -> [[Int]]
binaryInputs size = rmdups $ map -- rmdups because [1,0]+[0,1] is same as flipped
(\(x,y) -> zipWith (+) x y)
(columnPairs $ identity size)
transposeAdjMat :: [[Int]] -> [[Int]]
transposeAdjMat ([]:_) = []
transposeAdjMat m = (map head m) : transposeAdjMat (map tail m)
-- | AdjMap [(name, inbounds)]
data AdjMap a = AdjMap [(a, [a])] deriving (Show, Eq)
addAdjColToMap :: Int -- index
-> [Int] -- inbound
-> AdjMap Int
-> AdjMap Int
addAdjColToMap ix col (AdjMap xs) =
let conns = foldl (\c (cnt, i) -> case cnt of
1 -> i:c
2 -> i:i:c
_ -> c
)
[]
(zip col [0..]) in
AdjMap ((ix, conns) : xs)
adjMatToMap :: [[Int]] -> AdjMap Int
adjMatToMap cols = foldl
(\adjMap#(AdjMap nodes) col -> addAdjColToMap (length nodes) col adjMap)
(AdjMap [])
cols
-- | a graph's canonical form : http://mfukar.github.io/2015/09/30/haskellxiii.html
-- very expensive algo, of course
canon :: (Ord a, Enum a, Show a) => AdjMap a -> String
canon (AdjMap g) = minimum $ map f $ Data.List.permutations [1..(length g)]
where
-- Graph vertices:
vs = map fst g
-- Find, via brute force on all possible orderings (permutations) of vs,
-- a mapping of vs to [1..(length g)] which is minimal.
-- For example, map [1, 5, 6, 7] to [1, 2, 3, 4].
-- Minimal is defined lexicographically, since `f` returns strings:
f p = let n = zip vs p
in (show [(snd x, sort id $ map (\x -> snd $ head $ snd $ break ((==) x . fst) n)
$ snd $ take_edge g x)
| x <- sort snd n])
-- Sort elements of N in ascending order of (map f N):
sort f n = foldr (\x xs -> let (lt, gt) = break ((<) (f x) . f) xs
in lt ++ [x] ++ gt) [] n
-- Get the first entry from the adjacency list G that starts from the given node X
-- (actually, the vertex is the first entry of the pair, hence `(fst x)`):
take_edge g x = head $ dropWhile ((/=) (fst x) . fst) g
-- | all possible matrixes where each node has 2 inputs and arbitrary outs
binaryMatrixes :: Int -> [[[Int]]]
binaryMatrixes size = let columns = binaryInputs size
unfiltered = mapM (const columns) [1..size] in
fst $ foldl'
(\(keep, seen) x -> let can = canon . adjMatToMap $ x in
(if Set.member can seen
then keep
else id $! x : keep
, Set.insert can seen))
([], Set.fromList [])
unfiltered
There are a number of approaches you could try. One thing that I do note is that having loops with multi-edges (colored loops?) is a little unusual, but is probably just needs a refinement of existing techniques.
Filter the output of another program
The obvious candidate here is of course nAUTy/traces (http://pallini.di.uniroma1.it/) or similar (saucy, bliss, etc). Depending on how you want to do this, it could be as simple as run nauty (for example) and output to file, then read in the list filtering as you go.
For larger values of n this could start to be a problem if you are generating huge files. I'm not sure whether you start to run out of space before you run out of time, but still. What might be better is to generate and test them as you go, throwing away candidates. For your purposes, there may be an existing library for generation - I found this one but I have no idea how good it is.
Use graph invariants
A very easy first step to more efficient listing of graphs is to filter using graph invariants. An obvious one would be degree sequence (the ordered list of degrees of the graph). Others include the number of cycles, the girth, and so on. For your purposes, there might be some indegree/outdegree sequence you could use.
The basic idea is to use the invariant as a filter to avoid expensive checks for isomorphism. You can store the (list of ) invariants for already generated graphs, and check the new one against the list first. The canonical form of a structure is a kind of invariant.
Implement an algorithm
There are lost of GI algorithms, including the ones used by nauty and friends. However, they do tend to be quite hard! The description given in this answer is an excellent overview, but the devil is in the details of course.
Also note that the description is for general graphs, while you have a specific subclass of graph that might be easier to generate. There may be papers out there for digraph listing (generating) but I have not checked.

Improvement of the Greedy Algorithm

I've been working on an abstract chess algorithm using Haskell (trying to expand my understanding of different paradigms), and I've hit a challenge that I've been pondering about for weeks.
Here's the problem:
Given a board (represented by a list of lists of integers; each
integer represents a subsequent point value), with dimensions n x n,
determine the path that provides the most points. If there is a tie
for best path, return either of them.
Here are the specifics:
A = [[5,4,3,1],[10,2,1,0],[0,1,2,0],[2,3,4,20]]
which renders as:
R1: 5 4 3 1, R2: 10 2 1 0, R3: 0 1 2 0, R4: 2 3 4 20.
The rules are:
You may start anywhere on the top row
You may move one square at a time, either straight down, down-left (diagonal) , or down-right (diagonal).
The output must be a tuple of integers.
First element is a list representing the columns vs. row, and the second element is the total number of points. Eg. for the above board, the best solution is to travel from top-left (5) and go diagonally for the remaining steps (until the 20 point square). This would result in the tuple ([1,2,3,4], 29).
Remember, this is all in Haskell so it is a functional-paradigm recursive problem. At first, I was thinking about using the greedy algorithm, that is, choosing the highest value in r1, and recursing through comparing the next 3 possibilities; choosing the highest of the 3. However, the downfall is that the greedy algorithm doesn't have the ability to see potential ahead of the next row.
How would I go about this? I'm not looking for code per se, since I enjoy solving things on my own. However, pseudocode or some algorithmic guidance would be much appreciated!
I saw your previous question on the same topic, and I start to work on it.
As you doesn't want the direct solution, I can provide you my reflexion about your problem, I guess it could help you.
Some basic property :
1. The number of movement is alway egal to the length of the list m = length A
2. The number of starting point is egal to the length of the head of the list n = length (head A)
3. The current position could never be negative, then :
- if the current position is egal to 0 you can either go down or right
- else you can go to left, down or right
Which lead us to this pseudo code
generate_path :: [[Int]] -> [[Int]]
generate_path [] = [[]]
generate_path A = ... -- You have to put something here
where
m = length A
n = length (head A)
This things should look like something as this
move pos0 count0
| count0 == 0 =
| pos0 == 0 = move (down count) ++ move (right count)
| otherwise = move (left count) ++ move (down count) ++ move (right count)
where
count = count0 - 1
down = position0
left = position0 - 1
right = position0 + 1
In fact keeping all of this in mind and adding the (!!) operator, we shouldn't be so far of the solution. To convince you play with A + list comprehension + !!, as
[A !! x !! y | x <- [1..2], y <- [0..2]] -- I take random range
Or play with another version :
[[A !! x !! y | x <- [1..2]] | y <- [0..2]]] -- I take random range
In fact you have two recursion the main one working on the parameter n = length (head A), you repeat the same action from 0 to (n-1) at (n-1) retrieve the result, this recursion embedded another one which work on m, repeat the same action from 0 to (m-1).
Hope it help.
Good luck.
Keep a list of the paths to each column in the row just reached with the highest score to that cell.
You'd start (in your example), with the list
[([1],5), ([2],4), ([3],3), ([4],1)]
Then, when checking the next row, for each column, you pick the path with the highest score in the previous row that can reach that column, here, for the second row, in column 1 and 2, you'd pick the path ending in column 1 on the row above, and in column 3, you'd pick the path ending in column 2 in the row above, in column 4, the path ending in colum 3 in the previous row, so that would give you
[([1,1],15), ([1,2],7), ([2,3],5), ([3,4],3)]
for the third row, [0,1,2,0], you'd again pick the path ending in column 1 for the first two columns, the path ending in column 2 for the third, and the path ending in column 3 for the fourth,
[([1,1,1],15), ([1,1,2],16), ([1,2,3],9), ([2,3,4],5)]
for the fourth row, [2,3,4,20], you'd pick the path ending in column 2 for the first three columns, and the path ending in column 3 for the last,
[([1,1,2,1],18), ([1,1,2,2],19), ([1,1,2,3],20), ([1,2,3,4],29)]
Then, when you've reached the last row, you pick the path with the highest total.
Why it works:
Let the highest-scoring path end in column c. The part above the last column must be the highest scoring path ending in one of the columns c-1, c, c+1 on the penultimate row, since column c in the last row can only be reached from those.
The best solution is not a greedy algorithm from the top down, but rather an approach that starts with the last row and works up:
import Data.Function
import Data.List
-- All elements of Board are lists of equal lengths
-- valid b = 1 == length (group (map length b))
type Value = Int
type Board = [[Value]]
type Index = Int
type Result = ([Index], Value)
p :: Board
p = [[5,4,3,1],[10,2,1,0],[0,1,2,0],[2,3,4,20]]
best_from :: Board -> Result
best_from [] = undefined
best_from xs | any null xs = undefined
best_from b = best_of . best_list $ b
best_list :: Board -> [Result]
best_list b = foldr1 layer (map label b)
where label = zipWith (\index value -> ([index],value)) [1..]
layer new rest = zipWith (\(i1,v1) (i2,v2) -> (i1++i2, v1+v2)) new best
where temp = head rest : map best_pair (zip rest (tail rest))
best = map best_pair (zip temp (tail rest)) ++ [last temp]
best_pair :: (Result,Result) -> Result
best_pair (a#(_,a1), b#(_,b1)) | a1 >=b1 = a
| otherwise = b
best_of :: [Result] -> Result
best_of = maximumBy (compare `on` snd)
main = do
print (best_from p)
It is easy to solve if there is one row. So this converts each row into a list of Result with a simple [#] solution path.
Given the rest for the puzzel below a new row then adding the new row is a matter of finding the best solution from rest (by checking down, down left, down right) and combining with the new row.
This makes foldr, or here foldr1 the natural structure.
I chose a different path, no pun intended. I listed the allowed index combinations and mapped the board to them. Perhaps someone can find a way to generalize it to a board of any size.
import Data.List
import Data.Ord
import Data.Maybe
a = [[5,4,3,1],[10,2,1,0],[0,1,2,0],[2,3,4,20]]
r1 = a !! 0
r2 = a !! 1
r3 = a !! 2
r4 = a !! 3
i = [0,1,2,3]
index_combinations = [[a,b,c,d] | a <- i, b <- i, c <- i, d <- i,
abs (b-a) < 2, abs (c-b) < 2, abs (d-c) < 2]
mapR xs = [r1 !! (xs !! 0), r2 !! (xs !! 1),
r3 !! (xs !! 2), r4 !! (xs !! 3)]
r_combinations = map mapR index_combinations
r_combinations_summed = zip r_combinations $ map (foldr (+) 0) r_combinations
result = maximumBy (comparing snd) r_combinations_summed
path = index_combinations !! fromJust (elemIndex result r_combinations_summed)

The right way to use a data structure in OCaml

Ok, I have written a binary search tree in OCaml.
type 'a bstree =
|Node of 'a * 'a bstree * 'a bstree
|Leaf
let rec insert x = function
|Leaf -> Node (x, Leaf, Leaf)
|Node (y, left, right) as node ->
if x < y then
Node (y, insert x left, right)
else if x > y then
Node (y, left, insert x right)
else
node
I guess the above code does not have problems.
When using it, I write
let root = insert 4 Leaf
let root = insert 5 root
...
Is this the correct way to use/insert to the tree?
I mean, I guess I shouldn't declare the root and every time I again change the variable root's value, right?
If so, how can I always keep a root and can insert a value into the tree at any time?
This looks like good functional code for inserting into a tree. It doesn't mutate the tree during insertion, but instead it creates a new tree containing the value. The basic idea of immutable data is that you don't "keep" things. You calculate values and pass them along to new functions. For example, here's a function that creates a tree from a list:
let tree_of_list l = List.fold_right insert l Leaf
It works by passing the current tree along to each new call to insert.
It's worth learning to think this way, as many of the benefits of FP derive from the use of immutable data. However, OCaml is a mixed-paradigm language. If you want to, you can use a reference (or mutable record field) to "keep" a tree as it changes value, just as in ordinary imperative programming.
Edit:
You might think the following session shows a modification of a variable x:
# let x = 2;;
val x : int = 2
# let x = 3;;
val x : int = 3
#
However, the way to look at this is that these are two different values that happen to both be named x. Because the names are the same, the old value of x is hidden. But if you had another way to access the old value, it would still be there. Maybe the following will show how things work:
# let x = 2;;
val x : int = 2
# let f () = x + 5;;
val f : unit -> int = <fun>
# f ();;
- : int = 7
# let x = 8;;
val x : int = 8
# f ();;
- : int = 7
#
Creating a new thing named x with the value 8 doesn't affect what f does. It's still using the same old x that existed when it was defined.
Edit 2:
Removing a value from a tree immutably is analogous to adding a value. I.e., you don't actually modify an existing tree. You create a new tree without the value that you don't want. Just as inserting doesn't copy the whole tree (it re-uses large parts of the previous tree), so deleting won't copy the whole tree either. Any parts of the tree that aren't changed can be re-used in the new tree.
Edit 3
Here's some code to remove a value from a tree. It uses a helper function that adjoins two trees that are known to be disjoint (furthermore all values in a are less than all values in b):
let rec adjoin a b =
match a, b with
| Leaf, _ -> b
| _, Leaf -> a
| Node (v, al, ar), _ -> Node (v, al, adjoin ar b)
let rec delete x = function
| Leaf -> Leaf
| Node (v, l, r) ->
if x = v then adjoin l r
else if x < v then Node (v, delete x l, r)
else Node (v, l, delete x r)
(Hope I didn't just spoil your homework!)

Find the deepest element of a Binary Tree in SML

This is a homework question.
My question is simple: Write a function btree_deepest of type 'a btree -> 'a list that returns the list of the deepest elements of the tree. If the tree is empty, then deepest should return []. If there are multiple elements of the input tree at the same maximal depth, then deepest should return a list containing those deepest elements, ordered according to a preorder traversal. Your function must use the provided btree_reduce function and must not be recursive.
Here is my code:
(* Binary tree datatype. *)
datatype 'a btree = Leaf | Node of 'a btree * 'a * 'a btree
(* A reduction function. *)
(* btree_reduce : ('b * 'a * 'b -> 'b) -> 'b -> 'a tree -> 'b) *)
fun btree_reduce f b bt =
case bt of
Leaf => b
| Node (l, x, r) => f (btree_reduce f b l, x, btree_reduce f b r)
(* btree_size : 'a btree -> int *)
fun btree_size bt =
btree_reduce (fn(x,a,y) => x+a+y) 1 bt
(* btree_height : 'a btree -> int *)
fun btree_height bt =
btree_reduce (fn(l,n,r) => Int.max(l, r)+1) 0 bt
I know that I have to create a function to pass to btree_reduce to build the list of deepest elements and that is where I am faltering.
If I were allowed to use recursion then I would just compare the heights of the left and right node then recurse on whichever branch was higher (or recurse on both if they were the same height) then return the current element when the height is zero and throw these elements into a list.
I think I just need a push in the right direction to get started...
Thanks!
Update:
Here is an attempt at a solution that doesn't compile:
fun btree_deepest bt =
let
val (returnMe, height) = btree_reduce (fn((left_ele, left_dep),n,(right_ele, right_dep)) =>
if left_dep = right_dep
then
if left_dep = 0
then ([n], 1)
else ([left_ele::right_ele], left_dep + 1)
else
if left_dep > right_dep
then (left_ele, left_dep+1)
else (right_ele, right_dep+1)
)
([], 0) bt
in
returnMe
end
In order to get the elements of maximum depth, you will need to keep track of two things simultaneously for every subtree visited by btree_reduce: The maximum depth of that subtree, and the elements found at that depth. Wrap this information up in some data structure, and you have your type 'b (according to btree_reduce's signature).
Now, when you need to combine two subtree results in the function you provide to btree_reduce, you have three possible cases: "Left" sub-result is "deeper", "less deep", or "of equal depth" to the "right" sub-result. Remember that the sub-result represent the depths and node values of the deepest nodes in each subtree, and think about how to combine them to gain the depth and the values of the deepest nodes for the current tree.
If you need more pointers, I have an implementation of btree_deepest ready which I'm just itching to share; I've not posted it yet since you specifically (and honorably) asked for hints, not the solution.
Took a look at your code; it looks like there is some confusion based on whether X_ele are single elements or lists, which causes the type error. Try using the "#" operator in your first 'else' branch above:
if left_dep = 0
then ([n], 1)
else (left_ele # right_ele, left_dep + 1)

Resources