I have the following implementation of a Depth First Search algorithm:
public static void printDFS(Node root) {
Stack<Node> stack = new Stack<Node>();
stack.push(root);
while(!stack.isEmpty()) {
Node curr = stack.pop();
System.out.println(curr.getValue()) ;
if (curr.getLeft() != null) {
stack.push(curr.getLeft());
}
if (curr.getRight() != null) {
stack.push(curr.getRight());
}
}
}
and when I run it on a tree that looks like this:
0
/ \
6 7
/ \ / \
5 4 3 2
I get the visited output as: 0 -> 7 -> 2 -> 3 -> 6 -> 4 -> 5
Is this a 'correct' DFS ordering? I would have expected the output to have been a pre-order traversal (ie 0 ->6 -> 5 -> 4 -> 7 -> 3 -> 2) and I know I can get this by first pushing the right node of each subtree. But what I want to know is what is the correct visitation order in a DFS algorithm?
As already mentioned in another answer the reason why your visitation -> traversal order is "inversed" lies in the fact that you are using a Stack to keep track of the "current node".
Let me walk you through your example tree:
0
/ \
6 7
/ \ / \
5 4 3 2
stack.push(root) leads to following stack state:
0: 0 <-- (root) and Top of stack
You're popping the stack and put it in curr. In traversal terms you are now in this state:
0 <--
/ \
6 7
/ \ / \
5 4 3 2
you then proceed to add curr.getLeft() to the stack and then curr.getRight(). This leads to following stack state:
1: 7 <--(curr.getRight()) <-- Top of stack
0: 6 <--(curr.getLeft())
Repeating the same step we get following traversal state:
0
/ \
6 7<--
/ \ / \
5 4 3 2
and after adding the nodes:
2: 2 <-- Top of stack
1: 3
0: 6 <-- (initial getLeft())
as both nodes have no children, popping them from the stack and outputting them gets us to following traversal state:
0
/ \
-->6 7
/ \ / \
5 4 3 2
The rest of this is history ;)
As you specificially asked about a "correct" way (or ordering) for a DFS: There is none. You define what side you traverse to depth first.
It's a stack. What you push last will pop first
There is no such "correct DFS ordering". The main idea of DFS is to go deep; visiting children before siblings for any given node. Once you go far deep in the graph and you encounter a leaf, you backtrack and examine the nearest sibling in the same way.
The way you choose which child to examine first result in different traversing orders (or trees). Needless to say, all traversing methods result in a spanning tree over the graph. Pre-order traversing, the one you are comparing with, is probably the most well known order for DFS (or at least this is what I have seen). Others are valid but not too popular.
Here is a pseudo code for DFS:
'''
This is much more of a traversal algorithm than search.
'''
Algorithm DFS(Tree):
initialize stack to contain Tree.root()
while stack is not empty:
p = stack.pop()
perform 'action' for p
for each child 'c' in Tree.children(p):
stack.push(c)
This will search through all the nodes of tree whether binary or not.
To implement search and return.. modify the algorithm accordingly.
There are a few options to consider for your DFS algorithm:
Firstly, use Backtracking algorithm to do a DFS search.
If using Backtracking add only leftChild while traversing downwards. Otherwise reverse the order in which you push() node's children onto the Stack i.e rightChild right and then leftChild.
Again if using Backtracking, avoid cycles by creating a variable nodeVistited which will be set to true once a node has been pushed on stack. Not needed otherwise.
Try with these changes or let me know I will post code for DFS.
Related
Given an N-ary tree, I have to generate all the leaf to leaf paths in an n-array tree. The path should also denote the direction. As an example:
Tree:
1
/ \
2 6
/ \
3 4
/
5
Paths:
5 UP 3 UP 2 DOWN 4
4 UP 2 UP 1 DOWN 6
5 UP 3 UP 2 UP 1 DOWN 6
These paths can be in any order, but all paths need to be generated.
I kind of see the pattern:
looks like I have to do in order traversal and
need to save what I have seen so far.
However, can't really come up with an actual working algorithm.
Can anyone nudge me to the correct algorithm?
I am not looking for the actual implementation, just the pseudo code and the conceptual idea would be much appreciated.
The first thing I would do is to perform in-order traversal. As a result of this, we will accumulate all the leaves in the order from the leftmost to the rightmost nodes.(in you case this would be [5,4,6])
Along the way, I would certainly find the mapping between nodes and its parents so that we can perform dfs later. We can keep this mapping in HashMap(or its analogue). Apart from this, we will need to have the mapping between nodes and its priorities which we can compute from the result of the in-order traversal. In your example the in-order would be [5,3,2,4,1,6] and the list of priorities would be [0,1,2,3,4,5] respectively.
Here I assume that our node looks like(we may not have the mapping node -> parent a priori):
class TreeNode {
int val;
TreeNode[] nodes;
TreeNode(int x) {
val = x;
}
}
If we have n leaves, then we need to find n * (n - 1) / 2 paths. Obviously, if we have managed to find a path from leaf A to leaf B, then we can easily calculate the path from B to A. (by transforming UP -> DOWN and vice versa)
Then we start traversing over the array of leaves we computed earlier. For each leaf in the array we should be looking for paths to leaves which are situated to the right of the current one. (since we have already found the paths from the leftmost nodes to the current leaf)
To perform the dfs search, we should be going upwards and for each encountered node check whether we can go to its children. We should NOT go to a child whose priority is less than the priority of the current leaf. (doing so will lead us to the paths we already have) In addition to this, we should not visit nodes we have already visited along the way.
As we are performing dfs from some node, we can maintain a certain structure to keep the nodes(for instance, StringBuilder if you program in Java) we have come across so far. In our case, if we have reached leaf 4 from leaf 5, we accumulate the path = 5 UP 3 UP 2 DOWN 4. Since we have reached a leaf, we can discard the last visited node and proceed with dfs and the path = 5 UP 3 UP 2.
There might be a more advanced technique for solving this problem, but I think it is a good starting point. I hope this approach will help you out.
I didn't manage to create a solution without programming it out in Python. UNDER THE ASSUMPTION that I didn't overlook a corner case, my attempt goes like this:
In a depth-first search every node receives the down-paths, emits them (plus itself) if the node is a leaf or passes the down-paths to its children - the only thing to consider is that a leaf node is a starting point of a up-path, so these are input from the left to right children as well as returned to the parent node.
def print_leaf2leaf(root, path_down):
for st in path_down:
st.append(root)
if all([x is None for x in root.children]):
for st in path_down:
for n in st: print(n.d,end=" ")
print()
path_up = [[root]]
else:
path_up = []
for child in root.children:
path_up += child is not None and [st+[root] for st in print_root2root(child, path_down + path_up)] or []
for st in path_down:
st.pop()
return path_up
class node:
def __init__(self,d,*children):
self.d = d
self.children = children
## 1
## / \
## 2 6
## / \ /
## 3 4 7
## / / | \
## 5 8 9 10
five = node(5)
three = node(3,five)
four = node(4)
two = node(2,three,four)
eight = node(8)
nine = node(9)
ten = node(10)
seven = node(7,eight,nine,ten)
six = node(6,None,seven)
one = node(1,two,six)
print_leaf2leaf(one,[])
I'm trying to define an algorithm that returns the number of leaves at the lowest level of a complete binary tree. By a complete binary tree, I mean a binary tree whose every level, except possibly the last, is filled, and all nodes in the last level are as far left as possible.
For example, if I had the following complete binary tree,
_ 7_
/ \
4 9
/ \ / \
2 6 8 10
/ \ /
1 3 5
the algorithm would return '3' since there are three leaves at the lowest level of the tree.
I've been able to find numerous solutions for finding the count of all the leaves in regular or balanced binary trees, but so far I haven't had any luck with the particular case of finding the count of the leaves at the lowest level of a complete binary tree. Any help would be appreciated.
Do a breadth-first search, so you can aswell find a number of nodes on each level.
Some pseudo code
q <- new queue of (node, level) data
add (root, 0) in q
nodesPerLevel <- new vector of integers
while q is not empty:
(currentNode, currentLevel) <- take from top of q
nodesPerLevel[currentLevel] += 1
for each child in currentNode's children:
add (child, currentLevel + 1) in q
return last value of nodesPerLevel
Need to define a seek(u,v) function, where u is the new node within the tree (the node where I want to start searching), and v is the number of descendants below the new node, and this function would return index of highest key value. The tree doesn't have a be a BST, there can be nodes with many many children. Example:
input:
5 // tree with 5 nodes
1 3 5 2 7 // the nodes' keys
1 2 // 1 and 2 are linked
2 3 // 2 and 3 are linked
1 4 // 1 and 4 are linked
3 5 // 3 and 5 are linked
4 // # of seek() requests
2 3 // index 2 of tree, which would be key 5, 3 descendants from index 3
4 1 // index 4 of tree, but only return highest from index 4 to 4 (it would
// return itself)
3 2 // index 3, next 2 descendants
3 2 // same
output:
5 // Returned index 5 because the 7 is the highest key from array[3 'til 5]
4 // Returned index 4 because the range is one, plus 4's children are null
5 // Returned index 5 because the 7 is the highest key from array[4 'til 5]
5 // Same as prior line
I was thinking about putting the new root into a new Red Black Tree, but can't find a way to efficiently save successor or predecessor information for each node. Also thinking about putting into an array, but due to the nature of an unbalanced and unsorted tree, it doesn't guarantee that my tree would be sorted, plus because it's not a BST i can't perform an inorder tree walk. Any suggestions as to how I can get the highest key from a specific range?
I dont understand very well what you mean by : "the number of descendants below the new node". The way you say it, it implies there is a some sort of imposed tree walk, or at least an order in which you have to visit the nodes. In that case it would be best to explain more thoroughly what you mean.
In the rest of the answer I assume you mean distance from u.
From a pure algorithmic point of view, since you cannot assume anything about your tree, you have to visit all concerned vertices of the graph (i.e vertices at a distance <= v from u) to get your result. It means any partial tree traversal (such as depth-first or breadth-First) should be enough and necessary (since you have to visit all concerned nodes below u), since the order in which we visit the nodes doesn't matter.
If you can, it's simpler to use a recursive function seek'(u,v) which return a couple (index, key) defined as follows :
if v > 1, you define seek'(u,v) as the couple which maximizes its second component among the couples (u, key(u)) and seek(w,v-1) for w son of u.
else (v = 1) you define seek'(u,v) as (u, key(u))
You then have seek(u,v) = first(seek'(u,v)).
All of what I said presumes you have built a tree from the input, or that you can easily get the key of a node and its sons from its index.
1
/ \
2 3
/ \ / \
4 5 6 7
for the given binary tree we need to create a matrix a[7][7]
satisfying the ancestor property like a[2][1]=1 since 1 is an ancestor of 2 ....
i solved it by using extra space an array ...the solution i came up is
int a[n][n]={0};
void updatematrix(int a[][n],struct node *root,int temp[],int index){
if(root == NULL)
return ;
int i;
for(i=0;i< index;i++)
a[root->data][temp[i]]=1;
temp[index]=root->data;
updatematrix(a,root->left,temp,index+1);
updatematrix(a,root->right,temp,index+1);
}
is there any mistake in my solution ?
can we do this inplace ???(i mean without using the temp array )
temp contains the path from root to current node, i.e. the set of nodes visited while walking down the tree to arrive at the current node.
If you have a parent pointer in each node (but I guess not), you can follow those pointers and walk up the tree to traverse the same set of nodes as temp. But this uses more memory.
You can also walk down the tree several times (pseudocode):
def updatematrix(a,node):
if node is null: return
walkDown(a.data,node.left)
walkDown(a.data,node.right)
updatematrix(a,node.left)
updatematrix(a,node.right)
def walkDown(data,node):
if node is null: return
a[node][data] = 1
walkDown(data,node.left)
walkDown(data,node.right)
Same complexity, but the pattern of memory access looks less cache friendly.
I have multiple binary trees stored as an array. In each slot is either nil (or null; pick your language) or a fixed tuple storing two numbers: the indices of the two "children". No node will have only one child -- it's either none or two.
Think of each slot as a binary node that only stores pointers to its children, and no inherent value.
Take this system of binary trees:
0 1
/ \ / \
2 3 4 5
/ \ / \
6 7 8 9
/ \
10 11
The associated array would be:
0 1 2 3 4 5 6 7 8 9 10 11
[ [2,3] , [4,5] , [6,7] , nil , nil , [8,9] , nil , [10,11] , nil , nil , nil , nil ]
I've already written simple functions to find direct parents of nodes (simply by searching from the front until there is a node that contains the child)
Furthermore, let us say that at relevant times, both all trees are anywhere between a few to a few thousand levels deep.
I'd like to find a function
P(m,n)
to find the lowest common ancestor of m and n -- to put more formally, the LCA is defined as the "lowest", or deepest node in which have m and n as descendants (children, or children of children, etc.). If there is none, a nil would be a valid return.
Some examples, given our given tree:
P( 6,11) # => 2
P( 3,10) # => 0
P( 8, 6) # => nil
P( 2,11) # => 2
The main method I've been able to find is one that uses an Euler trace, which turns the given tree (Adding node A as the invisible parent of 0 and 1, with a "value" of -1), into:
A-0-2-6-2-7-10-7-11-7-2-0-3-0-A-1-4-1-5-8-5-9-5-1-A
And from that, simply find the node between your given m and n that has the lowest number; For example, to find P(6,11), look for a 6 and an 11 on the trace. The number between them that is the lowest is 2, and that's your answer. If A (-1) is in between them, return nil.
-- Calculating P(6,11) --
A-0-2-6-2-7-10-7-11-7-2-0-3-0-A-1-4-1-5-8-5-9-5-1-A
^ ^ ^
| | |
m lowest n
Unfortunately, I do believe that finding the Euler trace of a tree that can be several thousands of levels deep is a bit machine-taxing...and because my tree is constantly being changed throughout the course of the programming, every time I wanted to find the LCA, I'd have to re-calculate the Euler trace and hold it in memory every time.
Is there a more memory efficient way, given the framework I'm using? One that maybe iterates upwards? One way I could think of would be the "count" the generation/depth of both nodes, and climb the lowest node until it matched the depth of the highest, and increment both until they find someone similar.
But that'd involve climbing up from level, say, 3025, back to 0, twice, to count the generation, and using a terribly inefficient climbing-up algorithm in the first place, and then re-climbing back up.
Are there any other better ways?
Clarifications
In the way this system is built, every child will have a number greater than their parents.
This does not guarantee that if n is in generation X, there are no nodes in generation (X-1) that are greater than n. For example:
0
/ \
/ \
/ \
1 2 6
/ \ / \ / \
2 3 9 10 7 8
/ \ / \
4 5 11 12
is a valid tree system.
Also, an artifact of the way the trees are built are that the two immediate children of the same parent will always be consecutively numbered.
Are the nodes in order like in your example where the children have a larger id than the parent? If so, you might be able to do something similar to a merge sort to find them.. for your example, the parent tree of 6 and 11 are:
6 -> 2 -> 0
11 -> 7 -> 2 -> 0
So perhaps the algorithm would be:
left = left_start
right = right_start
while left > 0 and right > 0
if left = right
return left
else if left > right
left = parent(left)
else
right = parent(right)
Which would run as:
left right
---- -----
6 11 (right -> 7)
6 7 (right -> 2)
6 2 (left -> 2)
2 2 (return 2)
Is this correct?
Maybe this will help: Dynamic LCA Queries on Trees.
Abstract:
Richard Cole, Ramesh Hariharan
We show how to maintain a data
structure on trees which allows for
the following operations, all in
worst-case constant time. 1. Insertion
of leaves and internal nodes. 2.
Deletion of leaves. 3. Deletion of
internal nodes with only one child. 4.
Determining the Least Common Ancestor
of any two nodes.
Conference: Symposium on Discrete
Algorithms - SODA 1999
I've solved your problem in Haskell. Assuming you know the roots of the forest, the solution takes time linear in the size of the forest and constant additional memory. You can find the full code at http://pastebin.com/ha4gqU0n.
The solution is recursive, and the main idea is that you can call a function on a subtree which returns one of four results:
The subtree contains neither m nor n.
The subtree contains m but not n.
The subtree contains n but not m.
The subtree contains both m and n, and the index of their least common ancestor is k.
A node without children may contain m, n, or neither, and you simply return the appropriate result.
If a node with index k has two children, you combine the results as follows:
join :: Int -> Result -> Result -> Result
join _ (HasBoth k) _ = HasBoth k
join _ _ (HasBoth k) = HasBoth k
join _ HasNeither r = r
join _ r HasNeither = r
join k HasLeft HasRight = HasBoth k
join k HasRight HasLeft = HasBoth k
After computing this result you have to check the index k of the node itself; if k is equal to m or n, you will "extend" the result of the join operation.
My code uses algebraic data types, but I've been careful to assume you need only the following operations:
Get the index of a node
Find out if a node is empty, and if not, find its two children
Since your question is language-agnostic I hope you'll be able to adapt my solution.
There are various performance tweaks you could put in. For example, if you find a root that has exactly one of the two nodes m and n, you can quit right away, because you know there's no common ancestor. Also, if you look at one subtree and it has the common ancestor, you can ignore the other subtree (that one I get for free using lazy evaluation).
Your question was primarily about how to save memory. If a linear-time solution is too slow, you'll probably need an auxiliary data structure. Space-for-time tradeoffs are the bane of our existence.
I think that you can simply loop backwards through the array, always replacing the higher of the two indices by its parent, until they are either equal or no further parent is found:
(defun lowest-common-ancestor (array node-index-1 node-index-2)
(cond ((or (null node-index-1)
(null node-index-2))
nil)
((= node-index-1 node-index-2)
node-index-1)
((< node-index-1 node-index-2)
(lowest-common-ancestor array
node-index-1
(find-parent array node-index-2)))
(t
(lowest-common-ancestor array
(find-parent array node-index-1)
node-index-2))))