Print all root to leaf paths with there relative positions - algorithm

Given a binary tree, how do we print the root to the leaf path, but add “_” to indicate the relative position?
Example:
Input : Root of below tree
A
/ \
B C
/ \ / \
D E F G
Output : All root to leaf paths
_ _ A
_ B
D
_ A
B
_ E
A
_ C
F
A
_ C
_ _ G

You can use preorder travel to visit the tree. Record the path with the indent.
When visit left child decrease the indent, when visit right child increase the indent. Then you are able to get the path like,
(0, A), (-1, B), (-2, D)
(0, A), (-1, B), (0, E)
...
During the output phase, normalize the path, find the smallest indent for the path node and shift the path nodes to,
(2, A), (1, B), (0, D)
(1, A), (0, B), (1, E)
...
And then print the path accordingly.
Here's the sample code in python,
def travel(node, indent, path):
if not node:
print_path(path)
return
path.append((indent, node))
if node.left:
travel(node.left, indent - 1, path)
if node.right:
travel(node.right, indent + 1, path)
del path[-1]
def print_path(path):
min_indent = abs(min([x[0] for x in path]))
for indent, node in path:
p = []
for x in xrange(min_indent + indent):
p.append('_')
p.append(node.val)
print ' '.join(p)

The idea base on print path in vertical order.
1) We do Preorder traversal of the given Binary Tree. While traversing the tree, we can recursively calculate horizontal distances or HDs. We initially pass the horizontal distance as 0 for root. For left subtree, we pass the Horizontal Distance as Horizontal distance of root minus 1. For right subtree, we pass the Horizontal Distance as Horizontal Distance of root plus 1. For every HD value, we maintain a list of nodes in a vector (” that will store information of current node horizontal distance and key value of root “).we also maintain the order of node (order in which they appear in path from root to leaf). for maintaining the order.
2) While we reach to leaf node during traverse we print that path with underscore "_"
a) First find the minimum Horizontal distance of the current path.
b) After that we traverse current path
First Print number of underscore “_” : abs (current_node_HD – minimum-HD)
Print current node value.
Do this process for all root to leaf path.

I could not get Qiang Jin's response to work quite right. Switched a few things around.
class Node:
def __init__(self, val):
self.value = val
self.right = None
self.left = None
def printTreeRelativePaths(root):
indent = 0
path = []
preOrder(root, indent, path)
def preOrder(node, indent, path):
path.append((node, indent))
if not node.left and not node.right:
processPath(path)
if node.left:
preOrder(node.left, indent - 1, path)
if node.right:
preOrder(node.right, indent + 1, path)
del path[-1]
def processPath(path):
minIndent = 0
for element in path:
if element[1] < minIndent:
minIndent = element[1]
offset = abs(minIndent)
for element in path:
print ('_' * (offset + element[1])) + element[0].value
root = Node('A')
root.left = Node('B')
root.right = Node('C')
root.left.left = Node('D')
root.left.right = Node('E')
root.right.left = Node('F')
root.right.right = Node('G')
printTreeRelativePaths(root)

Related

Binary Tree - Most elegant way to traverse from the last level to root

I am looking for an implementation that allows me to traverse through a Binary Search tree, starting from the last level from left to right to the root, e.g.:
A
B C
D E G
Should return: [D, E, G, B, C, A]. I am interested in both, a recursive approach or an iterative approach.
I'm not sure whether my solution in Python is elegant enough, but maybe it will be helpful, nevertheless.
Introduction
Let's consider an example as follows:
8
/ \
5 10
/ \ \
4 6 12
The expected output is 4, 6, 12, 5, 10, 8. But how to achieve this?
Step 1 - BFS
Let's do a BFS with a slight modification - first traverse a right child, and then a left one.
def bfs(node):
q = []
q.append(node)
while q:
current = q.pop(0)
print (current.value, end = ' ')
if current.right:
q.append(current.right)
if current.left:
q.append(current.left)
The output is as follows:
8, 10, 5, 12, 6, 4
The output is basically a reverse of the expected output!
Step 2 - Reverse BFS output
To do this, introduce a stack variable that saves the current element of the queue.
def bfsFromBottomToTop(node):
q = []
q.append(node)
st = [] # create a stack variable
while q:
current = q.pop(0)
st.append(current.value) # push the current element to the stack
if current.right:
q.append(current.right)
if current.left:
q.append(current.left)
Then, you can pop all elements off the stack at the end of the method as below:
...
while st:
print(st.pop(), end = ' ')
...
4 6 12 5 10 8
Full Code
Here's the full code that can be used for trying it out yourself.
class Node:
def __init__(self, value):
self.left = None
self.right = None
self.value = value
def insert(node, value):
if node is None:
return Node(value)
if node.value > value:
node.left = insert(node.left, value)
else:
node.right = insert(node.right, value)
return node
def bfsFromBottomToTop(node):
q = []
q.append(node)
st = []
while q:
current = q.pop(0)
st.append(current.value)
if current.right:
q.append(current.right)
if current.left:
q.append(current.left)
while st:
print(st.pop(), end = ' ')
root = Node(8)
insert(root, 5)
insert(root, 10)
insert(root, 6)
insert(root, 4)
insert(root, 12)
bfsFromBottomToTop(root)

print all paths from root to leaves n-ary tree

I am trying to print all paths from root to all leaves in n-ary tree. This code prints the paths to the leaves, but it also prints subpaths too.
For example, let's say one path is 1-5-7-11. It prints 1-5-7-11, but it also prints 1-5-7, 1-5, so on.
How can I avoid this printing subpaths ?
Here is my code in matlab
Thanks
stack=java.util.Stack();
stack.push(0);
CP = [];
Q = [];
labels = ones(1,size(output.vertices,2));
while ~stack.empty()
x = stack.peek();
for e = 1:size(output.edges,2)
if output.edges{e}(1) == x && labels(output.edges{e}(2)+1) == 1
w = output.edges{e}(2);
stack.push(w);
CP = union(CP,w);
break
end
end
if e == size(output.edges,2)
Q = [];
for v=1:size(CP,2)
Q = union(Q,CP(v));
end
disp(Q)
stack.pop();
labels(x+1) = 0;
CP = CP(find(CP~=x));
end
end
Let's split the problem in two parts.
1. Find all leaf-nodes in a tree
input: Tree (T), with nodes N
output: subset of N (L), such that each node in L is a leaf
initialize an empty stack
push the root node on the stack
while the stack is not empty
do
pop a node from the stack, call it M
if M is a leaf, add it to L
if M is not a leaf, push all its children on the stack
done
2. Given a leaf, find its path to the root
input: leaf node L
output: a path from L to R, with R being the root of the tree
initialize an empty list (P)
while L is not the root of the tree
do
append L to the list
L = parent of L
done
return P

Find algorithm : Reconstruct a sequence with the minimum length combination of disjointed subsequences chosen from a list of subsequences

I do not know if it’s appropriate to ask this question here so sorry if it is not.
I got a sequence ALPHA, for example :
A B D Z A B X
I got a list of subsequences of ALPHA, for example :
A B D
B D
A B
D Z
A
B
D
Z
X
I search an algorithm that find the minimum length of disjointed subsequences that reconstruct ALPHA, for example in our case :
{A B D} {Z} {A B} {X}
Any ideas? My guess is something already exists.
You can transform this problem into finding a minimum path in a graph.
The nodes will correspond to prefixes of the string, including one for the empty string. There will be an edge from a node A to a node B if there is an allowed sub-sequence that, when appended to the string prefix A, the result is the string prefix B.
The question is now transformed into finding the minimum path in the graph starting from the node corresponding to the empty string, and ending in the node corresponding to the entire input string.
You can now apply e.g. BFS (since the edges have uniform costs), or Dijkstra's algorithm to find this path.
The following python code is an implementation based on the principles above:
def reconstruct(seq, subseqs):
n = len(seq)
d = dict()
for subseq in subseqs:
d[subseq] = True
# in this solution, the node with value v will correspond
# to the substring seq[0: v]. Thus node 0 corresponds to the empty string
# and node n corresponds to the entire string
# this will keep track of the predecessor for each node
predecessors = [-1] * (n + 1)
reached = [False] * (n + 1)
reached[0] = True
# initialize the queue and add the first node
# (the node corresponding to the empty string)
q = []
qstart = 0
q.append(0)
while True:
# test if we already found a solution
if reached[n]:
break
# test if the queue is empty
if qstart > len(q):
break
# poll the first value from the queue
v = q[qstart]
qstart += 1
# try appending a subsequence to the current node
for n2 in range (1, n - v + 1):
# the destination node was already added into the queue
if reached[v + n2]:
continue
if seq[v: (v + n2)] in d:
q.append(v + n2)
predecessors[v + n2] = v
reached[v + n2] = True
if not reached[n]:
return []
# reconstruct the path, starting from the last node
pos = n
solution = []
while pos > 0:
solution.append(seq[predecessors[pos]: pos])
pos = predecessors[pos]
solution.reverse()
return solution
print reconstruct("ABDZABX", ["ABD", "BD", "AB", "DZ", "A", "B", "D", "Z", "X"])
I don't have much experience with python, that's the main reason why I preferred to stick to the basics (e.g. implementing a queue with a list + an index to the start).

Improving the time complexity of DFS using recursion such that each node only works with its descendants

Problem
There is a perfectly balanced m-ary tree that is n levels deep. Each inner node has exactly m child nodes. The root is said to be at depth 0 and the leaf nodes are said to be at level n, so there are exactly n ancestors of every leaf node. Therefore, the total number of nodes in the tree is:
T = 1 + m^2 + ... + m^n
= (m^(n+1) - 1) / (m - 1)
Here is an example with m = 3 and n = 2.
a (depth 0)
_________|________
| | |
b c d (depth 1)
___|___ ___|___ ___|___
| | | | | | | | |
e f g h i j k l m (depth 2)
I am writing a depth first search function to traverse the entire tree in deepest node first and leftmost node first manner, and insert the value of each node to an output list.
I wrote this function in two different ways and want to compare the time complexity of both functions.
Although this question is language agnostic, I am using Python code below to show my functions because Python code looks almost like pseudocode.
Solutions
The first function is dfs1. It accepts the root node as node argument and an empty output list as output argument. The function descends into the tree recursively, visits every node and appends the value of the node to the output list.
def dfs1(node, output):
"""Visit each node (DFS) and place its value in output list."""
output.append(node.value)
for child_node in node.children:
dfs1(child_node, output)
The second function is dfs2. It accepts the root node as node argument but does not accept any list argument. The function descends into the tree recursively. At every level of recursion, on visiting every node, it creates a list with the value of the current node and all its descendants and returns this list to the caller.
def dfs2(node):
"""Visit nodes (DFS) and return list of values of visited nodes."""
output = [node.value]
for child_node in node.children:
for s in dfs2(child_node):
output.append(s)
return output
Analysis
There are two variables that define the problem size.
m -- The number of child nodes each child node has.
n -- The number of ancestors each leaf node has (height of the tree).
In dfs1, O(1) time is spent while visiting each node, so the total time spent in visiting all nodes is
O(1 + m + m^2 + ... + m^n).
I don't bother about simplifying this expression further.
In dfs2, the time spent while visiting each node is directly proportional to all leaf nodes reachable from that node. In other words, the time spent while visiting a node at depth d is O(m^(n - d)). Therefore, the total spent time in visiting all nodes is
1 * O(m^n) + m * O(m^(n - 1)) + m^2 * O(m^(n - 2)) + ... + m^n * O(1)
= (n + 1) * O(m^n)
Question
Is it possible to write dfs2 in such a manner that its time complexity is
O(1 + m + m^2 + ... + m^n)
without changing the essence of the algorithm, i.e. each node only creates an output list for itself and all its descendants, and does not have to bother with a list that may have values of its ancestors?
Complete working code for reference
Here is a complete Python code that demonstrates the above functions.
class Node:
def __init__(self, value):
"""Initialize current node with a value."""
self.value = value
self.children = []
def add(self, node):
"""Add a new node as a child to current node."""
self.children.append(node)
def make_tree():
"""Create a perfectly balanced m-ary tree with depth n.
(m = 3 and n = 2)
1 (depth 0)
_________|________
| | |
2 3 4 (depth 1)
___|___ ___|___ ___|___
| | | | | | | | |
5 6 7 8 9 10 11 12 13 (depth 2)
"""
# Create the nodes
a = Node( 1);
b = Node( 2); c = Node( 3); d = Node( 4)
e = Node( 5); f = Node( 6); g = Node( 7);
h = Node( 8); i = Node( 9); j = Node(10);
k = Node(11); l = Node(12); m = Node(13)
# Create the tree out of the nodes
a.add(b); a.add(c); a.add(d)
b.add(e); b.add(f); b.add(g)
c.add(h); c.add(i); c.add(j)
d.add(k); d.add(l); d.add(m)
# Return the root node
return a
def dfs1(node, output):
"""Visit each node (DFS) and place its value in output list."""
output.append(node.value)
for child_node in node.children:
dfs1(child_node, output)
def dfs2(node):
"""Visit nodes (DFS) and return list of values of visited nodes."""
output = [node.value]
for child_node in node.children:
for s in dfs2(child_node):
output.append(s)
return output
a = make_tree()
output = []
dfs1(a, output)
print(output)
output = dfs2(a)
print(output)
Both dfs1 and dfs2 functions produce the same output.
['a', 'b', 'e', 'f', 'g', 'c', 'h', 'i', 'j', 'd', 'k', 'l', 'm']
['a', 'b', 'e', 'f', 'g', 'c', 'h', 'i', 'j', 'd', 'k', 'l', 'm']
If in dfs1 output list is passed by reference, then complexity of ds1 is O(total nodes).
Whereas, in dfs2 output list is returned and appended to parent's output list, thus taking O(size of list) for each return. Hence increasing overall complexity. You can avoid this overhead if both your append and returning of output list takes constant time.
This can be done if your output list is "doubly ended linked list". Hence you can return reference of output list and instead of append you can concatenate two doubly ended linked list (which is O(1)).

Best way to find the most costly path in graph

I have a directed acyclic graph on which every vertex has a weight >= 0. There is a vertex who is the "start" of the graph and another vertex who is the "end" of the graph. The idea is to find the path from the start to the end whose sum of the weights of the vertices is the greater. For example, I have the next graph:
I(0) -> V1(3) -> F(0)
I(0) -> V1(3) -> V2(1) -> F(0)
I(0) -> V3(0.5) -> V2(1) -> F(0)
The most costly path would be I(0) -> V1(3) -> V2(1) -> F(0), which cost is 4.
Right now, I am using BFS to just enumerate every path from I to F as in the example above, and then, I choose the one with the greatest sum. I am afraid this method can be really naive.
Is there a better algorithm to do this? Can this problem be reduced to another one?
Since your graph has no cycles* , you can negate the weights of your edges, and run Bellman-Ford's algorithm.
* Shortest path algorithms such as Floyd-Warshall and Bellman-Ford do not work on graphs with negative cycles, because you can build a path of arbitrarily small weight by staying in a negative cycle.
You can perform a topological sort, then iterate through the list of vertices returned by the topological sort, from the start vertex to the end vertex and compute the costs. For each directed edge of the current vertex check if you can improve the cost of destination vertex, then move to the next one. At the end cost[end_vertex] will contain the result.
class grph:
def __init__(self):
self.no_nodes = 0
self.a = []
def build(self, path):
file = open(path, "r")
package = file.readline()
self.no_nodes = int(package[0])
self.a = []
for i in range(self.no_nodes):
self.a.append([10000] * self.no_nodes)
for line in file:
package = line.split(' ')
source = int(package[0])
target = int(package[1])
cost = int(package[2])
self.a[source][target] = cost
file.close()
def tarjan(graph):
visited = [0] * graph.no_nodes
path = []
for i in range(graph.no_nodes):
if visited[i] == 0:
if not dfs(graph, visited, path, i):
return []
return path[::-1]
def dfs(graph, visited, path, start):
visited[start] = 1
for i in range(graph.no_nodes):
if graph.a[start][i] != 10000:
if visited[i] == 1:
return False
elif visited[i] == 0:
visited[i] = 1
if not dfs(graph, visited, path, i):
return False
visited[start] = 2
path.append(start)
return True
def lw(graph, start, end):
topological_sort = tarjan(graph)
costs = [0] * graph.no_nodes
i = 0
while i < len(topological_sort) and topological_sort[i] != start:
i += 1
while i < len(topological_sort) and topological_sort[i] != end:
vertex = topological_sort[i]
for destination in range(graph.no_nodes):
if graph.a[vertex][destination] != 10000:
new_cost = costs[vertex] + graph.a[vertex][destination]
if new_cost > costs[destination]:
costs[destination] = new_cost
i += 1
return costs[end]
Input file:
6
0 1 6
1 2 2
3 0 10
1 4 4
2 5 9
4 2 2
0 2 10
In general longest path problem is NP-hard, but since the graph is a DAG, it can be solved by first negating the weights then do a shortest path. See here.
Because the weights reside on the vertices, before computing, you might want to move the weights to the in edges of the vertices.

Resources