In a generic tree represented by the common node structure having parent and child pointers, how can one find a list of all paths that have no overlapping edges with each other and terminate with a leaf node.
For example, given a tree like this:
1
/ | \
2 3 4
/ \ | / \
5 6 7 8 9
The desired output would be a list of paths as follows:
1 2 1 1 4
| | | | |
2 6 3 4 9
| | |
5 7 8
Or in list form:
[[1, 2, 5], [2, 6], [1, 3, 7], [1, 4, 8], [4, 9]]
Obviously the path lists themselves and their order can vary based on the order of processing of tree branches. For example, the following is another possible solution if we process left branches first:
[[1, 4, 9], [4, 8], [1, 3, 7], [1, 2, 6], [2, 5]]
For the sake of this question, no specific order is required.
You can use a recursive DFS algorithm with some modifications.
You didn't say what language you use, so, I hope that C# is OK for you.
Let's define a class for our tree node:
public class Node
{
public int Id;
public bool UsedOnce = false;
public bool Visited = false;
public Node[] Children;
}
Take a look at UsedOnce variable - it can look pretty ambigious.
UsedOnce equals to true if this node has been used once in an output. Since we have a tree, it also means that an edge from this node to its parent has been used once in an output (in a tree, every node has only one parent edge which is the edge to its parent). Read this carefully to not become confused in future.
Here we have a simple, basic depth-first search algorithm implementation.
All the magic will be covered in an output method.
List<Node> currentPath = new List<Node>(); // list of visited nodes
public void DFS(Node node)
{
if (node.Children.Length == 0) // if it is a leaf (no children) - output
{
OutputAndMarkAsUsedOnce(); // Here goes the magic...
return;
}
foreach (var child in node.Children)
{
if (!child.Visited) // for every not visited children call DFS
{
child.Visited = true;
currentPath.Add(child);
DFS(child);
currentPath.Remove(child);
child.Visited = false;
}
}
}
If OutputAndMarkedAsUsedOnce just outputed a currentPath contents, then we would have a plain DFS output like this:
1 2 5
1 2 6
1 3 7
1 4 8
1 4 9
Now, we need to use our UsedOnce. Let's find the last used-once-node (which has already been in an output) in current path and output all the path from this node inclusively. It is guaranteed that such node exists because, at least the last node in a path has never been met before and couldn't be marked as used once.
For instance, if the current path is "1 2 3 4 5" and 1, 2, 3 are marked as used once - then output "3 4 5".
In your example:
We are at "1 2 5". All of them are unused, output "1 2 5" and mark 1, 2, 5 as used once
Now, we are at "1 2 6". 1, 2 are used - 2 is the last one. Output from 2 inclusively, "2 6", mark 2 and 6 as used.
Now we are at "1 3 7", 1 is used, the only and the last. Output from 1 inclusively, "1 3 7". Mark 1, 3, 7 as used.
Now we are at "1 4 8". 1 is used, the only and the last. Output "1 4 8".
Now we are at "1 4 9". 1, 4 are used. Output from 4 - "4 9".
It works because in a tree "used node" means "used (the only parent) edge between it and its parent". So, we actually mark used edges and do not output them again.
For example, when we mark 2, 5 as used - it means that we mark edges 1-2 and 2-5. Then, when we go for "1 2 6" - we don't output edges "1-2" because it is used, but output "2-6".
Marking root node (node 1) as used once doesn't affect the output because its value is never checked. It has a physical explanation - root node has no parent edge.
Sorry for a poor explanation. It is pretty difficult to explain an algorithm on trees without drawing :) Feel free to ask any questions concerning algorithms or C#.
Here is the working IDEOne demo.
P.S. This code is, probably, not a good and proper C# code (avoided auto-properties, avoided LINQ) in order to make it understandable to other coders.
Of course, this algorithm is not perfect - we can remove currentPath because in a tree the path is easily recoverable; we can improve output; we can encapsulate this algorithm in a class. I just have tried to show the common solution.
This is a tree. The other solutions probably work but are unnecessarily complicated. Represent a tree structure in Python.
class Node:
def __init__(self, label, children):
self.label = label
self.children = children
Then the tree
1
/ \
2 3
/ \
4 5
is Node(1, [Node(2, []), Node(3, [Node(4, []), Node(5, [])])]). Make a recursive procedure as follows. We guarantee that the root appears in the first path.
def disjointpaths(node):
if node.children:
paths = []
for child in node.children:
childpaths = disjointpaths(child)
childpaths[0].insert(0, node.label)
paths.extend(childpaths)
return paths
else:
return [[node.label]]
This can be optimized (first target: stop inserting at the front of a list).
For all vertices, if the vertice is leaf (has no child pointers), go through the parent chain until you find a marked vertice or vertice with no parent. Mark all visited vertices. Collect the vertices to the intermediate list, then reverse it and add to the result.
If you cannot add a mark to the vertice object itself, you may implement the marking as a separate set of visited vertices and consider all the vertices added to the set as marked.
This can be very easily accomplished using DFS.
We call the DFS from root.
DFS(root,list)
where the list initially contains
list = {root}
Now the algorithm is as follows:
DFS(ptr,list)
{
if(ptr is a leaf)
print the list and return
else
{
for ith children of ptr do
{
if(ptr is root)
{
add the child to list
DFS(ith child of ptr,list)
remove the added child
}
else if(i equals 1 that is first child)
{
add the child to list
DFS(ith child of ptr,list)
}
else
{
initialize a new empty list list2
add ith child and the ptr node to list2
DFS(ith child of ptr,list2)
}
}
}
}
Related
I transformed a graph with cycles and multiple parents to XML such that I can use XQuery on it.
The graph is on the left and the XML-tree is on the right.
I transformed the graph by writing down all child nodes from the first node (node 1) and repeat that on the returned nodes until no more children exist or a node has already been visited (like node 2).
Further more, I added the constraint, that all nodes with the same number have to be selected, if one of them is selected. (For example, if node 2 (child of 1) is selected, then we also have to select node 2 (child of 6) in the XML-tree.)
The operations I can use on the graph are: getPatents, getChildren, readValue(node).
In the graph, all information is stored in the node, and in the XML-tree all Information of a node is stored as attributes.
My Question: I want to synchronize both structures, such that I can apply an axis like ancestor (or descendant) on the graph and on the XML-tree and get the same result.(I can parse the graph with Python and the XML-tree with XQuery)
My Problem: If I select node 8 on the graph and apply the ancestor function, it'll return: 4, 5, 2, 1, 6, 3 (6 and 3 because of the cycle).
The ancestor axis on the XML-tree would return (we have to select both 8s): 4, 5, 2, 1 (the second 2, (child of 6) would also be selected due to the constraint, but not node 6 and 3).
My Solution: Changing the ancestor axis such that it returns all parents of the selected nodes, then applies the constraint and then selects again all parents and so on. But this solution seems to be very complicated and inefficient. Is there any better way?
Thanks for your help
I think it is not that easy to solve that for that particular format and with XSLT/XQuery/XPath as the document order imposed by most step or except or intersect or the arbitrary order XQuery grouping gives make it hard to establish the nodes you want and in the order they are traversed, the easiest I could come up with is
declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";
declare option output:method 'text';
declare option output:item-separator ', ';
declare variable $main-root := /;
declare function local:eliminate-duplicates($nodes as node()*) as node()*
{
for $node at $p in $nodes
group by $id := generate-id($node)
order by head($p)
return head($node)
};
declare function local:get-parents($nodes as element(node)*, $collected as element(node)*) as element(node)*
{
let $new-parents :=
for $p in local:eliminate-duplicates($nodes ! ..)
return $main-root//node[#value = $p/#value][not(. intersect $collected)]
return
if ($new-parents)
then local:get-parents($new-parents, ($collected, $new-parents))
else $collected
};
local:get-parents(//node[#value = 8], ()) ! #value ! string()
https://xqueryfiddle.liberty-development.net/gWmuPs8 gives 4, 5, 2, 2, 1, 6, 3.
How efficient that works will partly depend on any index used for the node[#value = $p/#value] comparison, in XSLT you could ensure that with a key (https://xsltfiddle.liberty-development.net/aiyneS), in database oriented XQuery processors probably with an attribute based index.
Given an N-ary tree, I have to generate all the leaf to leaf paths in an n-array tree. The path should also denote the direction. As an example:
Tree:
1
/ \
2 6
/ \
3 4
/
5
Paths:
5 UP 3 UP 2 DOWN 4
4 UP 2 UP 1 DOWN 6
5 UP 3 UP 2 UP 1 DOWN 6
These paths can be in any order, but all paths need to be generated.
I kind of see the pattern:
looks like I have to do in order traversal and
need to save what I have seen so far.
However, can't really come up with an actual working algorithm.
Can anyone nudge me to the correct algorithm?
I am not looking for the actual implementation, just the pseudo code and the conceptual idea would be much appreciated.
The first thing I would do is to perform in-order traversal. As a result of this, we will accumulate all the leaves in the order from the leftmost to the rightmost nodes.(in you case this would be [5,4,6])
Along the way, I would certainly find the mapping between nodes and its parents so that we can perform dfs later. We can keep this mapping in HashMap(or its analogue). Apart from this, we will need to have the mapping between nodes and its priorities which we can compute from the result of the in-order traversal. In your example the in-order would be [5,3,2,4,1,6] and the list of priorities would be [0,1,2,3,4,5] respectively.
Here I assume that our node looks like(we may not have the mapping node -> parent a priori):
class TreeNode {
int val;
TreeNode[] nodes;
TreeNode(int x) {
val = x;
}
}
If we have n leaves, then we need to find n * (n - 1) / 2 paths. Obviously, if we have managed to find a path from leaf A to leaf B, then we can easily calculate the path from B to A. (by transforming UP -> DOWN and vice versa)
Then we start traversing over the array of leaves we computed earlier. For each leaf in the array we should be looking for paths to leaves which are situated to the right of the current one. (since we have already found the paths from the leftmost nodes to the current leaf)
To perform the dfs search, we should be going upwards and for each encountered node check whether we can go to its children. We should NOT go to a child whose priority is less than the priority of the current leaf. (doing so will lead us to the paths we already have) In addition to this, we should not visit nodes we have already visited along the way.
As we are performing dfs from some node, we can maintain a certain structure to keep the nodes(for instance, StringBuilder if you program in Java) we have come across so far. In our case, if we have reached leaf 4 from leaf 5, we accumulate the path = 5 UP 3 UP 2 DOWN 4. Since we have reached a leaf, we can discard the last visited node and proceed with dfs and the path = 5 UP 3 UP 2.
There might be a more advanced technique for solving this problem, but I think it is a good starting point. I hope this approach will help you out.
I didn't manage to create a solution without programming it out in Python. UNDER THE ASSUMPTION that I didn't overlook a corner case, my attempt goes like this:
In a depth-first search every node receives the down-paths, emits them (plus itself) if the node is a leaf or passes the down-paths to its children - the only thing to consider is that a leaf node is a starting point of a up-path, so these are input from the left to right children as well as returned to the parent node.
def print_leaf2leaf(root, path_down):
for st in path_down:
st.append(root)
if all([x is None for x in root.children]):
for st in path_down:
for n in st: print(n.d,end=" ")
print()
path_up = [[root]]
else:
path_up = []
for child in root.children:
path_up += child is not None and [st+[root] for st in print_root2root(child, path_down + path_up)] or []
for st in path_down:
st.pop()
return path_up
class node:
def __init__(self,d,*children):
self.d = d
self.children = children
## 1
## / \
## 2 6
## / \ /
## 3 4 7
## / / | \
## 5 8 9 10
five = node(5)
three = node(3,five)
four = node(4)
two = node(2,three,four)
eight = node(8)
nine = node(9)
ten = node(10)
seven = node(7,eight,nine,ten)
six = node(6,None,seven)
one = node(1,two,six)
print_leaf2leaf(one,[])
A BST is generated (by successive insertion of nodes) from each permutation of keys from the set
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}.
How many permutations determine trees of height three?
The number of permutations of nodes you have to check is 11! = 39,916,800, so you could just write a program to brute-force this. Here's a skeleton of one, written in C++:
vector<int> values = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11};
unsigned numSuccesses = 0;
do {
if (bstHeightOf(values) == 3) values++;
} while (next_permutation(values.begin(), values.end());
Here, you just need to write the bstHeightOf function, which computes the height of a BST formed by inserting the given nodes in the specified order. I'll leave this as an exercise.
You can prune down the search space a bunch by using these observations:
The maximum number of nodes in a BST of height 2 is 7.
The root can't be 1, 2, 3, 9, 10, or 11, because if it were, one subtree would have more than 7 nodes in it and therefore the overall tree would have height greater than three.
Given that you know the possible roots, one option would be to generate all BSTs with the keys {1, 2, 3, ..., 11} (not by listing off all orderings, but by listing off all trees), filter it down just to the set of nodes with height 3, and then use this recursive algorithm to count the number of ways each tree can be built by inserting values. This would probably be significantly faster than the above approach, since the number of trees to check is much lower than the number of orderings and each tree can be checked in linear time.
Hope this helps!
An alternative to templatetypdef's answer that might be more tricky but can be done completely by hand.
Consider the complete binary tree of height 3: it has 15 nodes. You're looking for trees with 11 nodes; that means that four of those 15 nodes are missing. The patterns in which these missing nodes can occur can be enumerated with fairly little effort. (Hint: I did this by dividing the patterns into two groups.) This will give you all the shapes of trees of height 3 with 11 nodes.
Once you've done this, you just need to reason about the relationship between these tree shapes and the actual trees you're looking for. (Hint: this relationship is extremely simple - don't overthink it.)
This allows you to enumerate the resulting trees that satisfy the requirements. If you get to 96, you have the same result as I do. For each of these trees, we now need to find how many permutations give rise to that tree.
This part is the tricky part; you might now need to split these trees up into smaller groups for which you know, by symmetry, that the number of permutations that gives rise to that tree is the same for all trees in a group. For example,
6
/ \
/ \
3 8
/ \ / \
2 5 7 10
/ / / \
1 4 9 11
is going to have the same number of permutations that give rise to it as
6
/ \
/ \
4 9
/ \ / \
2 5 7 11
/ \ \ /
1 3 8 10
You'll also need to find out how many trees occur in each group; the class of this example contains 16 trees. (Hint: I split them up into 7 groups of between 2 and 32 trees.) Now you'll need to find the number of permutations that give rise to such a tree, for each group. You can determine this "recursively", still on paper; for the class containing the two example trees above, I get 12096 permutations. Since that class contains 16 trees, the total number of permutations leading to such a tree is 16*12069 = 193536. Do the same for the six other classes and add the numbers up to get the total.
If any particular part of this solution has you stumped or anything is unclear, don't hesitate to ask!
Since this site is about programming, I'll provide code to determine this. We can use a backtracking algorithm, that backtracks as soon as the height constraint is violated.
We can implement the BST as a flat array, where the children of a node at index k are stored at indices 2*k and 2*k + 1. The root is at index 1. Index 0 is not used. When an index is not occupied we can store a special value there, like -1.
The algorithm is quite brute force, and on my laptop it takes about a 1.5 seconds to complete:
function insert(tree, value) {
let k = 1;
while (k < tree.length) {
if (tree[k] == -1) {
tree[k] = value;
return k;
}
k = 2*k + (value > tree[k] ? 1 : 0);
}
return -1;
}
function populate(tree, values) {
if (values.length == 0) return 1; // All values were inserted! Count this permutation
let count = 0;
for (let i = 0; i < values.length; i++) {
let value = values[i]
let node = insert(tree, value);
if (node >= 0) { // Height is OK
values.splice(i, 1); // Remove this value from remaining values
count += populate(tree, values);
values.splice(i, 0, value); // Backtrack
tree[node] = -1; // Free the node
}
}
return count;
}
function countTrees(n) {
// Create an empty tree as flat array of height 3,
// and provide n unique values to insert
return populate(Array(16).fill(-1), [...Array(n).keys()]);
}
console.log(countTrees(11));
Output: 1056000
Need to define a seek(u,v) function, where u is the new node within the tree (the node where I want to start searching), and v is the number of descendants below the new node, and this function would return index of highest key value. The tree doesn't have a be a BST, there can be nodes with many many children. Example:
input:
5 // tree with 5 nodes
1 3 5 2 7 // the nodes' keys
1 2 // 1 and 2 are linked
2 3 // 2 and 3 are linked
1 4 // 1 and 4 are linked
3 5 // 3 and 5 are linked
4 // # of seek() requests
2 3 // index 2 of tree, which would be key 5, 3 descendants from index 3
4 1 // index 4 of tree, but only return highest from index 4 to 4 (it would
// return itself)
3 2 // index 3, next 2 descendants
3 2 // same
output:
5 // Returned index 5 because the 7 is the highest key from array[3 'til 5]
4 // Returned index 4 because the range is one, plus 4's children are null
5 // Returned index 5 because the 7 is the highest key from array[4 'til 5]
5 // Same as prior line
I was thinking about putting the new root into a new Red Black Tree, but can't find a way to efficiently save successor or predecessor information for each node. Also thinking about putting into an array, but due to the nature of an unbalanced and unsorted tree, it doesn't guarantee that my tree would be sorted, plus because it's not a BST i can't perform an inorder tree walk. Any suggestions as to how I can get the highest key from a specific range?
I dont understand very well what you mean by : "the number of descendants below the new node". The way you say it, it implies there is a some sort of imposed tree walk, or at least an order in which you have to visit the nodes. In that case it would be best to explain more thoroughly what you mean.
In the rest of the answer I assume you mean distance from u.
From a pure algorithmic point of view, since you cannot assume anything about your tree, you have to visit all concerned vertices of the graph (i.e vertices at a distance <= v from u) to get your result. It means any partial tree traversal (such as depth-first or breadth-First) should be enough and necessary (since you have to visit all concerned nodes below u), since the order in which we visit the nodes doesn't matter.
If you can, it's simpler to use a recursive function seek'(u,v) which return a couple (index, key) defined as follows :
if v > 1, you define seek'(u,v) as the couple which maximizes its second component among the couples (u, key(u)) and seek(w,v-1) for w son of u.
else (v = 1) you define seek'(u,v) as (u, key(u))
You then have seek(u,v) = first(seek'(u,v)).
All of what I said presumes you have built a tree from the input, or that you can easily get the key of a node and its sons from its index.
Consider the following array, which is claimed to have represented a binary tree:
[1, 2, 5, 6, -1, 8, 11]
Given that the index with value -1 indicates the root element, I've below questions:
a) How is this actually represented?
Should we follow below formulae (source from this link) to figure out the tree?
Three simple formulae allow you to go from the index of the parent to the index of its children and vice versa:
* if index(parent) = N, index(left child) = 2*N+1
* if index(parent) = N, index(right child) = 2*N+2
* if index(child) = N, index(parent) = (N-1)/2 (integer division with truncation)
If we use above formulae, then index(root) = 3, index(left child) = 7, which doesn't exist.
b) Is it important to know whether it's a complete binary tree or not?
N=0 must be the root node since by the rules listed, it has no parent. 0 cannot be created from either of the expressions (2*N + 1) or (2*N + 2), assuming no negative N.
Note, index is not the value stored in the array, it is the place in the array.
For [1, 2, 5, 6, -1, 8, 11]
Index 0 = 1
Index 1 = 2
Index 2 = 5, etc.
If it is a complete tree, then -1 is a valid value and tree is
1
/ \
2 5
/ \ / \
6 -1 8 11
-1 could also be a "NULL" pointer, indicating no value exists at that node.
So the Tree would look like
1
/ \
2 5
/ / \
6 8 11
Given an array, you could think of any number of ways how could that array represent a binary tree. So there is no way to know, you have to go to the source of that array (whatever that is).
One of those ways is the way binary heap is usually represented, as per your link. If this was the representation used, -1 would not be the root element. And the node at position 3 would have no children, i.e. it would be a leaf.
And, yeah, it's probably important to know whether it's supposed to be a complete tree or not.
In general, you shouldn't try to figure out what does some data mean like this. You should be given documentation or the source code that uses the data. If you don't have that and you really need to reverse-engineer it, you most likely need to know more about the data. Observing the behavior of the code that uses it should help you. Or decompiling the code.
It may not be a complete binary tree, but it may not be an arbitrary one either. You can represent a tree in which at most a few of the rightmost few leaves are missing (or, if you exchange the convention for left and right children, at most a few of the leftmost leaves missing).
You can't represent this in your array:
A
/ \
B C
/ /
D E
But you can represent this
A
/ \
B C
/ \
D E
or this:
A
/ \
B C
/ \
D E
(for the last, have 2k+1 be the right child and 2k+2 the left child)
You only need to know to number of nodes in the three.