Directed tree mutation algorithms - algorithm

I came up with a problem o detecting "mutations" between two directed trees.
Example:
tree1:
A
/ \
B C - D
/ \ / \ \
G A 2 A 3
| \ |\
1 3 2 3
tree2:
A
/ \
B C - F
/ \ / \
G A 2 3
| \ |\
1 3 2 3
The algorithm should find that there is a mutation with
R
|
C - D
|\ \
X Y Z
Subsituted with
R
|
C - D
| \
X Z
Where R, Y and Z are the respective values
I am looking for any ideas, which might be:
link to algorithm or book with some algorithms
pseaudocode
code in any language (preferably python)
library in any language (preferably Python)

Have you looked at any tree difference problems?
Most tree diff problems produce a list of changes (e.g. insertion, deletion, moving, and relabelling of nodes) rather than a template subtree, but they might give you a starting place.

Related

Number of binary trees that have the given inorder sequence

I got this question in test series:
Five nodes labeled P, Q, R, S, T are used to construct a binary tree. Determine the number of distinct binary trees that can be formed such that each of those in-order traversal gives P, Q, R, S, T.
I do not know the exact answer. The solution they have given was incorrect. How to solve such problems?
The number of ways to construct a binary (search) tree with 𝑛 values -- such that the inorder traversal gives them in their proper order -- is the same as the number of binary tree shapes you can make with 𝑛 nodes (so here the values play no role).
This is the case because for every shape of binary tree there is exactly one way to label the nodes with 𝑛 values such that the inorder traversal gives the desired order.
The number of shapes of a binary tree with 𝑛 nodes is the Catalan number for the same 𝑛, quoting Wikipedia:
𝐢𝑛 is the number of full binary trees with 𝑛 + 1 leaves, or, equivalently, with a total of 𝑛 internal nodes
What are called internal nodes here, map to all the nodes of our binary trees.
For your concrete case 𝑛 is 5, and 𝐢𝑛 is 42.
I list the 14 trees that have P as root:
P P P P P
\ \ \ \ \
Q Q Q Q Q
\ \ \ \ \
R R T T S
\ \ / / / \
S T R S R T
\ / \ /
T S S R
P P P P
\ \ \ \
R R S S
/ \ / \ / \ / \
Q S Q T Q T R T
\ / \ /
T S R Q
P P P P P
\ \ \ \ \
T T T T T
/ / / / /
Q Q S S R
\ \ / / / \
R S Q R Q S
\ / \ /
S R R Q
...5 with Q as root
Q Q Q Q Q
/ \ / \ / \ / \ / \
P R P R P T P T P S
\ \ / / / \
S T R S R T
\ / \ /
T S S R
...4 with R as root
R R R R
/ \ / \ / \ / \
Q S Q T P S P T
/ \ / / \ \ \ /
P T P S Q T Q S
...5 with S as root
S S S S S
/ \ / \ / \ / \ / \
R T R T P T P T Q T
/ / \ \ / \
Q P Q R P R
/ \ \ /
P Q R Q
...and 14 with T as root:
T T T T T
/ / / / /
S S S S S
/ / / / /
R R P P Q
/ / \ \ / \
Q P Q R P R
/ \ \ /
P Q R Q
T T T T
/ / / /
R R Q Q
/ \ / \ / \ / \
Q S P S P R P S
/ \ \ /
P Q S R
T T T T T
/ / / / /
P P P P P
\ \ \ \ \
S S Q Q R
/ / \ \ / \
R Q R S Q S
/ \ \ /
Q R S R
42 in total.
First, observe that a complete binary search tree has only a single possible arrangement. For example with seven nodes and the requirement that in-order traversal is sorted alphabetically:
D
B F
A C E G
But your problem is with 5 nodes, so the tree is not complete. Here is one possible such tree:
S
Q T
P R
Reading from left to right, the letters are in the correct order, but clearly not the only possible order. Here's another one:
R
Q T
P S
There will always be exactly two empty leaves out of four possible leaf nodes. If we use 1 to denote a populated leaf and 0 to denote empty, the possibilities are:
0011
0101
1001
1100
1010
0110
That is "4 choose 2" which gives six different trees. The last question is whether there are two distinct trees with the same nodes populated but having different values (letters) at some nodes. The answer is no, that is not possible, for the same reason as with a complete tree: swapping any two values would always make the tree out of order. Just like in an array, there is only one ordering which is sorted when the values are all distinct.

Why does AA tree do the operation first skew and then split?

Why does AA tree do the operation first skew and then split? What is the reason for this and why shouldn't the balancing functions be called on the contrary?
Consider this sub-tree:
|
v
L<-T->R
/ \ / \
A B C D
If you apply skew first, and then split, you will get a legal tree.
|
v
T
/ \
L R
/ \ / \
A B C D
If you apply split first, then skew, you will get an illegal tree:
|
v
L->T->R
/ / / \
A B C D

Removing edges to eliminate all but one cycle-free path between two nodes

I have a connected undirected graph having n nodes. Given two nodes, I want to find the minimum number of edges that would have to be removed in order to ensure that there's only one cycle-free path between those two nodes.
For example, if this is the graph:
1------------2------------5
| |
| |
3-------------------------4
then given the nodes 1 and 5, the answer will be 1: just remove (for example) the edge between node 3 and node 4.
The brute-force approach is, for each subset of the set of edges, to try removing those edges and test if there's a unique cycle-free path between the two nodes of interest.
Is there a more efficient approach? (I Googled it, but did not find anything relevant.)
(Dear cryptomanic, I added these examples to help in the discussion about the exact requirements; please edit this part and indicate which of these solutions are valid. m69)
Input graph: (going from X to Y)
O---O---O---O O
/ \ / \ / \
O---O---X O Y---O---O
\ /
O---O---O---O
/ \ \
O---O O
Solution A: (no cycles inbetween X and Y)
O---O---O---O O
/ / \ / \
O---O---X O Y---O---O
/
O---O---O---O
/ \ \
O---O O
Solution B: (no side-paths inbetween X and Y)
O---O---O---O O
/ \ / \
O---O---X O Y---O---O
/
O---O---O---O
/ \ \
O---O O
Solution C: (no cycles connected to X and Y)
O---O---O---O O
/ / \ \
O---O---X O Y---O---O
/
O---O---O O
/ \ \
O---O O
Solution D: (completely isolate path from X to Y)
O---O---O---O O
/ \ / \
O---O X O Y O---O
O---O---O---O
/ \ \
O---O O
Solution E: (P can only be used once, so P-Q-R-P is not part of an alternative path)
O---O---O---O O
\ / / \
O---O---X O Y O---O
\ /
O---P---O---O
/ \ \
Q---R O
Solution F:
O---O---O---O O
\ / \ / \
O---O---X O Y---O---O
\ /
O---O---O---O
/ \ \
O---O O

How to find the weight of heaviest edge in the path between two nodes in a tree?

The idea I have heard about is finding the Lowest Common ancestor (LCA) of these 2 nodes using the binary lifting method. To know more about it:
https://www.topcoder.com/community/data-science/data-science-tutorials/range-minimum-query-and-lowest-common-ancestor/#Lowest%20Common%20Ancestor%20(LCA)
But I don't know where in that algorithm I can store the weight information. Any ideas??
Construct a tree for LCA as follows. In the weighted input tree, find the heaviest edge, delete it, and construct two (output) trees recursively, one for each remaining component of the input. Make these output trees the children of a newly created root. (The base case is to turn a single vertex into a single vertex.)
Say we have an unrooted weighted tree:
1 5 4
A-----B-----C-----D
| |
|2 |3
| |
E F
The rooted tree that we prepare for LCA is:
5
/ \
/ \
/ \
2 4
/ \ / \
1 E D 3
/ \ / \
A B C F

B tree insertion

hey i have a questions on my homework and i am being able to solve it i just want someone to see if i am doing right or wrong...
A b-tree with minimum branching factor of t=3
[D][G][K][N][V]
/ / / | \ \
/ / / | \ \
/ / / | \ \
AC EF HI LM OPRST WX
Now when i insert J in above tree this is the output i am getting....
[K]
/ \
/ \
/ \
[D][G] [N][V]
/ / / / \ \
/ / / / \ \
/ / / / \ \
AC EF HIJ LM OPRST WX
After Inserting Q in above tree this is the Final tree i am getting.
[K]
/ \
/ \
/ \
[D][G] [N][Q][V]
/ / / / / \ \
/ / / / / \ \
/ / / / / \ \
AC EF HIJ LM OP RST WX
Is this the Final Tree Correct?
No, the final B tree is not correct. The intermediate one is though. The last one should be like this
[K]
/ \
/ \
/ \
[D][G] [N][R][V]
/ / / / / \ \
/ / / / / \ \
/ / / / / \ \
AC EF HIJ LM OPQ ST WX
You missed something very important. In a B-tree, insertions are only done in the leaf node and every full node on the way is split. You inserted Q in a level 2 node in your final tree.
Edit: I think you are confused about the insertion algorithm. Insertions only take place in the leaf node. In the downward path from root to leaf, if any full node is encountered it is split first. If the leaf node is full, it will be split first and then the key will be inserted. In your case the leaf node OPRST will be split when it is encountered because it has 5 nodes and is full. Thus R will be moved up and and a new leaf node containing keys ST will be created. The older leaf node now will only have OP keys. Q is then compared with R and search moves leftward to OP node where Q finally gets inserted.
If the branching factor is 3, doesn't that mean the minimum number of keys in non-root node? How can the initial tree be correct?
Initial state would be:
└── E, I, N, S
β”œβ”€β”€ A, C, D
β”œβ”€β”€ F, G, H
β”œβ”€β”€ K, L, M
β”œβ”€β”€ O, P, R
└── T, V, W, X

Resources