Finding closest parent for two nodes - binary-tree

what is the best way to find closest parents for two given nodes on the tree?
So if I have:
1
/ \
2 3
/ \ / \
4 5 6 7
closest parent for 5 and 6 would be one.
Thanks

This problem is called Lowest Common Ancestor(LCA) problem. (Google it)
One query can be answered by simply climbing up along their parent links until they meet:
The first step is to let the lower node climb until they are in the same height.
The second step is to let them climb simultaneously until they meet at the same node.
Then that node is the LCA of these two node.
If you need to process multiple queries, you need to use more advanced algorithm. The most time-efficient algorithm use O(n) time to preprocess, and O(1) time for each query, where n is the total nodes in the tree.

Related

How to count the maximum number of times any node has been visited while traveling through a tree several times?

We travel through a given tree (not binary) several times. How do we calculate the most number of times any node in the tree has been visited?
For example: in the tree:
1
/ \
2 3
/ \
4 5
Suppose we are told to travel 2 times, from 2 to 3, then 5 to 3. The travel paths will be (2->1->3 and 5->3). The maximum number of times a node has been visited is 2 (the node is 3). All travels are independent from each other. A given travel starts from a given node A and ends at B.
How to efficiently travel (if we even need to) in order to calculate that, considering that we have over 50,000 nodes and 75,000 paths to cover (like 2 to 3 and 3 to 4 in the example)?
Based on what you are saying, the answer is the amount of children that node has...
Also in your example, going off what you have said, both 1 and 3 are visited the most.
In your example each node is only going to get visited once. The only way you could get multiple visits to one node would be with a tree like:
1 3
\ /
2
Edit:: the most efficient way of traversing is if you have a perfect binary tre
4
2 6
1 3 5 7
Where the max depth is number of ((log base 2 of (number of nodes + 1)) + 1) rounded down
Why not store the travel count of each node separately?
Maintain a HashMap<Node, long> keeping track of how many times each node has been visited.
Then maintain a TreeMap<long, List<Node>> that is keyed on count and contains the list of node whose count it is representing.
This way, the TreeMap's first would contain all the nodes that have the highest count, because there can definitely be more than one node with that highest visit count.
All you now need to do is add bookkeeping code for properly updating the two maps whenever a node is visited as part of a tree traversal.
There's an XY problem here.
Your question states you want to store number of node visits. What you really want though is an efficient traversal strategy.
You have options here. Since the edges are bi-directional the best strategy IMO would be a bi-directional search.
But the search strategy itself is a toss up.
Consider a slightly more elaborate tree as
1 -> 2,3,4; 2 -> 5,6,7; 3->8,9; 4->10,11; 10 -> 12,13. If you have an efficient path from 5 to 4 It doesn't mean you can just start from there to find an efficient path from 5 to 13 because you don;t 13 comes under 4 unless you've already found an efficient path from 4 to 13.
So I would suggest memoizing your traversals in a dictionary of the form <Node Pair>: [Traversal list]
Where you start at a target node and perform a Breadth first search. and each time you visit a node, examine your memoization structure if there exists a an entry for <curnode,targetnode> in the dictionary. If there exists an entry, you are done. If not proceed to the current node's sibling or child.
CAVEAT: THIS IS UNDER THE ASSUMPTION THAT ALL NODES HAVE ONLY 1 PARENT AND CYCLES DON'T HAPPEN
I think people are miss understanding the question. He/she wants to ask which node is visited max number of times given x number of travel. So from his tree edges are {(2-1)(1-3)(3-4),(3-5)} now for example we are traveling in following paths{(1,5), (2,4), (1,3), (1,2), (1,3), (4,5)}. So this example 3 visited 5 times, 1 visited 4 times and so on.
Since it is a tree there is only one path exist from one node to another. Find paths for all combination of nodes in DP way and store it.
Then count for each visit. I know there is more efficient way for counting but cannot think of it right now.

How to find the shortest path of a weighted tree?

I have a weighted tree which looks like (weights are in brackets)
A1
/ \
B1(3) B2(2)
/ \ / \
C1(1) C2(3) C3(4)
/ \ / \ / \
D1(8) D2(7) D3(2) D4(5)
......
So, each node has two children. And each node shares a child with a neighbour node. A depth of the tree can be very high.
3 + 1 + 8 = 12
3 + 1 + 7 = 11
3 + 3 + 7 = 13 ... and so on
What is the best way to find the shortest path? As a result I need not a sum of weights but a full path (lets say A1-B2-C3-D3).
I will be more than happy if you could reference me to the right algorithm.. Or provide java/pseudo code solution.
Thank you!
Update
I am looking for a full path from top to bottom
This may be a natural Dynamic Programming (DP) problem due to the child sharing property. I suggest using a bottom-up DP algorithm to solve this problem.
Define the state of each node as SP(n), which means shortest path from that node. We could notice that the SP(n) is only dependent on the SP(c), where c is child of n. And because of the child sharing property, the SP(n) may be reused by n's parents.
The state transformation equation is listed as below:
SP(n) = min {for every c of n's children | SP(c) + weight(c)}
As for implementation, we scan bottom-up from leaves to compute the SP(n) until we reach the root. And the time cost is O(n) since we compute it in one run.
You may want to look at Alpha-beta-pruning. This algorithm basically removes part of the search tree as soon as they are known to be obsolete, i.e. a shorter path to the same position is already known.

Disjoint-set forests - why should the rank be increased by one when the find of two nodes are of same rank?

I am implementing the disjoint-set datastructure to do union find. I came across the following statement in Wikipedia:
... whenever two trees of the same rank r are united, the rank of the result is r+1.
Why should the rank of the joined tree be increased by only one when the trees are of the same rank? What happens if I simply add the two ranks (i.e. 2*r)?
First, what is rank? It is almost the same as the height of a tree. In fact, for now, pretend that it is the same as the height.
We want to keep trees short, so keeping track of the height of every tree helps us do that. When unioning two trees of different height, we make the root of the shorter tree a child of the root of the taller tree. Importantly, this does not change the height of the taller tree. That is, the rank of the taller tree does not change.
However, when unioning two trees of the same height, we make one root the child of the other, and this increases the height of that overall tree by one, so we increase the rank of that root by one.
Now, I said that rank was almost the same as the height of the tree. Why almost? Because of path compression, a second technique used by the union-find data structure to keep trees short. Path compression can alter an existing tree to make it shorter than indicated by its rank. In principle, it might be better to make decisions based on the actual height than using rank as a proxy for height, but in practice, it is too hard/too slow to keep track of the true height information, whereas it is very easy/fast to keep track of rank.
You also asked "What happens if I simply add the two ranks (i.e. 2*r)?" This is an interesting question. The answer is probably nothing, meaning everything will still work just fine, with the same efficiency as before. (Well, assuming that you use 1 as your starting rank rather than 0.) Why? Because the way rank is used, what matters is the relative ordering of ranks, not their absolute magnitudes. If you add them, then your ranks will be 1,2,4,8 instead of 1,2,3,4 (or more likely 0,1,2,3), but they will still have exactly the same relative ordering so all is well. Your rank is simply 2^(the old rank). The biggest danger is that you run a larger risk of overflowing the integer used to represent the rank when dealing with very large sets (or, put another way, that you will need to use more space to store your ranks).
On the other hand, notice that by adding the two ranks, you are approximating the size of the trees rather than the heights of the trees. By always adding the two ranks, whether they are equal or not, then you are exactly tracking the sizes of the trees. Again, everything works just fine, with the same caveats about the possibility of overflowing integers if your trees are very large.
In fact, union-by-size is widely recognized as a legitimate alternative to union-by-rank. For some applications, you actually want to know the sizes of the sets, and for those applications union-by-size is actually preferabe to union-by-rank.
Because in this case - you add one tree is a "sub tree" of the other - which makes the original subtree increase its size.
Have a look at the following example:
1 3
| |
2 4
In the above, the "rank" of each tree is 2.
Now, let's say 1 is going to be the new unified root, you will get the following tree:
1
/ \
/ \
3 2
|
4
after the join the rank of "1" is 3, rank_old(1) + 1 - as expected.1
As for your second question, because it will yield false height for the trees.
If we take the above example, and merge the trees to get the tree of rank 3. What would happen if we then want to merge it with this tree2:
9
/ \
10 11
|
13
|
14
We'll find out both ranks are 4, and try to merge them the same way we did before, without favoring the 'shorter' tree - which will result in trees with higher height, and ultimately - worse time complexity.
(1) Disclaimer: The first part of this answer is taken from my answer to a similar question (though not identical due to your last part of the question)
(2) Note that the above tree is syntatically made, it cannot be created in an optimized disjoint forests algorithms, but it still demonstrates the issues needed for the answer.
If you read that paragraph in a little more depth, you'll realize that rank is more like depth, not size:
Since it is the depth of the tree that affects the running time, the tree with smaller depth gets added under the root of the deeper tree, which only increases the depth if the depths were equal. In the context of this algorithm, the term "rank" is used instead of "depth" ...
and a merge of equal depth trees only increases the depth of the tree by one since the root of the one is added to the root of the other.
Consider:
A D
/ \ merged with / \
B C E F
is:
A
/|\
B C D
/ \
E F
The depth was 2 for both, and it's 3 for the merged one.
Rank represents the depth of the tree, not the number of nodes in it. When you join a tree with a smaller rank with a tree with a larger rank, the overall rank remains the same.
Consider adding a tree with rank 4 to the root of the tree of rank 6: since we added a node above the root of the depth-4 tree, that subtree now has a rank of 5. The subtree to which we've added our depth-4 tree, however, is 6, so the rank does not change.
Now consider adding a tree with rank 6 to the root of a second tree of rank 6: since the root of the first depth-6 tree now has an extra node above it, the rank of that subtree (and the tree overall) changes to 7.
Since the rank of the tree determines the processing speed, the algorithm tries to keep the rank as low as possible by always attaching a shorter tree to the taller one, keeping the overall rank unchanged. The rank changes only when the trees have identical ranks, in which case one of them gets attached to the root of the other, bumping up the rank by one.
Actually Here two important properties should be known very well to us ....
1) What is Rank ?
2) Why Rank is Used ???
Rank is nothing but the depth of a tree .U can say rank as depth (level) of a tree . When we make union nodes then these (graph nodes ) will be formed as a tree with an ultimate root node.Rank is expressed only for those root nodes .
A merged with D
Initially A has rank (level) 0 and D has rank(level) 0 . So u can merge them making anyone of them as a root . Because if u make A as root the rank(level) will be 1
and if u make D as a root then the rank will also be 1
A
`D
Here rank ( level ) is 1 when root is A .
Now think for another ,
A merge B -----> A
`D `C / \
D B
\
C
So the level will be increased by 1 , see exactly without root (A) there is at most height / depth / rank is 2 . rank[ 1] -> {D,B} and rank [2] -> {C} ................
Now our main objective is to make tree with minimum rank(depth) as possible while merging ..
Now when two differnt rank tree merge ,then
A(rank 0) merge B(rank 1)---> B Here merged tree rank is 1 same as high rank (1)
`C / \
A C
When small rank goes under over high rank . Then the merged tree's rank(height/depth) will be the same rank associated with higher rank tree .That means the rank will not increase , the merged tree rank will be same as higher rank before ...
But if we will do the reverse work means high rank tree goes under over low rank tree then see ,
A ( rank 0 ) merge B (rank 1 ) --> A ( merged tree rank 2 greater than both )
`C `B
`C
So , whatever is seen from following observation is that if we try to keep rank (height) of merged tree as minimum possible then , we have to choose the first process. i think this part is clear !!
Now u have to understand what is our objective to keep tree's height minimum as possible ..........
when we use disjoint set union then for path compression ( finding ultimate root with whom a node is connected ) when we traverse from a node to it's root node then if it's height (rank) is long then time processing will be slow .That's why when we try to merge two trees then we try to keep heigh/depth/rank as minimum as possible

A linear-time algorithm for finding the longest distance between two nodes in a free tree?

Given a free tree, find an algorithm to find the longest path between two nodes that runs in linear time. Is this possible to do if the nodes don't store their level? If yes, how?
If the nodes do store their level then I would move the lower node up the tree to the same level as the other. Than I would keep moving up the tree until the nodes overlap. The distance would be the sum of each time a node was moved up the tree.
If all the edges between the two nodes could not be used more than once, the path is fixed. So the problem is to find the lowest common ancestor, you can read here: http://en.wikipedia.org/wiki/Lowest_common_ancestor
There's a famous algorithm to solve it, and it's here:
http://en.wikipedia.org/wiki/Tarjan%27s_off-line_least_common_ancestors_algorithm
I solved http://www.spoj.pl/problems/PT07Z/ with the following code as an exercise to learn python:
def func(node):
global M
if (len(node)==0):
return 0
else:
s=[func(nodes[n]) for n in node]
s.sort()
m1=s[-1]+1
m2=0
if len(s)>1:
m2=s[-2]+1
M=max(M,m1+m2)
return m1
t=input()
nodes={}
for node in range(1,t+1):
nodes[node]=[]
for i in range(t-1):
s=raw_input().split()
a,b=int(s[0]),int(s[1])
nodes[a].append(b)
M=0
func(nodes[1])
print M
Note you can sort the nodes in linear time because you know the nodes go from 0 to N, so you move node 0 to position 0.. node 5 to position 5 etc.

IOI 2003 : how to calculate the node that has the minimum balance in a tree?

here is the Balancing Act problem that demands to find the node that has the minimum balance in a tree. Balance is defined as :
Deleting any node
from the tree yields a forest : a collection of one or more trees. Define the balance of a node to be the size of the largest tree in the forest T created by deleting that node from T
For the sample tree like :
2 6 1 2 1 4 4 5 3 7 3 1
Explanation is :
Deleting node 4 yields two trees whose member nodes are {5} and {1,2,3,6,7}. The
larger of these two trees has five nodes, thus the balance of node 4 is five. Deleting node
1 yields a forest of three trees of equal size: {2,6}, {3,7}, and {4,5}. Each of these trees
has two nodes, so the balance of node 1 is two.
What kind of algorithm can you offer to this problem?
Thanks
I am going to assume that you have had a looong look at this problem: reading the solution does not help, you only get better at solving these problems by solving them yourself.
So one thing to observe is, the input is a tree. That means that each edge joins 2 smaller trees together. Removing an edge yields 2 disconnected trees (a forest of 2 trees).
So, if you calculate the size of the tree on one side of the edge, and then on the other, you should be able to look at a node's edges and ask "What is the size of the tree on the other side of this edge?"
You can calculate the sizes of trees using dynamic programming - your recurrence state is "What edge am I on? What side of the edge am I on?" and it calculates the size of the tree "hung" at that node. That is the crux of the problem.
Having that data, it is sufficient to iterate through all the nodes, look at their edges and ask "What is the size of the tree on the other side of this edge?" From there, you just pick the minimum.
Hope that helps.
You basically want to check 3 things for every node:
The size of its left subtree.
The size of its right subtree.
The size of the rest of the tree. (size of tree - left - right)
You can use this algorithm and expand it to any kind of tree (different number of subnodes).
Go over the tree in an in-order sequence.
Do this recursively:
Every time you just before you back up from a node to the "father" node, you need to add 1+size of node's total sub trees, to the "father" node.
Then store a value, let's call it maxTree, in the node that holds the maximum between all its subtrees, and the (sum of all subtrees)-(size of tree).
This way you can calculate all the subtree sizes in O(N).
While traversing the tree, you can hold a variable that hold the minimum value found so far.

Resources