Can you say please, do we have a solution to find LCA of two nodes in binary tree visiting every tree node only once? Nodes only have left, right references without parent.
(O(n) time complexity).
Related
I'm having trouble conceptualising the process of deletion from a splay tree. Given this intial, tree, I want to delete the node 78.
Based on the information from my course (derived from Goodrich, Tamassia and Goldwasser), the deleted node in a BST should be replaced by the next node reached by performing an in-order traversal from the node which should be 91. This node should then be splayed to the top of the tree. However, this is not the case as shown on this visualiser here. https://www.cs.usfca.edu/~galles/visualization/SplayTree.html
The visualizer replaced 78 by its in order predecessor (70) instead and splayed that node. (The in order successor, i.e., the next key in sorted order is 83, not 91.) In general, splay trees are wonderfully malleable and as long as you approximately halve the length of the path you just descended while making every other path at most a little bit longer, you’re doing it right from an asymptotic performance standpoint (your professor may have different ideas, however).
Your textbook description:
the deleted node in a BST should be replaced by the next node reached by performing an in-order traversal from the node which should be 91
That description applies to unbalanced BST (binary search trees) but does not apply to most of the various kinds of balanced binary trees, and also does not apply to Splay Trees. To delete a node in a splay tree do the following:
Splay the node to be deleted to the root and dispose of it. This leaves two trees, call the left tree A and the right tree B.
The new root of the recombined tree will come from A. Splay the largest (rightmost) node of A tree to its root.
Because A's new root has the greatest key in A, it has no right child. Set the right child of A's new root to B.
A is the new combined tree.
This is what the visualization at https://www.cs.usfca.edu/%7Egalles/visualization/SplayTree.html did.
You said in comments to the other answer:
So in practice, the node that you choose to replace the deleted node doesn't really matter, i.e. affect performance etc.
In the typical splay tree deletion algorithm the node to replace will be the predecessor or successor node, in key order.
The rule of thumb is to always splay whenever a specific node is accessed. Find the node to delete, then splay it to the root. Find its predecessor, then splay it to the root. There are variations where you can splay less aggressively, too.
I'm implementing a Digital Search Tree's 'delete' function.
What tree traversal algorithms should I use to find a leaf node that is a child of the deleted node?
The leaf not doesn't have to be the closest one or anything. Just a leaf.
Thank you
Can someone help me out with an algorithm to traverse a binary tree iteratively without using any other data structure like a stack
I read somewhere we can have a flag named visited for each node and turn in on if the node is visited but my BinaryTreeNode class does not have a visited variable defined. So I can not potentially do something like node.left.visited = false
Is there any other way to traverse iteratively?
One option would be to thread the binary tree.
Whenever some node points to NULL (be it left or right), make that node point to the node which comes next in its traversal (pre-order, post-order, etc). In this way, you can traverse the entire tree in one iteration.
Sample threaded binary tree:
Note that left node of each node points to the largest value smaller than it. And the right node of each node points to the smallest value larger than it. So this gives an in-order traversal.
I can't seem to figure out how the in-order traversal of a threaded binary tree is O(N)..
Because you have to descend the links to find the the leftmost child and then go back by the thread when you want to add the parent to the traversal path. would not that be O(N^2)?
Thanks!
The traversal of a tree (threaded or not) is O(N) because visiting any node, starting from its parent, is O(1). The visitation of a node consists of three fixed operations: descending to the node from parent, the visitation proper (spending time at the node), and then returning to the parent. O(1 * N) is O(N).
The ultimate way to look at it is that the tree is a graph, and the traversal crosses each edge in the graph only twice. And the number of edges is proportional to the number of nodes since there are no cycles or redundant edges (each node can be reached by one unique path). A tree with N nodes has exactly N-1 edges: each node has an edge leading to it from its parent node, except for the root node of the tree.
At times it appears as if visiting a node requires more than one descent. For instance, after visiting the rightmost node in a subtree, we have to pop back up numerous levels before we can march to the right into the next subtree. But we did not descend all the way down just to visit that node. Each one-level descent can be accounted for as being necessary for visiting just the node immediately below, and the opposite ascent's
cost is lumped with that. By visiting a node V, we also gain access to all the nodes below it, but all those nodes benefit from and share the edge traversal from V's parent down to V, and back up again.
This is related to amortized analysis, which applies in situations where we can globally understand the overall cost based on some general observation about the structure of the problem, but at the detailed level of the individual operations, the costs are distributed in an uneven way that appears confusing.
Amortized analysis helps us understand that, for instance, N insertions into a hash table which resizes itself by growing exponentially are O(N). Most of the insertion operations are quick, but from time to time, we grow the table and process its contents. This is similar to how, from time to time during a tree traversal, we have to perform numerous consecutive ascents to climb out of a deep subtree.
The global observation about the hash table is that each item inserted into the table will move to a larger table on average about three times in three resize operations, and so each insertion can be regarded as "pre paying" for three re-insertions, which is a fixed cost. Of course, "older" items will be moved more times, but this is offset by "younger" entries that move fewer times, diluting the cost. And the global observation about the tree was already noted above: it has N-1 edges, each of which are traversed exactly twice during the traversal, so the visitation of each node "pays" for the double traversal of its respective edge. Because this is so easy to see, we don't actually have to formally apply amortized analysis to tree traversal.
Now suppose we performed an individual searches for each node (and the tree is a balanced search tree). Then the traversal would still not be O(N*N), but rather O(N log N). Suppose we have an ordered search tree which holds consecutive integers. If we increment over the integers and perform individual searches for each value, then each search is O(log N), and we end up doing N of these. In this situation, the edge traversals are no longer shared, so amortization does not apply. To reach some given node that we are searching for which is found at depth D, we have to cross D edges twice, for the sake of that node and that node alone. The next search in the loop for another integer will be completely independent of the previous one.
It may also help you to think of a linked list, which can be regarded as a very unbalanced tree. To visit all the items in a linked list of length N and return back to the head node is obviously O(N). Searching for each item individually is O(N*N), but in a traversal, we are not searching for each node individually, but using each predecessor as a springboard into finding the next node.
There is no loop to find the parent. Otherwise said, you are going through each arc between two node twice. That would be 2*number of arc = 2*(number of node -1) which is O(N).
Problem: I have a binary tree, all leaves are numbered (from left to right, starting from 0) and no connection exists between them.
I want an algorithm that, given two indices (of 2 distinct leaves), visits the tree starting from the greater leaf (the one with the higher index) and gets to the lower one.
The internal nodes of the tree do not contain any useful information.
I should chose the path based only on the leaves indices. The path start from a leaf and terminates on a leaf, and of course I can access a leaf if I know its index (through an array of pointers)
The tree is static, no insertion or deletion of nodes is allowed.
I have developed an algorithm to do it but it really sucks... any ideas?
One option would be to find the least common ancestor of the two nodes, along with the sequence of nodes you should take from each node to get to that ancestor. Here's a sketch of the algorithm:
Starting from each node, walk back up to that node's parent until you reach the root. Count the number of nodes on the path from each node to the root. Let the height of the first node be h1 and the height of the second node be h2.
Let h = min(h1, h2). This is the height of the higher of the two nodes.
Starting from each node, keep following the node's parent pointer until both nodes are at height h. Record the nodes you followed during this step. At this point, both nodes are at the same height.
Until you find a common node, keep marching upwards from each node to its parent. Eventually you will hit their common ancestor. At this point, follow the path from the first node up to this ancestor, then down the path from the ancestor down to the second node.
In the worst case, this takes O(h) time and O(h) space, where h is the height of the tree. For a balanced binary tree is this O(lg n) time and space, which is quite good.
If you're interested in a Much More Hardcore version of this algorithm, consider looking into Tarjan's Least Common Ancestors algorithm, which with linear preprocessing time, can be used to find the least common ancestor much more rapidly than this.
Hope this helps!
Distance between any two nodes can be calculated with the help of lowest common ancestor:
Dist(n1, n2) = Dist(root, n1) + Dist(root, n2) - 2*Dist(root, lca)
where lca is lowest common ancestor.
see this for more help about this algorithm and see this video for learning how to calculate lca.