Idiomatic Traversal Binary Tree (Perhaps Any Tree) - algorithm

A Doubly Linked List enables idiomatic traversal of a Linked List and I thought why not for a Binary Tree? Traditionally, Binary Trees or Trees ingeneral are unidirectional and that implies, given a large tree with sufficient number of nodes, the running time to find a leaf node can be costly.
If, after finding such a node, to find the next I could traverse the tree back toward the root, would that not be advantageous as compared to another depth-first search through every node of the tree? I have never considered this before until realizing the marriage of a Doubly Linked List and a Binary Tree could potentially add benefit.
For example, if I employed an inner class
class Tree<T> {
private class TwoWayNode {
var data : T
var left : TwoWayNode
var right : TwoWayNode
var previous : TwoWayNode
}
}
The use of left and right are as normal to traverse the respective subtrees from each node and previous would hold a pointer to the parent node enable idiomatic traversal. Would someting like this work well and what are some of the potential problems or pitfalls?

Given you store a previous reference, you can walk leftmost first. Upon arrival at the leaf node, you back one up again, traverse right.
You can always compare the current node, your "walker", with the child nodes, so you can check if you went left or right the last time. This makes your traversal stateless and you do not even need recursion; suitable for very large datasets.
Now, everytime you just left the right leaf, you back one up again.
This algorithm is a Depth-First-Search.*
Making it faster:
Given that you could define some deterministic condition for the order of traversal, this can become quite flexible, and even be used in applications like ray tracing.
*: http://en.wikipedia.org/wiki/Depth-first_search
Bonus: This paper on traversal algorithms for Kd-trees in Ray Tracing: Review: Kd-tree Traversal Algorithms for Ray Tracing (http://dcgi.felk.cvut.cz/home/havran/ARTICLES)/cgf2011.pdf

Indeed nodes of a binary tree are often implemented with pointers to the left and right child and the parent (see this implementation of red black trees).
But you not always need a parent pointer:
For an inorder-traversal you can use a recursive algorithm so that the call stack takes care of that for you.
If you want to access the min or max node you can simply maintain a extra pointer to them.
Sometimes you can use a finger tree.
Or organize your pointers extra clever (see Self adjusting binary search trees page 666):
The left pointer of a node points to the first (left) child
The right pointer of a node points to either the sibling (if it is a left child) or back to the parent (if it is a right child)
Extra cool: Threaded binary search trees for extra easy inorder (and reverse order) traversal without a stack - so O(1) space!

Related

How to check if two binary trees share a node

Given an array of binary trees find whether any two trees share a node, not value wise, but "pointer" wise. At the bottom I provided an example.
My approach was to iterate through all the trees and store all the leaves (pointers) from each tree into a list, then check if list has any duplicates, but that's a rather slow approach. Is there perhaps a quicker way to solve this?
In the worst case you will have to traverse all nodes (all pointers) to find a shared node (pointer), as it might happen to be the last one visited. So the best time complexity we can expect to have is O(𝑚+𝑛) where 𝑚 and 𝑛 represent the number of nodes in either tree.
We can achieve this time complexity if we store the pointers from the first tree in a hash set and then traverse the pointers of the second tree to see if any of those is in the set. Assuming that get/set operations on a hash set have an amortized constant time complexity, the overal time complexity will be O(𝑚+𝑛).
If the same program is responsible for constructing the trees, then a reuse of the same node can be detected upon insertion. For instance, reuse of the same node in multiple trees can be completely avoided by having the insert method of your tree only take a value as argument, never a node instance. The method will then encapsulate the actual creation of the node, guaranteeing its uniqueness.
An idea for O(#nodes) time and O(1) space. It does more traversal work than simple traversals using a hash table, but it doesn't have the cost of using a hash table. I don't know what's better. Might depend on the language.
For two trees
Create one extra node. Do a Morris traversal of the first tree. It only modifies right child pointers, so we can use left child pointers for marking nodes as seen. For every tree node without left child, set our extra node as left child. Whenever checking a left child pointer, treat our extra node like a null pointer, i.e., don't visit it. After the traversal, the tree structure is restored, and all originally left-child-less tree nodes now point to our extra node as left child. That includes all leaf nodes.
Do a Morris traversal of the second tree. Again treat pointers to our extra node like null pointers. If we ever do encounter our extra node, we know the trees share a node. If not, then we know the trees don't share a node, since if they did share any, they'd also share a leaf node (just go down from any shared node to a leaf node, that's also shared), and all leafs nodes of the first tree are marked. After the traversal, the second tree is restored.
Do a Morris traversal of the first tree again, this time removing our extra node, restoring the original null pointers.
For an array of more than two trees
Mark the first tree as above. Check the second tree as above. Mark the second tree. Check the third. Mark the third. Check the fourth. Mark the fourth. Etc. When you found a shared node or there are no more trees, unmark the marked trees.
Every shared node must have two parents, or an ancestor with two parents.
LOOP over nodes
IF node has two parents
MARK node as shared
Mark all descendants as shared.

Rope and self-balancing binary tree hybrid? (i.e Sorted set with fast n-th element lookup)

Is there a data structure for a sorted set allows quick lookup of the n-th (i.e. the least but n-th) item? That is, something like a a hybrid between a rope and a red-black tree.
Seems like it should be possible to either keep track of the size of the left subtree and update it through rotations or do something else clever and I'm hoping someone smart has already worked this out.
Seems like it should be possible to either keep track of the size of the left subtree and update it through rotations […]
Yes, this is quite possible; but instead of keeping track of the size of the left subtree, it's a bit simpler to keep track of the size of the complete subtree rooted at a given node. (You can then get the size of its left subtree by examining its left-child's size.) It's not as tricky as you might think, because you can always re-calculate a node's size as long as its children are up-to-date, so you don't need any extra bookkeeping beyond making sure that you recalculate sizes by working your way up the tree.
Note that, in most mutable red-black tree implementations, 'put' and 'delete' stop walking back up the tree once they've restored the invariants, whereas with this approach you need to walk all the way back up the tree in all cases. That'll be a small performance hit, but at least it's not hard to implement. (In purely functional red-black tree implementations, even that isn't a problem, because those always have to walk the full path back up to create the new parent nodes. So you can just put the size-calculation in the constructor — very simple.)
Edited in response to your comment:
I was vaguely hoping this data structure already had a name so I could just find some implementations out there and that there was something clever one could do to minimize the updating but (while I can find plenty of papers on data structures that are variations of balanced binary trees) I can't figure out a good search term to look for papers that let one lookup the nth least element.
The fancy term for the nth smallest value in a collection is order statistic; so a tree structure that enables fast lookup by order statistic is called an order statistic tree. That second link includes some references that may help you — not sure, I haven't looked at them — but regardless, that should give you some good search terms. :-)
Yes, this is fully possible. Self-balancing tree algorithms do not actually need to be search trees, that is simply the typical presentation. The actual requirement is that nodes be ordered in some fashion (which a rope provides).
What is required is to update the tree weight on insert and erase. Rotations do not require a full update, local is enough. For example, a left rotate requires that the weight of the parent be added to the new parent (since that new parent is the old parent's right child it is not necessary to walk down the new parent's right descent tree since that was already the new parent's left descent tree). Similarly, for a right rotate it is necessary to subtract the weight of the new parent only, since the new parent's right descent tree will become the left descent tree of the old parent.
I suppose it would be possible to create an insert that updates the weight as it does rotations then adds the weight up any remaining ancestors but I didn't bother when I was solving this problem. I simply added the new node's weight all the way up the tree then did rotations as needed. Similarly for erase, I did the fix-up rotations then subtracted the weight of the node being removed before finally unhooking the node from the tree.

Why nodes of a binary tree have links only from parent to children?

Why nodes of a binary tree have links only from parent to children? I know tha there is threaded binary tree but those are harder to implement. A binary tree with two links will allow traversal in both directions iteratively without a stack or queue.
I do not know of any such design. If there is one please let me know.
Edit1: Let me conjure a problem for this. I want to do traversal without recursion and without using extra memory in form of stack or queue.
PS: I am afraid that I am going to get flake and downvotes for this stupid question.
Some binary trees do require children to keep up with their parent, or even their grandparent, e.g. Splay Trees. However this is only to balance or splay the tree. The reason we only traverse a tree from the parent to the children is because we are usually searching for a specific node, and as long as the binary tree is implemented such that all left children are less than the parent, and all right children are greater than the parent (or vice-versa), we only need links in one direction to find that node. We start the search at the root and then iterate down, and if the node is in the tree, we are guaranteed to find it. If we started at a leaf, there is no guarantee we would find the node we want by going back to the root. The reason we don't have links from the child to the parent is because it is unnecessary for searches. Hope this helps.
It can be, however, we should consider the balance between the memory usage and the complexity.
Yeah you can traverse the binary tree with an extra link in each node, but actually you are using the same extra memory as you do the traversal with a queue, which even run faster.
What binary search tree good at is that it can implement many searching problems in O(logN). It's fast enough and memory saving.
Let me conjure a problem for this. I want to do traversal without recursion and without using extra memory in form of stack or queue.
Have you considered that the parent pointers in the tree occupy space themselves?
They add O(N) memory to the tree to store parent pointer in order not to use O(log N) space during recursion.
What parent pointers allow us to do is to support an API whereby the caller can pass a pointer to a node and request an operation on it like "find the next node in order" (for example).
In this situation, we do not have a stack which holds the path to the root; we just receive a node "out of the blue" from the caller. With parent pointers, given a tree node, we can find its successor in amortized constant time O(1).
Implementations which don't require this functionality can save space by not including the parent pointers in the tree, and using recursion or an explicit stack structure for the root to leaf traversals.

Find a loop in a binary tree

How to find a loop in a binary tree? I am looking for a solution other than marking the visited nodes as visited or doing a address hashing. Any ideas?
Suppose you have a binary tree but you don't trust it and you think it might be a graph, the general case will dictate to remember the visited nodes. It is, somewhat, the same algorithm to construct a minimum spanning tree from a graph and this means the space and time complexity will be an issue.
Another approach would be to consider the data you save in the tree. Consider you have numbers of hashes so you can compare.
A pseudocode would test for this conditions:
Every node would have to have a maximum of 2 children and 1 parent (max 3 connections). More then 3 connections => not a binary tree.
The parent must not be a child.
If a node has two children, then the left child has a smaller value than the parent and the right child has a bigger value. So considering this, if a leaf, or inner node has as a child some node on a higher level (like parent's parent) you can determine a loop based on the values. If a child is a right node then it's value must be bigger then it's parent but if that child forms a loop, it means he is from the left part or the right part of the parent.
3.a. So if it is from the left part then it's value is smaller than it's sibling. So => not a binary tree. The idea is somewhat the same for the other part.
Testing aside, in what form is the tree that you want to test? Remeber that every node has a pointer to it's parent. An this pointer points to a single parent. So depending of the format you tree is in, you can take advantage from this.
As mentioned already: A tree does not (by definition) contain cycles (loops).
To test if your directed graph contains cycles (references to nodes already added to the tree) you can iterate trough the tree and add each node to a visited-list (or the hash of it if you rather prefer) and check each new node if it is in the list.
Plenty of algorithms for cycle-detection in graphs are just a google-search away.

Stackless pre-order traversal in a binary tree

Is it possible to perform iterative *pre-order* traversal on a binary tree without using node-stacks or "visited" flags?
As far as I know, such approaches usually require the nodes in the tree to have pointers to their parents. Now, to be sure, I know how to perform pre-order traversal using parent-pointers and visited-flags thus eliminating any requirement of stacks of nodes for iterative traversal.
But, I was wondering if visited-flags are really necessary. They would occupy a lot of memory if the tree has a lot of nodes. Also, having them would not make much sense if many pre-order tree traversals of a binary-tree are going on simultaneously in parallel.
If it is possible to perform this, some pseudo-code or better a short C++ code sample would be really useful.
EDIT: I specifically do not want to use recursion for pre-order traversal. The context for my question is that I have an octree (which is like a binary tree) which I have constructed on the GPU. I want to launch many threads, each of which does a tree-traversal independently and in parallel.
Firstly, CUDA does not support recursion.
Seoncdly, the concept of visited flags applies only for a single traversal. Since many traversals are going on simultaneously , having visited-flags field in the node data structure is of no use. They would make sense only on the CPU where all independent tree traversals are/can be serialised. To be more specific, after every tree-traversal we can set the visited-flags to false before performing another pre-order tree-traversal
You can use this algorithm, which only needs parent pointers and no additional storage:
For an inner node, the next node in a pre-order traversal is its leftmost child.
For a leaf node: Keep going upwards in the tree until you are coming from the left child of a node with two children. That node's right child will then be the next node to traverse.
function nextNode(node):
# inner node: return leftmost child
if node.left != null:
return node.left
if node.right != null:
return node.right
# leaf node
while (node.parent != null)
if node == node.parent.left and node.parent.right != null:
return node.parent.right
node = node.parent
return null #no more nodes
You can give each leaf node a pointer to the node that would come next in according to a preorder traversal.
For example, given the binary tree:
A
/ \
B C
/ \
D E
\
F
D would need to store a pointer to E, and F would need to store a pointer to C. Then you can simply traverse the tree iteratively as if it were a linked list.
You can do it with no extra storage by storing the same pointer in both the left and right subtree nodes. Since such a structure is not allowed in a tree (that would make it a DAG), you can safely infer that any node where all "child" pointers point to the same place is a leaf node.
You could add a single bit at each node signifying whether the first sub-branch addition went left-ward or rightward... Then, iterating through the tree allows choosing the original direction at every branch.
If you insist on doing this, you could number every possible path through the tree, and then set each worker to follow that path.
Your numbering scheme can simply be that each zero-bit means take the left child, and each one-bit means take the right child. To execute a depth-first search, process your number from least-significant bit to most-significant.
While it is not necessary to know the depth of the tree in advance, if you don't you will need to handle the case where all further numbers hit a leaf before the number is fully consumed.
There is a hack using the absolute values of the {->left,->right} pointers to encode one bit per node. It needs a first pass to get the initial pointer-"polarity" right.
It seems to be called DSW.
You can find more in this https://groups.google.com/group/comp.programming/browse_thread/thread/3552ea0af2006b28/6323076923faec26?hl=nl&q=tree+transversal&lnk=nl& usenet thread.
I don't know if it can be expanded to quad-trees or oct-trees, and I seriously doubt if it can be extended to multithreaded access. Adding a parent pointer is probably easier...
One direction you might want to consider is to delete the nodes of the tree as you traverse them and insert those nodes into a new tree. If you insert nodes in preorder, the new tree is going to be exactly same. But the problem here is how do you maintain integrity of the original tree as you delete items.

Resources