Data structure for handling intervals - algorithm

I have got a series of time intervals (t_start,t_end), that cannot overlap, i.e.: t_end(i) > t_start(i+1). I want to do the following operations:
1) Add new (Union of) intervals [ {(1,4),(8,10)} U (3,7) = {(1,7),(8,10)} ]
2) Take intervals out [ (1,7) - (3,5) = {(1,3),(5,7)}
3) Checking whether a point or a interval overlaps with an interval in my series (intersection)
4) Finding the first "non-interval" of a minimum length after some point [ {(1,4),(7,8)}: there is a "non-interval" of length 3 between 4 and 7 ].
I want to know good ways of implementing this, with low complexities (log n for all operations would do it).
Related question: Data structure for quick time interval look up

It sounds like you could just use a balanced binary tree of all the boundary times.
For example, represent {(1,4), (8,10), (12,15)} as a tree containing 1, 4, 8, 10, 12, and 15.
Each node needs to say whether it's the start or end of an interval. So:
8 (start)
/ \
1 (start) 12 (start)
\ / \
4 (end) 10 (end) 15 (end)
(Here all the "end" nodes ended up at the bottom by coincidence.)
Then I think you can have all your operations in O(log n) time. To add an interval:
Find the start time. If it's already in the tree as a start time, you can leave it there. If it's already in the tree as an end time, you'll want to remove it. If it's not in the tree and it doesn't fall during an existing interval, you'll want to add it. Otherwise you don't want to add it.
Find the stop time, using the same method to find out if you need to add it, remove it, or neither.
Now you just want to add or remove the abovementioned start and stop nodes and, at the same time, delete all the existing nodes in between. To do this you only need to rebuild the tree nodes at or directly above those two places in the tree. If the height of the tree is O(log n), which you can guarantee by using a balanced tree, this takes O(log n) time.
(Disclaimer: If you're in C++ and doing explicit memory management, you might end up freeing more than O(log n) pieces of memory as you do this, but really the time it takes to free a node should be billed to whoever added it, I think.)
Removing an interval is largely the same.
Checking a point or interval is straightforward.
Finding the first gap of at least a given size after a given time can be done in O(log n) too, if you also cache two more pieces of information per node:
In each start node (other than the leftmost), the size of the gap immediately to the left.
In every node, the size of the largest gap that appears in that subtree.
To find the first gap of a given size that appears after a given time, first find that time in the tree. Then walk up until you reach a node that claims to contain a large enough gap. If you came up from the right, you know this gap is to the left, so you ignore it and keep walking up. Otherwise you came from the left. If the node is a start node, check to see if the gap to its left is large enough. If so, you're done. Otherwise, the large-enough gap must be somewhere to the right. Walk down to the right and continue down until you find the gap. Again, because the height of the tree is O(log n), walking it three times (down, up, and possibly down again) is O(log n).

Without knowing anymore specifics, I'd suggest reading about Interval Trees. Interval trees are a special 1 dimensional case of more generic kd-trees, and have a O(n log n) construction time, and O(log n) typical operation times. Exact algorithm implementations you'd need to find yourself, but you can start by looking at CGAL.

I know you've already accepted an answer, but since you indicated that you will probably be implementing in C++, you could also have a look at Boosts Interval Container Library (http://www.boost.org/doc/libs/1_46_1/libs/icl/doc/html/index.html).

My interval tree implementation with AVL tree.
public class IntervalTreeAVL<T>{
private static class TreeNode<T>{
private T low;
private T high;
private TreeNode<T> left;
private TreeNode<T> right;
private T max;
private int height;
private TreeNode(T l, T h){
this.low=l;
this.high=h;
this.max=high;
this.height=1;
}
}
private TreeNode<T> root;
public void insert(T l, T h){
root=insert(root, l, h);
}
private TreeNode<T> insert(TreeNode<T> node, T l, T h){
if(node==null){
return new TreeNode<T>(l, h);
}
else{
int k=((Comparable)node.low).compareTo(l);
if(k>0){
node.left=insert(node.left, l, h);
}
else{
node.right=insert(node.right, l, h);
}
node.height=Math.max(height(node.left), height(node.right))+1;
node.max=findMax(node);
int hd = heightDiff(node);
if(hd<-1){
int kk=heightDiff(node.right);
if(kk>0){
node.right=rightRotate(node.right);
return leftRotate(node);
}
else{
return leftRotate(node);
}
}
else if(hd>1){
if(heightDiff(node.left)<0){
node.left = leftRotate(node.left);
return rightRotate(node);
}
else{
return rightRotate(node);
}
}
else;
}
return node;
}
private TreeNode<T> leftRotate(TreeNode<T> n){
TreeNode<T> r = n.right;
n.right = r.left;
r.left=n;
n.height=Math.max(height(n.left), height(n.right))+1;
r.height=Math.max(height(r.left), height(r.right))+1;
n.max=findMax(n);
r.max=findMax(r);
return r;
}
private TreeNode<T> rightRotate(TreeNode<T> n){
TreeNode<T> r = n.left;
n.left = r.right;
r.right=n;
n.height=Math.max(height(n.left), height(n.right))+1;
r.height=Math.max(height(r.left), height(r.right))+1;
n.max=findMax(n);
r.max=findMax(r);
return r;
}
private int heightDiff(TreeNode<T> a){
if(a==null){
return 0;
}
return height(a.left)-height(a.right);
}
private int height(TreeNode<T> a){
if(a==null){
return 0;
}
return a.height;
}
private T findMax(TreeNode<T> n){
if(n.left==null && n.right==null){
return n.max;
}
if(n.left==null){
if(((Comparable)n.right.max).compareTo(n.max)>0){
return n.right.max;
}
else{
return n.max;
}
}
if(n.right==null){
if(((Comparable)n.left.max).compareTo(n.max)>0){
return n.left.max;
}
else{
return n.max;
}
}
Comparable c1 = (Comparable)n.left.max;
Comparable c2 = (Comparable)n.right.max;
Comparable c3 = (Comparable)n.max;
T max=null;
if(c1.compareTo(c2)<0){
max=n.right.max;
}
else{
max=n.left.max;
}
if(c3.compareTo((Comparable)max)>0){
max=n.max;
}
return max;
}
TreeNode intervalSearch(T t1){
TreeNode<T> t = root;
while(t!=null && !isInside(t, t1)){
if(t.left!=null){
if(((Comparable)t.left.max).compareTo(t1)>0){
t=t.left;
}
else{
t=t.right;
}
}
else{
t=t.right;
}
}
return t;
}
private boolean isInside(TreeNode<T> node, T t){
Comparable cLow=(Comparable)node.low;
Comparable cHigh=(Comparable)node.high;
int i = cLow.compareTo(t);
int j = cHigh.compareTo(t);
if(i<=0 && j>=0){
return true;
}
return false;
}
}

I've just found Guava's Range and RangeSet which do exactly that.
It implements all the operations cited:
Union
RangeSet<Integer> intervals = TreeRangeSet.create();
intervals.add(Range.closedOpen(1,4)); // stores {[1,4)}
intervals.add(Range.closedOpen(8,10)); // stores {[1,4), [8,10)}
// Now unite 3,7
intervals.add(Range.closedOpen(3,7)); // stores {[1,7), [8,10)}
Subraction
intervals.remove(Range.closedOpen(3,5)); //stores {[1,3), [5, 7), [8, 10)}
Intersection
intervals.contains(3); // returns false
intervals.contains(5); // returns true
intervals.encloses(Range.closedOpen(2,4)); //returns false
intervals.subRangeSet(Range.closedOpen(2,4)); // returns {[2,3)} (isEmpty returns false)
intervals.subRangeSet(Range.closedOpen(3,5)).isEmpty(); // returns true
Finding empty spaces (this will be the same complexity as a set iteration in the worst case):
Range freeSpace(RangeSet<Integer> ranges, int size) {
RangeSet<Integer> frees = intervals.complement().subRangeSet(Range.atLeast(0));
for (Range free : frees.asRanges()) {
if (!free.hasUpperBound()) {
return free;
}
if (free.upperEndpoint() - free.lowerEndpoint() >= size) {
return free;
}
}

Related

How to implement range search in KD-Tree

I have built a d dimensional KD-Tree. I want to do range search on this tree. Wikipedia mentions range search in KD-Trees, but doesn't talk about implementation/algorithm in any way. Can someone please help me with this? If not for any arbitrary d, any help for at least for d = 2 and d = 3 would be great. Thanks!
There are multiple variants of kd-tree. The one I used had the following specs:
Each internal node has max two nodes.
Each leaf node can have max maxCapacity points.
No internal node stores any points.
Side note: there are also versions where each node (irrespective of whether its internal or leaf) stores exactly one point. The algorithm below can be tweaked for those too. Its mainly the buildTree where the key difference lies.
I wrote an algorithm for this some 2 years back, thanks to the resource pointed to by #9mat .
Suppose the task is to find the number of points which lie in a given hyper-rectangle ("d" dimensions). This task can also be to list all points OR all points which lie in given range and satisfy some other criteria etc, but that can be a straightforward change to my code.
Define a base node class as:
template <typename T> class kdNode{
public: kdNode(){}
virtual long rangeQuery(const T* q_min, const T* q_max) const{ return 0; }
};
Then, an internal node (non-leaf node) can look like this:
class internalNode:public kdNode<T>{
const kdNode<T> *left = nullptr, *right = nullptr; // left and right sub trees
int axis; // the axis on which split of points is being done
T value; // the value based on which points are being split
public: internalNode(){}
void buildTree(...){
// builds the tree recursively
}
// returns the number of points in this sub tree that lie inside the hyper rectangle formed by q_min and q_max
int rangeQuery(const T* q_min, const T* q_max) const{
// num of points that satisfy range query conditions
int rangeCount = 0;
// check for left node
if(q_min[axis] <= value) {
rangeCount += left->rangeQuery(q_min, q_max);
}
// check for right node
if(q_max[axis] >= value) {
rangeCount += right->rangeQuery(q_min, q_max);
}
return rangeCount;
}
};
Finally, the leaf node would look like:
class leaf:public kdNode<T>{
// maxCapacity is a hyper - param, the max num of points you allow a node to hold
array<T, d> points[maxCapacity];
int keyCount = 0; // this is the actual num of points in this leaf (keyCount <= maxCapacity)
public: leaf(){}
public: void addPoint(const T* p){
// add a point p to the leaf node
}
// check if points[index] lies inside the hyper rectangle formed by q_min and q_max
inline bool containsPoint(const int index, const T* q_min, const T* q_max) const{
for (int i=0; i<d; i++) {
if (points[index][i] > q_max[i] || points[index][i] < q_min[i]) {
return false;
}
}
return true;
}
// returns number of points in this leaf node that lie inside the hyper rectangle formed by q_min and q_max
int rangeQuery(const T* q_min, const T* q_max) const{
// num of points that satisfy range query conditions
int rangeCount = 0;
for(int i=0; i < this->keyCount; i++) {
if(containsPoint(i, q_min, q_max)) {
rangeCount++;
}
}
return rangeCount;
}
};
In the code for range query inside the leaf node, it is also possible to do a "binary search" inside of "linear search". Since the points will be sorted along on the axis axis, you can do a binary search do find l and r values using q_min and q_max, and then do a linear search from l to r instead of 0 to keyCount-1 (of course in the worst case it wont help, but practically, and especially if you have a capacity of pretty high values, this may help).
This is my solution for a KD-tree, where each node stores points (so not just the leafs). (Note that adapting for where points are stored only in the leafs is really easy).
I leaf some of the optimizations out and will explain them at the end, this to reduce the complexity of the solution.
The get_range function has varargs at the end, and can be called like,
x1, y1, x2, y2 or
x1, y1, z1, x2, y2, z2 etc. Where first the low values of the range are given and then the high values.
(You can use as many dimensions as you like).
static public <T> void get_range(K_D_Tree<T> tree, List<T> result, float... range) {
if (tree.root == null) return;
float[] node_region = new float[tree.DIMENSIONS * 2];
for (int i = 0; i < tree.DIMENSIONS; i++) {
node_region[i] = -Float.MAX_VALUE;
node_region[i+tree.DIMENSIONS] = Float.MAX_VALUE;
}
_get_range(tree, result, tree.root, node_region, 0, range);
}
The node_region represents the region of the node, we start as large as possible. Cause for all we know this could be the region we are dealing with.
Here the recursive _get_range implementation:
static public <T> void _get_range(K_D_Tree<T> tree, List<T> result, K_D_Tree_Node<T> node, float[] node_region, int dimension, float[] target_region) {
if (dimension == tree.DIMENSIONS) dimension = 0;
if (_contains_region(tree, node_region, target_region)) {
_add_whole_branch(node, result);
}
else {
float value = _value(tree, dimension, node);
if (node.left != null) {
float[] node_region_left = new float[tree.DIMENSIONS*2];
System.arraycopy(node_region, 0, node_region_left, 0, node_region.length);
node_region_left[dimension + tree.DIMENSIONS] = value;
if (_intersects_region(tree, node_region_left, target_region)){
_get_range(tree, result, node.left, node_region_left, dimension+1, target_region);
}
}
if (node.right != null) {
float[] node_region_right = new float[tree.DIMENSIONS*2];
System.arraycopy(node_region, 0, node_region_right, 0, node_region.length);
node_region_right[dimension] = value;
if (_intersects_region(tree, node_region_right, target_region)){
_get_range(tree, result, node.right, node_region_right, dimension+1, target_region);
}
}
if (_region_contains_node(tree, target_region, node)) {
result.add(node.point);
}
}
}
One important thing that the other answer does not provide is this part:
if (_contains_region(tree, node_region, target_region)) {
_add_whole_branch(node, result);
}
With a range search for a KD-Tree you have 3 options for a node's region, it's:
fully outside
it intersects
it's fully contained
Once you know a region is fully contained, then you can add the whole branch without doing any dimension checks.
To make it more clear, here is the _add_whole_branch:
static public <T> void _add_whole_branch(K_D_Tree_Node<T> node, List<T> result) {
result.add(node.point);
if (node.left != null) _add_whole_branch(node.left, result);
if (node.right != null) _add_whole_branch(node.right, result);
}
In this image, all the big white dots where added using _add_whole_branch and only for the red dots a check for all dimensions had to be done.
Optimization
1)
Instead of starting with the root node for the _get_range function, instead you can find the split node. This is the first node that has it's point within the query range. To find the split node you will still need to start at the root node, but the calculations are a bit cheaper (cause you go either left or right till).
2)
Now I create the float[] node_region_left and float[] node_region_right, and since this happens in a recursive function it can lead to quite some arrays. However, you can reuse the one for the left for the right. I didn't do it in this example for clarity reasons.
I can also imagine storing the region size in the node, but this takes quite some more memory and might lead to a lot of cache misses.

Last remaining number

I was asked this question in an interview.
Given an array 'arr' of positive integers and a starting index 'k' of the array. Delete element at k and jump arr[k] steps in the array in circular fashion. Do this repeatedly until only one element remain. Find the last remaining element.
I thought of O(nlogn) solution using ordered map. Is any O(n) solution possible?
My guess is that there is not an O(n) solution to this problem based on the fact that it seems to involve doing something that is impossible. The obvious thing you would need to solve this problem in linear time is a data structure like an array that exposes two operations on an ordered collection of values:
O(1) order-preserving deletes from the data structure.
O(1) lookups of the nth undeleted item in the data structure.
However, such a data structure has been formally proven to not exist; see "Optimal Algorithms for List Indexing and Subset Rank" and its citations. It is not a proof to say that if the natural way to solve some problem involves using a data structure that is impossible, the problem itself is probably impossible, but such an intuition is often correct.
Anyway there are lots of ways to do this in O(n log n). Below is an implementation of maintaining a tree of undeleted ranges in the array. GetIndex() below returns an index into the original array given a zero-based index into the array if items had been deleted from it. Such a tree is not self-balancing so will have O(n) operations in the worst case but in the average case Delete and GetIndex will be O(log n).
namespace CircleGame
{
class Program
{
class ArrayDeletes
{
private class UndeletedRange
{
private int _size;
private int _index;
private UndeletedRange _left;
private UndeletedRange _right;
public UndeletedRange(int i, int sz)
{
_index = i;
_size = sz;
}
public bool IsLeaf()
{
return _left == null && _right == null;
}
public int Size()
{
return _size;
}
public void Delete(int i)
{
if (i >= _size)
throw new IndexOutOfRangeException();
if (! IsLeaf())
{
int left_range = _left._size;
if (i < left_range)
_left.Delete(i);
else
_right.Delete(i - left_range);
_size--;
return;
}
if (i == _size - 1)
{
_size--; // Can delete the last item in a range by decremnting its size
return;
}
if (i == 0) // Can delete the first item in a range by incrementing the index
{
_index++;
_size--;
return;
}
_left = new UndeletedRange(_index, i);
int right_index = i + 1;
_right = new UndeletedRange(_index + right_index, _size - right_index);
_size--;
_index = -1; // the index field of a non-leaf is no longer necessarily valid.
}
public int GetIndex(int i)
{
if (i >= _size)
throw new IndexOutOfRangeException();
if (IsLeaf())
return _index + i;
int left_range = _left._size;
if (i < left_range)
return _left.GetIndex(i);
else
return _right.GetIndex(i - left_range);
}
}
private UndeletedRange _root;
public ArrayDeletes(int n)
{
_root = new UndeletedRange(0, n);
}
public void Delete(int i)
{
_root.Delete(i);
}
public int GetIndex(int indexRelativeToDeletes )
{
return _root.GetIndex(indexRelativeToDeletes);
}
public int Size()
{
return _root.Size();
}
}
static int CircleGame( int[] array, int k )
{
var ary_deletes = new ArrayDeletes(array.Length);
while (ary_deletes.Size() > 1)
{
int next_step = array[ary_deletes.GetIndex(k)];
ary_deletes.Delete(k);
k = (k + next_step - 1) % ary_deletes.Size();
}
return array[ary_deletes.GetIndex(0)];
}
static void Main(string[] args)
{
var array = new int[] { 5,4,3,2,1 };
int last_remaining = CircleGame(array, 2); // third element, this call is zero-based...
}
}
}
Also note that if the values in the array are known to be bounded such that they are always less than some m less than n, there are lots of O(nm) algorithms -- for example, just using a circular linked list.
I couldn't think of an O(n) solution. However, we could have O(n log n) average time by using a treap or an augmented BST with a value in each node for the size of its subtree. The treap enables us to find and remove the kth entry in O(log n) average time.
For example, A = [1, 2, 3, 4] and k = 3 (as Sumit reminded me in the comments, use the array indexes as values in the tree since those are ordered):
2(0.9)
/ \
1(0.81) 4(0.82)
/
3(0.76)
Find and remove 3rd element. Start at 2 with size = 2 (including the left subtree). Go right. Left subtree is size 1, which together makes 3, so we found the 3rd element. Remove:
2(0.9)
/ \
1(0.81) 4(0.82)
Now we're starting on the third element in an array with n - 1 = 3 elements and looking for the 3rd element from there. We'll use zero-indexing to correlate with our modular arithmetic, so the third element in modulus 3 would be 2 and 2 + 3 = 5 mod 3 = 2, the second element. We find it immediately since the root with its left subtree is size 2. Remove:
4(0.82)
/
1(0.81)
Now we're starting on the second element in modulus 2, so 1, and we're adding 2. 3 mod 2 is 1. Removing the first element we are left with 4 as the last element.

How to calculate a height of a tree

I am trying to learn DSA and got stuck on one problem.
How to calculate height of a tree. I mean normal tree, not any specific implementation of tree like BT or BST.
I have tried google but seems everyone is talking about Binary tree and nothing is available for normal tree.
Can anyone help me to redirect to some page or articles to calculate height of a tree.
Lets say a typical node in your tree is represented as Java class.
class Node{
Entry entry;
ArrayList<Node> children;
Node(Entry entry, ArrayList<Node> children){
this.entry = entry;
this.children = children;
}
ArrayList<Node> getChildren(){
return children;
}
}
Then a simple Height Function can be -
int getHeight(Node node){
if(node == null){
return 0;
}else if(node.getChildren() == null){
return 1;
} else{
int childrenMaxHeight = 0;
for(Node n : node.getChildren()){
childrenMaxHeight = Math.max(childrenMaxHeight, getHeight(n));
}
return 1 + childrenMaxHeight;
}
}
Then you just need to call this function passing the root of tree as argument. Since it traverse all the node exactly once, the run time is O(n).
1. If height of leaf node is considered as 0 / Or height is measured depending on number of edges in longest path from root to leaf :
int maxHeight(treeNode<int>* root){
if(root == NULL)
return -1; // -1 beacuse since a leaf node is 0 then NULL node should be -1
int h=0;
for(int i=0;i<root->childNodes.size();i++){
temp+=maxHeight(root->childNodes[i]);
if(temp>h){
h=temp;
}
}
return h+1;
}
2. If height of root node is considered 1:
int maxHeight(treeNode<int>* root){
if(root == NULL)
return 0;
int h=0;
for(int i=0;i<root->childNodes.size();i++){
temp+=maxHeight(root->childNodes[i]);
if(temp>h){
h=temp;
}
}
return h+1;
Above Code is based upon following class :
template <typename T>
class treeNode{
public:
T data;
vector<treeNode<T>*> childNodes; // vector for storing pointer to child treenode
creating Tree node
treeNode(T data){
this->data = data;
}
};
In case of 'normal tree' you can recursively calculate the height of tree in similar fashion to a binary tree but here you will have to consider all children at a node instead of just two.
To find a tree height a BFS iteration will work fine.
Edited form Wikipedia:
Breadth-First-Search(Graph, root):
create empty set S
create empty queues Q1, Q2
root.parent = NIL
height = -1
Q1.enqueue(root)
while Q1 is not empty:
height = height + 1
switch Q1 and Q2
while Q2 is not empty:
for each node n that is adjacent to current:
if n is not in S:
add n to S
n.parent = current
Q1.enqueue(n)
You can see that adding another queue allows me to know what level of the tree.
It iterates for each level, and for each mode in that level.
This is a discursion way to do it (opposite of recursive). So you don't have to worry about that too.
Run time is O(|V|+ |E|).

How to validate if a B-tree is sorted

I just had this as an interview question and was wondering if anyone knows the answer?
Write a method that validates whether a B-tree is correctly sorted. You do NOT need to validate whether
the tree is balanced. Use the following model for a node in the B-tree.
It was to be done in Java and use this model:
class Node {
List<Integer> keys;
List<Node> children;
}
One (space-inefficient but simple) way to do this is to do a generalized inorder traversal of the B-tree to get back the keys in what should be sorted order, then to check whether that sequence actually is in sorted order. Here's some quick code for this:
public static boolean isSorted(Node root) {
ArrayList<Integer> values = new ArrayList<Integer>();
performInorderTraversal(root, values);
return isArraySorted(values);
}
private static void performInorderTraversal(Node root, ArrayList<Integer> result) {
/* An empty tree has no values. */
if (result == null) return;
/* Process the first tree here, then loop, processing the interleaved
* keys and trees.
*/
performInorderTraversal(root.children.get(0), result);
for (int i = 1; i < root.children.size(); i++) {
result.add(root.children.get(i - 1));
performInorderTraversal(root.children.get(i), result);
}
}
private static boolean isArraySorted(ArrayList<Integer> array) {
for (int i = 0; i < array.size() - 1; i++) {
if (array.get(i) >= array.get(i + 1)) return false;
}
return true;
}
This takes time O(n) and uses space O(n), where n is the number of elements in the B-tree. You can cut the space usage down to O(h), where h is the height of the B-tree, by not storing all the elements in the traversal and instead just tracking the very last one, stopping the search early if the next-encountered value is not larger than the previous one. I didn't do that here because it takes more code, but conceptually it's not too hard.
Hope this helps!

how to find lowest common ancestor of a nary tree?

Is there a way without using extra space to find LCA of nary tree.
I did it using a string saving the preorder of both the nodes and finding common prefix
If nodes "know" their depth - or you're willing to allow the space to compute the depth of your nodes, you can back up from the lower node to the same depth of the higher node, and then go up one level at a time until they meet.
Depends on what "extra space" means in this context. You can do it with one integer - the difference in depths of the two nodes. Is that too much space?
Another possibility is given you don't have a parent pointer, you can use pointer reversal - every time you traverse a pointer, remember the location from which you came, remember the pointer you will next traverse, and then just before the next pointer traversal, replace that pointer with the back pointer. You have to reverse this when going up the tree to restore it. This takes the space of one pointer as a temporary. And another integer to keep the depth as you work your way down and up. Do this synchronously for the two nodes you seek, so that you can work your way back up from the lower one until you're at the same height in both traversals, and then work back up from both until you're at the common node. This takes three extra pieces of memory - one for each of the current depths, one for the temporary used during a pointer reversal. Very space efficient. Is it worth it?
Go back and do it for a binary tree. If you can do it for a binary tree you can do it for an n-ary tree.
Here's a link to LCA in a binary tree:
And here's how it looks after converting it to a n-ary tree LCA:
public class LCA {
public static <V> Node<V>
lowestCommonAncestor(Node<V> argRoot, Node<V> a, Node<V> b) {
if (argRoot == null) {
return null;
}
if (argRoot.equals(a) || argRoot.equals(b)) {
// if at least one matched, no need to continue
// this is the LCA for this root
return argRoot;
}
Iterator<Node<V>> it = argRoot.childIterator();
// nr of branches that a or b are on,
// could be max 2 (considering unique nodes)
int i = 0;
Node<V> lastFoundLCA = null;
while (it.hasNext()) {
Node<V> node = lowestCommonAncestor(it.next(), a, b);
if (node != null) {
lastFoundLCA = node;
i++ ;
}
if (i >= 2) {
return argRoot;
}
}
return lastFoundLCA;
}
}
Do a synchronous walk to both the nodes.
Start with LCA=root;
loop:
find the step to take for A and the step for B
if these are equal { LCA= the step; decend A; descend B; goto loop; }
done: LCA now contains the lca for A and B
Pseudocode in C:
struct node {
struct node *offspring[1234];
int payload;
};
/* compare function returning the slot in which this should be found/placed */
int find_index (struct node *par, struct node *this);
struct node *lca(struct node *root, struct node *one, struct node *two)
{
struct node *lca;
int idx1,idx2;
for (lca=root; lca; lca=lca->offspring[idx1] ) {
idx1 = find_index(lca, one);
idx2 = find_index(lca, two);
if (idx1 != idx2 || idx1 < 0) break;
if (lca->offspring[idx1] == NULL) break;
}
return lca;
}

Resources