BST - interval deletion/multiple nodes deletion - algorithm

Suppose I have a binary search tree in which I'm supposed to insert N unique-numbered keys in the order given to me on standard input, then I am to delete all nodes with keys in interval I = [min,max] and also all connections adjacent to these nodes. This gives me a lot of smaller trees that I am to merge together in a particular way. More precise description of the problem:
Given a BST, which contains distinct keys, and interval I, the interval deletion works in two phases. During the first phase it removes all nodes whose key is in I and all edges adjacent to the removed nodes. Let the resulting graph contain k connected components T1,...,Tk. Each of the components is a BST where the root is the node with the smallest depth among all nodes of this component in the original BST. We assume that the sequence of trees Ti is sorted so that for each i < j all keys in Ti are smaller than keys in Tj. During the second phase, trees Ti are merged together to form one BST. We denote this operation by Merge(T1,...,Tk). Its output is defined recurrently as follows:
EDIT: I am also supposed to delete any edge that connects nodes, that are separated by the given interval, meaning in example 2 the edge connecting nodes 10 and 20 is deleted because the interval[13,15] is 'in between them' thus separating them.
For an empty sequence of trees, Merge() gives an empty BST.
For a one-element sequence containing a tree T, Merge(T) = T.
For a sequence of trees T1,...,Tk where k > 1, let A1< A2< ... < An be the sequence of keys stored in the union of all trees T1,...,Tk, sorted in ascending order. Moreover, let m = ⌊(1+k)/2⌋ and let Ts be the tree which contains Am. Then, Merge(T1,...,Tk) gives a tree T created by merging three trees Ts, TL = Merge(T1,...,Ts-1) and TR = Merge(Ts+1,...,Tk). These trees are merged by establishing the following two links: TL is appended as the left subtree of the node storing the minimal key of Ts and TR is appended as the right subtree of the node storing the maximal key of Ts.
After I do this my task is to find the depth D of the resulting merged tree and the number of nodes in depth D-1. My program should be finished in few seconds even for a tree of 100000s of nodes (4th example).
My problem is that I haven't got a clue on how to do this or where even start. I managed to construct the desired tree before deletion but that's about that.
I'd be grateful for implementation of a program to solve this or any advice at all. Preferably in some C-ish programming language.
examples:
input(first number is number of keys to be inserted in the empty tree, the second are the unique keys to be inserted in the order given, the third line containts two numbers meaning the interval to be deleted):
13
10 5 8 6 9 7 20 15 22 13 17 16 18
8 16
correct output of the program: 3 3 , first number being the depth D, the second number of nodes in depth D-1
input:
13
10 5 8 6 9 7 20 15 22 13 17 16 18
13 15
correct output: 4 3
pictures of the two examples
example 3: https://justpaste.it/1du6l
correct output: 13 6
example 4: link
correct output: 58 9

This is a big answer, I'll talk at high-level.Please examine the source for details, or ask in comment for clarification.
Global Variables :
vector<Node*> roots : To store roots of all new trees.
map<Node*,int> smap : for each new tree, stores it's size
vector<int> prefix : prefix sum of roots vector, for easy binary search in merge
Functions:
inorder : find size of a BST (all calls combinedly O(N))
delInterval : Main theme is,if root isn't within interval, both of it's childs might be roots of new trees. The last two if checks for that special edge in your edit. Do this for every node, post-order. (O(N))
merge : Merge all new roots positioned at start to end index in roots. First we find the total members of new tree in total (using prefix-sum of roots i.e prefix). mid denotes m in your question. ind is the index of root that contains mid-th node, we retrieve that in root variable. Now recursively build left/right subtree and add them in left/right most node. O(N) complexity.
traverse: in level map, compute the number of nodes for every depth of tree. (O(N.logN), unordered_map will turn it O(N))
Now the code (Don't panic!!!):
#include <bits/stdc++.h>
using namespace std;
int N = 12;
struct Node
{
Node* parent=NULL,*left=NULL,*right = NULL;
int value;
Node(int x,Node* par=NULL) {value = x;parent = par;}
};
void insert(Node* root,int x){
if(x<root->value){
if(root->left) insert(root->left,x);
else root->left = new Node(x,root);
}
else{
if(root->right) insert(root->right,x);
else root->right = new Node(x,root);
}
}
int inorder(Node* root){
if(root==NULL) return 0;
int l = inorder(root->left);
return l+1+inorder(root->right);
}
vector<Node*> roots;
map<Node*,int> smap;
vector<int> prefix;
Node* delInterval(Node* root,int x,int y){
if(root==NULL) return NULL;
root->left = delInterval(root->left,x,y);
root->right = delInterval(root->right,x,y);
if(root->value<=y && root->value>=x){
if(root->left) roots.push_back(root->left);
if(root->right) roots.push_back(root->right);
return NULL;
}
if(root->value<x && root->right && root->right->value>y) {
roots.push_back(root->right);
root->right = NULL;
}
if(root->value>y && root->left && root->left->value<x) {
roots.push_back(root->left);
root->left = NULL;
}
return root;
}
Node* merge(int start,int end){
if(start>end) return NULL;
if(start==end) return roots[start];
int total = prefix[end] - (start>0?prefix[start-1]:0);//make sure u get this line
int mid = (total+1)/2 + (start>0?prefix[start-1]:0); //or this won't make sense
int ind = lower_bound(prefix.begin(),prefix.end(),mid) - prefix.begin();
Node* root = roots[ind];
Node* TL = merge(start,ind-1);
Node* TR = merge(ind+1,end);
Node* temp = root;
while(temp->left) temp = temp->left;
temp->left = TL;
temp = root;
while(temp->right) temp = temp->right;
temp->right = TR;
return root;
}
void traverse(Node* root,int depth,map<int, int>& level){
if(!root) return;
level[depth]++;
traverse(root->left,depth+1,level);
traverse(root->right,depth+1,level);
}
int main(){
srand(time(NULL));
cin>>N;
int* arr = new int[N],start,end;
for(int i=0;i<N;i++) cin>>arr[i];
cin>>start>>end;
Node* tree = new Node(arr[0]); //Building initial tree
for(int i=1;i<N;i++) {insert(tree,arr[i]);}
Node* x = delInterval(tree,start,end); //deleting the interval
if(x) roots.push_back(x);
//sort the disconnected roots, and find their size
sort(roots.begin(),roots.end(),[](Node* r,Node* v){return r->value<v->value;});
for(auto& r:roots) {smap[r] = inorder(r);}
prefix.resize(roots.size()); //prefix sum root sizes, to cheaply find 'root' in merge
prefix[0] = smap[roots[0]];
for(int i=1;i<roots.size();i++) prefix[i]= smap[roots[i]]+prefix[i-1];
Node* root = merge(0,roots.size()-1); //merge all trees
map<int, int> level; //key=depth, value = no of nodes in depth
traverse(root,0,level); //find number of nodes in each depth
int depth = level.rbegin()->first; //access last element's key i.e total depth
int at_depth_1 = level[depth-1]; //no of nodes before
cout<<depth<<" "<<at_depth_1<<endl; //hoorray
return 0;
}

Related

Find the Size of Each Node

I am having a tree rooted at 1 , i need to find the size of it's each node.
I am using this recursive call in order to do
find_size(int curr , int parent){
S[curr]=1
for(int j:Children[curr]){
if(j==parent) continue;
find_size(j,curr)
S[curr]+=S[j];
}
}
How to reduce my solution to non recursive one , using stacks or something ? Since recursive solution does not work for large data set.
You denote the node by indices, therefore I guess you have them represented as two arrays as follows:
int[] parent; // index of parent (the parent of the root is negative, e.g. -1)
int[][] children; // indices of children for each node
You can collect the sums starting from the leaf nodes and proceed upwards as soon as you know the result of all children O(n):
s = new int[parent.length];
int[] processed = new int[parent.length]; // the number of children that are processed
for (int i = 0; i < parent.length; i++) // initialize
s[i] = 1;
for (int i = 0; i < parent.length; i++) {
if (children[i].length == 0) { // leaf node
int p = parent[i], j = i;
while (p >= 0 && processed[j] == children[j].length) { // all children are processed
s[p] += s[j]; // adjust parent score
processed[p]++; // increase the number of processed child nodes for parent
j = p; // parent becomes the current node
p = parent[j]; // and its parent the parent
}
}
}
I will describe one possible iterative approach, which consists of two steps:
use a queue to determine the depth of each node.
process the nodes in decreasing depth order.
This approach is based on a BFS traversal of the tree, thus it does not directly mimic the DFS traversal which is done recursively, and has the advantage of being easier to implement iteratively.
For step 1:
initially, add into the queue only the root node, mark it with depth = 0.
while the queue is not empty, extract the first node from the queue, look at its depth (denoted here as currentDepth), and add its children to the end of the queue by marking each with childDepth = currentDepth + 1.
For step 2:
process the nodes in the reverse depth order. The processing of a node involves computing its sub-tree size, by adding the sizes of all children (plus 1, for the current node).
note that each time a node is processed, the children were already processed (because all nodes with higher depth were already processed), thus we already know the sizes of the children sub-trees.
Remark:
For step 2, sorting the nodes in decreasing depth order can be done efficiently by implementing the queue from step 1 with a list from which we never actually remove elements (e.g. the queue head can be kept using a pointer, and only this pointer can be incremented when polling).
Processing this list in reverse order is all that is needed in order to traverse through the nodes in decreasing depth order. Thus, it is not really necessary to explicitly use the depth field.
The implementation of the above ideas would look like this:
void find_size() {
// Step 1
int queue[numNodes];
queue[0] = 1; // add the root in the queue
int start = 0;
int end = 1;
while (start < end) {
int node = queue[start++]; // poll one node from the queue
for (int i: Children[node]) { // add its children to the end
queue[end++] = i;
}
}
// Step 2
for (int i = end - 1; i >= 0; i--) {
int node = queue[i];
S[node] = 1;
for (int j: Children[node]) {
S[node] += S[j];
}
}
}

How to calculate a height of a tree

I am trying to learn DSA and got stuck on one problem.
How to calculate height of a tree. I mean normal tree, not any specific implementation of tree like BT or BST.
I have tried google but seems everyone is talking about Binary tree and nothing is available for normal tree.
Can anyone help me to redirect to some page or articles to calculate height of a tree.
Lets say a typical node in your tree is represented as Java class.
class Node{
Entry entry;
ArrayList<Node> children;
Node(Entry entry, ArrayList<Node> children){
this.entry = entry;
this.children = children;
}
ArrayList<Node> getChildren(){
return children;
}
}
Then a simple Height Function can be -
int getHeight(Node node){
if(node == null){
return 0;
}else if(node.getChildren() == null){
return 1;
} else{
int childrenMaxHeight = 0;
for(Node n : node.getChildren()){
childrenMaxHeight = Math.max(childrenMaxHeight, getHeight(n));
}
return 1 + childrenMaxHeight;
}
}
Then you just need to call this function passing the root of tree as argument. Since it traverse all the node exactly once, the run time is O(n).
1. If height of leaf node is considered as 0 / Or height is measured depending on number of edges in longest path from root to leaf :
int maxHeight(treeNode<int>* root){
if(root == NULL)
return -1; // -1 beacuse since a leaf node is 0 then NULL node should be -1
int h=0;
for(int i=0;i<root->childNodes.size();i++){
temp+=maxHeight(root->childNodes[i]);
if(temp>h){
h=temp;
}
}
return h+1;
}
2. If height of root node is considered 1:
int maxHeight(treeNode<int>* root){
if(root == NULL)
return 0;
int h=0;
for(int i=0;i<root->childNodes.size();i++){
temp+=maxHeight(root->childNodes[i]);
if(temp>h){
h=temp;
}
}
return h+1;
Above Code is based upon following class :
template <typename T>
class treeNode{
public:
T data;
vector<treeNode<T>*> childNodes; // vector for storing pointer to child treenode
creating Tree node
treeNode(T data){
this->data = data;
}
};
In case of 'normal tree' you can recursively calculate the height of tree in similar fashion to a binary tree but here you will have to consider all children at a node instead of just two.
To find a tree height a BFS iteration will work fine.
Edited form Wikipedia:
Breadth-First-Search(Graph, root):
create empty set S
create empty queues Q1, Q2
root.parent = NIL
height = -1
Q1.enqueue(root)
while Q1 is not empty:
height = height + 1
switch Q1 and Q2
while Q2 is not empty:
for each node n that is adjacent to current:
if n is not in S:
add n to S
n.parent = current
Q1.enqueue(n)
You can see that adding another queue allows me to know what level of the tree.
It iterates for each level, and for each mode in that level.
This is a discursion way to do it (opposite of recursive). So you don't have to worry about that too.
Run time is O(|V|+ |E|).

Fastest non-recursive implementation of LCA?

Here's the algorithm I came up with for non-recursively finding the lowest common ancestor of two nodes in a binary tree. Here's the basic strategy:
Use a dictionary/hashtable to store the tree. Each key-value pair represents a node and its parent.
Starting from each of the two nodes, walk up the tree by setting the variable representing each node's value to that of its parent, storing traversed values in a hashset (one for each of the two nodes).
The search is complete when any of the following conditions are reached: (a) the value of the two nodes is equal; or (b) when the two paths cross each other (i.e., the hashset of node 1's traversed values contains the current value for node 2, or vice versa); or (c) the node passed in doesn't exist in the tree (in which case the algorithm terminates and returns -1).
My understanding is that the worst-case time and space complexity of my algorithm is O(log(n)), since we never need to make more than 2 * height traversals or store more than 2 * height values in our hashsets (and since the lookup time for the hashsets and the tree dictionary are O(1)).
Following is my code (C#). Please advise if I am correct in my analysis, or if there is a more efficient (non-recursive) way to do this:
int LowestCommonAncestor(int value1, int value2, Dictionary<int, int> tree)
{
var value1Visited = new HashSet<int>();
var value2Visited = new HashSet<int>();
while (true)
{
if (value1 == value2) return value1;
if (value1Visited.Contains(value2)) return value2;
if (value2Visited.Contains(value1)) return value1;
int nextValue1;
int nextValue2;
if (tree.TryGetValue(value1, out nextValue1))
{
//Walk node 1 up the tree:
value1 = nextValue1;
value1Visited.Add(value1);
}
else
{
//Node doesn't exist in tree:
return -1;
}
if (tree.TryGetValue(value2, out nextValue2))
{
//Walk node 2 up the tree:
value2 = nextValue2;
value2Visited.Add(value2);
}
else
{
//Node doesn't exist in tree:
return -1;
}
}
}
Go up from each node to the root to measure its depth
Move up the path from the deeper node until you get to the same depth as the shallower one.
Move up the paths from both nodes (i.e., keeping the same depth on both paths) until they meet.
You don't need two hash sets.
Go up and collect in a single hash set the ancestors of one node
Go up from the second node and at each of its ancestors, check if the path collected at step 1 contains the current ancestor of the second. Stop at the first common one.
With D being the max depth of the tree, the complexity is O(D) worst-case complexity.
The worst case complexity in N - number of nodes - when the tree is degenerated in a list, one of the node being the head of this list and the other is the tail.
If the tree is balanced, D=log(N) - with log's base being the number of descendents of a node (binary - log2, ternary - log3, etc).
Here, then, is my revised algorithm:
int LCA(int value1, int value2, Dictionary<int, int> tree)
{
if (!tree.ContainsKey(value1) || !(tree.ContainsKey(value2))) return -1;
int depth1 = 0;
int depth2 = 0;
int tmpVal1 = value1;
int tmpVal2 = value2;
while (tmpVal1 != -1)
{
tmpVal1 = tree[tmpVal1];
depth1++;
}
while (tmpVal2 != -1)
{
tmpVal2 = tree[tmpVal2];
depth2++;
}
if (depth1 > depth2)
{
while (depth1 > depth2)
{
value1 = tree[value1];
depth1--;
}
}
else if (depth2 > depth1)
{
while (depth2 > depth1)
{
value2 = tree[value2];
depth2--;
}
}
while (value1 != value2)
{
value1 = tree[value1];
value2 = tree[value2];
}
return value1;
}

how to find lowest common ancestor of a nary tree?

Is there a way without using extra space to find LCA of nary tree.
I did it using a string saving the preorder of both the nodes and finding common prefix
If nodes "know" their depth - or you're willing to allow the space to compute the depth of your nodes, you can back up from the lower node to the same depth of the higher node, and then go up one level at a time until they meet.
Depends on what "extra space" means in this context. You can do it with one integer - the difference in depths of the two nodes. Is that too much space?
Another possibility is given you don't have a parent pointer, you can use pointer reversal - every time you traverse a pointer, remember the location from which you came, remember the pointer you will next traverse, and then just before the next pointer traversal, replace that pointer with the back pointer. You have to reverse this when going up the tree to restore it. This takes the space of one pointer as a temporary. And another integer to keep the depth as you work your way down and up. Do this synchronously for the two nodes you seek, so that you can work your way back up from the lower one until you're at the same height in both traversals, and then work back up from both until you're at the common node. This takes three extra pieces of memory - one for each of the current depths, one for the temporary used during a pointer reversal. Very space efficient. Is it worth it?
Go back and do it for a binary tree. If you can do it for a binary tree you can do it for an n-ary tree.
Here's a link to LCA in a binary tree:
And here's how it looks after converting it to a n-ary tree LCA:
public class LCA {
public static <V> Node<V>
lowestCommonAncestor(Node<V> argRoot, Node<V> a, Node<V> b) {
if (argRoot == null) {
return null;
}
if (argRoot.equals(a) || argRoot.equals(b)) {
// if at least one matched, no need to continue
// this is the LCA for this root
return argRoot;
}
Iterator<Node<V>> it = argRoot.childIterator();
// nr of branches that a or b are on,
// could be max 2 (considering unique nodes)
int i = 0;
Node<V> lastFoundLCA = null;
while (it.hasNext()) {
Node<V> node = lowestCommonAncestor(it.next(), a, b);
if (node != null) {
lastFoundLCA = node;
i++ ;
}
if (i >= 2) {
return argRoot;
}
}
return lastFoundLCA;
}
}
Do a synchronous walk to both the nodes.
Start with LCA=root;
loop:
find the step to take for A and the step for B
if these are equal { LCA= the step; decend A; descend B; goto loop; }
done: LCA now contains the lca for A and B
Pseudocode in C:
struct node {
struct node *offspring[1234];
int payload;
};
/* compare function returning the slot in which this should be found/placed */
int find_index (struct node *par, struct node *this);
struct node *lca(struct node *root, struct node *one, struct node *two)
{
struct node *lca;
int idx1,idx2;
for (lca=root; lca; lca=lca->offspring[idx1] ) {
idx1 = find_index(lca, one);
idx2 = find_index(lca, two);
if (idx1 != idx2 || idx1 < 0) break;
if (lca->offspring[idx1] == NULL) break;
}
return lca;
}

algorithm to use to return a specific range of nodes in a directed graph

I have a class Graph with two lists types namely nodes and edges
I have a function
List<int> GetNodesInRange(Graph graph, int Range)
when I get these parameters I need an algorithm that will go through the graph and return the list of nodes only as deep (the level) as the range.
The algorithm should be able to accommodate large number of nodes and large ranges.
Atop this, should I use a similar function
List<int> GetNodesInRange(Graph graph, int Range, int selected)
I want to be able to search outwards from it, to the number of nodes outwards (range) specified.
alt text http://www.freeimagehosting.net/uploads/b110ccba58.png
So in the first function, should I pass the nodes and require a range of say 2, I expect the results to return the nodes shown in the blue box.
The other function, if I pass the nodes as in the graph with a range of 1 and it starts at node 5, I want it to return the list of nodes that satisfy this criteria (placed in the orange box)
What you need seems to be simply a depth-limited breadth-first search or depth-first search, with an option of ignoring edge directionality.
Here's a recursive definition that may help you:
I'm the only one of range 1 from myself.
I know who my immediate neighbors are.
If N > 1, then those of range N from myself are
The union of all that is of range N-1 from my neighbors
It should be a recursive function, that finds neighbours of the selected, then finds neighbours of each neighbour until range is 0. DFS search something like that:
List<int> GetNodesInRange(Graph graph, int Range, int selected){
var result = new List<int>();
result.Add( selected );
if (Range > 0){
foreach ( int neighbour in GetNeighbours( graph, selected ) ){
result.AddRange( GetNodesInRange(graph, Range-1, neighbour) );
}
}
return result;
}
You should also check for cycles, if they are possible. This code is for tree structure.
// get all the nodes that are within Range distance of the root node of graph
Set<int> GetNodesInRange(Graph graph, int Range)
{
Set<int> out = new Set<int>();
GetNodesInRange(graph.root, int Range, out);
return out;
}
// get all the nodes that are within Range successor distance of node
// accepted nodes are placed in out
void GetNodesInRange(Node node, int Range, Set<int> out)
{
boolean alreadyVisited = out.add(node.value);
if (alreadyVisited) return;
if (Range == 0) return;
// for each successor node
{
GetNodesInRange(successor, Range-1, out);
}
}
// get all the nodes that are within Range distance of selected node in graph
Set<int> GetNodesInRange(Graph graph, int Range, int selected)
{
Set<int> out = new Set<int>();
GetNodesInRange(graph, Range, selected, out);
return out;
}
// get all the nodes that are successors of node and within Range distance
// of selected node
// accepted nodes are placed in out
// returns distance to selected node
int GetNodesInRange(Node node, int Range, int selected, Set<int> out)
{
if (node.value == selected)
{
GetNodesInRange(node, Range-1, out);
return 1;
}
else
{
int shortestDistance = Range + 1;
// for each successor node
{
int distance = GetNodesInRange(successor, Range, selected, out);
if (distance < shortestDistance) shortestDistance = distance;
}
if (shortestDistance <= Range)
{
out.add(node.value);
}
return shortestDistance + 1;
}
}
I modified your requirements somewhat to return a Set rather than a List.
The GetNodesInRange(Graph, int, int) method will not handle graphs that contain cycles. This can be overcome by maintaining a collection of nodes that have already been visited. The GetNodesInRange(Graph, int) method makes use of the fact that the out set is a collection of visited nodes to overcome cycles.
Note: This has not been tested in any way.

Resources