Job Interview Question Using Trees, What data to save? - algorithm

I was solving the following job interview question and solved most of it but failed at the last requirement.
Q: Build a data structure which supports the following functions:
Init - Initialise Empty DS. O(1) Time complexity.
SetPositiveInDay(d,x) - Add to the DS that in day d exactly x new people were infected with covid-19. O(log n)Time complexity.
WorseBefore(d) - From the days inserted into the DS and smaller than d return the last one which has more newly infected people than d. O(log n)Time complexity.
For example:
Init()
SetPositiveInDay(1,10)
SetPositiveInDay(2,20)
SetPositiveInDay(3,15)
SetPositiveInDay(5,17)
SetPositiveInDay(23,180)
SetPositiveInDay(8,13)
SetPositiveInDay(13,18)
WorstBefore(13) // Returns day #2
SetPositiveInDay(10,19)
WorstBefore(13) // Returns day #10
Important note: you can't suppose that days will be entered by order and can't suppose too that there won't be "gaps" between days. (Some days may not be saved in the DS while those after it may be).
What I did?
I used AVL tree (I could use 2-3 tree too).
For each node I have:
Sick - Number of new infected people in that day.
maxLeftSick - Max number of infected people for left son.
maxRightSick - Max number of infected people for right son.
When inserted a new node I made sure that in rotation data won't get missed plus, for each single node from the new one till the root I did:
But I wasn't successful implementing WorseBefore(d).

Where to search?
First you need to find the node node corresponding to d in the tree ordered by days. Let x = Sick(node). This can be done in O(log n).
If maxLeftSick(node) > x, the solution must be in the left subtree of node. Search for the solution there and return the answer. This can be done in O(log n) - see below.
Otherwise, traverse the tree upwards towards the root, starting from node, until you find the first node nextPredecessor satisfying this property (this takes O(log n)):
nextPredecessor is smaller than node,
and either
Sick(nextPredecessor) > x or
maxLeftSick(nextPredecessor) > x.
If no such node exists, we give up. In case 1, just return nextPredecessor since that is the best solution.
In case 2, we know that the solution must be in the left subtree of nextPredecessor, so search there and return the answer. Again, this takes O(log n) - see below.
Note that there is no need to search in the right subtree of nextPredecessor since the only nodes that are smaller than node in that subtree would be the left subtree of node itself, and we have already excluded that.
Note also that it is not necessary to traverse further up the tree than nextPredecessor since those nodes are even smaller, and we are looking for the largest node satisfying all constraints.
How to search?
OK, so how do we search for the solution in a subtree? Finding the largest day within a subtree rooted in q that is worse than an infection number x is simple using the maxLeftSick and maxRightSick information:
If q has a right child and maxRightSick(q) > x then search in the right subtree of q.
If q has no right child and Sick(q) > x, return Day(q).
If q has a left child and maxLeftSick(q) > x then search in the left subtree of q.
Otherwise there is no solution within the subtree q.
We are effectively using maxLeftSick and maxRightSick to prune the search tree to include only "worse" nodes, and within that pruned tree we get the right most node, i.e. the one with the largest day.
It is easy to see that this algorithm runs in O(log n) where n is the total number of nodes since the number of steps is bounded by the height of the tree.
Pseudocode
Here is the pseudocode (assuming maxLeftSick and maxRightSick return -1 if no corresponding child node exists):
// Returns the largest day smaller than d such that its
// infection number is larger than the infection number on day d.
// Returns -1 if no such day exists.
int WorstBefore(int d) {
node = find(d);
// try to find the solution in the left subtree
if (maxLeftSick(node) > Sick(node)) {
return FindLastWorseThan(node -> left, Sick(node));
}
// move up towards root until we find the first node
// that is smaller than `node` and such that
// Sick(nextPredecessor) > Sick(node) or
// maxLeftSick(nextPredecessor) > Sick(node).
nextPredecessor = findNextPredecessor(node);
if (nextPredecessor == null) return -1;
// Case 1
if (Sick(nextPredecessor) > Sick(node)) return nextPredecessor;
// Case 2: maxLeftSick(nextPredecessor) > Sick(node)
return FindLastWorseThan(nextPredecessor -> left, Sick(node));
}
// Finds the latest day within the given subtree with root "node" where
// the infection number is larger than x. Runs in O(log(size(q)).
int FindLastWorseThan(Node q, int x) {
if ((q -> right) = null and Sick(q) > x) return Day(q);
if (maxRightSick(q) > x) return FindLastWorseThan(q -> right, x);
if (maxLeftSick(q) > x) return FindLastWorseThan(q -> left, x);
return -1;
}

First of all, your chosen data structure looks fine to me. You did not mention it explicitly, but I assume that the "key" you use in the AVL tree is the day number, i.e. an in-order traversal of the tree would list the nodes in their chronological order.
I would just suggest a cosmetic change: store the maximum value of sick in the node itself, so that you don't have two similar informations (maxLeftSick and maxRightSick) stored in one node instance, but move those two informations to the child nodes, so that your node.maxLeftSick is actually stored in node.left.maxSick, and similarly node.maxRightSick is stored in node.right.maxSick. This is of course not done when that child does not exist, but then we don't need that information either. In your structure maxLeftSick would be 0 when left is not defined. In my proposed structure, you would not have that value -- the 0 would follow naturally from the fact that there is no left child. In my proposal, the root node would have an information in maxSick which is not present in yours, and which would be the sum of your root.maxLeftSick and root.maxRightSick. This information would not really be used, but it is just there to make the structure consistent throughout the tree.
So you would just store one maxSick, which considers the current node's sick value also in that maximum. The processing you do during rotations will need to change accordingly, but will not become more complex.
I will assume that your AVL tree is single-threaded, i.e. you don't keep track of parent-pointers. So create a find method which will return the path to the node to be found. For instance, in Python syntax, it could look like this:
def find(self, day):
node = self.root
path = [] # an array of nodes
while node:
path.append(node)
if node.day == day: # bingo
return path
if day < node.day:
node = node.left
else:
node = node.right
Then the worstBefore method could look like this:
def worstBefore(self, day):
path = self.find(day)
if not path:
return # day not found
# get number of sick people on that day:
sick = path[-1].sick
# look for recent day with greater number of sick
while path:
node = path.pop() # walk upward, starting with found node
if node.day < day and node.sick > sick:
return node.day
if node.left and node.left.maxSick > sick:
# we will find the result in this subtree
node = node.left
while True:
if node.right and node.right.maxSick > sick:
node = node.right
elif node.sick > sick: # bingo
return node.day
else:
node = node.left
So the path returned by the find method will be used to get the parents of a node when you need to backtrack upwards in the tree along that path.
If along that path you find a left child whose maxSick is greater, then you know that the targeted node must be in that subtree. It is then a matter to walk down that subtree in a controlled way, choosing the right child when it still has maxSick greater. Otherwise check the current node's sick value and return that one if that value is greater. Otherwise go left, and repeat.
While there is no such left sub tree, go up along the path. If that parent would be a match, then return it (make sure to verify the day number). Keep checking for left sub trees that have a larger maxSick.
This runs in O(logn) because you first will walk zero or more steps upward and then zero or more steps downward (in a left subtree).
You can see your example scenario run on repl.it. There I focussed on this question, and didn't implement the rotations.

Related

What is the pseudocode for this binary tree

basically i am required to come out with a pseudocode for this. What i currently have is
dictionary = {}
if node.left == none and node.right == none
visit(node)
dictionary[node] = 1
This is only the leaf nodes, how do i get the size for each node(parent and root)?
You can do a post-order traversal to find the size of each node.
The idea is to first handle both left and right trees. Then, after they are processed - you can use this data to process the current node.
This should look something like:
count = 0
if (node.left != none)
count += visit(node.left)
if (node.right != none)
count += visit(node.right)
// self is included.
count += 1
// update the node
node.size = count
return count
The dictionary for visited nodes is not needed since this is a tree, it guarantees to end.
As a side note - the size attribute of each node, is an important one. It basically upgrades your tree to a Order Statistics Tree
well the concept is that each node will know it's subtree size by first knowing the subtree size of all it's child which is maximum two child here as it is a binary tree, so once it knows subtree size of all child it can then add up all of them and atlast add 1 to it's
result and then the same thing will be done by it's parent also and so on upto root node. if we think about leaf node, it
has no child, so result subtree size will be only 1 in which it include itself.
one this idea is clear, it is easy to write code
that while traversing we will first know the subtree size of child nodes of current node then add 1 in it, in case of leaf node it will have subtree size of 1 only, below is the pseudocode of traverse funtion which finds the subtree size of each node and store them in dictionary sizeDictionary and a visited dictionary/array having larger scope has been used to keep track of visited nodes.
traverse(Tree curNode, dictionary subTreeSizeDictionary)
visited[curNode] = true
subtreeSizeDictionary[curNode] = 0
for child of curNode
if(notVisited[child])
traverse(child , sizeDictionary)
subtreeSizeDictionary[curNode] += subtreeSizeDictionary[child]
subtreeSizeDictionary[curNode] += 1;
here it is binary tree, but as you can see from pseudocode this concept can be used for any valid tree, the time complexity is O(n) as we visited each node only once.

Exercise review Trees binary c++

Given a binary tree, whose root is located a treasure, and whose internal nodes can contain a dragon or does not contain anything, you are asked to design an algorithm that tells us the leaf of the tree whose path to the root has the lowest number of dragons. In the event that there are multiple paths with the same number of dragons, the algorithm will return that which is more to the left of all them. To do this, implement a function which gets a binary tree whose nodes store integers:
The root contains the integer 0, which represents the treasure.
The internal nodes contain the integer 1 to indicate that the node there is a dragon or the integer 2 to indicate that there is no dragon.
In each leaf stores an integer greater than or equal to 3 that cannot be repeated. and return the whole sheet to the path selected. The tree has at least one root node and a leaf node different from the root. For example, given the following tree (the second test case shown in the example), the algorithm return the integer 4.
I can not upload a picture of the tree of example, but someone tell me with words that I can do to go through all the branches, and to know which is the path with less dragons I'd appreciate it.
A greeting!
You want to think about these problems recursively: if you're at a parent node with...
no children you must have no dragon and a node counter, and you consider yourself to have 0 dragons and be the best node: you'd tell your parent that if asked
a left branch and/or a right branch, then you ask your children for their dragon-count and which node they consider best, and IF the left node reports a lesser or equal dragon count...
you take your best-node and dragon-count from it, ELSE
you take your best-node and dragon-count from the right node
then you add 1 to the dragon-count if your node's storing the integer 1
By starting that processing at the root node, you get the result for the entire tree.
This is the first algorithm that comes to mind. Assuming that you have an array that stores the values in nodes node_value[NODE_NUM], where NODE_NUM is the number of nodes in your tree, and you store index of childs of each node with the arrays left[NODE_NUM] and right[NODE_NUM], and your root will have index root_index. We will store information about the number of dragons in the path to root in the array dragon[NODE_NUM] So the algorithm pseudocode is:
# the recursive function itself
process(node_index):
n_left <- 0
if node_value[left[node_index]] = 1
n_left <- 1
n_right <- 0
if node_value[right[node_index]] = 1
n_right <- 1
dragon[left[node_index]] <- dragon[node_index] + n_left
dragon[right[node_index]] <- dragon[node_index] + n_right
process(left[node_index])
process(right[node_index])
# the number of dragons in path from root to root is, obviously, zero:
dragon[root_index] <- 0
# Call the function itself
process(root_index)
After that, in dragon we will have the number of dragons in the way to root from every nodes in tree. Now, all you have to do is to loop through all nodes and find the node that is a leaf and that its values is minimal:
min <- infinity
node_min <- unknown
for each node:
if node_value[node] >= 3:
if dragon[node] < min:
min <- dragon[node]
node_min <- node
return node_min
Now, the node_min is the node that has least dragons in the path to root.

Pseudo Code and conditions for deleting a Node in Binary Search Tree

I'm trying to write a function to remove a node from a binary tree. I haven't coded the function yet, and I am trying to think about the different conditions I should consider for removing a node. I am guessing that the possible conditions are:
The node has no children
The node has one child
The node has 2 children
In each of these cases what would be the algorithm to perform a delete function?
This is something you would find in any standard textbook about algorithms, but let's suppose you are interested in the unbalanced case (balanced trees usually performs some rebalancing operations called "rotations" after a removal) and you use the "obvious" datastructure (a tree_node structure that holds the value and two pointers to other tree_node):
No children: release the memory hold by the node and set the parent's child link that pointed to it as NULL;
One child: release the memory hold by the node and set the parent's child link that pointed to it as the address of its unique child;
Two children: this is indeed the "complicated" case. Find the rightmost node of the left child (or the leftmost node of the right child), take its value, remove it (it is "case 1", so it is easy and can be done recursively) and set the current node's value as the one of that node. This is O(tree_height) = O(n), but it is not a problem (at least in theory) because this would be neverthless the complexity of finding a node.
Does your tree have any additional properties?
Is it an AVL?
If not, there are some pretty obvious and straightforward ways to do what you want (which will depend on your data representation, as Vitalij said).
And if it is an AVL for example, there ALSO are some well known method for doing that (wikipedia will tell you more on that topic)
First task is to find whether node exists which will be done during search and rest of your conditions are correct.
Leaf node: set the parent's child (right/left) to NULL.
Has one child: Just set the child of the node to be deleted to its parent's child.
Has two children: Basically have to re-order the whole subtree here by pruning the subtree to by finding new children for the node to be deleted.
Assuming you are dealing with general binary trees, do the following,
Node has no child- ie it is a leaf : Conveniently delete it..
Node has one child - Make the parent of the node to be deleted parent of its child , then delete the node. ie, if A->Parent = B; C->Parent = A; and A has to be deleted, then 1. Make C->Parent = B; 2. Delete A;
Tricky one.... Yes, replacing the node to be deleted by the left most child of the right subtree work, or by the rightmost tree of the left subtree, either will do... because it can be seen like this,
When a node is deleted, it has to be replaced by a node which satisfies some properties...
Lets say if our binary tree represents sorted numbers (in increasing order) in inorder traversal, then the deleted node should be replaced by some node from either of its subtrees. That should be larger in value than the whole remaining left subtree, and smaller than the whole remaining right subtree (remaining means the subtree remaining after adjusting for the deleted node successfully). Only two such nodes exist, leftmost leaf of the right subtree, or the rightmost node of left one.
Hence, replacing the deleted node from either one suffices...!!
Delete the given keys one at a time from the binary search tree. Possible equal keys were inserted into the left branch of the existing node. Please note that the insertion strategy also affects how the deletion is performed
BinarySearchTree-Delete
Node Delete(Node root, Key k)
1 if (root == null) // failed search
2 return null;
3 if (k == root.key) // successful search
4 return DeleteThis(root);
5 if (k < root.key) // k in the left branch
6 root.left = Delete(root.left, k);
7 else // k > root.key, i.e., k in the right branch
8 root.right = Delete(root.right, k);
9 return root;
Node DeleteThis(Node root)
1 if root has two children
2 p = Largest(root.left); // replace root with its immediate predecessor p
3 root.key = p.key;
4 root.left = Delete(root.left, p)
5 return root;
6 if root has only left child
7 return root.left
8 if root has only right child
9 return root.right
10 else root has no children
11 return null
Node Largest(Node root)
1 if root has no right child
2 return root
3 return Largest(root.right)

Sum up child values and save the values calculated in intermediate steps

struct node {
int value;
struct node* left;
struct node* right;
int left_sum;
int right_sum;
}
In a binary tree, from a particular node, there is a simply recursive algorithm to sum up all its child values. Is there a way to save the values calculated in the intermediate steps and store them as left_sum and right_sum in child nodes?
Will it be easier to do this bottom up by adding a struct node* parent link to the node definition?
No, this is clearly an exercise in recursion. Think about what the sum means. It's zero plus the "sum of all values from the root down".
Interestingly enough, the "sum of all values from the root down" is the value of the root node plus the "sum of all values from its left node down" plus the "sum of all values from its right node down".
Hopefully, you can see where I'm going here.
The essence of recursion is to define an operation in terms of similar, simpler, operations with a terminating condition.
The terminating condition, in this case, is the leaf nodes of the tree or, to make the code simpler, beyond the leaf nodes.
Examine the following pseudo-code:
def sumAllNodes (node):
if node == NULL:
return 0
return node.value + sumAllNodes (node.left) + sumAllNodes (node.right)
fullSum = sumAllNodes (rootnode)
That's really all there is to it. With the following tree:
__A(9)__
/ \
B(3) C(2)
/ \ \
D(21) E(7) F(1)
Using the pseudo-code, the sum is the value of A (9) plus the sums of the left and right subtrees.
The left subtree of A is the value of B (3) plus the sums of its left and right subtrees.
The left subtree of B is the value of D (21) plus the sums of its left and right subtrees.
The left subtree of D is the value of NULL (0).
Later on, the right subtree of A is the value of C (2) plus the sums of its left and right subtrees, it's left subtree being empty, its right subtree being F (1).
Because you're doing this recursively, you don't explicitly ever walk your way up the tree. It's the fact that the recursive calls are returning with the summed values which gives that ability. In other words, it happens under the covers.
And the other part of your question is not really useful though, of course, there may be unstated requirements that I'm not taking into account, because they're, well, ... unstated :-)
Is there a way to save the values calculated in the intermediate steps and store them as left_sum and right_sum in child nodes?
You never actually re-use the sums for a given sub-tree. During a sum calculation, you would calculate the B-and-below subtree only once as part of adding it to A and the C-and-below subtree.
You could store those values so that B contained both the value and the two sums (left and right) - this would mean that every change to the tree would have to propagate itself up to the root as well but it's doable.
Now there are some situations where that may be useful. For example, if the tree itself changes very rarely but you want the sum very frequently, it makes sense performance wise to do it on update so that the cost is amortised across lots of reads.
I sometimes use this method with databases (which are mostly read far more often than written) but it's unusual to see it in "normal" binary trees.
Another possible optimisation: just maintain the sum as a separate variable in the tree object. Initialise it to zero then, whenever you add a node, add its value to the sum.
When you delete a node, subtract its value from the sum. That gives you your very fast O(1) "return sum" function without having to propagate upwards on update.
The downside is that you only have a sum for the tree as a whole but I'm having a hard time coming up with a valid use case for needing the sum of subtrees. If you have such a use case, then I'd go for something like:
def updateAllNodes (node):
if node == NULL:
return 0
node.leftSum = updateAllNodes (node.left)
node.rightSum = updateAllNodes (node.right)
return node.value + node.leftSum + node.rightSum
change the tree somehow (possibly many times)
fullSum = updateAllNodes (root)
In other words, just update the entire tree after each change (or batch the changes then update if you know there's quite a few changes happening). This will probably be a little simpler than trying to do it as part of the tree update itself.
You can even use a separate dirtyFlag which is set to true whenever the tree changes and set to false whenever you calculate and store the sum. Then use that in the sum calculation code to only do the recalc if it's dirty (in other words, a cache of the sums).
That way, code like:
fullSum = updateAllNodes (root)
fullSum = updateAllNodes (root)
fullSum = updateAllNodes (root)
fullSum = updateAllNodes (root)
fullSum = updateAllNodes (root)
will only incur a cost on the first invocation. The other four should be blindingly fast since the sum is cached.

Create Balanced Binary Search Tree from Sorted linked list

What's the best way to create a balanced binary search tree from a sorted singly linked list?
How about creating nodes bottom-up?
This solution's time complexity is O(N). Detailed explanation in my blog post:
http://www.leetcode.com/2010/11/convert-sorted-list-to-balanced-binary.html
Two traversal of the linked list is all we need. First traversal to get the length of the list (which is then passed in as the parameter n into the function), then create nodes by the list's order.
BinaryTree* sortedListToBST(ListNode *& list, int start, int end) {
if (start > end) return NULL;
// same as (start+end)/2, avoids overflow
int mid = start + (end - start) / 2;
BinaryTree *leftChild = sortedListToBST(list, start, mid-1);
BinaryTree *parent = new BinaryTree(list->data);
parent->left = leftChild;
list = list->next;
parent->right = sortedListToBST(list, mid+1, end);
return parent;
}
BinaryTree* sortedListToBST(ListNode *head, int n) {
return sortedListToBST(head, 0, n-1);
}
You can't do better than linear time, since you have to at least read all the elements of the list, so you might as well copy the list into an array (linear time) and then construct the tree efficiently in the usual way, i.e. if you had the list [9,12,18,23,24,51,84], then you'd start by making 23 the root, with children 12 and 51, then 9 and 18 become children of 12, and 24 and 84 become children of 51. Overall, should be O(n) if you do it right.
The actual algorithm, for what it's worth, is "take the middle element of the list as the root, and recursively build BSTs for the sub-lists to the left and right of the middle element and attach them below the root".
Best isn't only about asynmptopic run time. The sorted linked list has all the information needed to create the binary tree directly, and I think this is probably what they are looking for
Note that the first and third entries become children of the second, then the fourth node has chidren of the second and sixth (which has children the fifth and seventh) and so on...
in psuedo code
read three elements, make a node from them, mark as level 1, push on stack
loop
read three elemeents and make a node of them
mark as level 1
push on stack
loop while top two enties on stack have same level (n)
make node of top two entries, mark as level n + 1, push on stack
while elements remain in list
(with a bit of adjustment for when there's less than three elements left or an unbalanced tree at any point)
EDIT:
At any point, there is a left node of height N on the stack. Next step is to read one element, then read and construct another node of height N on the stack. To construct a node of height N, make and push a node of height N -1 on the stack, then read an element, make another node of height N-1 on the stack -- which is a recursive call.
Actually, this means the algorithm (even as modified) won't produce a balanced tree. If there are 2N+1 nodes, it will produce a tree with 2N-1 values on the left, and 1 on the right.
So I think #sgolodetz's answer is better, unless I can think of a way of rebalancing the tree as it's built.
Trick question!
The best way is to use the STL, and advantage yourself of the fact that the sorted associative container ADT, of which set is an implementation, demands insertion of sorted ranges have amortized linear time. Any passable set of core data structures for any language should offer a similar guarantee. For a real answer, see the quite clever solutions others have provided.
What's that? I should offer something useful?
Hum...
How about this?
The smallest possible meaningful tree in a balanced binary tree is 3 nodes.
A parent, and two children. The very first instance of such a tree is the first three elements. Child-parent-Child. Let's now imagine this as a single node. Okay, well, we no longer have a tree. But we know that the shape we want is Child-parent-Child.
Done for a moment with our imaginings, we want to keep a pointer to the parent in that initial triumvirate. But it's singly linked!
We'll want to have four pointers, which I'll call A, B, C, and D. So, we move A to 1, set B equal to A and advance it one. Set C equal to B, and advance it two. The node under B already points to its right-child-to-be. We build our initial tree. We leave B at the parent of Tree one. C is sitting at the node that will have our two minimal trees as children. Set A equal to C, and advance it one. Set D equal to A, and advance it one. We can now build our next minimal tree. D points to the root of that tree, B points to the root of the other, and C points to the... the new root from which we will hang our two minimal trees.
How about some pictures?
[A][B][-][C]
With our image of a minimal tree as a node...
[B = Tree][C][A][D][-]
And then
[Tree A][C][Tree B]
Except we have a problem. The node two after D is our next root.
[B = Tree A][C][A][D][-][Roooooot?!]
It would be a lot easier on us if we could simply maintain a pointer to it instead of to it and C. Turns out, since we know it will point to C, we can go ahead and start constructing the node in the binary tree that will hold it, and as part of this we can enter C into it as a left-node. How can we do this elegantly?
Set the pointer of the Node under C to the node Under B.
It's cheating in every sense of the word, but by using this trick, we free up B.
Alternatively, you can be sane, and actually start building out the node structure. After all, you really can't reuse the nodes from the SLL, they're probably POD structs.
So now...
[TreeA]<-[C][A][D][-][B]
[TreeA]<-[C]->[TreeB][B]
And... Wait a sec. We can use this same trick to free up C, if we just let ourselves think of it as a single node instead of a tree. Because after all, it really is just a single node.
[TreeC]<-[B][A][D][-][C]
We can further generalize our tricks.
[TreeC]<-[B][TreeD]<-[C][-]<-[D][-][A]
[TreeC]<-[B][TreeD]<-[C]->[TreeE][A]
[TreeC]<-[B]->[TreeF][A]
[TreeG]<-[A][B][C][-][D]
[TreeG]<-[A][-]<-[C][-][D]
[TreeG]<-[A][TreeH]<-[D][B][C][-]
[TreeG]<-[A][TreeH]<-[D][-]<-[C][-][B]
[TreeG]<-[A][TreeJ]<-[B][-]<-[C][-][D]
[TreeG]<-[A][TreeJ]<-[B][TreeK]<-[D][-]<-[C][-]
[TreeG]<-[A][TreeJ]<-[B][TreeK]<-[D][-]<-[C][-]
We are missing a critical step!
[TreeG]<-[A]->([TreeJ]<-[B]->([TreeK]<-[D][-]<-[C][-]))
Becomes :
[TreeG]<-[A]->[TreeL->([TreeK]<-[D][-]<-[C][-])][B]
[TreeG]<-[A]->[TreeL->([TreeK]<-[D]->[TreeM])][B]
[TreeG]<-[A]->[TreeL->[TreeN]][B]
[TreeG]<-[A]->[TreeO][B]
[TreeP]<-[B]
Obviously, the algorithm can be cleaned up considerably, but I thought it would be interesting to demonstrate how one can optimize as you go by iteratively designing your algorithm. I think this kind of process is what a good employer should be looking for more than anything.
The trick, basically, is that each time we reach the next midpoint, which we know is a parent-to-be, we know that its left subtree is already finished. The other trick is that we are done with a node once it has two children and something pointing to it, even if all of the sub-trees aren't finished. Using this, we can get what I am pretty sure is a linear time solution, as each element is touched only 4 times at most. The problem is that this relies on being given a list that will form a truly balanced binary search tree. There are, in other words, some hidden constraints that may make this solution either much harder to apply, or impossible. For example, if you have an odd number of elements, or if there are a lot of non-unique values, this starts to produce a fairly silly tree.
Considerations:
Render the element unique.
Insert a dummy element at the end if the number of nodes is odd.
Sing longingly for a more naive implementation.
Use a deque to keep the roots of completed subtrees and the midpoints in, instead of mucking around with my second trick.
This is a python implementation:
def sll_to_bbst(sll, start, end):
"""Build a balanced binary search tree from sorted linked list.
This assumes that you have a class BinarySearchTree, with properties
'l_child' and 'r_child'.
Params:
sll: sorted linked list, any data structure with 'popleft()' method,
which removes and returns the leftmost element of the list. The
easiest thing to do is to use 'collections.deque' for the sorted
list.
start: int, start index, on initial call set to 0
end: int, on initial call should be set to len(sll)
Returns:
A balanced instance of BinarySearchTree
This is a python implementation of solution found here:
http://leetcode.com/2010/11/convert-sorted-list-to-balanced-binary.html
"""
if start >= end:
return None
middle = (start + end) // 2
l_child = sll_to_bbst(sll, start, middle)
root = BinarySearchTree(sll.popleft())
root.l_child = l_child
root.r_child = sll_to_bbst(sll, middle+1, end)
return root
Instead of the sorted linked list i was asked on a sorted array (doesn't matter though logically, but yes run-time varies) to create a BST of minimal height, following is the code i could get out:
typedef struct Node{
struct Node *left;
int info;
struct Node *right;
}Node_t;
Node_t* Bin(int low, int high) {
Node_t* node = NULL;
int mid = 0;
if(low <= high) {
mid = (low+high)/2;
node = CreateNode(a[mid]);
printf("DEBUG: creating node for %d\n", a[mid]);
if(node->left == NULL) {
node->left = Bin(low, mid-1);
}
if(node->right == NULL) {
node->right = Bin(mid+1, high);
}
return node;
}//if(low <=high)
else {
return NULL;
}
}//Bin(low,high)
Node_t* CreateNode(int info) {
Node_t* node = malloc(sizeof(Node_t));
memset(node, 0, sizeof(Node_t));
node->info = info;
node->left = NULL;
node->right = NULL;
return node;
}//CreateNode(info)
// call function for an array example: 6 7 8 9 10 11 12, it gets you desired
// result
Bin(0,6);
HTH Somebody..
This is the pseudo recursive algorithm that I will suggest.
createTree(treenode *root, linknode *start, linknode *end)
{
if(start == end or start = end->next)
{
return;
}
ptrsingle=start;
ptrdouble=start;
while(ptrdouble != end and ptrdouble->next !=end)
{
ptrsignle=ptrsingle->next;
ptrdouble=ptrdouble->next->next;
}
//ptrsignle will now be at the middle element.
treenode cur_node=Allocatememory;
cur_node->data = ptrsingle->data;
if(root = null)
{
root = cur_node;
}
else
{
if(cur_node->data (less than) root->data)
root->left=cur_node
else
root->right=cur_node
}
createTree(cur_node, start, ptrSingle);
createTree(cur_node, ptrSingle, End);
}
Root = null;
The inital call will be createtree(Root, list, null);
We are doing the recursive building of the tree, but without using the intermediate array.
To get to the middle element every time we are advancing two pointers, one by one element, other by two elements. By the time the second pointer is at the end, the first pointer will be at the middle.
The running time will be o(nlogn). The extra space will be o(logn). Not an efficient solution for a real situation where you can have R-B tree which guarantees nlogn insertion. But good enough for interview.
Similar to #Stuart Golodetz and #Jake Kurzer the important thing is that the list is already sorted.
In #Stuart's answer, the array he presented is the backing data structure for the BST. The find operation for example would just need to perform index array calculations to traverse the tree. Growing the array and removing elements would be the trickier part, so I'd prefer a vector or other constant time lookup data structure.
#Jake's answer also uses this fact but unfortunately requires you to traverse the list to find each time to do a get(index) operation. But requires no additional memory usage.
Unless it was specifically mentioned by the interviewer that they wanted an object structure representation of the tree, I would use #Stuart's answer.
In a question like this you'd be given extra points for discussing the tradeoffs and all the options that you have.
Hope the detailed explanation on this post helps:
http://preparefortechinterview.blogspot.com/2013/10/planting-trees_1.html
A slightly improved implementation from #1337c0d3r in my blog.
// create a balanced BST using #len elements starting from #head & move #head forward by #len
TreeNode *sortedListToBSTHelper(ListNode *&head, int len) {
if (0 == len) return NULL;
auto left = sortedListToBSTHelper(head, len / 2);
auto root = new TreeNode(head->val);
root->left = left;
head = head->next;
root->right = sortedListToBSTHelper(head, (len - 1) / 2);
return root;
}
TreeNode *sortedListToBST(ListNode *head) {
int n = length(head);
return sortedListToBSTHelper(head, n);
}
If you know how many nodes are in the linked list, you can do it like this:
// Gives path to subtree being built. If branch[N] is false, branch
// less from the node at depth N, if true branch greater.
bool branch[max depth];
// If rem[N] is true, then for the current subtree at depth N, it's
// greater subtree has one more node than it's less subtree.
bool rem[max depth];
// Depth of root node of current subtree.
unsigned depth = 0;
// Number of nodes in current subtree.
unsigned num_sub = Number of nodes in linked list;
// The algorithm relies on a stack of nodes whose less subtree has
// been built, but whose right subtree has not yet been built. The
// stack is implemented as linked list. The nodes are linked
// together by having the "greater" handle of a node set to the
// next node in the list. "less_parent" is the handle of the first
// node in the list.
Node *less_parent = nullptr;
// h is root of current subtree, child is one of its children.
Node *h, *child;
Node *p = head of the sorted linked list of nodes;
LOOP // loop unconditionally
LOOP WHILE (num_sub > 2)
// Subtract one for root of subtree.
num_sub = num_sub - 1;
rem[depth] = !!(num_sub & 1); // true if num_sub is an odd number
branch[depth] = false;
depth = depth + 1;
num_sub = num_sub / 2;
END LOOP
IF (num_sub == 2)
// Build a subtree with two nodes, slanting to greater.
// I arbitrarily chose to always have the extra node in the
// greater subtree when there is an odd number of nodes to
// split between the two subtrees.
h = p;
p = the node after p in the linked list;
child = p;
p = the node after p in the linked list;
make h and p into a two-element AVL tree;
ELSE // num_sub == 1
// Build a subtree with one node.
h = p;
p = the next node in the linked list;
make h into a leaf node;
END IF
LOOP WHILE (depth > 0)
depth = depth - 1;
IF (not branch[depth])
// We've completed a less subtree, exit while loop.
EXIT LOOP;
END IF
// We've completed a greater subtree, so attach it to
// its parent (that is less than it). We pop the parent
// off the stack of less parents.
child = h;
h = less_parent;
less_parent = h->greater_child;
h->greater_child = child;
num_sub = 2 * (num_sub - rem[depth]) + rem[depth] + 1;
IF (num_sub & (num_sub - 1))
// num_sub is not a power of 2
h->balance_factor = 0;
ELSE
// num_sub is a power of 2
h->balance_factor = 1;
END IF
END LOOP
IF (num_sub == number of node in original linked list)
// We've completed the full tree, exit outer unconditional loop
EXIT LOOP;
END IF
// The subtree we've completed is the less subtree of the
// next node in the sequence.
child = h;
h = p;
p = the next node in the linked list;
h->less_child = child;
// Put h onto the stack of less parents.
h->greater_child = less_parent;
less_parent = h;
// Proceed to creating greater than subtree of h.
branch[depth] = true;
num_sub = num_sub + rem[depth];
depth = depth + 1;
END LOOP
// h now points to the root of the completed AVL tree.
For an encoding of this in C++, see the build member function (currently at line 361) in https://github.com/wkaras/C-plus-plus-intrusive-container-templates/blob/master/avl_tree.h . It's actually more general, a template using any forward iterator rather than specifically a linked list.

Resources