Exercise review Trees binary c++ - algorithm

Given a binary tree, whose root is located a treasure, and whose internal nodes can contain a dragon or does not contain anything, you are asked to design an algorithm that tells us the leaf of the tree whose path to the root has the lowest number of dragons. In the event that there are multiple paths with the same number of dragons, the algorithm will return that which is more to the left of all them. To do this, implement a function which gets a binary tree whose nodes store integers:
The root contains the integer 0, which represents the treasure.
The internal nodes contain the integer 1 to indicate that the node there is a dragon or the integer 2 to indicate that there is no dragon.
In each leaf stores an integer greater than or equal to 3 that cannot be repeated. and return the whole sheet to the path selected. The tree has at least one root node and a leaf node different from the root. For example, given the following tree (the second test case shown in the example), the algorithm return the integer 4.
I can not upload a picture of the tree of example, but someone tell me with words that I can do to go through all the branches, and to know which is the path with less dragons I'd appreciate it.
A greeting!

You want to think about these problems recursively: if you're at a parent node with...
no children you must have no dragon and a node counter, and you consider yourself to have 0 dragons and be the best node: you'd tell your parent that if asked
a left branch and/or a right branch, then you ask your children for their dragon-count and which node they consider best, and IF the left node reports a lesser or equal dragon count...
you take your best-node and dragon-count from it, ELSE
you take your best-node and dragon-count from the right node
then you add 1 to the dragon-count if your node's storing the integer 1
By starting that processing at the root node, you get the result for the entire tree.

This is the first algorithm that comes to mind. Assuming that you have an array that stores the values in nodes node_value[NODE_NUM], where NODE_NUM is the number of nodes in your tree, and you store index of childs of each node with the arrays left[NODE_NUM] and right[NODE_NUM], and your root will have index root_index. We will store information about the number of dragons in the path to root in the array dragon[NODE_NUM] So the algorithm pseudocode is:
# the recursive function itself
process(node_index):
n_left <- 0
if node_value[left[node_index]] = 1
n_left <- 1
n_right <- 0
if node_value[right[node_index]] = 1
n_right <- 1
dragon[left[node_index]] <- dragon[node_index] + n_left
dragon[right[node_index]] <- dragon[node_index] + n_right
process(left[node_index])
process(right[node_index])
# the number of dragons in path from root to root is, obviously, zero:
dragon[root_index] <- 0
# Call the function itself
process(root_index)
After that, in dragon we will have the number of dragons in the way to root from every nodes in tree. Now, all you have to do is to loop through all nodes and find the node that is a leaf and that its values is minimal:
min <- infinity
node_min <- unknown
for each node:
if node_value[node] >= 3:
if dragon[node] < min:
min <- dragon[node]
node_min <- node
return node_min
Now, the node_min is the node that has least dragons in the path to root.

Related

Calculating the number of black nodes in each path in Red Black Tree

What recursion algorithm should I use to calculate the number of black nodes in each path?
In a red-black tree, the number of black nodes on any path is constant. The process can be broken into two parts.
Part 1: get a path. Here, a path is defined as a sequence of nodes.
To get a path from the empty tree, return the path containing only the empty tree.
To get a path from a non-empty tree, get the path of its left child and append the non-empty tree to this path.
Part 2: count the number of black nodes in the path.
Hopefully, this one is self-explanatory.
What recursion algorithm should I use to calculate the number of black nodes in each path?
None. Use an iterative approach. So if you start from the root, increment a counter initialised to 0 every time you see a black node including the root and keep going leftwards until the left node pointer is null. Whatever the count is, that is the black depth down any given path. In pseudo code:
int bcount = 0;
Node *currnode = GetRoot();
while (currnode != NULL)
{
if (currnode->GetColour() == BLACK)
++bcount:
currnode = currnode->Left();
}

Generate all the leaf to leaf path in an n-array tree

Given an N-ary tree, I have to generate all the leaf to leaf paths in an n-array tree. The path should also denote the direction. As an example:
Tree:
1
/ \
2 6
/ \
3 4
/
5
Paths:
5 UP 3 UP 2 DOWN 4
4 UP 2 UP 1 DOWN 6
5 UP 3 UP 2 UP 1 DOWN 6
These paths can be in any order, but all paths need to be generated.
I kind of see the pattern:
looks like I have to do in order traversal and
need to save what I have seen so far.
However, can't really come up with an actual working algorithm.
Can anyone nudge me to the correct algorithm?
I am not looking for the actual implementation, just the pseudo code and the conceptual idea would be much appreciated.
The first thing I would do is to perform in-order traversal. As a result of this, we will accumulate all the leaves in the order from the leftmost to the rightmost nodes.(in you case this would be [5,4,6])
Along the way, I would certainly find the mapping between nodes and its parents so that we can perform dfs later. We can keep this mapping in HashMap(or its analogue). Apart from this, we will need to have the mapping between nodes and its priorities which we can compute from the result of the in-order traversal. In your example the in-order would be [5,3,2,4,1,6] and the list of priorities would be [0,1,2,3,4,5] respectively.
Here I assume that our node looks like(we may not have the mapping node -> parent a priori):
class TreeNode {
int val;
TreeNode[] nodes;
TreeNode(int x) {
val = x;
}
}
If we have n leaves, then we need to find n * (n - 1) / 2 paths. Obviously, if we have managed to find a path from leaf A to leaf B, then we can easily calculate the path from B to A. (by transforming UP -> DOWN and vice versa)
Then we start traversing over the array of leaves we computed earlier. For each leaf in the array we should be looking for paths to leaves which are situated to the right of the current one. (since we have already found the paths from the leftmost nodes to the current leaf)
To perform the dfs search, we should be going upwards and for each encountered node check whether we can go to its children. We should NOT go to a child whose priority is less than the priority of the current leaf. (doing so will lead us to the paths we already have) In addition to this, we should not visit nodes we have already visited along the way.
As we are performing dfs from some node, we can maintain a certain structure to keep the nodes(for instance, StringBuilder if you program in Java) we have come across so far. In our case, if we have reached leaf 4 from leaf 5, we accumulate the path = 5 UP 3 UP 2 DOWN 4. Since we have reached a leaf, we can discard the last visited node and proceed with dfs and the path = 5 UP 3 UP 2.
There might be a more advanced technique for solving this problem, but I think it is a good starting point. I hope this approach will help you out.
I didn't manage to create a solution without programming it out in Python. UNDER THE ASSUMPTION that I didn't overlook a corner case, my attempt goes like this:
In a depth-first search every node receives the down-paths, emits them (plus itself) if the node is a leaf or passes the down-paths to its children - the only thing to consider is that a leaf node is a starting point of a up-path, so these are input from the left to right children as well as returned to the parent node.
def print_leaf2leaf(root, path_down):
for st in path_down:
st.append(root)
if all([x is None for x in root.children]):
for st in path_down:
for n in st: print(n.d,end=" ")
print()
path_up = [[root]]
else:
path_up = []
for child in root.children:
path_up += child is not None and [st+[root] for st in print_root2root(child, path_down + path_up)] or []
for st in path_down:
st.pop()
return path_up
class node:
def __init__(self,d,*children):
self.d = d
self.children = children
## 1
## / \
## 2 6
## / \ /
## 3 4 7
## / / | \
## 5 8 9 10
five = node(5)
three = node(3,five)
four = node(4)
two = node(2,three,four)
eight = node(8)
nine = node(9)
ten = node(10)
seven = node(7,eight,nine,ten)
six = node(6,None,seven)
one = node(1,two,six)
print_leaf2leaf(one,[])

Job Interview Question Using Trees, What data to save?

I was solving the following job interview question and solved most of it but failed at the last requirement.
Q: Build a data structure which supports the following functions:
Init - Initialise Empty DS. O(1) Time complexity.
SetPositiveInDay(d,x) - Add to the DS that in day d exactly x new people were infected with covid-19. O(log n)Time complexity.
WorseBefore(d) - From the days inserted into the DS and smaller than d return the last one which has more newly infected people than d. O(log n)Time complexity.
For example:
Init()
SetPositiveInDay(1,10)
SetPositiveInDay(2,20)
SetPositiveInDay(3,15)
SetPositiveInDay(5,17)
SetPositiveInDay(23,180)
SetPositiveInDay(8,13)
SetPositiveInDay(13,18)
WorstBefore(13) // Returns day #2
SetPositiveInDay(10,19)
WorstBefore(13) // Returns day #10
Important note: you can't suppose that days will be entered by order and can't suppose too that there won't be "gaps" between days. (Some days may not be saved in the DS while those after it may be).
What I did?
I used AVL tree (I could use 2-3 tree too).
For each node I have:
Sick - Number of new infected people in that day.
maxLeftSick - Max number of infected people for left son.
maxRightSick - Max number of infected people for right son.
When inserted a new node I made sure that in rotation data won't get missed plus, for each single node from the new one till the root I did:
But I wasn't successful implementing WorseBefore(d).
Where to search?
First you need to find the node node corresponding to d in the tree ordered by days. Let x = Sick(node). This can be done in O(log n).
If maxLeftSick(node) > x, the solution must be in the left subtree of node. Search for the solution there and return the answer. This can be done in O(log n) - see below.
Otherwise, traverse the tree upwards towards the root, starting from node, until you find the first node nextPredecessor satisfying this property (this takes O(log n)):
nextPredecessor is smaller than node,
and either
Sick(nextPredecessor) > x or
maxLeftSick(nextPredecessor) > x.
If no such node exists, we give up. In case 1, just return nextPredecessor since that is the best solution.
In case 2, we know that the solution must be in the left subtree of nextPredecessor, so search there and return the answer. Again, this takes O(log n) - see below.
Note that there is no need to search in the right subtree of nextPredecessor since the only nodes that are smaller than node in that subtree would be the left subtree of node itself, and we have already excluded that.
Note also that it is not necessary to traverse further up the tree than nextPredecessor since those nodes are even smaller, and we are looking for the largest node satisfying all constraints.
How to search?
OK, so how do we search for the solution in a subtree? Finding the largest day within a subtree rooted in q that is worse than an infection number x is simple using the maxLeftSick and maxRightSick information:
If q has a right child and maxRightSick(q) > x then search in the right subtree of q.
If q has no right child and Sick(q) > x, return Day(q).
If q has a left child and maxLeftSick(q) > x then search in the left subtree of q.
Otherwise there is no solution within the subtree q.
We are effectively using maxLeftSick and maxRightSick to prune the search tree to include only "worse" nodes, and within that pruned tree we get the right most node, i.e. the one with the largest day.
It is easy to see that this algorithm runs in O(log n) where n is the total number of nodes since the number of steps is bounded by the height of the tree.
Pseudocode
Here is the pseudocode (assuming maxLeftSick and maxRightSick return -1 if no corresponding child node exists):
// Returns the largest day smaller than d such that its
// infection number is larger than the infection number on day d.
// Returns -1 if no such day exists.
int WorstBefore(int d) {
node = find(d);
// try to find the solution in the left subtree
if (maxLeftSick(node) > Sick(node)) {
return FindLastWorseThan(node -> left, Sick(node));
}
// move up towards root until we find the first node
// that is smaller than `node` and such that
// Sick(nextPredecessor) > Sick(node) or
// maxLeftSick(nextPredecessor) > Sick(node).
nextPredecessor = findNextPredecessor(node);
if (nextPredecessor == null) return -1;
// Case 1
if (Sick(nextPredecessor) > Sick(node)) return nextPredecessor;
// Case 2: maxLeftSick(nextPredecessor) > Sick(node)
return FindLastWorseThan(nextPredecessor -> left, Sick(node));
}
// Finds the latest day within the given subtree with root "node" where
// the infection number is larger than x. Runs in O(log(size(q)).
int FindLastWorseThan(Node q, int x) {
if ((q -> right) = null and Sick(q) > x) return Day(q);
if (maxRightSick(q) > x) return FindLastWorseThan(q -> right, x);
if (maxLeftSick(q) > x) return FindLastWorseThan(q -> left, x);
return -1;
}
First of all, your chosen data structure looks fine to me. You did not mention it explicitly, but I assume that the "key" you use in the AVL tree is the day number, i.e. an in-order traversal of the tree would list the nodes in their chronological order.
I would just suggest a cosmetic change: store the maximum value of sick in the node itself, so that you don't have two similar informations (maxLeftSick and maxRightSick) stored in one node instance, but move those two informations to the child nodes, so that your node.maxLeftSick is actually stored in node.left.maxSick, and similarly node.maxRightSick is stored in node.right.maxSick. This is of course not done when that child does not exist, but then we don't need that information either. In your structure maxLeftSick would be 0 when left is not defined. In my proposed structure, you would not have that value -- the 0 would follow naturally from the fact that there is no left child. In my proposal, the root node would have an information in maxSick which is not present in yours, and which would be the sum of your root.maxLeftSick and root.maxRightSick. This information would not really be used, but it is just there to make the structure consistent throughout the tree.
So you would just store one maxSick, which considers the current node's sick value also in that maximum. The processing you do during rotations will need to change accordingly, but will not become more complex.
I will assume that your AVL tree is single-threaded, i.e. you don't keep track of parent-pointers. So create a find method which will return the path to the node to be found. For instance, in Python syntax, it could look like this:
def find(self, day):
node = self.root
path = [] # an array of nodes
while node:
path.append(node)
if node.day == day: # bingo
return path
if day < node.day:
node = node.left
else:
node = node.right
Then the worstBefore method could look like this:
def worstBefore(self, day):
path = self.find(day)
if not path:
return # day not found
# get number of sick people on that day:
sick = path[-1].sick
# look for recent day with greater number of sick
while path:
node = path.pop() # walk upward, starting with found node
if node.day < day and node.sick > sick:
return node.day
if node.left and node.left.maxSick > sick:
# we will find the result in this subtree
node = node.left
while True:
if node.right and node.right.maxSick > sick:
node = node.right
elif node.sick > sick: # bingo
return node.day
else:
node = node.left
So the path returned by the find method will be used to get the parents of a node when you need to backtrack upwards in the tree along that path.
If along that path you find a left child whose maxSick is greater, then you know that the targeted node must be in that subtree. It is then a matter to walk down that subtree in a controlled way, choosing the right child when it still has maxSick greater. Otherwise check the current node's sick value and return that one if that value is greater. Otherwise go left, and repeat.
While there is no such left sub tree, go up along the path. If that parent would be a match, then return it (make sure to verify the day number). Keep checking for left sub trees that have a larger maxSick.
This runs in O(logn) because you first will walk zero or more steps upward and then zero or more steps downward (in a left subtree).
You can see your example scenario run on repl.it. There I focussed on this question, and didn't implement the rotations.

What is the pseudocode for this binary tree

basically i am required to come out with a pseudocode for this. What i currently have is
dictionary = {}
if node.left == none and node.right == none
visit(node)
dictionary[node] = 1
This is only the leaf nodes, how do i get the size for each node(parent and root)?
You can do a post-order traversal to find the size of each node.
The idea is to first handle both left and right trees. Then, after they are processed - you can use this data to process the current node.
This should look something like:
count = 0
if (node.left != none)
count += visit(node.left)
if (node.right != none)
count += visit(node.right)
// self is included.
count += 1
// update the node
node.size = count
return count
The dictionary for visited nodes is not needed since this is a tree, it guarantees to end.
As a side note - the size attribute of each node, is an important one. It basically upgrades your tree to a Order Statistics Tree
well the concept is that each node will know it's subtree size by first knowing the subtree size of all it's child which is maximum two child here as it is a binary tree, so once it knows subtree size of all child it can then add up all of them and atlast add 1 to it's
result and then the same thing will be done by it's parent also and so on upto root node. if we think about leaf node, it
has no child, so result subtree size will be only 1 in which it include itself.
one this idea is clear, it is easy to write code
that while traversing we will first know the subtree size of child nodes of current node then add 1 in it, in case of leaf node it will have subtree size of 1 only, below is the pseudocode of traverse funtion which finds the subtree size of each node and store them in dictionary sizeDictionary and a visited dictionary/array having larger scope has been used to keep track of visited nodes.
traverse(Tree curNode, dictionary subTreeSizeDictionary)
visited[curNode] = true
subtreeSizeDictionary[curNode] = 0
for child of curNode
if(notVisited[child])
traverse(child , sizeDictionary)
subtreeSizeDictionary[curNode] += subtreeSizeDictionary[child]
subtreeSizeDictionary[curNode] += 1;
here it is binary tree, but as you can see from pseudocode this concept can be used for any valid tree, the time complexity is O(n) as we visited each node only once.

Algorithm for finding the size of the largest connected region of nodes of the same value in a tree

This is a question I got in an interview, and I'm still not fully sure how to solve it.
Let's say we have a tree of numbers, and we want to find the size of the largest connected region in the tree whose nodes have the same value. For example, in this tree
3
/ \
3 3
/ \ / \
1 2 3 4
The answer is 4, because you have a region of 4 connected 3s.
I would suggest a depth first search with a function that takes two inputs:
A target value
A start node
and returns two outputs:
the size of the subtree of node with values equal to the target value
the largest size of connected region within the subtree of node
You can then call this function with a dummy target value (e.g. -1) and the root node and it will return the answer in the second output.
In pseudocode:
dfs(target_value,start_node):
if start_node.value == target_value:
total = 1
best = 0
for each child of start_node:
x,m = dfs(target_value,child)
best = max(m,best)
total += x
return total,best
else
x,m = dfs(start_node.value,start_node)
return 0,max(x,m)
_,ans = dfs(-1, root_node)
print ans
Associate a counter with each node to represent the largest connected region rooted at that node where all the nodes are the same value. Initialize this counter to 1 for every node.
Run DFS on the tree.
When you back up from any node, if both nodes have the same value, add the child node's counter to the parent's counter.
When you're done, the largest counter associated with a node is your answer. You can keep track of this as you run the algorithm.

Resources