This question was taken from a bigger one preparing for a job interview (Solved the rest of it successfully)
Question: Suggest DataStructure to handle boxes where a box has: special ID, weight and size.
I want to save those boxes in an AVL tree, and being able to solve the following problem:
From all boxes which has a maximum size of v (In other words: size<=v) I want to find the heaviest one.
How can I do this in log(n) where n is the number of total saved boxes?
I know that the solution would be saving some extra data in each node but I'm not sure which data would be helpful (no need to explain how to fix the data in rotation etc)
Example of extra data saved in each node: Id of heaviest box in right sub-tree.
It sounds like you're already on the right track: each node stores its heaviest descendant. The only missing piece is coming up with a set of log(n) nodes such that the target node is the descendant of one of them.
In other words, you need to identify all the subtrees of the AVL tree which consist entirely of nodes whose size is less than (i.e. are to the left of) your size=v node.
Which ones are those? Well, for one thing, the left child of your size=v node, of course. Then, go from that node up to the root. Every ancestor of the size=v node which is a right child, consider its left sibling (and the node itself). The set of subtrees whose roots you examine along the way will be all nodes to the left of the size=v node.
As a simplification, you can combine the upper-bound size search with the search for the target node. Basically, you traverse to the child with the highest maximum-weight descendant, but don't allow traversing to children which would violate the size constraint.
max = null
x = root
while x is not null:
if x.size <= v:
if x.weight > max.weight:
max = x
x = x.left or x.right, depending on which has a larger maxWeightDescendant
else:
x = x.left
Related
Suppose I have a binary tree in which a node can have either 0,1 or 2 children. A cost value is associated with each node, and it can be {5,10,20,40}. The most optimal placement of a new node is under a node with same or lower cost value. For example- a new node with cost value 20 is best placed under a node with cost value 20, but can also be placed under nodes with cost values 5 and 10.
Primary requirement of this algorithm is to complete the left and right child of a node if it is required, i.e. if a node with cost value 10 has a left child with cost value 10, then a new node having cost value 10 will be made the right child of the above node . The secondary requirement is to maximize the overall depth of the tree.
The tree cannot be rearranged at any point of time. If an incoming node is of lesser value, then there is no penalty involved.
Given the above requirements, how can we decide the best position of an incoming new node in the tree ? Can we write a general algorithm for it ?
Initially, I thought to complete each level of the tree first, but I don't think it would be optimal.
The secondary requirement is to maximize the overall depth of the tree.
That's a bit unusual.
The quickest way:
sort your input values
fill all the minimal value nodes (5's) in respect with the first requirement (still unclear if both left-right nodes must be filled in before going down a level. If it must then the max depth will be log2(N5) If "going deep on left" is allowed without filling in the right, then the max depth tree will degenerate in list with all right nodes to null).Call this the master tree
make a tree from the next values (say 10-value nodes) and attach this tree to the deepest branch of the master tree
repeat step 3 as necessary
Note: this is the simplest concept, the implementation may take advantage from the fact the master tree is sorted at all time and get over with the initial sort.
I'm implementing the Bentley-Ottmann algorithm
to find the set of segment intersection points,
unfortunately I didn't understand some things.
For example :
how can I get the neighbours of the segment Sj in the image.
I'm using a balanced binary search tree for the sweepLine status, but we store the segments in the leaves, after reading this wikipedia article I didn't find an explanation for this operation.
From the reference book (de Berg & al.: "Computational Geometry", ill. at p.25):
Suppose we search in T for the segment immediately to the left of some point p that lies on the sweep line.
At each internal node v we test whether p lies left or right of the segment stored at v.
Depending on the outcome we descend to the left or right subtree of v,
eventually ending up in a leaf.
Either this leaf, or the leaf immediately to the left of it, stores the segment we are searching for.
For my example if I follow this I will arrive at the leaf Sj but I will know just the leaf to the left i.e. Sk, how can I get Si?
Edit
I found this discussion that looks like my problem, unfortunately there are no answers about how can I implement some operations in such data structure.
The operation are:
inserting a node in such data structure.
deleting a node.
swapping two nodes.
searching for neighbours' node.
I know how to implement these operations in a balanced binary search tree when we store data too in internal node, but with this type of AVL I don't know if it is the same thing.
Thank you
I stumbled upon the same problem when reading Computational Geometry from DeBerg (see p. 25 for the quote and the image). My understanding is the following:
say you need the right neighbor of a segment S which is in the tree. If you store data in the nodes, the pseudo code is:
locate node S
if S has a right subtree:
return the left-most node of the right subtree of S
else if S is in the left sub-tree of any ancestor:
return the lowest/nearest such ancestor
else
return not found
If you store the data in the leaves, the pseudo-code becomes:
let p the point of S currently on the sweep line
let n the segment at the root of the tree
while n != null && n is not a leaf:
if n = S:
n = right child of S
else:
determine if p is on the right or left of n
update n accordingly (normal descent)
In the end, either n is null and it means there is no right neighbor, or n points to the proper leaf.
The same logic applies for the left neighbor.
Same as you, I have met the same problem while reading the de Berg & al.: "Computational Geometry". But I think The C++ Standard Template Library (STL) have an implantation called "map" which can do the job.
You just need to define some personalized class for line segment and event points and their comparison functions. Then, use std::map to build the tree and access the neighboring element using map.find() to get and iterator, and use iterator to gain access to the two neighbor element.
For exa, this is the tree.
10
12 -1
5 1 1 -2
2 3 10 -9
How to find the node with maximum value?
Given the problem as stated, you need to traverse the entire tree. See proof below.
Traversing the entire tree should be a fairly trivial process.
Proof that we need to traverse the entire tree:
Assume we're able to identify which side of a tree the maximum is on without traversing the entire tree.
Given any tree with the maximum node on the left. Call this maximum x.
Pick one of the leaf nodes on the right. Add 2 children to it: x+1 and -x-1.
Since x+1-x-1 = 0, adding these won't change the sum at the leaf we added it to, thus nor the sums at any other nodes in the tree.
Since this can be added to any leaf in the tree, and it doesn't affect the sums, we'd need to traverse the entire tree to find out if this occurs anywhere.
Thus our assumption that we can identify which side of a tree the maximum is on without traversing the entire tree is incorrect.
Thus we need to traverse the entire tree.
In the general case, you need to traverse the entire tree. If the values in the tree are not constrained (e.g. all non-negative, but in your example there are negative values), then the value in a node tells you nothing about the individual values below it.
How to find a loop in a binary tree? I am looking for a solution other than marking the visited nodes as visited or doing a address hashing. Any ideas?
Suppose you have a binary tree but you don't trust it and you think it might be a graph, the general case will dictate to remember the visited nodes. It is, somewhat, the same algorithm to construct a minimum spanning tree from a graph and this means the space and time complexity will be an issue.
Another approach would be to consider the data you save in the tree. Consider you have numbers of hashes so you can compare.
A pseudocode would test for this conditions:
Every node would have to have a maximum of 2 children and 1 parent (max 3 connections). More then 3 connections => not a binary tree.
The parent must not be a child.
If a node has two children, then the left child has a smaller value than the parent and the right child has a bigger value. So considering this, if a leaf, or inner node has as a child some node on a higher level (like parent's parent) you can determine a loop based on the values. If a child is a right node then it's value must be bigger then it's parent but if that child forms a loop, it means he is from the left part or the right part of the parent.
3.a. So if it is from the left part then it's value is smaller than it's sibling. So => not a binary tree. The idea is somewhat the same for the other part.
Testing aside, in what form is the tree that you want to test? Remeber that every node has a pointer to it's parent. An this pointer points to a single parent. So depending of the format you tree is in, you can take advantage from this.
As mentioned already: A tree does not (by definition) contain cycles (loops).
To test if your directed graph contains cycles (references to nodes already added to the tree) you can iterate trough the tree and add each node to a visited-list (or the hash of it if you rather prefer) and check each new node if it is in the list.
Plenty of algorithms for cycle-detection in graphs are just a google-search away.
Problem: I have a binary tree, all leaves are numbered (from left to right, starting from 0) and no connection exists between them.
I want an algorithm that, given two indices (of 2 distinct leaves), visits the tree starting from the greater leaf (the one with the higher index) and gets to the lower one.
The internal nodes of the tree do not contain any useful information.
I should chose the path based only on the leaves indices. The path start from a leaf and terminates on a leaf, and of course I can access a leaf if I know its index (through an array of pointers)
The tree is static, no insertion or deletion of nodes is allowed.
I have developed an algorithm to do it but it really sucks... any ideas?
One option would be to find the least common ancestor of the two nodes, along with the sequence of nodes you should take from each node to get to that ancestor. Here's a sketch of the algorithm:
Starting from each node, walk back up to that node's parent until you reach the root. Count the number of nodes on the path from each node to the root. Let the height of the first node be h1 and the height of the second node be h2.
Let h = min(h1, h2). This is the height of the higher of the two nodes.
Starting from each node, keep following the node's parent pointer until both nodes are at height h. Record the nodes you followed during this step. At this point, both nodes are at the same height.
Until you find a common node, keep marching upwards from each node to its parent. Eventually you will hit their common ancestor. At this point, follow the path from the first node up to this ancestor, then down the path from the ancestor down to the second node.
In the worst case, this takes O(h) time and O(h) space, where h is the height of the tree. For a balanced binary tree is this O(lg n) time and space, which is quite good.
If you're interested in a Much More Hardcore version of this algorithm, consider looking into Tarjan's Least Common Ancestors algorithm, which with linear preprocessing time, can be used to find the least common ancestor much more rapidly than this.
Hope this helps!
Distance between any two nodes can be calculated with the help of lowest common ancestor:
Dist(n1, n2) = Dist(root, n1) + Dist(root, n2) - 2*Dist(root, lca)
where lca is lowest common ancestor.
see this for more help about this algorithm and see this video for learning how to calculate lca.