This was asked in an exam. Can you please help me find a solution?
Design an algorithm to find the number of ancestors of a given node (of a tree or a graph) using:
O(m) memory
Unlimited memory size
Assuming no cycles in graph (in which case the ancestors make no sense) DFS-loop can be used to calculate ancestors of any node k,just mark counted nodes in each iteration and donot count visited nodes twice.
for i in graph visited[i] = false // for DFS
for i in graph counted[i] = false // for ancestors
int globalcount = 0; // count the no of ancestors
for i in graph DFS(i,k) //DFS-loop
def bool DFS(u,k) { //K is the node whos ancestors want to find
if(!visited[u]) {
visited[u] = true // prevent re-entering
totalret = false // whether there is path from u to k
for each edge(u,v) {
retval = DFS(v)
if(retval&&!counted[u]&&u!=k) { //there is path from u to k & u is not counted
globalcount++
counted[u] = true
totalret = true
}
}
if(u == k) return true
else return totalret
}
return counted[u] // if visited already and whether ancestor(k)?
}
print globalcount // total ancestor(k)
space complexity : O(V) V : no of vertices
time complexity : O(E) E : no of edges in graph
The algorithm will be dependent on the design of the tree. The simplest example is for nodes containing their parent in which case it reduces to
int ancestors = 0;
while( node = node->parent) ancestors++;
Constraints does not pose an issue in any reasonable implementation.
If the node does not contain a parent it depends on the structure of the tree.
In the most complex case for an unordered tree it entails a full tree search, counting ancestors.
For a search tree all you need is to do a search.
Related
How can I minimize the number of vertices of the directed graph by removing circuits? Is there any algorithms that can be adapted here?
There already is a question about removing the cycles in graphs, but I am particularly asking about MINIMIZING THE NUMBER OF VERTICES by removing the cycles in graphs
Supposing the solution for your problem is to simply turn cycles into a single node, sure, you can do that easily.
When you execute Breadth-First Search (BSF) or Depth-First Search (DFS), you will find cycles (i.e. if you mark the path you step into, once you can reach an already marked node, you have found a cycle). Hence, you can easily find cycles by storing the predecessor of each node you visit, that is, if you are in node u and you go to node v, you can store p[v] = u, so if you find some node w already visited in the adjacency list of v, you can walk back, parent by parent, until you find w and you have all nodes from that cycles.
I cannot guarantee any property of completeness from this algorithm, so if you can freely preprocess your graph, you can run DFSs on it until the graph is unchanged by it, otherwise run it a certain number n of times that you find is efficient.
void FindCycles(vector<Node> nodes){
int p[nodes.size()];
bool mark[nodes.size()]; //set all to false
stack<int> s;
s.push(nodes[0].id);
while(s.size()){
int u = s.pop();
mark[u] = true;
for(int v : nodes[u].adjs){
p[v] = u;
if(mark[v]) {
//found a cycle, call some method to reduce the graph
cout<<v<<" belongs to the cycle"<<endl;
while(u != v){
cout<<u<<" belongs to the cycle"<<endl;
u = p[u];
}
break;
}
else{
s.push(v);
}
}
}
}
Consider a directed graph with AND nodes and OR nodes. The AND nodes are activated only when all the in edges into it are activated. The OR nodes are activated if at least one of the in edges into it is activated. How to design an efficient algorithm to decide if all nodes can be activated? I have thought of some naive algorithm but it takes O(n^3) time. I am also assuming that vertices with no in edges into them are activated initially.I believe n^3 cannot be an efficient algorithm and there's some method that I am missing. Tagging domains where the problem might have a solution.
Maintain a set A of already activated nodes, a queue Q of nodes, and counters C of in-edges for each node.
Start by counting in-edges:
for each n in nodes {
for each n2 adjacent to n {
C[n2] += 1
}
}
Then initialize Q with the nodes with no in-edges:
for each n in nodes {
if C[n] == 0 {
add n to Q
}
}
Now repeat this process, until the queue is empty:
take q from Q
for each n adjacent to q {
if n is in A { continue }
if n is OR {
add n to A
add n to Q
} else { // n must be AND
C[n] -= 1
if C[n] is 0 {
add n to A
add n to Q
}
}
}
[This is a variant of topological sort that copes with the differences between OR and AND nodes].
When this process terminates, the set A contains all activated nodes.
The runtime is O(V+E), where V is the number of nodes in the graph, and E the number of edges.
You could preprocess the graph to compute the in-degree of each node.
Add all nodes with in-degree 0 to a stack, and prepare an array A containing the activation count for each node (initially equal to 0).
Then do the following pseudo-code
visited = set(stack)
while stack:
node = stack.pop()
for dest in node.neighbours():
A[dest] += 1
if ((Type[dest]==AND and A[dest]==indegree[dest]) or
(Type[dest]==OR and A[dest]>0)):
if node not in visited:
visited.add(node)
stack.append(dest)
This will visit each edge and each node at most once, so will have linear complexity.
When you finish the process, visited contains the set of activated nodes.
It is possible in O(n). Here is a possible algorithm.
n the total number of nodes
s the sum of the nodes that have been activated
a array to indicate if node n has been activated
c array to count the number of incoming edges for node n
Iterate through the nodes, if they have no incoming edges call your propagation function with it, e.g. propagate(i);.
If s == n all nodes have been activated.
Pseudo code for the propagate function:
function propagate(idx) {
if (a[idx]) // is node activated already
return; // return because node was already propagated
a[idx] = true; // activate
s++; // increase the number of activated nodes
for (var j = 0; j < outEdges[idx].length; j++) { // iterate through the outgoing edges
var idx2 = outEdges[idx][j]; // the node the edge is pointing to
if (isOrNode[idx2]) {
propagate(idx2);
} else { // AND node
c[idx2]++; // increase the count of incoming activated edges
if (inEdges[idx2].length == c[idx2]) // all incoming edges have been activated
propagate(idx2);
}
}
}
Can Breadth first Search be used for finding topological sorting of vertices and strongly connected components in a graph?
If yes how to do that? and If not why not?
we generally use Depth first search in these problems but What will be the problem if I try to implement using BFS?
Will code like this work?
def top_bfs(start_node):
queue = [start_node]
stack = []
while not queue.empty():
node = queue.dequeue()
if not node.visited:
node.visited = True
stack.push(node)
for c in node.children:
queue.enqueue(c)
stack.reverse()
return stack
Yes, you can do topological sorting using BFS. Actually I remembered once my teacher told me that if the problem can be solved by BFS, never choose to solve it by DFS. Because the logic for BFS is simpler than DFS, most of the time you will always want a straightforward solution to a problem.
You need to start with nodes of which the indegree is 0, meaning no other nodes direct to them. Be sure to add these nodes to your result first.You can use a HashMap to map every node with its indegree, and a queue which is very commonly seen in BFS to assist your traversal. When you poll a node from the queue, the indegree of its neighbors need to be decreased by 1, this is like delete the node from the graph and delete the edge between the node and its neighbors. Every time you come across nodes with 0 indegree, offer them to the queue for checking their neighbors later and add them to the result.
public ArrayList<DirectedGraphNode> topSort(ArrayList<DirectedGraphNode> graph) {
ArrayList<DirectedGraphNode> result = new ArrayList<>();
if (graph == null || graph.size() == 0) {
return result;
}
Map<DirectedGraphNode, Integer> indegree = new HashMap<DirectedGraphNode, Integer>();
Queue<DirectedGraphNode> queue = new LinkedList<DirectedGraphNode>();
//mapping node to its indegree to the HashMap, however these nodes
//have to be directed to by one other node, nodes whose indegree == 0
//would not be mapped.
for (DirectedGraphNode DAGNode : graph){
for (DirectedGraphNode nei : DAGNode.neighbors){
if(indegree.containsKey(nei)){
indegree.put(nei, indegree.get(nei) + 1);
} else {
indegree.put(nei, 1);
}
}
}
//find all nodes with indegree == 0. They should be at starting positon in the result
for (DirectedGraphNode GraphNode : graph) {
if (!indegree.containsKey(GraphNode)){
queue.offer(GraphNode);
result.add(GraphNode);
}
}
//everytime we poll out a node from the queue, it means we delete it from the
//graph, we will minus its neighbors indegree by one, this is the same meaning
//as we delete the edge from the node to its neighbors.
while (!queue.isEmpty()) {
DirectedGraphNode temp = queue.poll();
for (DirectedGraphNode neighbor : temp.neighbors){
indegree.put(neighbor, indegree.get(neighbor) - 1);
if (indegree.get(neighbor) == 0){
result.add(neighbor);
queue.offer(neighbor);
}
}
}
return result;
}
The fact that they have similar names doesn't make them similar methods.
DFS is typically implemented with LIFO (a stack if you will) - last in first out.
BFS typically implemented with FIFO (a queue if you will) - first in first out.
You can walk a graph in any way you want, and eventually come out with a topological order of its nodes. But if you want to do it efficiently, then DFS is the best option, as the topological order of the nodes essentially reflects their depth in the graph (well, "dependency-depth" to be more accurate).
So generally the code for topologically sorting using DFS (depth first search) is much more straight forward, you run it and it backtracks since its recursive assigning numbers as it calls back to previous stack frames. BFS is less straight forward but still easy to understand.
First, you must calculate the in-degree of all the vertices on the graph, this is because you must start at a vertex that has an in-degree of 0.
int[] indegree = int[adjList.length];
for(int i = 0; i < adjList.length; i++){
for(Edge e = adjList[i]; e != null; e = e.next){
indegree[e.vertexNum]++;
}
}
So the code above iterates through the vertex array, then it iterates through a single vertex's edges(in this case its stored using linked list), then it increments the vertex that the edge is pointing to in the indegree array. So at the end of the outer loop you will have traversed each vertex's neighbors and calculated each vertex's in-degree.
Second, you now must use BFS to actually topologically sort this graph. So this first snippet of code will only enqueue the vertices in the graph that have an in-degree of 0.
Queue<Integer> q = new Queue<>();
for(int i = 0; i < indegree.length; i++){
if(indegree[i] == 0){
q.enqueue(i);
}
}
Now, after enqueueing only vertices with in-degree of 0, you start the loop to assign topological numbers.
while(!q.isEmpty()){
int vertex = q.dequeue();
System.out.print(vertex);
for(Edge e = adjList[vertex]; e != null; e = e.next){
if(--indegree[e.vnum] == 0){
q.enqueue(e.vnum);
}
}
So the print statement prints out the vertex number that corresponds to the vertex. So depending on the requirements of your program, you can change the code where the print statement is to something that stores the vertex numbers or the names or something along those lines. Other than that, I hope this helped answer the first question.
Second Question
As for the second part of the question, it's pretty simple.
1.Create boolean array filled with false values, this will represent if the vertices have been visited or not.
2.Create for loop iterating over the adjList array, inside this loop you will call bfs, after calling bfs you will iterate over the boolean array you created, checking if any value is false, if it is then the graph is not strongly connected and you can return "graph is not strongly connected" and end the program. At the end of each iteration of the outer for-loop (but after the inner for-loop) don't forget to reset your boolean array to all false values again.
3.At this point the outer for loop is done and you can return true, it is your job to implement to bfs it should take in an integer and the visited boolean array you created as parameters.
The standard algorithm for deleting all nodes in a binary tree uses a postorder traversal over the nodes along these lines:
if (root is not null) {
recursively delete left subtree
recursively delete right subtree
delete root
}
This algorithm uses O(h) auxiliary storage space, where h is the height of the tree, because of the space required to store the stack frames during the recursive calls. However, it runs in time O(n), because every node is visited exactly once.
Is there an algorithm to delete all the nodes in a binary tree using only O(1) auxiliary storage space without sacrificing runtime?
It is indeed possible to delete all the nodes in a binary tree using O(n) and O(1) auxiliary storage space by using an algorithm based on tree rotations.
Given a binary tree with the following shape:
u
/ \
v C
/ \
A B
A right rotation of this tree pulls the node v above the node u and results in the following tree:
v
/ \
A u
/ \
B C
Note that a tree rotation can be done in O(1) time and space by simply changing the root of the tree to be v, setting u's left child to be v's former right child, then setting v's right child to be u.
Tree rotations are useful in this context because a right rotation will always decrease the height of the left subtree of the tree by one. This is useful because of a clever observation: it is extremely easy to delete the root of the tree if it has no left subchild. In particular, if the tree is shaped like this:
v
\
A
Then we can delete all the nodes in the tree by deleting the node v, then deleting all the nodes in its subtree A. This leads to a very simple algorithm for deleting all the nodes in the tree:
while (root is not null) {
if (root has a left child) {
perform a right rotation
} else {
delete the root, and make the root's right child the new root.
}
}
This algorithm clearly uses only O(1) storage space, because it needs at most a constant number of pointers to do a rotation or to change the root and the space for these pointers can be reused across all iterations of the loop.
Moreover, it can be shown that this algorithm also runs in O(n) time. Intuitively, it's possible to see this by looking at how many times a given edge can be rotated. First, notice that whenever a right rotation is performed, an edge that goes from a node to its left child is converted into a right edge that runs from the former child back to its parent. Next, notice that once we perform a rotation that moves node u to be the right child of node v, we will never touch node u again until we have deleted node v and all of v's left subtree. As a result, we can bound the number of total rotations that will ever be done by noting that every edge in the tree will be rotated with its parent at most once. Consequently, there are at most O(n) rotations done, each of which takes O(1) time, and exactly n deletions done. This means that the algorithm runs in time O(n) and uses only O(1) space.
In case it helps, I have a C++ implementation of this algorithm, along with a much more in-depth analysis of the algorithm's behavior. It also includes formal proofs of correctness for all of the steps of the algorithm.
Hope this helps!
Let me start with a serious joke: If you set the root of a BST to null, you effectively delete all the nodes in the tree (the garbage collector will make the space available). While the wording is Java specific, the idea holds for other programming languages. I mention this just in case you were at a job interview or taking an exam.
Otherwise, all you have to do is use a modified version of the DSW algorithm. Basically turn the tree into a backbone and then delete as you would a linked list. Space O(1) and time O(n). You should find talks of DSW in any textbook or online.
Basically DSW is used to balance a BST. But for your case, once you get the backbone, instead of balancing, you delete like you would a linked list.
Algorithm 1, O(n) time and O(1) space:
Delete node immediately unless it has both children. Otherwise get to the leftmost node reversing 'left' links to ensure all nodes are reachable - the leftmost node becomes new root:
void delete_tree(Node *node) {
Node *p, *left, *right;
for (p = node; p; ) {
left = p->left;
right = p->right;
if (left && right) {
Node *prev_p = nullptr;
do {
p->left = prev_p;
prev_p = p;
p = left;
} while ((left = p->left) != nullptr);
p->left = p->right;
p->right = prev_p; //need it on the right to avoid loop
} else {
delete p;
p = (left) ? left : right;
}
}
}
Algorithm 2, O(n) time and O(1) space: Traverse nodes depth-first, replacing child links with links to parent. Each node is deleted on the way up:
void delete_tree(Node *node) {
Node *p, *left, *right;
Node *upper = nullptr;
for (p = node; p; ) {
left = p->left;
right = p->right;
if (left && left != upper) {
p->left = upper;
upper = p;
p = left;
} else if (right && right != upper) {
p->right = upper;
upper = p;
p = right;
} else {
delete p;
p = upper;
if (p)
upper = (p->left) ? p->left : p->right;
}
}
}
I'm surprised by all the answers above that require complicated operations.
Removing nodes from a BST with O(1) additional storage is possible by simply replacing all recursive calls with a loop that searches for the node and also keeps track the current node's parent. Using recursion is only simpler because the recursive calls automatically store all ancestors of the searched node in a stack. However, it's not necessary to store all ancestors. It's only necessary to store the searched node and its parent, so the searched node can be unlinked. Storing all ancestors is simply a waste of space.
Solution in Python 3 is below. Don't be thrown off by the seemingly recursive call to delete --- the maximum recursion depth here is 2 since the second call to delete is guaranteed to result in the delete base case (root node containing the searched value).
class Tree(object):
def __init__(self, x):
self.value = x
self.left = None
self.right = None
def remove_rightmost(parent, parent_left, child):
while child.right is not None:
parent = child
parent_left = False
child = child.right
if parent_left:
parent.left = child.left
else:
parent.right = child.left
return child.value
def delete(t, q):
if t is None:
return None
if t.value == q:
if t.left is None:
return t.right
else:
rightmost_value = remove_rightmost(t, True, t.left)
t.value = rightmost_value
return t
rv = t
while t is not None and t.value != q:
parent = t
if q < t.value:
t = t.left
parent_left = True
else:
t = t.right
parent_left = False
if t is None:
return rv
if parent_left:
parent.left = delete(t, q)
else:
parent.right = delete(t, q)
return rv
def deleteFromBST(t, queries):
for q in queries:
t = delete(t, q)
return t
UPDATE
I worked out an algorithm that I think runs in O(n*k) running time. Below is the pseudo-code:
routine heaviestKPath( T, k )
// create 2D matrix with n rows and k columns with each element = -∞
// we make it size k+1 because the 0th column must be all 0s for a later
// function to work properly and simplicity in our algorithm
matrix = new array[ T.getVertexCount() ][ k + 1 ] (-∞);
// set all elements in the first column of this matrix = 0
matrix[ n ][ 0 ] = 0;
// fill our matrix by traversing the tree
traverseToFillMatrix( T.root, k );
// consider a path that would arc over a node
globalMaxWeight = -∞;
findArcs( T.root, k );
return globalMaxWeight
end routine
// node = the current node; k = the path length; node.lc = node’s left child;
// node.rc = node’s right child; node.idx = node’s index (row) in the matrix;
// node.lc.wt/node.rc.wt = weight of the edge to left/right child;
routine traverseToFillMatrix( node, k )
if (node == null) return;
traverseToFillMatrix(node.lc, k ); // recurse left
traverseToFillMatrix(node.rc, k ); // recurse right
// in the case that a left/right child doesn’t exist, or both,
// let’s assume the code is smart enough to handle these cases
matrix[ node.idx ][ 1 ] = max( node.lc.wt, node.rc.wt );
for i = 2 to k {
// max returns the heavier of the 2 paths
matrix[node.idx][i] = max( matrix[node.lc.idx][i-1] + node.lc.wt,
matrix[node.rc.idx][i-1] + node.rc.wt);
}
end routine
// node = the current node, k = the path length
routine findArcs( node, k )
if (node == null) return;
nodeMax = matrix[node.idx][k];
longPath = path[node.idx][k];
i = 1;
j = k-1;
while ( i+j == k AND i < k ) {
left = node.lc.wt + matrix[node.lc.idx][i-1];
right = node.rc.wt + matrix[node.rc.idx][j-1];
if ( left + right > nodeMax ) {
nodeMax = left + right;
}
i++; j--;
}
// if this node’s max weight is larger than the global max weight, update
if ( globalMaxWeight < nodeMax ) {
globalMaxWeight = nodeMax;
}
findArcs( node.lc, k ); // recurse left
findArcs( node.rc, k ); // recurse right
end routine
Let me know what you think. Feedback is welcome.
I think have come up with two naive algorithms that find the heaviest length-constrained path in a weighted Binary Tree. Firstly, the description of the algorithm is as follows: given an n-vertex Binary Tree with weighted edges and some value k, find the heaviest path of length k.
For both algorithms, I'll need a reference to all vertices so I'll just do a simple traversal of the Tree to have a reference to all vertices, with each vertex having a reference to its left, right, and parent nodes in the tree.
Algorithm 1
For this algorithm, I'm basically planning on running DFS from each node in the Tree, with consideration to the fixed path length. In addition, since the path I'm looking for has the potential of going from left subtree to root to right subtree, I will have to consider 3 choices at each node. But this will result in a O(n*3^k) algorithm and I don't like that.
Algorithm 2
I'm essentially thinking about using a modified version of Dijkstra's Algorithm in order to consider a fixed path length. Since I'm looking for heaviest and Dijkstra's Algorithm finds the lightest, I'm planning on negating all edge weights before starting the traversal. Actually... this doesn't make sense since I'd have to run Dijkstra's on each node and that doesn't seem very efficient much better than the above algorithm.
So I guess my main questions are several. Firstly, do the algorithms I've described above solve the problem at hand? I'm not totally certain the Dijkstra's version will work as Dijkstra's is meant for positive edge values.
Now, I am sure there exist more clever/efficient algorithms for this... what is a better algorithm? I've read about "Using spine decompositions to efficiently solve the length-constrained heaviest path problem for trees" but that is really complicated and I don't understand it at all. Are there other algorithms that tackle this problem, maybe not as efficiently as spine decomposition but easier to understand?
You could use a DFS downwards from each node that stops after k edges to search for paths, but notice that this will do 2^k work at each node for a total of O(n*2^k) work, since the number of paths doubles at each level you go down from the starting node.
As DasBoot says in a comment, there is no advantage to using Dijkstra's algorithm here since it's cleverness amounts to choosing the shortest (or longest) way to get between 2 points when multiple routes are possible. With a tree there is always exactly 1 way.
I have a dynamic programming algorithm in mind that will require O(nk) time. Here are some hints:
If you choose some leaf vertex to be the root r and direct all other vertices downwards, away from the root, notice that every path in this directed tree has a highest node -- that is, a unique node that is nearest to r.
You can calculate the heaviest length-k path overall by going through each node v and calculating the heaviest length-k path whose highest node is v, finally taking the maximum over all nodes.
A length-k path whose highest node is v must have a length-i path descending towards one child and a length-(k-i) path descending towards the other.
That should be enough to get you thinking in the right direction; let me know if you need further help.
Here's my solution. Feedback is welcome.
Lets treat the binary tree as a directed graph, with edges going from parent to children. Lets define two concepts for each vertex v:
a) an arc: which is a directed path, that is, it starts from vertex v, and all vertices in the path are children of the starting vertex v.
b) a child-path: which is a directed or non-directed path containing v, that is, it could start anywhere, end anywhere, and go from child of v to v, and then, say to its other child. The set of arcs is a subset of the set of child-paths.
We also define a function HeaviestArc(v,j), which gives, for a vertex j, the heaviest arc, on the left or right side, of length j, starting at v. We also define LeftHeaviest(v,j), and RightHeaviest(v,j) as the heaviest left and right arcs of length j respectively.
Given this, we can define the following recurrences for each vertex v, based on its children:
LeftHeaviest(v,j) = weight(LeftEdge(v)) + HeaviestArc(LeftChild(v)),j-1);
RightHeaviest(v,j) = weight(RightEdge(v)) + HeaviestArc(RightChild(v)),j-1);
HeaviestArc(v,j) = max(LeftHeaviest(v,j),RightHeaviest(v,j));
Here j here goes from 1 to k, and HeaviestArc(v,0)=LeftHeaviest(v,0),RightHeaviest(v,0)=0 for all. For leaf nodes, HeaviestArc(v,0) = 0, and HeaviestArc(v,j)=-inf for all other j (I need to think about corner cases more thoroughly).
And then HeaviestChildPath(v), the heaviest child-path containing v, can be calculated as:
HeaviestChildPath(v) = max{ for j = 0 to k LeftHeaviest(j) + RightHeaviest(k-j)}
The heaviest path should be the heaviest of all child paths.
The estimated runtime of the algorithm should be order O(kn).
def traverse(node, running_weight, level):
if level == 0:
if max_weight < running_weight:
max_weight = running_weight
return
traverse(node->left,running_weight+node.weight,level-1)
traverse(node->right,running_weight+node.weight,level-1)
traverse(node->parent,running_weight+node.weight,level-1)
max_weight = 0
for node in tree:
traverse(node,0,N)