Is it possible to use Breadth first search logic to do a topological sort of a DAG?
The solution in Cormen makes use of Depth first search but wouldn't be easier to use BFS?
Reason:
BFS visits all the nodes in a particular depth before visiting nodes with the next depth value. It naturally means that the parents will be listed before the children if we do a BFS. Isn't this exactly what we need for a topological sort?
A mere BFS is only sufficient for a tree (or forest of trees), because in (forest of) trees, in-degrees are at most 1.
Now, look at this case:
B → C → D
↗
A
A BFS where queue is initialized to A B (whose in-degrees are zero) will return A B D C, which is not topologically sorted. That's why you have to maintain in-degrees count, and only pick nodes whose count has dropped to zero. (*)
BTW, this is the flaw of your 'reason' : BFS only guarantee one parent has been visited before, not all of them.
Edit: (*) In other words you push back adjacent nodes whose in-degree is zero (in the exemple, after processing A, D would be skipped). So, you're still using a queue and you've just added a filtering step to the general algorithm. That being said, continuing to call it a BFS is questionable.
It is possible, even wikipedia describes an algorithm based on BFS.
Basically, you use a queue in which you insert all nodes with no incoming edges. Then, when you extract a node, you remove all of its outgoing edges and insert the nodes reachable from it that have no other incoming edges.
In a BFS all of the edges you actually walk will end up in the correct direction. But all the edges you don't walk (those between nodes at the same depth, or those from deeper nodes back up to earlier nodes) will end up going the wrong way if you lay out the graph in BFS order.
Yes, you really need DFS to do it.
Yes, you can do topological sorting using BFS. Actually I remembered once my teacher told me that if the problem can be solved by BFS, never choose to solve it by DFS. Because the logic for BFS is simpler than DFS, most of the time you will always want a straightforward solution to a problem.
As YvesgereY and IVlad has mentioned, you need to start with nodes of which the indegree is 0, meaning no other nodes direct to them. Be sure to add these nodes to your result first.You can use a HashMap to map every node with its indegree, and a queue which is very commonly seen in BFS to assist your traversal. When you poll a node from the queue, the indegree of its neighbors need to be decreased by 1, this is like delete the node from the graph and delete the edge between the node and its neighbors. Every time you come across nodes with 0 indegree, offer them to the queue for checking their neighbors later and add them to the result.
public ArrayList<DirectedGraphNode> topSort(ArrayList<DirectedGraphNode> graph) {
ArrayList<DirectedGraphNode> result = new ArrayList<>();
if (graph == null || graph.size() == 0) {
return result;
}
Map<DirectedGraphNode, Integer> indegree = new HashMap<DirectedGraphNode, Integer>();
Queue<DirectedGraphNode> queue = new LinkedList<DirectedGraphNode>();
//mapping node to its indegree to the HashMap, however these nodes
//have to be directed to by one other node, nodes whose indegree == 0
//would not be mapped.
for (DirectedGraphNode DAGNode : graph){
for (DirectedGraphNode nei : DAGNode.neighbors){
if(indegree.containsKey(nei)){
indegree.put(nei, indegree.get(nei) + 1);
} else {
indegree.put(nei, 1);
}
}
}
//find all nodes with indegree == 0. They should be at starting positon in the result
for (DirectedGraphNode GraphNode : graph) {
if (!indegree.containsKey(GraphNode)){
queue.offer(GraphNode);
result.add(GraphNode);
}
}
//everytime we poll out a node from the queue, it means we delete it from the
//graph, we will minus its neighbors indegree by one, this is the same meaning
//as we delete the edge from the node to its neighbors.
while (!queue.isEmpty()) {
DirectedGraphNode temp = queue.poll();
for (DirectedGraphNode neighbor : temp.neighbors){
indegree.put(neighbor, indegree.get(neighbor) - 1);
if (indegree.get(neighbor) == 0){
result.add(neighbor);
queue.offer(neighbor);
}
}
}
return result;
}
Related
Consider a weighted directed graph, including V vertices and E edges. I am looking for an algorithm that finds the shortest cycle that passes through only S certain node (must pass through all nodes in S), not the other nodes. The cycle starts and ends from node w in set S.
Is it possible to delete the nodes in the set of V - S and also delete their corresponding connected edges, and then apply an algorithm (for finding the shortest cycle) to this graph, including only S nodes and their corresponding edges?
I emphasize that we only consider the nodes in set S, not the other nodes.
I am not sure if the below link is relevant to my question. The link asks for the shortest cycle that must pass through the blue nodes, but the cycle may pass through the black ones (I am not sure about this).
Finding shortest circuit in a graph that visits X nodes at least once
Yes, the way your problem is stated, the consider approach is correct.
A graph where you remove all vertices that don't belong to a set S is called an induced subgraph. Every path/cycle in the original graph that only uses vertices from S can be found in the induced subgraph, too. Therefore, finding the shortest cycle in the induced subgraph is equivalent to finding the cycle in the original graph.
If your problem requires to find the shortest cycle that uses all nodes in S, then you're solving the travelling salesman problem, which is known to be NP-hard, which means there is no known (and likely no existing) polynomial algorithm. That said, it is a well studied problem, you can choose from both exact algorithms (if the set is small enough) and heuristics/approximations for larger scale.
The first step is to detect the cycles that are present in your graph, if any
This can be done by modifying a depth first search ( DFS ) as follows:
- run DFS
- IF a node is reached for the second time
- IF path exists from node reached again to current DFS node
- the path is a cycle
Now you can filter the cycles detected for your criteria ( visit nodes in S, shortest, etc )
Here is the C++ code for a DFS that detects and records cycles
std::vector<std::vector<vertex_t>>
cGraph::dfs_cycle_finder(const std::string &start)
{
std::vector<std::vector<vertex_t>> ret;
// track visited vertices
std::vector<bool> visited(vVertex.size(), false);
// vertices waiting to be processed
std::stack<vertex_t> wait;
// start at the beginning
wait.push(vVertex[index(start)]);
// continue until no more vertices need processing
while (!wait.empty())
{
vertex_t v = wait.top();
wait.pop();
int vi = index(v);
if (!visited[vi])
{
visited[vi] = true;
for (vertex_t w : adjacentOut(v))
{
if (!visited[index(w)])
{
wait.push(w);
}
else
{
// previously visited node, check for ancestor
auto cycle = path( w, v );
if( cycle.size() > 0 ) {
// found a cycle
cycle.push_back( w );
ret.push_back(cycle);
}
}
}
}
}
return ret;
}
The complete application for this is at https://github.com/JamesBremner/graphCycler
Example output:
node a linked to b
node b linked to c
node c linked to d
node d linked to a
cycle: a b c d a
Can Breadth first Search be used for finding topological sorting of vertices and strongly connected components in a graph?
If yes how to do that? and If not why not?
we generally use Depth first search in these problems but What will be the problem if I try to implement using BFS?
Will code like this work?
def top_bfs(start_node):
queue = [start_node]
stack = []
while not queue.empty():
node = queue.dequeue()
if not node.visited:
node.visited = True
stack.push(node)
for c in node.children:
queue.enqueue(c)
stack.reverse()
return stack
Yes, you can do topological sorting using BFS. Actually I remembered once my teacher told me that if the problem can be solved by BFS, never choose to solve it by DFS. Because the logic for BFS is simpler than DFS, most of the time you will always want a straightforward solution to a problem.
You need to start with nodes of which the indegree is 0, meaning no other nodes direct to them. Be sure to add these nodes to your result first.You can use a HashMap to map every node with its indegree, and a queue which is very commonly seen in BFS to assist your traversal. When you poll a node from the queue, the indegree of its neighbors need to be decreased by 1, this is like delete the node from the graph and delete the edge between the node and its neighbors. Every time you come across nodes with 0 indegree, offer them to the queue for checking their neighbors later and add them to the result.
public ArrayList<DirectedGraphNode> topSort(ArrayList<DirectedGraphNode> graph) {
ArrayList<DirectedGraphNode> result = new ArrayList<>();
if (graph == null || graph.size() == 0) {
return result;
}
Map<DirectedGraphNode, Integer> indegree = new HashMap<DirectedGraphNode, Integer>();
Queue<DirectedGraphNode> queue = new LinkedList<DirectedGraphNode>();
//mapping node to its indegree to the HashMap, however these nodes
//have to be directed to by one other node, nodes whose indegree == 0
//would not be mapped.
for (DirectedGraphNode DAGNode : graph){
for (DirectedGraphNode nei : DAGNode.neighbors){
if(indegree.containsKey(nei)){
indegree.put(nei, indegree.get(nei) + 1);
} else {
indegree.put(nei, 1);
}
}
}
//find all nodes with indegree == 0. They should be at starting positon in the result
for (DirectedGraphNode GraphNode : graph) {
if (!indegree.containsKey(GraphNode)){
queue.offer(GraphNode);
result.add(GraphNode);
}
}
//everytime we poll out a node from the queue, it means we delete it from the
//graph, we will minus its neighbors indegree by one, this is the same meaning
//as we delete the edge from the node to its neighbors.
while (!queue.isEmpty()) {
DirectedGraphNode temp = queue.poll();
for (DirectedGraphNode neighbor : temp.neighbors){
indegree.put(neighbor, indegree.get(neighbor) - 1);
if (indegree.get(neighbor) == 0){
result.add(neighbor);
queue.offer(neighbor);
}
}
}
return result;
}
The fact that they have similar names doesn't make them similar methods.
DFS is typically implemented with LIFO (a stack if you will) - last in first out.
BFS typically implemented with FIFO (a queue if you will) - first in first out.
You can walk a graph in any way you want, and eventually come out with a topological order of its nodes. But if you want to do it efficiently, then DFS is the best option, as the topological order of the nodes essentially reflects their depth in the graph (well, "dependency-depth" to be more accurate).
So generally the code for topologically sorting using DFS (depth first search) is much more straight forward, you run it and it backtracks since its recursive assigning numbers as it calls back to previous stack frames. BFS is less straight forward but still easy to understand.
First, you must calculate the in-degree of all the vertices on the graph, this is because you must start at a vertex that has an in-degree of 0.
int[] indegree = int[adjList.length];
for(int i = 0; i < adjList.length; i++){
for(Edge e = adjList[i]; e != null; e = e.next){
indegree[e.vertexNum]++;
}
}
So the code above iterates through the vertex array, then it iterates through a single vertex's edges(in this case its stored using linked list), then it increments the vertex that the edge is pointing to in the indegree array. So at the end of the outer loop you will have traversed each vertex's neighbors and calculated each vertex's in-degree.
Second, you now must use BFS to actually topologically sort this graph. So this first snippet of code will only enqueue the vertices in the graph that have an in-degree of 0.
Queue<Integer> q = new Queue<>();
for(int i = 0; i < indegree.length; i++){
if(indegree[i] == 0){
q.enqueue(i);
}
}
Now, after enqueueing only vertices with in-degree of 0, you start the loop to assign topological numbers.
while(!q.isEmpty()){
int vertex = q.dequeue();
System.out.print(vertex);
for(Edge e = adjList[vertex]; e != null; e = e.next){
if(--indegree[e.vnum] == 0){
q.enqueue(e.vnum);
}
}
So the print statement prints out the vertex number that corresponds to the vertex. So depending on the requirements of your program, you can change the code where the print statement is to something that stores the vertex numbers or the names or something along those lines. Other than that, I hope this helped answer the first question.
Second Question
As for the second part of the question, it's pretty simple.
1.Create boolean array filled with false values, this will represent if the vertices have been visited or not.
2.Create for loop iterating over the adjList array, inside this loop you will call bfs, after calling bfs you will iterate over the boolean array you created, checking if any value is false, if it is then the graph is not strongly connected and you can return "graph is not strongly connected" and end the program. At the end of each iteration of the outer for-loop (but after the inner for-loop) don't forget to reset your boolean array to all false values again.
3.At this point the outer for loop is done and you can return true, it is your job to implement to bfs it should take in an integer and the visited boolean array you created as parameters.
I had a test today (Data Structures course), and one of the questions was the following:
Given an undirected, non-weighted graph G=(V,E), you need to write an algorithm that for a given node s, returns the shortest path from s to all the nodes v' in the complement graph.
A Complement Graph G'=(E',V') contains an edge between any to nodes in G that don't share an edge, and only those.
The algorithm needs to run in O(V+E) (of the original graph).
I asked 50 different students, and not even one of them solved it correctly.
any Ideas?
Thanks a lot,
Barak.
The course staff have published the official answers to the test.
The answer is:
"The algorithm is based on a BFS with a few adaptations.
For each node in the graph we will add 2 fields - next and prev. Using these two fields we can maintain two Doubly-Linked lists of nodes: L1,L2.
At the beginning of every iteration of the algorithm, L1 has all the while nodes in the graph, and L2 is empty.
The BFS code (without the initialization) is:
At the ending of the loop at lines 3-5, L1 contains all the white nodes that aren't adjacent to u in G, or in other words, all the white nodes that are adjacent to u in the complement graph.
Therefore the runtime of the algorithm equals to the runtime of the original BFS on the complement graph.
The time is O(V+E) because lines 4-5 are executed at most 2E times, and lines 7-9 are executed at most V times (Every node can get out of L1 only once)."
Note: this is the original solution translated from Hebrew.
I Hope you find it helpful, and thank you all for helping me out,
Barak.
I would like to propose a different approach.
Initialization:-
Create a list of undiscovered edges. Let's call it undiscovered and initialize it with all nodes.
Then we will run a modified version of BFS
Create a Queue(Q) and add start node to it
Main algo
while undiscovered.size()>0 && Queue not Empty
curr_node = DEQUEUE(Queue)
create a list of all edges in the complement graph(Lets call it
complement_edges). This can be created by looping through all the
nodes in undiscovered and checking whether it is connected to
curr_node.
Then loop through each node in complement_edges perform 3
operation
Update distance if optimal
remove this node from undiscovered
ENQUEUE(Queue, this node)
Some things to note here,
If the initial graph is sparse, then the undiscovered will become empty very fast.
During implementation, use hashing to store edges in graph, this will make step 2 fast.
Heres the sample code:-
HashSet<Integer> adjList[]; // graph stored as adjancency list
public int[] calc_distance(int start){
HashSet<Integer> undiscovered = new HashSet<>();
for(int i=1;i<=N;i++){
undiscovered.add(i);
}
int[] dist = new int[N+1];
Arrays.fill(dist, Integer.MAX_VALUE/4);
Queue<Integer> q = new LinkedList<>();
q.add(start);
dist[start] = 0;
while(!q.isEmpty() && undiscovered.size()>0){
int curr = q.poll();
LinkedList<Integer> complement_edges = new LinkedList<>();
for(int child : undiscovered){
if(!adjList[curr].contains(child)){
// curr and child is connected in complement
complement_edges.add(child);
}
}
for(int child : complement_edges){
if(dist[child]>(dist[curr]+1)){
dist[child] = dist[curr]+1;
}
// remove complement_edges from undiscovered
undiscovered.remove(child);
q.add(child);
}
}
return dist;
}
}
Just wanted to know how efficient is the below algorithm to find the lowest common ancestor of two nodes in a binary search tree.
Node getLowestCommonAncestor (Node root, Node a, Node b) {
Find the in-order traversal of Node root.
Find temp1 = the in-order successor of Node a.
Find temp2 = the in-order successor of Node b.
return min (temp1, temp2);
}
Searching for the lowest common ancestor in a binary search tree is simpler than that: observe that the LCA is the node where the searches for item A and item B diverge for the first time.
Start from the root, and search for A and B at the same time. As long as both searches take you in the same direction, continue the search. Once you arrive at the node such that searching for A and B take you to different subtrees, you know that the current node is the LCA.
A node at the bottom of a large binary search trees can have an in-order successor close to it, for instance if it is the left child of a node, its in-order successor is its parent.
Two nodes descending from different children of the root will have the root as their least common ancestor, no matter where they are, so I believe that your algorithm gets this case wrong.
This is a discussion of efficient LCA algorithms (given time to build a preparatory data structure) at http://en.wikipedia.org/wiki/Lowest_common_ancestor, with pointers to code.
An inefficient but simple way of finding the LCA is as follows: in the tree keep pointers from children to parents and a note of the depth of each node. Given two nodes, move up from the deepest one until the depth if the same. If you are pointing at the other node, it is the LCA. Otherwise move up one step from each node and check again, and so on, until you meet at the LCA.
Finding LCA of BST is straight:
Find the node for which node1 and node2 are present on different
sides. But if the node1 is an ancestor of node2 than also we will
have to return node1. Below code implements this algo.
TreeNode<K> commonAncestor(TreeNode t, K k1, K k2){
if(t==null) return null;
while(true)
{
int c1=t.k.compareTo(k1);
int c2=t.k.compareTo(k2);
if(c1*c2<=0)
{
return t;
}
else
{
if(c1<0)
{
t = t.right;
}
else
{
t = t.left;
}
}
}
}
I have written an algorithm to determine "whether an undirected graph is a tree"
Assumptions : graph G is represented as adjacency list, where we already know the number of vertices which is n
Is_graph_a_tree(G,1,n) /* using BFS */
{
-->Q={1} //is a Queue
-->An array M[1:n], such that for all i, M[i]=0 /* to mark visited vertices*/
-->M[1]=1
-->edgecount=0 // to determine the number of edges visited
-->While( (Q is not empty) and (edgecount<=n-1) )
{
-->i=dequeue(Q)
-->for each edge (i,j) and M[j] =0 and edgecount<=n-1
{
-->M[j]=1
-->Q=Q U {j}
-->edgecount++
}
}
If(edgecount != n-1)
--> print “G is not a tree”
Else
{
-->If there exists i such that M[i]==0
Print “ G is not a tree”
Else
Print “G is tree”
}
}
Is it right??
Is the time complexity of this algorithm Big0h(n)??
I think the counting of edges is not correct. You should also count edges (i,j) for witch M[j]=1 but j is not the parent of i (so you would also need to keep the parent of each node).
Maybe is better to count the edges at the end, by summing the sizes of the adjacency lists and dividing by 2.
You want to do a Depth First Search. An undirected graph has only back edges and tree edges. So you can just copy the DFS algorithm and if you find a back edge then it's not a tree.