Using BFS for topological sort - algorithm

Can Breadth first Search be used for finding topological sorting of vertices and strongly connected components in a graph?
If yes how to do that? and If not why not?
we generally use Depth first search in these problems but What will be the problem if I try to implement using BFS?
Will code like this work?
def top_bfs(start_node):
queue = [start_node]
stack = []
while not queue.empty():
node = queue.dequeue()
if not node.visited:
node.visited = True
stack.push(node)
for c in node.children:
queue.enqueue(c)
stack.reverse()
return stack

Yes, you can do topological sorting using BFS. Actually I remembered once my teacher told me that if the problem can be solved by BFS, never choose to solve it by DFS. Because the logic for BFS is simpler than DFS, most of the time you will always want a straightforward solution to a problem.
You need to start with nodes of which the indegree is 0, meaning no other nodes direct to them. Be sure to add these nodes to your result first.You can use a HashMap to map every node with its indegree, and a queue which is very commonly seen in BFS to assist your traversal. When you poll a node from the queue, the indegree of its neighbors need to be decreased by 1, this is like delete the node from the graph and delete the edge between the node and its neighbors. Every time you come across nodes with 0 indegree, offer them to the queue for checking their neighbors later and add them to the result.
public ArrayList<DirectedGraphNode> topSort(ArrayList<DirectedGraphNode> graph) {
ArrayList<DirectedGraphNode> result = new ArrayList<>();
if (graph == null || graph.size() == 0) {
return result;
}
Map<DirectedGraphNode, Integer> indegree = new HashMap<DirectedGraphNode, Integer>();
Queue<DirectedGraphNode> queue = new LinkedList<DirectedGraphNode>();
//mapping node to its indegree to the HashMap, however these nodes
//have to be directed to by one other node, nodes whose indegree == 0
//would not be mapped.
for (DirectedGraphNode DAGNode : graph){
for (DirectedGraphNode nei : DAGNode.neighbors){
if(indegree.containsKey(nei)){
indegree.put(nei, indegree.get(nei) + 1);
} else {
indegree.put(nei, 1);
}
}
}
//find all nodes with indegree == 0. They should be at starting positon in the result
for (DirectedGraphNode GraphNode : graph) {
if (!indegree.containsKey(GraphNode)){
queue.offer(GraphNode);
result.add(GraphNode);
}
}
//everytime we poll out a node from the queue, it means we delete it from the
//graph, we will minus its neighbors indegree by one, this is the same meaning
//as we delete the edge from the node to its neighbors.
while (!queue.isEmpty()) {
DirectedGraphNode temp = queue.poll();
for (DirectedGraphNode neighbor : temp.neighbors){
indegree.put(neighbor, indegree.get(neighbor) - 1);
if (indegree.get(neighbor) == 0){
result.add(neighbor);
queue.offer(neighbor);
}
}
}
return result;
}

The fact that they have similar names doesn't make them similar methods.
DFS is typically implemented with LIFO (a stack if you will) - last in first out.
BFS typically implemented with FIFO (a queue if you will) - first in first out.
You can walk a graph in any way you want, and eventually come out with a topological order of its nodes. But if you want to do it efficiently, then DFS is the best option, as the topological order of the nodes essentially reflects their depth in the graph (well, "dependency-depth" to be more accurate).

So generally the code for topologically sorting using DFS (depth first search) is much more straight forward, you run it and it backtracks since its recursive assigning numbers as it calls back to previous stack frames. BFS is less straight forward but still easy to understand.
First, you must calculate the in-degree of all the vertices on the graph, this is because you must start at a vertex that has an in-degree of 0.
int[] indegree = int[adjList.length];
for(int i = 0; i < adjList.length; i++){
for(Edge e = adjList[i]; e != null; e = e.next){
indegree[e.vertexNum]++;
}
}
So the code above iterates through the vertex array, then it iterates through a single vertex's edges(in this case its stored using linked list), then it increments the vertex that the edge is pointing to in the indegree array. So at the end of the outer loop you will have traversed each vertex's neighbors and calculated each vertex's in-degree.
Second, you now must use BFS to actually topologically sort this graph. So this first snippet of code will only enqueue the vertices in the graph that have an in-degree of 0.
Queue<Integer> q = new Queue<>();
for(int i = 0; i < indegree.length; i++){
if(indegree[i] == 0){
q.enqueue(i);
}
}
Now, after enqueueing only vertices with in-degree of 0, you start the loop to assign topological numbers.
while(!q.isEmpty()){
int vertex = q.dequeue();
System.out.print(vertex);
for(Edge e = adjList[vertex]; e != null; e = e.next){
if(--indegree[e.vnum] == 0){
q.enqueue(e.vnum);
}
}
So the print statement prints out the vertex number that corresponds to the vertex. So depending on the requirements of your program, you can change the code where the print statement is to something that stores the vertex numbers or the names or something along those lines. Other than that, I hope this helped answer the first question.
Second Question
As for the second part of the question, it's pretty simple.
1.Create boolean array filled with false values, this will represent if the vertices have been visited or not.
2.Create for loop iterating over the adjList array, inside this loop you will call bfs, after calling bfs you will iterate over the boolean array you created, checking if any value is false, if it is then the graph is not strongly connected and you can return "graph is not strongly connected" and end the program. At the end of each iteration of the outer for-loop (but after the inner for-loop) don't forget to reset your boolean array to all false values again.
3.At this point the outer for loop is done and you can return true, it is your job to implement to bfs it should take in an integer and the visited boolean array you created as parameters.

Related

How to minimize the vertex of the graph by substituting cycles?

How can I minimize the number of vertices of the directed graph by removing circuits? Is there any algorithms that can be adapted here?
There already is a question about removing the cycles in graphs, but I am particularly asking about MINIMIZING THE NUMBER OF VERTICES by removing the cycles in graphs
Supposing the solution for your problem is to simply turn cycles into a single node, sure, you can do that easily.
When you execute Breadth-First Search (BSF) or Depth-First Search (DFS), you will find cycles (i.e. if you mark the path you step into, once you can reach an already marked node, you have found a cycle). Hence, you can easily find cycles by storing the predecessor of each node you visit, that is, if you are in node u and you go to node v, you can store p[v] = u, so if you find some node w already visited in the adjacency list of v, you can walk back, parent by parent, until you find w and you have all nodes from that cycles.
I cannot guarantee any property of completeness from this algorithm, so if you can freely preprocess your graph, you can run DFSs on it until the graph is unchanged by it, otherwise run it a certain number n of times that you find is efficient.
void FindCycles(vector<Node> nodes){
int p[nodes.size()];
bool mark[nodes.size()]; //set all to false
stack<int> s;
s.push(nodes[0].id);
while(s.size()){
int u = s.pop();
mark[u] = true;
for(int v : nodes[u].adjs){
p[v] = u;
if(mark[v]) {
//found a cycle, call some method to reduce the graph
cout<<v<<" belongs to the cycle"<<endl;
while(u != v){
cout<<u<<" belongs to the cycle"<<endl;
u = p[u];
}
break;
}
else{
s.push(v);
}
}
}
}

DFS after remove some edge

I have a graph with one source vertex and a list of the edges, where in each iteration one edge from the list is going to be removed from the graph.
For each vertex i have to print the number of iterations after it lost its connection to the source vertex- there will be no path between the vertex and the source.
My idea is to run DFS algorithm from the source vertex in each iteration and increment the value of the vertexes, which have the connection with the source vertex- there is a path between the vertex and the source vertex.
I'm sure there is a better idea than run the dfs algorithm from the source vertex in each iteration. But I don't know how to resolve the problem in better, faster way.
Since you have the whole edge list in advance, you can process it backwards, connecting the graph instead of disconnecting it.
In pseudo-code:
GIVEN:
edges = list of edges
outputMap = new empty map from vertex to iteration number
S = source vertex
//first remove all the edges in the list
for (int i=0;i<edges.size();i++) {
removeEdge(edges[i]);
}
//find vertices that are never disconnected
//use DFS or BFS
foreach vertex reachable from S
{
outputMap[vertex] = -1;
}
//walk through the edges backward, reconnecting
//the graph
for (int i=edges.size()-1; i>=0; i--)
{
Vertex v1 = edges[i].v1;
Vertex v2 = edges[i].v2;
Vertex newlyConnected = null;
//this is for an undirected graph
//for a directed graph, you only test one way
//is a new vertex being connected to the source?
if (outputMap.containsKey(v1) && !outputMap.containsKey(v2))
newlyConnected = v2;
else if (outputMap.containsKey(v2) && !outputMap.containsKey(v1))
newlyConnected = v1;
if (newlyConnected != null)
{
//BFS or DFS again
foreach vertex reachable from newlyConnected
{
//It's easy to calculate the desired remove iteration number
//from our add iteration number
outputMap[vertex] = edges.size()-i;
}
}
addEdge(v1,v2);
}
//generate output
foreach entry in outputMap
{
if (entry.value >=0)
{
print("vertex "+entry.key+" disconnects in iteration "+entry.value);
}
}
This algorithm achieves linear time, since each vertex is only involved in a single BFS or DFS, before it gets connected to the source.
It helps to reverse time, so that we're thinking about adding edges one by one and determining when connectivity to the source is achieved. Your idea of performing a traversal after each step is a good one. To get the total cost down to linear, you need the following optimization and an amortized analysis. The optimization is that you save the set of visited vertices from traversal to traversal and treat the set as one "supervertex", deleting intra-set edges as they are traversed. The cost of each traversal is proportional to the number of edges thus deleted, hence the amortized linear running time.

Shortest path in a complement graph algorithm

I had a test today (Data Structures course), and one of the questions was the following:
Given an undirected, non-weighted graph G=(V,E), you need to write an algorithm that for a given node s, returns the shortest path from s to all the nodes v' in the complement graph.
A Complement Graph G'=(E',V') contains an edge between any to nodes in G that don't share an edge, and only those.
The algorithm needs to run in O(V+E) (of the original graph).
I asked 50 different students, and not even one of them solved it correctly.
any Ideas?
Thanks a lot,
Barak.
The course staff have published the official answers to the test.
The answer is:
"The algorithm is based on a BFS with a few adaptations.
For each node in the graph we will add 2 fields - next and prev. Using these two fields we can maintain two Doubly-Linked lists of nodes: L1,L2.
At the beginning of every iteration of the algorithm, L1 has all the while nodes in the graph, and L2 is empty.
The BFS code (without the initialization) is:
At the ending of the loop at lines 3-5, L1 contains all the white nodes that aren't adjacent to u in G, or in other words, all the white nodes that are adjacent to u in the complement graph.
Therefore the runtime of the algorithm equals to the runtime of the original BFS on the complement graph.
The time is O(V+E) because lines 4-5 are executed at most 2E times, and lines 7-9 are executed at most V times (Every node can get out of L1 only once)."
Note: this is the original solution translated from Hebrew.
I Hope you find it helpful, and thank you all for helping me out,
Barak.
I would like to propose a different approach.
Initialization:-
Create a list of undiscovered edges. Let's call it undiscovered and initialize it with all nodes.
Then we will run a modified version of BFS
Create a Queue(Q) and add start node to it
Main algo
while undiscovered.size()>0 && Queue not Empty
curr_node = DEQUEUE(Queue)
create a list of all edges in the complement graph(Lets call it
complement_edges). This can be created by looping through all the
nodes in undiscovered and checking whether it is connected to
curr_node.
Then loop through each node in complement_edges perform 3
operation
Update distance if optimal
remove this node from undiscovered
ENQUEUE(Queue, this node)
Some things to note here,
If the initial graph is sparse, then the undiscovered will become empty very fast.
During implementation, use hashing to store edges in graph, this will make step 2 fast.
Heres the sample code:-
HashSet<Integer> adjList[]; // graph stored as adjancency list
public int[] calc_distance(int start){
HashSet<Integer> undiscovered = new HashSet<>();
for(int i=1;i<=N;i++){
undiscovered.add(i);
}
int[] dist = new int[N+1];
Arrays.fill(dist, Integer.MAX_VALUE/4);
Queue<Integer> q = new LinkedList<>();
q.add(start);
dist[start] = 0;
while(!q.isEmpty() && undiscovered.size()>0){
int curr = q.poll();
LinkedList<Integer> complement_edges = new LinkedList<>();
for(int child : undiscovered){
if(!adjList[curr].contains(child)){
// curr and child is connected in complement
complement_edges.add(child);
}
}
for(int child : complement_edges){
if(dist[child]>(dist[curr]+1)){
dist[child] = dist[curr]+1;
}
// remove complement_edges from undiscovered
undiscovered.remove(child);
q.add(child);
}
}
return dist;
}
}

Finding a cycle in a directed graph using BFS or DFS

I tried looking around the Internet but I'm a little stuck at the moment with regards to modifying the BFS or DFS algorithm in order to be able to find a cycle in a directed graph. If the graph were not directed, the DFS algorithm would solve this using back edges, but this method fails when looking at directed graphs.
Can anyone point me in the right direction?
Thanks for your time.
Keep track of vertices currently in recursion stack of function for DFS traversal. If you reach a vertex that is already in the recursion stack, then there is a cycle in the tree.
Create an array recStack[] and add every vertex visited in it. if you encounter a vertex that is already visited, there exists a cycle and you can print it by passing that vertex again to a modified DFS function for printing
bool isGraphCyclic(int v, bool visited[], bool *recStack)
{
if(visited[v] == false)
{
// Mark the current node as visited and part of recursion stack
visited[v] = true;
recStack[v] = true;
// Recur for all the vertices adjacent to this vertex
list<int>::iterator i;
for(i = adj[v].begin(); i != adj[v].end(); ++i)
{
if ( !visited[*i] && isGraphCyclic(*i, visited, recStack) )
return true;
else if (recStack[*i])
return true;
}
}
recStack[v] = false; // remove the vertex from recursion stack
return false;
}
DFS algorithm classifies graph edges into three categories *:
Forward edges
Cross edges
Back edges
If your graph has a back edge, it has a cycle. When you run a DFS algorithm and see a backedge, examine the portion of the path from the vertex to which the back edge leads to the current node will give you a set of nodes from the cycle to which the back edge belongs.
* Sometimes, tree edges are treated as a separate category from forward edges, which is insignificant for the purposes of this discussion.

Topological search and Breadth first search

Is it possible to use Breadth first search logic to do a topological sort of a DAG?
The solution in Cormen makes use of Depth first search but wouldn't be easier to use BFS?
Reason:
BFS visits all the nodes in a particular depth before visiting nodes with the next depth value. It naturally means that the parents will be listed before the children if we do a BFS. Isn't this exactly what we need for a topological sort?
A mere BFS is only sufficient for a tree (or forest of trees), because in (forest of) trees, in-degrees are at most 1.
Now, look at this case:
B → C → D
↗
A
A BFS where queue is initialized to A B (whose in-degrees are zero) will return A B D C, which is not topologically sorted. That's why you have to maintain in-degrees count, and only pick nodes whose count has dropped to zero. (*)
BTW, this is the flaw of your 'reason' : BFS only guarantee one parent has been visited before, not all of them.
Edit: (*) In other words you push back adjacent nodes whose in-degree is zero (in the exemple, after processing A, D would be skipped). So, you're still using a queue and you've just added a filtering step to the general algorithm. That being said, continuing to call it a BFS is questionable.
It is possible, even wikipedia describes an algorithm based on BFS.
Basically, you use a queue in which you insert all nodes with no incoming edges. Then, when you extract a node, you remove all of its outgoing edges and insert the nodes reachable from it that have no other incoming edges.
In a BFS all of the edges you actually walk will end up in the correct direction. But all the edges you don't walk (those between nodes at the same depth, or those from deeper nodes back up to earlier nodes) will end up going the wrong way if you lay out the graph in BFS order.
Yes, you really need DFS to do it.
Yes, you can do topological sorting using BFS. Actually I remembered once my teacher told me that if the problem can be solved by BFS, never choose to solve it by DFS. Because the logic for BFS is simpler than DFS, most of the time you will always want a straightforward solution to a problem.
As YvesgereY and IVlad has mentioned, you need to start with nodes of which the indegree is 0, meaning no other nodes direct to them. Be sure to add these nodes to your result first.You can use a HashMap to map every node with its indegree, and a queue which is very commonly seen in BFS to assist your traversal. When you poll a node from the queue, the indegree of its neighbors need to be decreased by 1, this is like delete the node from the graph and delete the edge between the node and its neighbors. Every time you come across nodes with 0 indegree, offer them to the queue for checking their neighbors later and add them to the result.
public ArrayList<DirectedGraphNode> topSort(ArrayList<DirectedGraphNode> graph) {
ArrayList<DirectedGraphNode> result = new ArrayList<>();
if (graph == null || graph.size() == 0) {
return result;
}
Map<DirectedGraphNode, Integer> indegree = new HashMap<DirectedGraphNode, Integer>();
Queue<DirectedGraphNode> queue = new LinkedList<DirectedGraphNode>();
//mapping node to its indegree to the HashMap, however these nodes
//have to be directed to by one other node, nodes whose indegree == 0
//would not be mapped.
for (DirectedGraphNode DAGNode : graph){
for (DirectedGraphNode nei : DAGNode.neighbors){
if(indegree.containsKey(nei)){
indegree.put(nei, indegree.get(nei) + 1);
} else {
indegree.put(nei, 1);
}
}
}
//find all nodes with indegree == 0. They should be at starting positon in the result
for (DirectedGraphNode GraphNode : graph) {
if (!indegree.containsKey(GraphNode)){
queue.offer(GraphNode);
result.add(GraphNode);
}
}
//everytime we poll out a node from the queue, it means we delete it from the
//graph, we will minus its neighbors indegree by one, this is the same meaning
//as we delete the edge from the node to its neighbors.
while (!queue.isEmpty()) {
DirectedGraphNode temp = queue.poll();
for (DirectedGraphNode neighbor : temp.neighbors){
indegree.put(neighbor, indegree.get(neighbor) - 1);
if (indegree.get(neighbor) == 0){
result.add(neighbor);
queue.offer(neighbor);
}
}
}
return result;
}

Resources