I have a graph with one source vertex and a list of the edges, where in each iteration one edge from the list is going to be removed from the graph.
For each vertex i have to print the number of iterations after it lost its connection to the source vertex- there will be no path between the vertex and the source.
My idea is to run DFS algorithm from the source vertex in each iteration and increment the value of the vertexes, which have the connection with the source vertex- there is a path between the vertex and the source vertex.
I'm sure there is a better idea than run the dfs algorithm from the source vertex in each iteration. But I don't know how to resolve the problem in better, faster way.
Since you have the whole edge list in advance, you can process it backwards, connecting the graph instead of disconnecting it.
In pseudo-code:
GIVEN:
edges = list of edges
outputMap = new empty map from vertex to iteration number
S = source vertex
//first remove all the edges in the list
for (int i=0;i<edges.size();i++) {
removeEdge(edges[i]);
}
//find vertices that are never disconnected
//use DFS or BFS
foreach vertex reachable from S
{
outputMap[vertex] = -1;
}
//walk through the edges backward, reconnecting
//the graph
for (int i=edges.size()-1; i>=0; i--)
{
Vertex v1 = edges[i].v1;
Vertex v2 = edges[i].v2;
Vertex newlyConnected = null;
//this is for an undirected graph
//for a directed graph, you only test one way
//is a new vertex being connected to the source?
if (outputMap.containsKey(v1) && !outputMap.containsKey(v2))
newlyConnected = v2;
else if (outputMap.containsKey(v2) && !outputMap.containsKey(v1))
newlyConnected = v1;
if (newlyConnected != null)
{
//BFS or DFS again
foreach vertex reachable from newlyConnected
{
//It's easy to calculate the desired remove iteration number
//from our add iteration number
outputMap[vertex] = edges.size()-i;
}
}
addEdge(v1,v2);
}
//generate output
foreach entry in outputMap
{
if (entry.value >=0)
{
print("vertex "+entry.key+" disconnects in iteration "+entry.value);
}
}
This algorithm achieves linear time, since each vertex is only involved in a single BFS or DFS, before it gets connected to the source.
It helps to reverse time, so that we're thinking about adding edges one by one and determining when connectivity to the source is achieved. Your idea of performing a traversal after each step is a good one. To get the total cost down to linear, you need the following optimization and an amortized analysis. The optimization is that you save the set of visited vertices from traversal to traversal and treat the set as one "supervertex", deleting intra-set edges as they are traversed. The cost of each traversal is proportional to the number of edges thus deleted, hence the amortized linear running time.
Related
I want to use topological sorting using Depth First Search (DFS) for the given problem (the directed graph attached below).
click here to see the image.
Could you please help by writing the appropriate code for the given problem using any programming language?
To recap: an ordering of the vertices of a graph is a topological sorting if all edges in the graph lead from higher-numbered vertices to lower-numbered vertices (or the other way around).
This means, that if, having a vertex V, you'd be able to somehow visit all vertices accessible from V and give them numbers lower than the number you give to V, then all edges going out of V will lead to vertices numbered lower than V, so they will conform to the definition of a topological sorting.
It is now our goal to achieve the above for all vertices.
Recall that a DFS could be useful here, because a DFS visits all vertices accessible from a given vertex.
We must run the DFS for all vertices, but since the DFS is a recursive algorithm, it will run itself for all vertices accessible from other vertices for which the DFS is run. Thus, we must run the DFS for all vertices, numbering the vertices in the order in which we exit from the DFS function, because at the end of the DFS function we know that all vertices accessible from a given vertex have been visited.
Here's some sample code in C++:
#include <vector> // we'll need vector for an adjacency list graph representation
const maxN = 1'000'000 // maximum number of vertices
vector<int> graph[maxN]; // the adjacency list representation of the graph
bool visited[maxN]; // filled with the value false by default
int postOrder[maxN]; // array with the so-called post-order ordering of the vertices
int counter = 1;
void DFS(int v) {
visited[v] = true; // mark vertex as visited
for (int neighbour : graph[v]) { // for each neighbour of v
if (!visited[neighbour]) { // if the neighbour wasn't already visited
DFS(neighbour); // run the DFS for that neighbour
}
}
// Finally, after all the vertices accessible from v have been numbered,
// give v the next number, and increase the counter.
postOrder[v] = counter;
counter++;
}
void sortTopologically() {
// For each unvisited vertex, run the DFS starting in that vertex.
for (int v = 0; v < maxN; v++) {
if (!visited[v]) {
DFS(v);
}
}
}
This is called a post-order ordering and is a very common way to topologically sort a directed graph. You can then easily check if the topological sorting exists by verifying that all edges lead from a higher-numbered vertex to a lower-numbered vertex.
Good luck with your university course!
Can Breadth first Search be used for finding topological sorting of vertices and strongly connected components in a graph?
If yes how to do that? and If not why not?
we generally use Depth first search in these problems but What will be the problem if I try to implement using BFS?
Will code like this work?
def top_bfs(start_node):
queue = [start_node]
stack = []
while not queue.empty():
node = queue.dequeue()
if not node.visited:
node.visited = True
stack.push(node)
for c in node.children:
queue.enqueue(c)
stack.reverse()
return stack
Yes, you can do topological sorting using BFS. Actually I remembered once my teacher told me that if the problem can be solved by BFS, never choose to solve it by DFS. Because the logic for BFS is simpler than DFS, most of the time you will always want a straightforward solution to a problem.
You need to start with nodes of which the indegree is 0, meaning no other nodes direct to them. Be sure to add these nodes to your result first.You can use a HashMap to map every node with its indegree, and a queue which is very commonly seen in BFS to assist your traversal. When you poll a node from the queue, the indegree of its neighbors need to be decreased by 1, this is like delete the node from the graph and delete the edge between the node and its neighbors. Every time you come across nodes with 0 indegree, offer them to the queue for checking their neighbors later and add them to the result.
public ArrayList<DirectedGraphNode> topSort(ArrayList<DirectedGraphNode> graph) {
ArrayList<DirectedGraphNode> result = new ArrayList<>();
if (graph == null || graph.size() == 0) {
return result;
}
Map<DirectedGraphNode, Integer> indegree = new HashMap<DirectedGraphNode, Integer>();
Queue<DirectedGraphNode> queue = new LinkedList<DirectedGraphNode>();
//mapping node to its indegree to the HashMap, however these nodes
//have to be directed to by one other node, nodes whose indegree == 0
//would not be mapped.
for (DirectedGraphNode DAGNode : graph){
for (DirectedGraphNode nei : DAGNode.neighbors){
if(indegree.containsKey(nei)){
indegree.put(nei, indegree.get(nei) + 1);
} else {
indegree.put(nei, 1);
}
}
}
//find all nodes with indegree == 0. They should be at starting positon in the result
for (DirectedGraphNode GraphNode : graph) {
if (!indegree.containsKey(GraphNode)){
queue.offer(GraphNode);
result.add(GraphNode);
}
}
//everytime we poll out a node from the queue, it means we delete it from the
//graph, we will minus its neighbors indegree by one, this is the same meaning
//as we delete the edge from the node to its neighbors.
while (!queue.isEmpty()) {
DirectedGraphNode temp = queue.poll();
for (DirectedGraphNode neighbor : temp.neighbors){
indegree.put(neighbor, indegree.get(neighbor) - 1);
if (indegree.get(neighbor) == 0){
result.add(neighbor);
queue.offer(neighbor);
}
}
}
return result;
}
The fact that they have similar names doesn't make them similar methods.
DFS is typically implemented with LIFO (a stack if you will) - last in first out.
BFS typically implemented with FIFO (a queue if you will) - first in first out.
You can walk a graph in any way you want, and eventually come out with a topological order of its nodes. But if you want to do it efficiently, then DFS is the best option, as the topological order of the nodes essentially reflects their depth in the graph (well, "dependency-depth" to be more accurate).
So generally the code for topologically sorting using DFS (depth first search) is much more straight forward, you run it and it backtracks since its recursive assigning numbers as it calls back to previous stack frames. BFS is less straight forward but still easy to understand.
First, you must calculate the in-degree of all the vertices on the graph, this is because you must start at a vertex that has an in-degree of 0.
int[] indegree = int[adjList.length];
for(int i = 0; i < adjList.length; i++){
for(Edge e = adjList[i]; e != null; e = e.next){
indegree[e.vertexNum]++;
}
}
So the code above iterates through the vertex array, then it iterates through a single vertex's edges(in this case its stored using linked list), then it increments the vertex that the edge is pointing to in the indegree array. So at the end of the outer loop you will have traversed each vertex's neighbors and calculated each vertex's in-degree.
Second, you now must use BFS to actually topologically sort this graph. So this first snippet of code will only enqueue the vertices in the graph that have an in-degree of 0.
Queue<Integer> q = new Queue<>();
for(int i = 0; i < indegree.length; i++){
if(indegree[i] == 0){
q.enqueue(i);
}
}
Now, after enqueueing only vertices with in-degree of 0, you start the loop to assign topological numbers.
while(!q.isEmpty()){
int vertex = q.dequeue();
System.out.print(vertex);
for(Edge e = adjList[vertex]; e != null; e = e.next){
if(--indegree[e.vnum] == 0){
q.enqueue(e.vnum);
}
}
So the print statement prints out the vertex number that corresponds to the vertex. So depending on the requirements of your program, you can change the code where the print statement is to something that stores the vertex numbers or the names or something along those lines. Other than that, I hope this helped answer the first question.
Second Question
As for the second part of the question, it's pretty simple.
1.Create boolean array filled with false values, this will represent if the vertices have been visited or not.
2.Create for loop iterating over the adjList array, inside this loop you will call bfs, after calling bfs you will iterate over the boolean array you created, checking if any value is false, if it is then the graph is not strongly connected and you can return "graph is not strongly connected" and end the program. At the end of each iteration of the outer for-loop (but after the inner for-loop) don't forget to reset your boolean array to all false values again.
3.At this point the outer for loop is done and you can return true, it is your job to implement to bfs it should take in an integer and the visited boolean array you created as parameters.
I had a test today (Data Structures course), and one of the questions was the following:
Given an undirected, non-weighted graph G=(V,E), you need to write an algorithm that for a given node s, returns the shortest path from s to all the nodes v' in the complement graph.
A Complement Graph G'=(E',V') contains an edge between any to nodes in G that don't share an edge, and only those.
The algorithm needs to run in O(V+E) (of the original graph).
I asked 50 different students, and not even one of them solved it correctly.
any Ideas?
Thanks a lot,
Barak.
The course staff have published the official answers to the test.
The answer is:
"The algorithm is based on a BFS with a few adaptations.
For each node in the graph we will add 2 fields - next and prev. Using these two fields we can maintain two Doubly-Linked lists of nodes: L1,L2.
At the beginning of every iteration of the algorithm, L1 has all the while nodes in the graph, and L2 is empty.
The BFS code (without the initialization) is:
At the ending of the loop at lines 3-5, L1 contains all the white nodes that aren't adjacent to u in G, or in other words, all the white nodes that are adjacent to u in the complement graph.
Therefore the runtime of the algorithm equals to the runtime of the original BFS on the complement graph.
The time is O(V+E) because lines 4-5 are executed at most 2E times, and lines 7-9 are executed at most V times (Every node can get out of L1 only once)."
Note: this is the original solution translated from Hebrew.
I Hope you find it helpful, and thank you all for helping me out,
Barak.
I would like to propose a different approach.
Initialization:-
Create a list of undiscovered edges. Let's call it undiscovered and initialize it with all nodes.
Then we will run a modified version of BFS
Create a Queue(Q) and add start node to it
Main algo
while undiscovered.size()>0 && Queue not Empty
curr_node = DEQUEUE(Queue)
create a list of all edges in the complement graph(Lets call it
complement_edges). This can be created by looping through all the
nodes in undiscovered and checking whether it is connected to
curr_node.
Then loop through each node in complement_edges perform 3
operation
Update distance if optimal
remove this node from undiscovered
ENQUEUE(Queue, this node)
Some things to note here,
If the initial graph is sparse, then the undiscovered will become empty very fast.
During implementation, use hashing to store edges in graph, this will make step 2 fast.
Heres the sample code:-
HashSet<Integer> adjList[]; // graph stored as adjancency list
public int[] calc_distance(int start){
HashSet<Integer> undiscovered = new HashSet<>();
for(int i=1;i<=N;i++){
undiscovered.add(i);
}
int[] dist = new int[N+1];
Arrays.fill(dist, Integer.MAX_VALUE/4);
Queue<Integer> q = new LinkedList<>();
q.add(start);
dist[start] = 0;
while(!q.isEmpty() && undiscovered.size()>0){
int curr = q.poll();
LinkedList<Integer> complement_edges = new LinkedList<>();
for(int child : undiscovered){
if(!adjList[curr].contains(child)){
// curr and child is connected in complement
complement_edges.add(child);
}
}
for(int child : complement_edges){
if(dist[child]>(dist[curr]+1)){
dist[child] = dist[curr]+1;
}
// remove complement_edges from undiscovered
undiscovered.remove(child);
q.add(child);
}
}
return dist;
}
}
I tried looking around the Internet but I'm a little stuck at the moment with regards to modifying the BFS or DFS algorithm in order to be able to find a cycle in a directed graph. If the graph were not directed, the DFS algorithm would solve this using back edges, but this method fails when looking at directed graphs.
Can anyone point me in the right direction?
Thanks for your time.
Keep track of vertices currently in recursion stack of function for DFS traversal. If you reach a vertex that is already in the recursion stack, then there is a cycle in the tree.
Create an array recStack[] and add every vertex visited in it. if you encounter a vertex that is already visited, there exists a cycle and you can print it by passing that vertex again to a modified DFS function for printing
bool isGraphCyclic(int v, bool visited[], bool *recStack)
{
if(visited[v] == false)
{
// Mark the current node as visited and part of recursion stack
visited[v] = true;
recStack[v] = true;
// Recur for all the vertices adjacent to this vertex
list<int>::iterator i;
for(i = adj[v].begin(); i != adj[v].end(); ++i)
{
if ( !visited[*i] && isGraphCyclic(*i, visited, recStack) )
return true;
else if (recStack[*i])
return true;
}
}
recStack[v] = false; // remove the vertex from recursion stack
return false;
}
DFS algorithm classifies graph edges into three categories *:
Forward edges
Cross edges
Back edges
If your graph has a back edge, it has a cycle. When you run a DFS algorithm and see a backedge, examine the portion of the path from the vertex to which the back edge leads to the current node will give you a set of nodes from the cycle to which the back edge belongs.
* Sometimes, tree edges are treated as a separate category from forward edges, which is insignificant for the purposes of this discussion.
I have a graph with huge number of nodes with one start node ( all edges are outward ) and one end node ( all edges towards it ). It is an unidirectional and unweighted graph.How to optimize the search in this kind of graph for finding out if path exists between two nodes ? I know BFS provides a solution. Is there anyway to optimize the search ( like adding some additional information ) as I will be doing frequent search on the graph?
EDIT : To add more information about the graph, the graph has one start node with multiple out-edges and one end node with multiple in-edges. In between, there are millions of nodes connected. It is an unweighted DAG. And there are no heuristics involved. Just check isConnected(node a,node b).
Considering your graph is acyclic here is a way to do it : -
Do DFS on graph start with source vertex(only outgoiong edges)
For each edge (u,v) in the graph connected[u][v] = true
Try to store the previous node in DFS stack in a array & for each vertex v visited check the previous nodes in the stack and do
connected[u][v] = true where u is a previous node.
If graph is not acyclic then first calculate SCC's using Kosaraju or Tarjan and then reduce the graph to acyclic and do connected[u][v] = true for each pair in a SCC
pseudo code for modified DFS routine:-
bool connected[n][n] = {false};
bool visited[n] = {false};
int stack[n];
for each source vertex v do :
DFS(v,stack,0);
void DFS(int u,int stack[n],int depth) {
if(!visited[v]) {
visited[v] = true;
for(int i=0;i<depth;i++) {
connected[stack[i]][v] = true;
}
stack[depth] = u;
for each edge(u,v) {
connected[u][v] = true;
DFS(v,stack,depth+1);
}
}
}
Space Complexity : O(V^2)
Time Complexity : O(V^2)
Note:-
If your number of queries are less then try to use DFS for them individually and cache the results as this will be more time consuming then that.