Shortest path in a complement graph algorithm - algorithm

I had a test today (Data Structures course), and one of the questions was the following:
Given an undirected, non-weighted graph G=(V,E), you need to write an algorithm that for a given node s, returns the shortest path from s to all the nodes v' in the complement graph.
A Complement Graph G'=(E',V') contains an edge between any to nodes in G that don't share an edge, and only those.
The algorithm needs to run in O(V+E) (of the original graph).
I asked 50 different students, and not even one of them solved it correctly.
any Ideas?
Thanks a lot,
Barak.

The course staff have published the official answers to the test.
The answer is:
"The algorithm is based on a BFS with a few adaptations.
For each node in the graph we will add 2 fields - next and prev. Using these two fields we can maintain two Doubly-Linked lists of nodes: L1,L2.
At the beginning of every iteration of the algorithm, L1 has all the while nodes in the graph, and L2 is empty.
The BFS code (without the initialization) is:
At the ending of the loop at lines 3-5, L1 contains all the white nodes that aren't adjacent to u in G, or in other words, all the white nodes that are adjacent to u in the complement graph.
Therefore the runtime of the algorithm equals to the runtime of the original BFS on the complement graph.
The time is O(V+E) because lines 4-5 are executed at most 2E times, and lines 7-9 are executed at most V times (Every node can get out of L1 only once)."
Note: this is the original solution translated from Hebrew.
I Hope you find it helpful, and thank you all for helping me out,
Barak.

I would like to propose a different approach.
Initialization:-
Create a list of undiscovered edges. Let's call it undiscovered and initialize it with all nodes.
Then we will run a modified version of BFS
Create a Queue(Q) and add start node to it
Main algo
while undiscovered.size()>0 && Queue not Empty
curr_node = DEQUEUE(Queue)
create a list of all edges in the complement graph(Lets call it
complement_edges). This can be created by looping through all the
nodes in undiscovered and checking whether it is connected to
curr_node.
Then loop through each node in complement_edges perform 3
operation
Update distance if optimal
remove this node from undiscovered
ENQUEUE(Queue, this node)
Some things to note here,
If the initial graph is sparse, then the undiscovered will become empty very fast.
During implementation, use hashing to store edges in graph, this will make step 2 fast.
Heres the sample code:-
HashSet<Integer> adjList[]; // graph stored as adjancency list
public int[] calc_distance(int start){
HashSet<Integer> undiscovered = new HashSet<>();
for(int i=1;i<=N;i++){
undiscovered.add(i);
}
int[] dist = new int[N+1];
Arrays.fill(dist, Integer.MAX_VALUE/4);
Queue<Integer> q = new LinkedList<>();
q.add(start);
dist[start] = 0;
while(!q.isEmpty() && undiscovered.size()>0){
int curr = q.poll();
LinkedList<Integer> complement_edges = new LinkedList<>();
for(int child : undiscovered){
if(!adjList[curr].contains(child)){
// curr and child is connected in complement
complement_edges.add(child);
}
}
for(int child : complement_edges){
if(dist[child]>(dist[curr]+1)){
dist[child] = dist[curr]+1;
}
// remove complement_edges from undiscovered
undiscovered.remove(child);
q.add(child);
}
}
return dist;
}
}

Related

What is an Efficient Algorithm to find Graph Centroid?

Graph Centroid is a vertex at equal distance or at distance less than or equal (N/2) where N is the size of the connected components connected through this vertex?! [Needs Correction?!]
Here's a problem at CodeForces that asks to find whether each vertex is a centroid, but after removing and replacing exactly one edge at a time.
Problem Statement
I need help to refine this PseudoCode / Algorithm.
Loop all Vertices:
Loop all edges:
Position each edge in every empty edge position between two unconnected nodes
Count Size of each Connected Component (*1).
If all sizes are less than or equal N/2,
Then return true
The problem is that this algorithm will run in at least O(N*M^2)). It's not acceptable.
I looked up the answers, but I couldn't come up with the high level abstraction of the algorithm used by others. Could you please help me understand how these solutions work?
Solutions' Link
(*1) DFS Loop
I will try to describe you a not so complex algorithm for solving this problem in linear time, for future references see my code (it have some comments).
The main idea is that you can root the tree T at an arbitrary vertex and traverse it, for each vertex V you can do this:
Cut subtree V from T.
Find the heaviest vertex H having size <= N/2 (H can be in any of T or subtree V).
Move subtree H to become child of subtree V.
Re root T with V and find if the heaviest vertex have size <= N/2
The previous algorithm can be implemented carefully to get linear time complexity, the issue is that it have a lot of cases to handle.
A better idea is to find the centroid C of T and root T at vertex C.
Having vertex C as the root of T is useful because it guarantee us that every descendant of C have size <= N/2.
When traversing the tree we can avoid to check for the heaviest vertex down the tree but up, every time we visit a child W, we can pass the heaviest size (being <= N/2) if we re root T at W.
Try to understand what I explained, let me know if something is not clear.
Well, the Centroid of a tree can be determined in O(N) space and time complexity.
Construct a Matrix representing the Tree, with the row indexes representing the N nodes and the elements in the i-th row representing the nodes to which i-th node is connected. You can any other representation also.
Maintain 2 linear arrays of size N with index i representing the depth of i-th node (depth) and the parent of the i-th node (parent) respectively.
Also maintain 2 more linear arrays, the first one containing the BFS traversal sequence of the tree (queue), and the other one (leftOver) containing the value of [N - Number of nodes in the subtree rooted at that node]. In other words, the i-th index contains the number of nodes that is left in the whole tree when the i-th node is removed from the tree along with all its children.
Now, perform a BFS traversal taking any arbitrary node as root and fill the arrays 'parent' and 'depth'. This takes O(N) time complexity. Also, record the traversal sequence in the array 'queue'.
Starting from the leaf nodes, add the number of nodes present in the subtree rooted at that node, with the value at the parent index in the array 'leftOver'. This also takes O(N) time since you can use the already prepared 'queue' array and travel from backwards.
Finally, traverse through the array 'leftOver' and modify each value to [N-1 - Initial Value]. The 'leftOver' array is prepared. Cost: Another O(N).
Your work is almost done. Now, iterate over this 'leftOver' array and find the index whose value is closest to floor(N/2). However, this value must not exceed floor(N/2) at any cost.
This index is the index of the Centroid of the Tree. Overall time complexity: O(N).
Java Code:
import java.util.ArrayList;
import java.util.Iterator;
import java.util.Scanner;
class Find_Centroid
{
static final int MAXN=100_005;
static ArrayList<Integer>[] graph;
static int[] depth,parent; // Step 2
static int N;
static Scanner io=new Scanner(System.in);
public static void main(String[] args)
{
int i;
N=io.nextInt();
// Number of nodes in the Tree
graph=new ArrayList[N];
for(i=0;i<graph.length;++i)
graph[i]=new ArrayList<>();
//Initialisation
for(i=1;i<N;++i)
{
int a=io.nextInt()-1,b=io.nextInt()-1;
// Assuming 1-based indexing
graph[a].add(b); graph[b].add(a);
// Step 1
}
int centroid = findCentroid(new java.util.Random().nextInt(N));
// Arbitrary indeed... ;)
System.out.println("Centroid: "+(centroid+1));
// '+1' for output in 1-based index
}
static int[] queue=new int[MAXN],leftOver;
// Step 3
static int findCentroid(int r)
{
leftOver=new int[N];
int i,target=N/2,ach=-1;
bfs(r); // Step 4
for(i=N-1;i>=0;--i)
if(queue[i]!=r)
leftOver[parent[queue[i]]] += leftOver[queue[i]] +1;
// Step 5
for(i=0;i<N;++i)
leftOver[i] = N-1 -leftOver[i];
// Step 6
for(i=0;i<N;++i)
if(leftOver[i]<=target && leftOver[i]>ach)
// Closest to target(=N/2) but does not exceed it.
{
r=i; ach=leftOver[i];
}
// Step 7
return r;
}
static void bfs(int root) // Iterative
{
parent=new int[N]; depth=new int[N];
int st=0,end=0;
parent[root]=-1; depth[root]=1;
// Parent of root is obviously undefined. Hence -1.
// Assuming depth of root = 1
queue[end++]=root;
while(st<end)
{
int node = queue[st++], h = depth[node]+1;
Iterator<Integer> itr=graph[node].iterator();
while(itr.hasNext())
{
int ch=itr.next();
if(depth[ch]>0) // 'ch' is parent of 'node'
continue;
depth[ch]=h; parent[ch]=node;
queue[end++]=ch; // Recording the Traversal sequence
}
}
}
}
Now, for the problem, http://codeforces.com/contest/709/problem/E, iterate over each node i, consider it as the root, keep descenting down the child which has >N/2 nodes and try to arrive at a node which has just less than N/2 nodes (closest to N/2 nodes) under it. If, on removing this node along with all its children makes 'i' the centroid, print '1' otherwise print 0. This process can be carried out efficiently, as the 'leftOver' array is already there for you.
Actually, you are detaching the disturbing node (the node which is preventing i from being the centroid) along with its children and attaching it to the i-th node itself. The subtree is guaranteed to have atmost N/2 nodes (as checked earlier) and so won't cause a problem now.
Happy Coding..... :)

DFS after remove some edge

I have a graph with one source vertex and a list of the edges, where in each iteration one edge from the list is going to be removed from the graph.
For each vertex i have to print the number of iterations after it lost its connection to the source vertex- there will be no path between the vertex and the source.
My idea is to run DFS algorithm from the source vertex in each iteration and increment the value of the vertexes, which have the connection with the source vertex- there is a path between the vertex and the source vertex.
I'm sure there is a better idea than run the dfs algorithm from the source vertex in each iteration. But I don't know how to resolve the problem in better, faster way.
Since you have the whole edge list in advance, you can process it backwards, connecting the graph instead of disconnecting it.
In pseudo-code:
GIVEN:
edges = list of edges
outputMap = new empty map from vertex to iteration number
S = source vertex
//first remove all the edges in the list
for (int i=0;i<edges.size();i++) {
removeEdge(edges[i]);
}
//find vertices that are never disconnected
//use DFS or BFS
foreach vertex reachable from S
{
outputMap[vertex] = -1;
}
//walk through the edges backward, reconnecting
//the graph
for (int i=edges.size()-1; i>=0; i--)
{
Vertex v1 = edges[i].v1;
Vertex v2 = edges[i].v2;
Vertex newlyConnected = null;
//this is for an undirected graph
//for a directed graph, you only test one way
//is a new vertex being connected to the source?
if (outputMap.containsKey(v1) && !outputMap.containsKey(v2))
newlyConnected = v2;
else if (outputMap.containsKey(v2) && !outputMap.containsKey(v1))
newlyConnected = v1;
if (newlyConnected != null)
{
//BFS or DFS again
foreach vertex reachable from newlyConnected
{
//It's easy to calculate the desired remove iteration number
//from our add iteration number
outputMap[vertex] = edges.size()-i;
}
}
addEdge(v1,v2);
}
//generate output
foreach entry in outputMap
{
if (entry.value >=0)
{
print("vertex "+entry.key+" disconnects in iteration "+entry.value);
}
}
This algorithm achieves linear time, since each vertex is only involved in a single BFS or DFS, before it gets connected to the source.
It helps to reverse time, so that we're thinking about adding edges one by one and determining when connectivity to the source is achieved. Your idea of performing a traversal after each step is a good one. To get the total cost down to linear, you need the following optimization and an amortized analysis. The optimization is that you save the set of visited vertices from traversal to traversal and treat the set as one "supervertex", deleting intra-set edges as they are traversed. The cost of each traversal is proportional to the number of edges thus deleted, hence the amortized linear running time.

Using BFS for topological sort

Can Breadth first Search be used for finding topological sorting of vertices and strongly connected components in a graph?
If yes how to do that? and If not why not?
we generally use Depth first search in these problems but What will be the problem if I try to implement using BFS?
Will code like this work?
def top_bfs(start_node):
queue = [start_node]
stack = []
while not queue.empty():
node = queue.dequeue()
if not node.visited:
node.visited = True
stack.push(node)
for c in node.children:
queue.enqueue(c)
stack.reverse()
return stack
Yes, you can do topological sorting using BFS. Actually I remembered once my teacher told me that if the problem can be solved by BFS, never choose to solve it by DFS. Because the logic for BFS is simpler than DFS, most of the time you will always want a straightforward solution to a problem.
You need to start with nodes of which the indegree is 0, meaning no other nodes direct to them. Be sure to add these nodes to your result first.You can use a HashMap to map every node with its indegree, and a queue which is very commonly seen in BFS to assist your traversal. When you poll a node from the queue, the indegree of its neighbors need to be decreased by 1, this is like delete the node from the graph and delete the edge between the node and its neighbors. Every time you come across nodes with 0 indegree, offer them to the queue for checking their neighbors later and add them to the result.
public ArrayList<DirectedGraphNode> topSort(ArrayList<DirectedGraphNode> graph) {
ArrayList<DirectedGraphNode> result = new ArrayList<>();
if (graph == null || graph.size() == 0) {
return result;
}
Map<DirectedGraphNode, Integer> indegree = new HashMap<DirectedGraphNode, Integer>();
Queue<DirectedGraphNode> queue = new LinkedList<DirectedGraphNode>();
//mapping node to its indegree to the HashMap, however these nodes
//have to be directed to by one other node, nodes whose indegree == 0
//would not be mapped.
for (DirectedGraphNode DAGNode : graph){
for (DirectedGraphNode nei : DAGNode.neighbors){
if(indegree.containsKey(nei)){
indegree.put(nei, indegree.get(nei) + 1);
} else {
indegree.put(nei, 1);
}
}
}
//find all nodes with indegree == 0. They should be at starting positon in the result
for (DirectedGraphNode GraphNode : graph) {
if (!indegree.containsKey(GraphNode)){
queue.offer(GraphNode);
result.add(GraphNode);
}
}
//everytime we poll out a node from the queue, it means we delete it from the
//graph, we will minus its neighbors indegree by one, this is the same meaning
//as we delete the edge from the node to its neighbors.
while (!queue.isEmpty()) {
DirectedGraphNode temp = queue.poll();
for (DirectedGraphNode neighbor : temp.neighbors){
indegree.put(neighbor, indegree.get(neighbor) - 1);
if (indegree.get(neighbor) == 0){
result.add(neighbor);
queue.offer(neighbor);
}
}
}
return result;
}
The fact that they have similar names doesn't make them similar methods.
DFS is typically implemented with LIFO (a stack if you will) - last in first out.
BFS typically implemented with FIFO (a queue if you will) - first in first out.
You can walk a graph in any way you want, and eventually come out with a topological order of its nodes. But if you want to do it efficiently, then DFS is the best option, as the topological order of the nodes essentially reflects their depth in the graph (well, "dependency-depth" to be more accurate).
So generally the code for topologically sorting using DFS (depth first search) is much more straight forward, you run it and it backtracks since its recursive assigning numbers as it calls back to previous stack frames. BFS is less straight forward but still easy to understand.
First, you must calculate the in-degree of all the vertices on the graph, this is because you must start at a vertex that has an in-degree of 0.
int[] indegree = int[adjList.length];
for(int i = 0; i < adjList.length; i++){
for(Edge e = adjList[i]; e != null; e = e.next){
indegree[e.vertexNum]++;
}
}
So the code above iterates through the vertex array, then it iterates through a single vertex's edges(in this case its stored using linked list), then it increments the vertex that the edge is pointing to in the indegree array. So at the end of the outer loop you will have traversed each vertex's neighbors and calculated each vertex's in-degree.
Second, you now must use BFS to actually topologically sort this graph. So this first snippet of code will only enqueue the vertices in the graph that have an in-degree of 0.
Queue<Integer> q = new Queue<>();
for(int i = 0; i < indegree.length; i++){
if(indegree[i] == 0){
q.enqueue(i);
}
}
Now, after enqueueing only vertices with in-degree of 0, you start the loop to assign topological numbers.
while(!q.isEmpty()){
int vertex = q.dequeue();
System.out.print(vertex);
for(Edge e = adjList[vertex]; e != null; e = e.next){
if(--indegree[e.vnum] == 0){
q.enqueue(e.vnum);
}
}
So the print statement prints out the vertex number that corresponds to the vertex. So depending on the requirements of your program, you can change the code where the print statement is to something that stores the vertex numbers or the names or something along those lines. Other than that, I hope this helped answer the first question.
Second Question
As for the second part of the question, it's pretty simple.
1.Create boolean array filled with false values, this will represent if the vertices have been visited or not.
2.Create for loop iterating over the adjList array, inside this loop you will call bfs, after calling bfs you will iterate over the boolean array you created, checking if any value is false, if it is then the graph is not strongly connected and you can return "graph is not strongly connected" and end the program. At the end of each iteration of the outer for-loop (but after the inner for-loop) don't forget to reset your boolean array to all false values again.
3.At this point the outer for loop is done and you can return true, it is your job to implement to bfs it should take in an integer and the visited boolean array you created as parameters.

Counting incoming edges in a directed acyclic graph

Given a directed acyclic graph (DAG), is there an algorithm that runs in linear time that counts the in-degree of each vertex (and the source of that edge), given that we know a root node (a node that from it you can reach every other vertex)?
Let's assume your nodes are numbered from 1 to n. There's a simple solution: Create an array D of size n, with values initialized to 0. Then walk through all edges (v, w) and increment D[w] by one. In the end D[w] will be the in-degree of vertex w.
This may be the answer to your problem:
public List<Integer> findIndegree(int n, List<List<Integer>> edges)
{
List<Integer>as = new ArrayList<>();
int[]in_deg = new int[n];
for(int u=0;u<n;u++)
{
for(int x:edges.get(u))
in_deg[x]++;
}}
Here the array in_deg is storing the degree of the variables.

Topological search and Breadth first search

Is it possible to use Breadth first search logic to do a topological sort of a DAG?
The solution in Cormen makes use of Depth first search but wouldn't be easier to use BFS?
Reason:
BFS visits all the nodes in a particular depth before visiting nodes with the next depth value. It naturally means that the parents will be listed before the children if we do a BFS. Isn't this exactly what we need for a topological sort?
A mere BFS is only sufficient for a tree (or forest of trees), because in (forest of) trees, in-degrees are at most 1.
Now, look at this case:
B → C → D
↗
A
A BFS where queue is initialized to A B (whose in-degrees are zero) will return A B D C, which is not topologically sorted. That's why you have to maintain in-degrees count, and only pick nodes whose count has dropped to zero. (*)
BTW, this is the flaw of your 'reason' : BFS only guarantee one parent has been visited before, not all of them.
Edit: (*) In other words you push back adjacent nodes whose in-degree is zero (in the exemple, after processing A, D would be skipped). So, you're still using a queue and you've just added a filtering step to the general algorithm. That being said, continuing to call it a BFS is questionable.
It is possible, even wikipedia describes an algorithm based on BFS.
Basically, you use a queue in which you insert all nodes with no incoming edges. Then, when you extract a node, you remove all of its outgoing edges and insert the nodes reachable from it that have no other incoming edges.
In a BFS all of the edges you actually walk will end up in the correct direction. But all the edges you don't walk (those between nodes at the same depth, or those from deeper nodes back up to earlier nodes) will end up going the wrong way if you lay out the graph in BFS order.
Yes, you really need DFS to do it.
Yes, you can do topological sorting using BFS. Actually I remembered once my teacher told me that if the problem can be solved by BFS, never choose to solve it by DFS. Because the logic for BFS is simpler than DFS, most of the time you will always want a straightforward solution to a problem.
As YvesgereY and IVlad has mentioned, you need to start with nodes of which the indegree is 0, meaning no other nodes direct to them. Be sure to add these nodes to your result first.You can use a HashMap to map every node with its indegree, and a queue which is very commonly seen in BFS to assist your traversal. When you poll a node from the queue, the indegree of its neighbors need to be decreased by 1, this is like delete the node from the graph and delete the edge between the node and its neighbors. Every time you come across nodes with 0 indegree, offer them to the queue for checking their neighbors later and add them to the result.
public ArrayList<DirectedGraphNode> topSort(ArrayList<DirectedGraphNode> graph) {
ArrayList<DirectedGraphNode> result = new ArrayList<>();
if (graph == null || graph.size() == 0) {
return result;
}
Map<DirectedGraphNode, Integer> indegree = new HashMap<DirectedGraphNode, Integer>();
Queue<DirectedGraphNode> queue = new LinkedList<DirectedGraphNode>();
//mapping node to its indegree to the HashMap, however these nodes
//have to be directed to by one other node, nodes whose indegree == 0
//would not be mapped.
for (DirectedGraphNode DAGNode : graph){
for (DirectedGraphNode nei : DAGNode.neighbors){
if(indegree.containsKey(nei)){
indegree.put(nei, indegree.get(nei) + 1);
} else {
indegree.put(nei, 1);
}
}
}
//find all nodes with indegree == 0. They should be at starting positon in the result
for (DirectedGraphNode GraphNode : graph) {
if (!indegree.containsKey(GraphNode)){
queue.offer(GraphNode);
result.add(GraphNode);
}
}
//everytime we poll out a node from the queue, it means we delete it from the
//graph, we will minus its neighbors indegree by one, this is the same meaning
//as we delete the edge from the node to its neighbors.
while (!queue.isEmpty()) {
DirectedGraphNode temp = queue.poll();
for (DirectedGraphNode neighbor : temp.neighbors){
indegree.put(neighbor, indegree.get(neighbor) - 1);
if (indegree.get(neighbor) == 0){
result.add(neighbor);
queue.offer(neighbor);
}
}
}
return result;
}

Resources