How to reduce a strongly connected component to one vertex? - algorithm

From https://algs4.cs.princeton.edu/42digraph/
Reachable vertex in a digraph. Design a linear-time algorithm to determine whether a digraph has a vertex that is reachable from
every other vertex.
Kosaraju-Sharir algorithm gives us the strongly connected components. Java code for that can be seen here. Reducing each SCC to a single vertex, a vertex that has outdegree zero is reachable from every other.
Problem is, everyone seems to be talking about reducing a SCC without providing details. What is an efficient algorithm to do so?

Following is a Java solution to my own question. For the graph representation, it uses edu.princeton.cs:algs4:1.0.3 from https://github.com/kevin-wayne/algs4. There appears to be general algorithms for graph contraction, as outlined in this paper; however, for my purposes, the following is sufficient.
/**
* 43. <b>Reachable vertex.</b>
* <p>
* DAG: Design a linear-time algorithm to determine whether a DAG has a vertex that is reachable from every other
* vertex, and if so, find one.
* Digraph: Design a linear-time algorithm to determine whether a digraph has a vertex that is reachable from every
* other vertex, and if so, find one.
* <p>
* Answer:
* DAG: Consider an edge (u, v) ∈ E. Since the graph is acyclic, u is not reachable from v.
* Thus u cannot be the solution to the problem. From this it follows that only a vertex of
* outdegree zero can be a solution. Furthermore, there has to be exactly one vertex with outdegree zero,
* or the problem has no solution. This is because if there were multiple vertices with outdegree zero,
* they wouldn't be reachable from each other.
* <p>
* Digraph: Reduce the graph to it's Kernel DAG, then find a vertex of outdegree zero.
*/
public class Scc {
private final Digraph g;
private final Stack<Integer> s = new Stack<>();
private final boolean marked[];
private final Digraph r;
private final int[] scc;
private final Digraph kernelDag;
public Scc(Digraph g) {
this.g = g;
this.r = g.reverse();
marked = new boolean[g.V()];
scc = new int[g.V()];
Arrays.fill(scc, -1);
for (int v = 0; v < r.V(); v++) {
if (!marked[v]) visit(v);
}
int i = 0;
while (!s.isEmpty()) {
int v = s.pop();
if (scc[v] == -1) visit(v, i++);
}
Set<Integer> vPrime = new HashSet<>();
Set<Map.Entry<Integer, Integer>> ePrime = new HashSet<>();
for (int v = 0; v < scc.length; v++) {
vPrime.add(scc[v]);
for (int w : g.adj(v)) {
// no self-loops, no parallel edges
if (scc[v] != scc[w]) {
ePrime.add(new SimpleImmutableEntry<>(scc[v], scc[w]));
}
}
}
kernelDag = new Digraph(vPrime.size());
for (Map.Entry<Integer, Integer> e : ePrime) kernelDag.addEdge(e.getKey(), e.getValue());
}
public int reachableFromAllOther() {
for (int v = 0; v < kernelDag.V(); v++) {
if (kernelDag.outdegree(v) == 0) return v;
}
return -1;
}
// reverse postorder
private void visit(int v) {
marked[v] = true;
for (int w : r.adj(v)) {
if (!marked[w]) visit(w);
}
s.push(v);
}
private void visit(int v, int i) {
scc[v] = i;
for (int w : g.adj(v)) {
if (scc[w] == -1) visit(w, i);
}
}
}
Running it on the graph below produces the strongly-connected components as shown. Vertex 0 in the reduced DAG is reachable from every other vertex.
What I couldn't find anywhere is the kind of detail that I presented above. Comments like "well, this is easy, you do that, then you do something else" are thrown around without concrete details.

Suppose you already have a method to compute SCCs and the usual graph, vertex and edge methods. Then it's just creating a new graph, adding a vertex representative for each SCC and then adding edge representatives.
For the edges you need to be able to map an original vertex (the edge destination) to its representative in the new graph. You can model that in the first pass using a Map<Vertex, SCC> which maps vertices to their SCCs and a Map<SCC, Vertex> which maps SCCs to their representative vertices in the new graph. Or you directly have a Map<Vertex, Vertex> mapping original vertices to their representatives.
Here is a Java solution:
public static Graph graphToSccGraph(Graph graph) {
Collection<SCC> sccs = SccComputation.computeSccs(graph);
Graph sccGraph = new Graph();
Map<Vertex, SCC> vertexToScc = new HashMap<>();
Map<SCC, Vertex> sccToRep = new HashMap<>();
// Add a representative for each SCC (O(|V|))
for (SCC scc : sccs) {
Vertex rep = new Vertex();
sccGraph.addVertex(rep);
sccToRep.put(scc, rep);
for (Vertex vertex : scc.getVertices()) {
vertexToScc.put(vertex, scc);
}
}
// Add edge representatives (O(|E|))
for (Vertex vertex : graph.getVertices()) {
Vertex sourceRep = sccToRep.get(vertexToScc.get(vertex));
for (Edge edge : vertex.getOutgoingEdges()) {
Vertex destRep = sccToRep.get(vertexToScc.get(edge.getDestination()));
Edge edgeRep = new Edge(sourceRep, destRep);
if (!sccGraph.contains(edgeRep)) {
sccGraph.addEdge(edgeRep);
}
}
}
return sccGraph;
}
Time complexity is linear in the size of the graph (amount of vertices and edges), so optimal. That is Theta(|V| + |E|).
Usually people use a Union-Find (see Wikipedia) data-structure to make this even simpler and get rid of the Maps.

Related

How to increase efficiency of Prim's algorithm used in finding minimum spanning tree from adjacency matrix of an undirected graph?

I have implemented an undirected graph using adjacency matrix. Now I want to find the edges in the minimum spanning tree that can be obtained by using Prim's Algorithm (along with priority queue). I did that using classic method, but it is highly inefficient (giving correct results). On larger data sets (of vertices and the vertices that they are connected to.).
This is the implementation of Prim's algorithm using priority queue as i used in my code. (This is the code from site geeksforgeeks, the code i wrote is an inspiration from this.)
void Graph::primMST()
{
// Create a priority queue to store vertices that
// are being primMST. This is weird syntax in C++.
// Refer below link for details of this syntax
// http://geeksquiz.com/implement-min-heap-using-stl/
priority_queue< iPair, vector <iPair> , greater<iPair> > pq;
int src = 0; // Taking vertex 0 as source
// Create a vector for keys and initialize all
// keys as infinite (INF)
vector<int> key(V, INF);
// To store parent array which in turn store MST
vector<int> parent(V, -1);
// To keep track of vertices included in MST
vector<bool> inMST(V, false);
// Insert source itself in priority queue and initialize
// its key as 0.
pq.push(make_pair(0, src));
key[src] = 0;
/* Looping till priority queue becomes empty */
while (!pq.empty())
{
// The first vertex in pair is the minimum key
// vertex, extract it from priority queue.
// vertex label is stored in second of pair (it
// has to be done this way to keep the vertices
// sorted key (key must be first item
// in pair)
int u = pq.top().second;
pq.pop();
//Different key values for same vertex may exist in the priority queue.
//The one with the least key value is always processed first.
//Therefore, ignore the rest.
if(inMST[u] == true){
continue;
}
inMST[u] = true; // Include vertex in MST
// 'i' is used to get all adjacent vertices of a vertex
list< pair<int, int> >::iterator i;
for (i = adj[u].begin(); i != adj[u].end(); ++i)
{
// Get vertex label and weight of current adjacent
// of u.
int v = (*i).first;
int weight = (*i).second;
// If v is not in MST and weight of (u,v) is smaller
// than current key of v
if (inMST[v] == false && key[v] > weight)
{
// Updating key of v
key[v] = weight;
pq.push(make_pair(key[v], v));
parent[v] = u;
}
}
}
// Print edges of MST using parent array
for (int i = 1; i < V; ++i)
printf("%d - %d\n", parent[i], i);
}
Thanks in advance.

Detected Cycle in directed graph if the vertex is found in recursive stack-why?

I have read an article from here about how to detect cycle in a directed graph. The basic concept of this algorithm is if a node is found in recursive stack then there is a cycle, but i don't understand why. what is the logic here?
#include<iostream>
#include <list>
#include <limits.h>
using namespace std;
class Graph
{
int V; // No. of vertices
list<int> *adj; // Pointer to an array containing adjacency lists
bool isCyclicUtil(int v, bool visited[], bool *rs);
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // to add an edge to graph
bool isCyclic(); // returns true if there is a cycle in this graph
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w); // Add w to v’s list.
}
bool Graph::isCyclicUtil(int v, bool visited[], bool *recStack)
{
if(visited[v] == false)
{
// Mark the current node as visited and part of recursion stack
visited[v] = true;
recStack[v] = true;
// Recur for all the vertices adjacent to this vertex
list<int>::iterator i;
for(i = adj[v].begin(); i != adj[v].end(); ++i)
{
if ( !visited[*i] && isCyclicUtil(*i, visited, recStack) )
return true;
else if (recStack[*i])
return true;
}
}
recStack[v] = false; // remove the vertex from recursion stack
return false;
}
bool Graph::isCyclic()
{
// Mark all the vertices as not visited and not part of recursion
// stack
bool *visited = new bool[V];
bool *recStack = new bool[V];
for(int i = 0; i < V; i++)
{
visited[i] = false;
recStack[i] = false;
}
for(int i = 0; i < V; i++)
if (isCyclicUtil(i, visited, recStack))
return true;
return false;
}
int main()
{
// Create a graph given in the above diagram
Graph g(4);
g.addEdge(0, 1);
g.addEdge(0, 2);
g.addEdge(1, 2);
g.addEdge(2, 0);
g.addEdge(2, 3);
g.addEdge(3, 3);
if(g.isCyclic())
cout << "Graph contains cycle";
else
cout << "Graph doesn't contain cycle";
return 0;
}
From a brief look, the code snippet is an implementation of depth-first search, which is a basic search technique for directed graphs; the same approach works for breadth-first search. Note that apparently this implementation works only if there is only one connected component, otherwise the test must be performed for each connected component until a cycle is found.
That being said, the technique works by choosing one node at will and starting a recursive search there. Basically, if the search discovers a node that is in the stack, there must be a cycle, since it has been previously reached.
In the current implementation, recStack is not actually the stack, it just indicates whether a specific node is currently in the stack, no sequence information is stored. The actual cycle is contained implicitly in the call stack. The cycle is the sequence of nodes for which the calls of isCyclicUtil has not yet returned. If the actual cycle has to be extracted, the implementation must be changed.
So essentailly, what this is saying, is if a node leads to itself, there is a cycle. This makes sense if you think about it!
Say we start at node1.
{node1 -> node2}
{node2 -> node3}
{node3 -> node4
node3 -> node1}
{node4 -> end}
{node1 -> node2}
{node2 -> node3}.....
This is a small graph that contains a cycle. As you can see, we traverse the graph, going from each node to the next. In some cases we reach and end, but even if we reach the end, our code wants to go back to the other branch off of node3 so that it can check it's next node. This node then leads back to node1.
This will happen forever if we let it, because the path starting at node1 leads back to itself. We are recursively putting each node we visit on the stack, and if we reach an end, we remove all of the nodes from the stack AFTER the branch. In our case, we would be removing node4 from the stack every time we hit the end, but the rest of the nodes would stay on the stack because of the branch off of node3.
Hope this helps!

Count number of cycles in directed graph using DFS

I want to count total number of directed cycles available in a directed graph (Only count is required).
You can assume graph is given as adjacency matrix.
I know DFS but could not make a working algorithm for this problem.
Please provide some pseudo code using DFS.
This algorithm based on DFS seems to work, but I don't have a proof.
This algorithm is modified from the dfs for topological sorting
(https://en.wikipedia.org/wiki/Topological_sorting#Depth-first_search).
class Solution {
vector<Edge> edges;
// graph[vertex_id] -> vector of index of outgoing edges from #vertex_id.
vector<vector<int>> graph;
vector<bool> mark;
vector<bool> pmark;
int cycles;
void dfs(int node) {
if (pmark[node]) {
return;
}
if (mark[node]) {
cycles++;
return;
}
mark[node] = true;
// Try all outgoing edges.
for (int edge_index : graph[node]) {
dfs(edges[edge_index].to);
}
pmark[node] = true;
mark[node] = false;
}
int CountCycles() {
// Build graph.
// ...
cycles = 0;
mark = vector<bool>(graph.size(), false);
pmark = vector<bool>(graph.size(), false);
for (int i = 0; i < (int) graph.size(); i++) {
dfs(i);
}
return cycles;
}
};
Let us consider that , we are coloring the nodes with three types of color . If the node is yet to be discovered then its color is white . If the node is discovered but any of its descendants is/are yet to be discovered then its color is grey. Otherwise its color is black . Now, while doing DFS if we face a situation when, there is an edge between two grey nodes then the graph has cycle. The total number of cycles will be total number of times we face the situation mentioned above i.e. we find an edge between two grey nodes .

Dijkstra algorithm optimization regarding priority queueu

I use the code below in a simulation. Because I am calling the dijkstra method over and over, performance is very crucial for me. , I use PriorityQueue to keep the nodes of the graph in an ascending order relative to their distance to the source. PriorityQueue provides me to access the node with smallest distance with O(log n) complexity. However,
to keep the nodes in order after recalculating a nodes distance, I need to first remove the node, than add it again. I suppose there may be a better way. I appreciate for ANY feedback. Thanks in advance for all community.
public HashMap<INode, Double> getSingleSourceShortestDistance(INode sourceNode) {
HashMap<INode, Double> distance = new HashMap<>();
PriorityQueue<INode> pq;
// The nodes are stored in a priority queue in which all nodes are sorted
according to their estimated distances.
INode u = null;
INode v = null;
double alt;
Set<INode> nodeset = nodes.keySet();
Iterator<INode> iter = nodeset.iterator();
//Mark all nodes with infinity
while (iter.hasNext()) {
INode node = iter.next();
distance.put(node, Double.POSITIVE_INFINITY);
previous.put(node, null);
}
iter = null;
// Mark the distance[source] as 0
distance.put(sourceNode, 0d);
pq = new PriorityQueue<>(this.network.getNodeCount(), new NodeComparator(distance));
pq.addAll(nodeset);
// Loop while q is empty
while (!pq.isEmpty()) {
// Fetch the node with the smallest estimated distance.
u = pq.peek();
/**
* break the loop if the distance is greater than the max net size.
* That shows that the nodes in the queue can not be reached from
* the source node.
*/
if ((Double.isInfinite(distance.get(u).doubleValue()))) {
break;
}
// Remove the node with the smallest estimated distance.
pq.remove(u);
// Iterate over all nodes (v) which are neighbors of node u
iter = nodes.get(u).keySet().iterator();
while (iter.hasNext()) {
v = (INode) iter.next();
alt = distance.get(u) + nodes.get(u).get(v).getDistance();
if (alt < distance.get(v)) {
distance.put(v, alt);
//To reorder the queue node v is first removed and then inserted.
pq.remove(v);
pq.add(v);
}
}
}
return distance;
}
protected static class NodeComparator<INode> implements Comparator<INode> {
private Map<INode, Number> distances;
protected NodeComparator(Map<INode, Number> distances) {
this.distances = distances;
}
#Override
public int compare(INode node1, INode node2) {
return ((Double) distances.get(node1)).compareTo((Double) distances.get(node2));
}
}
You could use a Heap with increase_key and decrease_key implemented, so you could update the node distance without removing and adding it again.

Diameter of a rooted k-ary tree

I'm trying to find a linear-time algorithm using recursion to solve the diameter problem for a rooted k-ary tree implemented with adjacency lists. The diameter of a tree is the maximum distance between any couple of leaves. If I choose a root r (that is, a node whose degree is > 1), it can be shown that the diameter is either the maximum distance between two leaves in the same subtree or the maximum distance between two leaves of a path that go through r. My pseudocode for this problem:
Tree-Diameter(T,r)
if degree[r] = 1 then
height[r] = 0
return 0
for each v in Adj[r] do
for i = 1 to degree[r] - 1 do
d_i = Tree-Diameter(T,v)
height[r] = max_{v in Adj[r]} (height[v]
return max(d_i, max_{v in V} (height[v]) + secmax_{v in V} (height[v], 0) + 1)
To get linear time, I compute the diameter AND the height of each subtree at the same time. Then, I choose the maximum quantity between the diameters of each subtrees and the the two biggest heights of the tree + 1 (the secmax function chooses between height[v] and 0 because some subtree can have only a child: in this case, the second biggest height is 0). I ask you if this algorithm works fine and if not, what are the problems? I tried to generalize an algorithm that solve the same problem for a binary tree but I don't know if it's a good generalization.
Any help is appreciated! Thanks in advance!
In all in tree for finding diameter do as below:
Select a random node A, run BFS on this node, to find furthermost node from A. name this node as S.
Now run BFS starting from S, find the furthermost node from S, name it D.
Path between S and D is diameter of your tree. This algorithm is O(n), and just two time traverses tree. Proof is little tricky but not hard. (try yourself or if you think is not true, I'll write it later). And be careful I'm talking about Trees not general graphs. (There is no loop in tree and is connected).
This is a python implementation of what I believe you are interested in. Here, a tree is represented as a list of child trees.
def process(tree):
max_child_height=0
secmax_child_height=0
max_child_diameter=0
for child in tree:
child_height,child_diameter=process(child)
if child_height>max_child_height:
secmax_child_height=max_child_height
max_child_height=child_height
elif child_height>secmax_child_height:
secmax_child_height=child_height
if child_diameter>max_child_diameter:
max_child_diameter=child_diameter
height=max_child_height+1
if len(tree)>1:
diameter=max(max_child_diameter,max_child_height+secmax_child_height)
else:
diameter=max_child_diameter
return height,diameter
def diameter(tree):
height,diameter=process(tree)
return diameter
This is the recursive solution with Java.
import java.util.ArrayList;
import java.util.List;
public class DiameterOrNAryTree {
public int diameter(Node root) {
Result result = new Result();
getDepth(root, result);
return result.max;
}
private int getDepth(Node node, Result result) {
if (node == null) return 0;
int h1 = 0, h2 = 0;
for (Node c : node.children) {
int d = getDepth(c, result);
if (d > h1) {
h2 = h1;
h1 = d;
} else if (d > h2) h2 = d;
}
result.max = Math.max(result.max, h1 + h2);
return h1 + 1;
}
class Result {
int max;
Result() {
max = 0;
}
}
class Node {
public int val;
public List<Node> children;
public Node() {
children = new ArrayList<Node>();
}
public Node(int _val) {
val = _val;
children = new ArrayList<Node>();
}
public Node(int _val, ArrayList<Node> _children) {
val = _val;
children = _children;
}
}
}

Resources