dijkstra algorithm priority queue - algorithm

When we are writing dijkstra algorithm with priority queue why we are not check about the visited node?
while (!pq.empty())
{
int u = pq.top().second;
pq.pop();
// Get all adjacent of u.
for (auto x : adj[u])
{
int v = x.first;
int weight = x.second;
if (dist[v] > dist[u] + weight)
{
dist[v] = dist[u] + weight;
pq.push(make_pair(dist[v], v));
}
}

It does check the previous value of the node in dist[v], which I assume stores the current best distance from the root (or u) to node v. If a new path is found to v which is shorter than the previous shortest one, it is reinserted into the priority queue because it may now provide shorter paths to other nodes. If this new distance to v is longer than the previous, then it is left alone. This is why there is no else in the implementation.

Related

Check if a node is present in Path from A to B in tree

There will be many queries. Each query (A,B,K) requires you to check if a node (value=K) can be found in the path from A to B. Solution is expected not to exceed O(n+qlogq), n,q : node count, queries count.
I have a solution in my mind. I am posting that down. I want to know what other approaches are.
my approach:
Find LCA (lowest common ancestor) between A and B. Check if K is an ancestor to A or B. If yes=> check if LCA is ancestor to K. If yes, output yes. To find if a vertex is ancestor to other vertex, we can check whether a vertex is present in the subtree of another vertex. (This can be done in O(1) if we preprocess node's in-out visiting order in dfs. https://www.geeksforgeeks.org/printing-pre-and-post-visited-times-in-dfs-of-a-graph/ )
But the complexity increases if all queries have same K value. we need to check all the K which satisfies the in-out times with A or B. So to optimize that we can sort all K respective to in-out time of DFS.
Any thoughts?
There are following cases for R to exist in the path between U and V:
R is the lowest common ancestor of U and V.
R is on the path between LCA(U,V) and U.
R is on the path between LCA(U,V) and V.
// Function that return true if R
// exists on the path between U
// and V in the given tree
bool isPresent(int U, int V, int R)
{
// Calculating LCA between U and V
int LCA = lowestCommonAncestor(U, V);
// Calculating LCA between U and R
int LCA_1 = lowestCommonAncestor(U, R);
// Calculating LCA between U and V
int LCA_2 = lowestCommonAncestor(V, R);
if (LCA == R || (LCA_1 == LCA && LCA_2 == R) ||
(LCA_2 == LCA && LCA_1 == R)) {
return true;
}
return false;
}

Why don't we take minimum of the low time of child and parent node in finding articulation point algorithm using a time variable

One approach to find articulation point is to maintain discovery time of a node. Here in a disc[] array we have maintained the discovery time of a vertex, and in another array low[], we have kept the minimum of the discovery times of its child which is an parent of its root.
what we have done is, recursively called the function for all the adjacent nodes if it is not visited. and if it is already visited we just took the min of low[u] and dist[v]. where u is the parent v.
Why this is not min(low[u],low[v]).
Here is an explanation of above algorithm.
// A Java program to find articulation points in an undirected graph
import java.io.*;
import java.util.*;
import java.util.LinkedList;
// This class represents an undirected graph using adjacency list
// representation
class Graph
{
private int V; // No. of vertices
// Array of lists for Adjacency List Representation
private LinkedList<Integer> adj[];
int time = 0;
static final int NIL = -1;
// Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
//Function to add an edge into the graph
void addEdge(int v, int w)
{
adj[v].add(w); // Add w to v's list.
adj[w].add(v); //Add v to w's list
}
// A recursive function that find articulation points using DFS
// u --> The vertex to be visited next
// visited[] --> keeps tract of visited vertices
// disc[] --> Stores discovery times of visited vertices
// parent[] --> Stores parent vertices in DFS tree
// ap[] --> Store articulation points
void APUtil(int u, boolean visited[], int disc[],
int low[], int parent[], boolean ap[])
{
// Count of children in DFS Tree
int children = 0;
// Mark the current node as visited
visited[u] = true;
// Initialize discovery time and low value
disc[u] = low[u] = ++time;
// Go through all vertices aadjacent to this
Iterator<Integer> i = adj[u].iterator();
while (i.hasNext())
{
int v = i.next(); // v is current adjacent of u
// If v is not visited yet, then make it a child of u
// in DFS tree and recur for it
if (!visited[v])
{
children++;
parent[v] = u;
APUtil(v, visited, disc, low, parent, ap);
// Check if the subtree rooted with v has a connection to
// one of the ancestors of u
low[u] = Math.min(low[u], low[v]);
// u is an articulation point in following cases
// (1) u is root of DFS tree and has two or more chilren.
if (parent[u] == NIL && children > 1)
ap[u] = true;
// (2) If u is not root and low value of one of its child
// is more than discovery value of u.
if (parent[u] != NIL && low[v] >= disc[u])
ap[u] = true;
}
// Update low value of u for parent function calls.
else if (v != parent[u])
low[u] = Math.min(low[u], disc[v]);
}
}
// The function to do DFS traversal. It uses recursive function APUtil()
void AP()
{
// Mark all the vertices as not visited
boolean visited[] = new boolean[V];
int disc[] = new int[V];
int low[] = new int[V];
int parent[] = new int[V];
boolean ap[] = new boolean[V]; // To store articulation points
// Initialize parent and visited, and ap(articulation point)
// arrays
for (int i = 0; i < V; i++)
{
parent[i] = NIL;
visited[i] = false;
ap[i] = false;
}
// Call the recursive helper function to find articulation
// points in DFS tree rooted with vertex 'i'
for (int i = 0; i < V; i++)
if (visited[i] == false)
APUtil(i, visited, disc, low, parent, ap);
// Now ap[] contains articulation points, print them
for (int i = 0; i < V; i++)
if (ap[i] == true)
System.out.print(i+" ");
}
// Driver method
public static void main(String args[])
{
// Create graphs given in above diagrams
System.out.println("Articulation points in first graph ");
Graph g1 = new Graph(5);
g1.addEdge(1, 0);
g1.addEdge(0, 2);
g1.addEdge(2, 1);
g1.addEdge(0, 3);
g1.addEdge(3, 4);
g1.AP();
System.out.println();
System.out.println("Articulation points in Second graph");
Graph g2 = new Graph(4);
g2.addEdge(0, 1);
g2.addEdge(1, 2);
g2.addEdge(2, 3);
g2.AP();
System.out.println();
System.out.println("Articulation points in Third graph ");
Graph g3 = new Graph(7);
g3.addEdge(0, 1);
g3.addEdge(1, 2);
g3.addEdge(2, 0);
g3.addEdge(1, 3);
g3.addEdge(1, 4);
g3.addEdge(1, 6);
g3.addEdge(3, 5);
g3.addEdge(4, 5);
g3.AP();
}
}
this is specific to find articulation points. when we iterate through child V of children of node U, the U will be articulation point if low(V)==dist(U). But if U belongs to another cycle which is processed before handling U->V, then dist(U)>low(U). Because U can go back to its ancestor. In this case, if we use
low[u] = min(low[u],low[v])
then low(V)=low(U) when handling V. back to post handling of U, the dist[U]<=low[V] is not true (because low[V]=low[U]<dist[U]). Then U will not be an articulation point any more, which is flawed.
Please note that this will not apply to find bridges or Trajan's SCC, as both don't care about U is the root of a cycling case (dist[U]==low[v]).
Personally, I prefer using dist[u] all the time, as it's consistent, and the definition of low is also much clear.
good article: https://codeforces.com/blog/entry/71146
let's say U is the parent and V be its neighbor or child in DFS spanning tree. A simple scenario in which it will fail could be when there is a BackEdge from U to one of its ancestors and there is a BackEdge from V to U. Consider below graph
Credits: https://codeforces.com/blog/entry/71146
I think you have slightly misinterpreted the visited vertex case.
In DFS of an undirected graph, whenever we find an edge from vertex u to vertex v such that v is already visited, it means that it is a back edge and that v is an ancestor of u (unless of-course v is u's parent).
Now from the given code, the motive of low[u] is to store the lowest discovery time vertex reachable from any of the vertices in the sub tree rooted with u. (i.e. including u)
So when we discover an edge u->v such that v is an ancestor of u (and not its parent) that means that this vertex v can very well be the vertex with the lowest disc. time that is reachable from sub tree rooted at u. So we update low[u] as minimum of its current value and disc. time of the ancestor vertex v that is reachable from it.
Now why not low[v], simply because if the vertex x being tested for being an A.P. is below v, then low[x] = disc[v] works enough to prove it is not an AP, and if it is above v, then there will be a step - > low[x] = min(low[x], low[v])
First things first:
disc[]: It answers a simple question, when was a particular vertex "discovered" in the depth-first-search?, which means it assigns a number to the the vertex in the order it is found in the dfs.(Is not changed)
low[x]: It answers yet another simple question, "what is lowest level vertex x can climb to". (Is subject to change in every iteration)
back-edge: An edge which connects a vertex to an already visited vertex which is not its immediate parent
Now, in the given piece of code, the following section:
else if (v != parent[u])
low[u] = Math.min(low[u], disc[v]);
refers to the scenario when there exists a back-edge. On encountering a back-edge we update the low value of parent with the lowest-level vertex it can climb to (if its discovery time is lesser that the lower value of parent).
Its difficult to find an example where changing this piece of code to the following
else if (v != parent[u])
low[u] = Math.min(low[u], low[v]);
would break the algorithm. That being said, both these pieces of code have semantically very different meaning.
Math.min(low[u], low[v]); simply refers that if the child has a back-edge, so will its immediate parent, while Math.min(low[u], disc[v]); semantically means that the low value of a vertex is the lowest level vertex it can climb to.

Difference between Prim and Dijkstra graph algorithm

I'm reading graph algorithms from Cormen book. Below is pseudocode from that book
Prim algorithm for MST
MST-PRIM (G, w, r)
for each u in G.V
u.key = infinity
u.p = NIL
r.key = 0
Q = G.V
while Q neq null
u = EXTRACT-MIN(Q)
for each v in G.Adj[u]
if (v in Q) and (w(u,v) < v.key)
v.p = u
v.key = w(u,v)
Dijkstra algorithm to find single source shortest path.
INITIALIZE-SINGLE-SOURCE (G,s)
for each vertex v in G.V
v.d = infinity
v.par = NIL
s.d = 0
DIJKSTRA (G, w, s)
INITIALIZE-SINGLE-SOURCE(G,s)
S = NULL
Q = G.V
while Q neq null
u = EXTRACT-MIN(Q)
S = S U {u}
for each vertex v in G.Adj[u]
RELAX(u,v,w)
My question is, why we are checking if vertex belongs to Q (v in Q), i.e. that vertex doesn't belong to tree, whereas in Dijkstra algorithm we are not checking for that.
Any reason, why?
The algorithms called Prim and Dijkstra solve different problems in the first place. 'Prim' finds a minimum spanning tree of an undirected graph, while 'Disjkstra' solves the single-source shortest path problem for directed graphs with nonnegative edge weights.
In both algorithms queue Q contains all vertices that are not 'done' yet, i.e. white and gray according to common terminology (see here).
In Dijkstra's algorithm, the black vertex cannot be relaxed, because if it could, that would mean that its distance was not correct beforehand (contradicts with property of black nodes). So there is no difference whether you check v in Q or not.
In Prim's algorithm, it is possible to find an edge of small weight, that leads to already black vertex. That's why if we do not check v in Q, then the value in vertex v can change indeed. Normally, it does not matter, because we never read min-weight value for black vertices. However, your pseudocode is using MinHeap data structure. In this case each modification of vertex values must be accompanied with DecreaseKey. Clearly, it is not valid to call DecreaseKey for black vertices, because they are not in heap. That's why I suppose author decided to check for v in Q explicitly.
Speaking generally, the codes for Dijkstra's and Prim's algorithms are usually absolutely same, except for a minor difference:
Prim's algorithm checks w(u, v) for being less than D(v) in RELAX.
Dijkstra's algorithm checks D(u) + w(u, v) for being less D(v) in RELAX.
Take a look at my personal implementation for both Dijkstra and Prim written in C++.
They are very similar and I modified Dijkstra into Prim.
Dijkstra:
const int INF = INT_MAX / 4;
struct node { int v, w; };
bool operator<(node l, node r){if(l.w==r.w)return l.v>r.v; return l.w> r.w;}
vector<int> Dijkstra(int max_v, int start_v, vector<vector<node>>& adj_list) {
vector<int> min_dist(max_v + 1, INF);
priority_queue<node> q;
q.push({ start_v, 0 });
min_dist[start_v] = 0;
while (q.size()) {
node n = q.top(); q.pop();
for (auto adj : adj_list[n.v]) {
if (min_dist[adj.v] > n.w + adj.w) {
min_dist[adj.v] = n.w + adj.w;
q.push({ adj.v, adj.w + n.w });
}
}
}
return min_dist;
}
Prim:
struct node { int v, w; };
bool operator<(node l, node r) { return l.w > r.w; }
int MST_Prim(int max_v, int start_v, vector<vector<node>>& adj_list) {
vector<int> visit(max_v + 1, 0);
priority_queue<node> q; q.push({ start_v, 0 });
int sum = 0;
while (q.size()) {
node n = q.top(); q.pop();
if (visit[n.v]) continue;
visit[n.v] = 1;
sum += n.w;
for (auto adj : adj_list[n.v]) {
q.push({ adj.v, adj.w });
}
}
return sum;
}

Finding "Best Roots" in a Directed Tree Graph?

(This is derived from a recently completed programming contest)
You are given G, a connected graph with N nodes and N-1 edges.
(Notice that this implies G forms a tree.)
Each edge of G is directed. (not necessarily upward to any root)
For each vertex v of G it is possible to invert zero or more edges such that there is a directed path from every other vertex w to v. Let the minimum possible number of edge inversions to achieve this be f(v).
By what linear or loglinear algorithm can we determine the subset of vertexes that have the minimal overall f(v) (including the value of f(v) of those vertexes)?
For example consider the 4 vertex graph with these edges:
A<--B
C<--B
D<--B
The value of f(A) = 2, f(B) = 3, f(C) = 2 and f(D) = 2...
..so therefore the desired output is {A,C,D} and 2
(note we only need to calculate the f(v) of vertexes that have a minimal f(v) - not all of them)
Code:
For posterity here is the code of solution:
int main()
{
struct Edge
{
bool fwd;
int dest;
};
int n;
cin >> n;
vector<vector<Edge>> V(n+1);
rep(i, n-1)
{
int src, dest;
scanf("%d %d", &src, &dest);
V[src].push_back(Edge{true, dest});
V[dest].push_back(Edge{false, src});
}
vector<int> F(n+1, -1);
vector<bool> done(n+1, false);
vector<int> todo;
todo.push_back(1);
done[1] = true;
F[1] = 0;
while (!todo.empty())
{
int next = todo.back();
todo.pop_back();
for (Edge e : V[next])
{
if (done[e.dest])
continue;
if (!e.fwd)
F[1]++;
done[e.dest] = true;
todo.push_back(e.dest);
}
}
todo.push_back(1);
while (!todo.empty())
{
int next = todo.back();
todo.pop_back();
for (Edge e : V[next])
{
if (F[e.dest] != -1)
continue;
if (e.fwd)
F[e.dest] = F[next] + 1;
else
F[e.dest] = F[next] - 1;
todo.push_back(e.dest);
}
}
int minf = INT_MAX;
rep(i,1,n)
chmin(minf, F[i]);
cout << minf << endl;
rep(i,1,n)
if (F[i] == minf)
cout << i << " ";
cout << endl;
}
I think that the following algorithm works correctly, and it certainly works in linear time.
The motivation for this algorithm is the following. Let's suppose that you already know the value of f(v) for some single node v. Now, consider any node u adjacent to v. If we want to compute the value of f(u), we can reuse some of the information from f(v) in order to compute it. Note that in order to get from any node w in the graph to u, one of two cases must happen:
That path passes through the edge connecting u and v. In that case, the way that we get from w to u is to go from w to v, then to follow the edge from v to u.
That path does not pass through the edge connecting u and v. In that case, the way that we get from w to u is the exact same way that we got from w to v, except that we stop as soon as we get to u.
The reason that this observation is important is that it means that if we know the number of edges we'd flip to get from any node to v, we can easily modify it to get the set of edges that we'd flip to get from any node to u. Specifically, it's going to be the same set of edges as before, except that we want to direct the edge connecting u and v so that it connects v to u rather than the other way around.
If the edge from u to v is initially directed (u, v), then we have to flip all the normal edges we flipped to get every node pointing at v, plus one more edge to get v pointed back at u. Thus f(u) = f(v) + 1. Otherwise, if the edge is originally directed (v, u), then the set of edges that we'd flip would be the same as before (pointing everything at v), except that we wouldn't flip the edge (v, u). Thus f(u) = f(v) - 1.
Consequently, once we know the value of f for a single node v, we can compute it for each adjacent node u as follows:
f(u) = f(v) + 1 if (u, v) is an edge.
f(u) = f(v) - 1 otherwise
This means that we can compute f(v) for all nodes v as follows:
Compute f(v) for some initial node v, chosen arbitrarily.
Do a DFS starting from v. When reaching a node u, compute its f score using the above logic.
All that's left to do is to compute f(v) for some initial node. To do this, we can run a DFS from v outward. Every time we see an edge pointed the wrong way, we have to flip it. Thus the initial value of f(v) is given by the number of wrong-pointing edges we find during the initial DFS.
We thus can compute the f score for each node in O(n) time by doing an initial DFS to compute f(v) for the initial node, then a secondary DFS to compute f(u) for each other node u. You can then for-loop over each of the n f-scores to find the minimum score, then do one more loop to find all values with that f-score. Each of these steps takes O(n) time, so the overall algorithm takes O(n) time as well.
Hope this helps! This was an awesome problem!

graph - How to find maximum induced subgraph H of G such that each vertex in H has degree ≥ k

Here is an excise for graph.
Given an undirected graph G with n vertices and m edges, and an integer k, give an O(m + n) algorithm that finds the maximum induced subgraph H of G such that each vertex in H has degree ≥ k, or prove that no such graph exists.
An induced subgraph F = (U, R) of a graph G = (V, E) is a subset of U of the vertices V of G, and all edges R of G such that both vertices of each edge are in U.
My initial idea is like this:
First, this excise actually asks that we have all vertices S whose degrees are bigger than or equal to k, then we remove vertices in S who don't have any edge connected to others. Then the refined S is H, in which all vertices have degree >= k and the edges between them is R.
In addition, it asks O(m+n), so I think I need to a BFS or DFS. Then I get stuck.
In BFS, I can know the degree of a vertex. But once I get the degree of v (a vertex), I don't know other connected vertices except for its parent. But if the parent doesn't have degree >= k, I can't eliminate v as it may still be connected with others.
Any hints?
Edit:
According to the answer of #Michael J. Barber, I implemented it and update the code here:
Can anyone have a look at the key method of the codes public Graph kCore(Graph g, int k)? Do I do it right? Is it O(m+n)?
class EdgeNode {
EdgeNode next;
int y;
}
public class Graph {
public EdgeNode[] edges;
public int numVertices;
public boolean directed;
public Graph(int _numVertices, boolean _directed) {
numVertices = _numVertices;
directed = _directed;
edges = new EdgeNode[numVertices];
}
public void insertEdge(int x, int y) {
insertEdge(x, y, directed);
}
public void insertEdge(int x, int y, boolean _directed) {
EdgeNode edge = new EdgeNode();
edge.y = y;
edge.next = edges[x];
edges[x] = edge;
if (!_directed)
insertEdge(y, x, true);
}
public Graph kCore(Graph g, int k) {
int[] degree = new int[g.numVertices];
boolean[] deleted = new boolean[g.numVertices];
int numDeleted = 0;
updateAllDegree(g, degree);// get all degree info for every vertex
for (int i = 0;i < g.numVertices;i++) {
**if (!deleted[i] && degree[i] < k) {
deleteVertex(p.y, deleted, g);
}**
}
//Construct the kCore subgraph
Graph h = new Graph(g.numVertices - numDeleted, false);
for (int i = 0;i < g.numVertices;i++) {
if (!deleted[i]) {
EdgeNode p = g[i];
while(p!=null) {
if (!deleted[p.y])
h.insertEdge(i, p.y, true); // I just insert the good edge as directed, because i->p.y is inserted and later p.y->i will be inserted too in this loop.
p = p.next;
}
}
}
}
return h;
}
**private void deleteVertex(int i, boolean[] deleted, Graph g) {
deleted[i] = true;
EdgeNode p = g[i];
while(p!=null) {
if (!deleted[p.y] && degree[p.y] < k)
deleteVertex(p.y, deleted, g);
p = p.next;
}
}**
private void updateAllDegree(Graph g, int[] degree) {
for(int i = 0;i < g.numVertices;i++) {
EdgeNode p = g[i];
while(p!=null) {
degree[i] += 1;
p = p.next;
}
}
}
}
A maximal induced subgraph where the vertices have minimum degree k is called a k-core. You can find the k-cores just by repeatedly removing any vertices with degree less than k.
In practice, you first evaluate the degrees of all the vertices, which is O(m). You then go through the vertices looking for vertices with degree less than k. When you find such a vertex, cut it from the graph and update the degrees of the neighbors, also deleting any neighbors whose degrees drop below k. You need to look at each vertex at least once (so doable in O(n)) and update degrees at most once for each edge (so doable in O(m)), giving a total asymptotic bound of O(m+n).
The remaining connected components are the k-cores. Find the biggest one by evaluating their sizes.

Resources