About tarjan's algorithm for finding scc - algorithm

I'm a senior student learing informatics olympiad on algorithms, and this is my first question on stackoverflow.
In tarjan's dfs search getting lowlink(u):
low[u]=min(low[u],low[v]) (v isn't visited)
or
low[u]=min(low[u],dfn[v]) (v is still in the stack)
My question is, is it still ok to replace dfn[v] for low[v] in the second case? I know this is incorrect, but I failed finding a counter-example. Could anyone help explain this?
thx:)

It's correct, actually.
The proof of correctness depends on two properties of low. The first is that, for all v, there exists w reachable from v such that dfn[w] <= low[v] <= dfn[v]. The second is that, when determining whether v is a root, we have for all w reachable from v that low[v] <= dfn[w].
We can prove inductively that the first property still holds by the fact that, if there's a path from u to v and a path from v to w, then there's a path from u to w. As for the second, let low be the original array and low' be yours. It's not hard to show that, for all v, at all times, low'[v] <= low[v], so at the critical moment for v, for all w reachable from v, it holds that low'[v] <= low[v] <= dfn[w].
I imagine that the algorithm is presented the way it is to avoid the need to consider intermediate values of low.

Related

Dijkstra's Algorithm pseudocode "U" symbol

I'm currently trying to follow pseudocode for Dijkstra's Algorithm, but I'm having difficulty understand what one of the lines means.
DijkstrasAlgorithm(G, w, s)
InitalizeSingleSource(G, s)
S = 0
Q = G.V
while Q != 0
u = ExtractMin(Q)
S = S∪{u}
for each vertex v ∈ G.Adj[u]
Relax(u, v, w)
This part right here "S = S∪{u}" is what's confusing me. I'm not sure what S is supposed to be equal to. Could someone explain? Thanks!
That’s the set union operator. S here is the set of all nodes for which the shortest path has been computed, and this line means “add the node u to that set.”
Mechanically, S ∪ {u} is the set consisting of everything already in S, plus the node u. That’s why S = S ∪ {u} means to add u to S.
(As a note, I think the pseudocode has a typo in where S was declared. You probably meant to initialize it to the empty set ∅ rather than the number 0.)
Dijkstra’s algorithm is a pretty tough one to understand purely from pseudocode. I recommend checking out a tutorial somewhere so that you have a high-level intuition for what’s going on. It’s much easier to understand this pseudocode by mapping the contents onto your conceptual understanding.

Dijkstra with negative edges. Don't understand the examples, they work according to CLRS pseudocode

EDIT 2: It seems this isn't from CLRS (I assumed it was because it followed the same format of CLRS code that was given to us in this Algos and DS course).
Still, in this course we were given this code as being "Dijkstra's Algorithm".
I read Why doesn't Dijkstra's algorithm work for negative weight edges? and Negative weights using Dijkstra's Algorithm (second one is specific to the OP's algorithm I think).
Looking at the Pseudocode from CLRS ("Intro to Algorithms"), I don't understand why Dijkstra wouldn't work on those examples of graphs with negative edges.
In the pseudocode (below), we Insert nodes back onto the heap if the new distance to them is shorter than the previous distance to them, so it seems to me that the distances would eventually be updated to the correct distances.
For example:
The claim here is that (A,C) will be set to 1 and never updated to the correct distance -2.
But the pseudocode from CLRS says that we first put C and B on the Heap with distances 1 and 2 respectively; then we pop C, see no outgoing edges; then we pop B, look at the edge (B,C), see that Dist[C] > Dist[B] + w(B,C), update Dist[C] to -2, put C back on the heap, see no outgoing edges and we're done.
So it worked fine.
Same for the example in the first answer to this question: Negative weights using Dijkstra's Algorithm
The author of the answer claims that the distance to C will not be updated to -200, but according to this pseudocode that's not true, since we would put B back on the heap and then compute the correct shortest distance to C.
(pseudocode from CLRS)
Dijkstra(G(V, E, ω), s ∈ V )
for v in V do
dist[v] ← ∞
prev[v] ← nil
end for
dist[s] = 0
H←{(s,0)}
while H̸=∅ do
v ← DeleteMin(H)
for (v, w) ∈ E do
if dist[w] > dist[v] + ω(v, w) then
dist[w] ← dist[v] + ω(v, w)
prev[w] ← v
Insert((w, dist[w]), H)
end if
end for
end while
EDIT: I understand that we assume that once a node is popped off the heap, the shortest distance has been found; but still, it seems (according to CLRS) that we do put nodes back on the heapif the distance is shorter than previously computed, so in the end when the algorithm is done running we should get the correct shortest distance regardless.
That implementation is technically not Dijkstra's algorithm, which is described by Dijkstra here (could not find any better link): the set A he talks about are the nodes for which the minimum path is known. So once you add a node to this set, it's fixed. You know the minimum path to it, and it no longer participates in the rest of the algorithm. It also talks about transferring nodes, so they cannot be in two sets at once.
This is in line with Wikipedia's pseudocode:
1 function Dijkstra(Graph, source):
2
3 create vertex set Q
4
5 for each vertex v in Graph: // Initialization
6 dist[v] ← INFINITY // Unknown distance from source to v
7 prev[v] ← UNDEFINED // Previous node in optimal path from source
8 add v to Q // All nodes initially in Q (unvisited nodes)
9
10 dist[source] ← 0 // Distance from source to source
11
12 while Q is not empty:
13 u ← vertex in Q with min dist[u] // Node with the least distance will be selected first
14 remove u from Q
15
16 for each neighbor v of u: // where v is still in Q.
17 alt ← dist[u] + length(u, v)
18 if alt < dist[v]: // A shorter path to v has been found
19 dist[v] ← alt
20 prev[v] ← u
21
22 return dist[], prev[]
And its heap pseudocode as well.
However, note that Wikipedia also states, at the time of this answer:
Instead of filling the priority queue with all nodes in the initialization phase, it is also possible to initialize it to contain only source; then, inside the if alt < dist[v] block, the node must be inserted if not already in the queue (instead of performing a decrease_priority operation).[3]:198
Doing this would still lead to reinserting a node in some cases with negative valued edges, such as the example graph given in the accepted answer to the second linked question.
So it seems that some authors make this confusion. In this case, they should clearly state that either this implementation works with negative edges or that it's not a proper Dijkstra's implementation.
I guess the original paper might be interpreted as a bit vague. Nowhere in it does Dijkstra make any mention of negative or positive edges, nor does he make it clear beyond any alternative interpretation that a node cannot be updated once in the A set. I don't know if he himself further clarified things in any subsequent works or speeches, or if the rest is just a matter of interpretation by others.
So from my point of view, you could argue that it's also a valid Dijkstra's.
As to why you might implement it this way: because it will likely be no slower in practice if we only have positive edges, and because it is quicker to write without having to perform additional checks or not-so-standard heap operations.

Why Dijkstra Algorithm is relaxing adjacent edges to the vertices already in the Shortest path tree?

In the DIJKSTRA pseudo-code in chapter 24 page 658 CLRS Third Edition, in the inner loop, while relaxing adjacent edges from the new added vertex why is the relaxing allowed on the edges already dequed from the queue and added to Shortest Path to tree?
while(Q not empty){
u = extractMin from Q;
Add S to the shortest path tree;
for each vertex v adjacent to u
relax(u,v,w)
}
Why is the inner loop not checking if the vertex is already part of Shortest path tree like,
while(Q not empty){
u = extractMin from Q;
Add S to the shortest path tree;
for each vertex v adjacent to u
if v is in Q
then relax(u,v,w)
}
Which is correct approach?
The first thing relax does is to check
if v.d > u.d + w(u,v)
If v is already on the shortest path tree, the check will always fail and relax will not proceed. An if v is in Q check would be redundant.
However, if if v is in Q is a significantly faster operation than if v.d > u.d + w(u,v) in a concrete implementation of the algorithm, including it may be a useful optimization.
Both approaches are functionally correct. However, your version is less optimal than the CLRS version.
You don't want to do if v is in Q because that's an O(log n) operation, whereas if v.d > u.d + w(u, v) is O(1). At the beginning of the algorithm, Q contains all the vertices in the graph. So for, say a very large sparsely-connected graph, your version would end-up being much worse than CLRS.
Your question, however, is not entirely without merit. The explanation for Dijkstra's algorithm in CLRS is a bit confusing, which is what actually brought me to this discussion thread. Looking at the pseudo-code on page 658:
DIJKSTRA(G, w, s)
1 INITIALIZE-SINGLE-SOURCE(G, s)
2 S = 0
3 Q = G.V
4 while Q not empty
5 u = EXTRACT-MIN(Q)
6 add u to S
7 for each vertex v in G.Adj[u]
8 RELAX(u, v, w)
one wonders what is the point of maintaining S at all? If we do away with it entirely by removing lines 2 and 6, the algorithm still works, and after it's complete you can print the shortest path by following the predecessor pointers (already stored in each vertex) backwards through the graph (using PRINT-PATH(G, s, v) on page 601, as described on page 647). S seems to be used more as an explanation tool here, to illustrate the fact that Dijkstra is a greedy algorithm, but in an actual graph implementation, seems to me it would not be needed.

Modify this algorithm for Nearest Neighbour Search (NNS) to perform Approximate-NNS

From the slides of a course, I found these:
Given a set P in R^D, and a query point q, it's NN is point p_0 in P, where:
dist(p_0, q) <= dist(p, q), for every p in P.
Similarly, with an approximation factor 1 > ε > 0, the ε-NN is p_0, such that:
dist(p_0, q) <= (1+ε) * dist(p, q), for every p in P.
(I wonder why ε can't reach 1).
We build a KD-tree and then we search for the NN, with this algorithm:
which is correct, as far as my mind goes and my testing.
How should I modify the above algorithm, in order to perform Approximate Nearest Neighbour Search (ANNS)?
My thought is to multiply the current best (at the part of the update in the leaf) with ε and leave the rest of the algorithm as is. I am not sure however, if this is correct. Can someone explain?
PS - I understand how search for NN works.
Note that I asked in the Computer Science site, but I got nothing!
The one modification needed is to replace current best distance with current best distance/(1+ε). This prunes the nodes that cannot contain a point violating the new inequality.
The reason that this works is that (assuming that cut-coor(q) is on the left side) the test
cut-coor(q) + current best distance > node's cut-value
is checking to see if the hyperplane separating left-child and right-child is closer than current best distance, which is a necessary condition for a point in right-child to be closer than that to the query point q, as the line segment joining q and a point in right-child passes through that hyperplane. By replacing d(p_0, q) = current best distance with current best distance/(1+ε), we're checking to see if any point p on the right side could satisfy
d(p, q) < d(p_0, q)/(1+ε),
which is equivalent to
(1+ε) d(p, q) < d(p_0, q),
which is a witness to the violation of the approximate nearest neighbor guarantee.

Path finding algorithm on graph considering both nodes and edges

I have an undirected graph. For now, assume that the graph is complete. Each node has a certain value associated with it. All edges have a positive weight.
I want to find a path between any 2 given nodes such that the sum of the values associated with the path nodes is maximum while at the same time the path length is within a given threshold value.
The solution should be "global", meaning that the path obtained should be optimal among all possible paths. I tried a linear programming approach but am not able to formulate it correctly.
Any suggestions or a different method of solving would be of great help.
Thanks!
If you looking for an algorithm in general graph, your problem is NP-Complete, Assume path length threshold is n-1, and each vertex has value 1, If you find the solution for your problem, you can say given graph has Hamiltonian path or not. In fact If your maximized vertex size path has value n, then you have a Hamiltonian path. I think you can use something like Held-Karp relaxation, for finding good solution.
This might not be perfect, but if the threshold value (T) is small enough, there's a simple algorithm that runs in O(n^3 T^2). It's a small modification of Floyd-Warshall.
d = int array with size n x n x (T + 1)
initialize all d[i][j][k] to -infty
for i in nodes:
d[i][i][0] = value[i]
for e:(u, v) in edges:
d[u][v][w(e)] = value[u] + value[v]
for t in 1 .. T
for k in nodes:
for t' in 1..t-1:
for i in nodes:
for j in nodes:
d[i][j][t] = max(d[i][j][t],
d[i][k][t'] + d[k][j][t-t'] - value[k])
The result is the pair (i, j) with the maximum d[i][j][t] for all t in 0..T
EDIT: this assumes that the paths are allowed to be not simple, they can contain cycles.
EDIT2: This also assumes that if a node appears more than once in a path, it will be counted more than once. This is apparently not what OP wanted!
Integer program (this may be a good idea or maybe not):
For each vertex v, let xv be 1 if vertex v is visited and 0 otherwise. For each arc a, let ya be the number of times arc a is used. Let s be the source and t be the destination. The objective is
maximize ∑v value(v) xv .
The constraints are
∑a value(a) ya ≤ threshold
∀v, ∑a has head v ya - ∑a has tail v ya = {-1 if v = s; 1 if v = t; 0 otherwise (conserve flow)
∀v ≠ x, xv ≤ ∑a has head v ya (must enter a vertex to visit)
∀v, xv ≤ 1 (visit each vertex at most once)
∀v ∉ {s, t}, ∀cuts S that separate vertex v from {s, t}, xv ≤ ∑a such that tail(a) ∉ S &wedge; head(a) &in; S ya (benefit only from vertices not on isolated loops).
To solve, do branch and bound with the relaxation values. Unfortunately, the last group of constraints are exponential in number, so when you're solving the relaxed dual, you'll need to generate columns. Typically for connectivity problems, this means using a min-cut algorithm repeatedly to find a cut worth enforcing. Good luck!
If you just add the weight of a node to the weights of its outgoing edges you can forget about the node weights. Then you can use any of the standard algorigthms for the shortest path problem.

Resources