I'm currently trying to follow pseudocode for Dijkstra's Algorithm, but I'm having difficulty understand what one of the lines means.
DijkstrasAlgorithm(G, w, s)
InitalizeSingleSource(G, s)
S = 0
Q = G.V
while Q != 0
u = ExtractMin(Q)
S = S∪{u}
for each vertex v ∈ G.Adj[u]
Relax(u, v, w)
This part right here "S = S∪{u}" is what's confusing me. I'm not sure what S is supposed to be equal to. Could someone explain? Thanks!
That’s the set union operator. S here is the set of all nodes for which the shortest path has been computed, and this line means “add the node u to that set.”
Mechanically, S ∪ {u} is the set consisting of everything already in S, plus the node u. That’s why S = S ∪ {u} means to add u to S.
(As a note, I think the pseudocode has a typo in where S was declared. You probably meant to initialize it to the empty set ∅ rather than the number 0.)
Dijkstra’s algorithm is a pretty tough one to understand purely from pseudocode. I recommend checking out a tutorial somewhere so that you have a high-level intuition for what’s going on. It’s much easier to understand this pseudocode by mapping the contents onto your conceptual understanding.
Related
In the DIJKSTRA pseudo-code in chapter 24 page 658 CLRS Third Edition, in the inner loop, while relaxing adjacent edges from the new added vertex why is the relaxing allowed on the edges already dequed from the queue and added to Shortest Path to tree?
while(Q not empty){
u = extractMin from Q;
Add S to the shortest path tree;
for each vertex v adjacent to u
relax(u,v,w)
}
Why is the inner loop not checking if the vertex is already part of Shortest path tree like,
while(Q not empty){
u = extractMin from Q;
Add S to the shortest path tree;
for each vertex v adjacent to u
if v is in Q
then relax(u,v,w)
}
Which is correct approach?
The first thing relax does is to check
if v.d > u.d + w(u,v)
If v is already on the shortest path tree, the check will always fail and relax will not proceed. An if v is in Q check would be redundant.
However, if if v is in Q is a significantly faster operation than if v.d > u.d + w(u,v) in a concrete implementation of the algorithm, including it may be a useful optimization.
Both approaches are functionally correct. However, your version is less optimal than the CLRS version.
You don't want to do if v is in Q because that's an O(log n) operation, whereas if v.d > u.d + w(u, v) is O(1). At the beginning of the algorithm, Q contains all the vertices in the graph. So for, say a very large sparsely-connected graph, your version would end-up being much worse than CLRS.
Your question, however, is not entirely without merit. The explanation for Dijkstra's algorithm in CLRS is a bit confusing, which is what actually brought me to this discussion thread. Looking at the pseudo-code on page 658:
DIJKSTRA(G, w, s)
1 INITIALIZE-SINGLE-SOURCE(G, s)
2 S = 0
3 Q = G.V
4 while Q not empty
5 u = EXTRACT-MIN(Q)
6 add u to S
7 for each vertex v in G.Adj[u]
8 RELAX(u, v, w)
one wonders what is the point of maintaining S at all? If we do away with it entirely by removing lines 2 and 6, the algorithm still works, and after it's complete you can print the shortest path by following the predecessor pointers (already stored in each vertex) backwards through the graph (using PRINT-PATH(G, s, v) on page 601, as described on page 647). S seems to be used more as an explanation tool here, to illustrate the fact that Dijkstra is a greedy algorithm, but in an actual graph implementation, seems to me it would not be needed.
I'm a senior student learing informatics olympiad on algorithms, and this is my first question on stackoverflow.
In tarjan's dfs search getting lowlink(u):
low[u]=min(low[u],low[v]) (v isn't visited)
or
low[u]=min(low[u],dfn[v]) (v is still in the stack)
My question is, is it still ok to replace dfn[v] for low[v] in the second case? I know this is incorrect, but I failed finding a counter-example. Could anyone help explain this?
thx:)
It's correct, actually.
The proof of correctness depends on two properties of low. The first is that, for all v, there exists w reachable from v such that dfn[w] <= low[v] <= dfn[v]. The second is that, when determining whether v is a root, we have for all w reachable from v that low[v] <= dfn[w].
We can prove inductively that the first property still holds by the fact that, if there's a path from u to v and a path from v to w, then there's a path from u to w. As for the second, let low be the original array and low' be yours. It's not hard to show that, for all v, at all times, low'[v] <= low[v], so at the critical moment for v, for all w reachable from v, it holds that low'[v] <= low[v] <= dfn[w].
I imagine that the algorithm is presented the way it is to avoid the need to consider intermediate values of low.
I have an undirected graph. For now, assume that the graph is complete. Each node has a certain value associated with it. All edges have a positive weight.
I want to find a path between any 2 given nodes such that the sum of the values associated with the path nodes is maximum while at the same time the path length is within a given threshold value.
The solution should be "global", meaning that the path obtained should be optimal among all possible paths. I tried a linear programming approach but am not able to formulate it correctly.
Any suggestions or a different method of solving would be of great help.
Thanks!
If you looking for an algorithm in general graph, your problem is NP-Complete, Assume path length threshold is n-1, and each vertex has value 1, If you find the solution for your problem, you can say given graph has Hamiltonian path or not. In fact If your maximized vertex size path has value n, then you have a Hamiltonian path. I think you can use something like Held-Karp relaxation, for finding good solution.
This might not be perfect, but if the threshold value (T) is small enough, there's a simple algorithm that runs in O(n^3 T^2). It's a small modification of Floyd-Warshall.
d = int array with size n x n x (T + 1)
initialize all d[i][j][k] to -infty
for i in nodes:
d[i][i][0] = value[i]
for e:(u, v) in edges:
d[u][v][w(e)] = value[u] + value[v]
for t in 1 .. T
for k in nodes:
for t' in 1..t-1:
for i in nodes:
for j in nodes:
d[i][j][t] = max(d[i][j][t],
d[i][k][t'] + d[k][j][t-t'] - value[k])
The result is the pair (i, j) with the maximum d[i][j][t] for all t in 0..T
EDIT: this assumes that the paths are allowed to be not simple, they can contain cycles.
EDIT2: This also assumes that if a node appears more than once in a path, it will be counted more than once. This is apparently not what OP wanted!
Integer program (this may be a good idea or maybe not):
For each vertex v, let xv be 1 if vertex v is visited and 0 otherwise. For each arc a, let ya be the number of times arc a is used. Let s be the source and t be the destination. The objective is
maximize ∑v value(v) xv .
The constraints are
∑a value(a) ya ≤ threshold
∀v, ∑a has head v ya - ∑a has tail v ya = {-1 if v = s; 1 if v = t; 0 otherwise (conserve flow)
∀v ≠ x, xv ≤ ∑a has head v ya (must enter a vertex to visit)
∀v, xv ≤ 1 (visit each vertex at most once)
∀v ∉ {s, t}, ∀cuts S that separate vertex v from {s, t}, xv ≤ ∑a such that tail(a) ∉ S ∧ head(a) ∈ S ya (benefit only from vertices not on isolated loops).
To solve, do branch and bound with the relaxation values. Unfortunately, the last group of constraints are exponential in number, so when you're solving the relaxed dual, you'll need to generate columns. Typically for connectivity problems, this means using a min-cut algorithm repeatedly to find a cut worth enforcing. Good luck!
If you just add the weight of a node to the weights of its outgoing edges you can forget about the node weights. Then you can use any of the standard algorigthms for the shortest path problem.
I am trying to conceive a solution for problems like in the following example:
A != B
B != C
D != B
C != B
E != D
E != A
How many variables are true and how many are false? As far as I found out I should try to use breadth-first search, but my problem is where to start and the fact that the graph will be an oriented one (I am connecting xi to !xj where the equality relation exists). Can someone point me in the right direction?
It's a graph 2-coloring problem. Vertices: A, B, C, … Edge (u, v) in this undirected graph is present if and only if u != v.
2-coloring is one of the applications of the breadth-first search. See: http://en.wikipedia.org/wiki/Breadth-first_search#Testing_bipartiteness
I don't think you need search at all here. Consider your constraints as a graph connecting vertices xi and xj iff there is a constraint xi = !xj. Take a connected component of the graph (i.e., one where a path exists connecting every pair of vertices). Assuming your constraints are consistent (i.e., don't simultaneously specify xi = xj and xi = !xj) then you can pick any vertex xi in the component and immediately work out whether any connected vertex xj is equal to xi or !xi. It's then straightforward to work out the assignments you need to maximise or minimise the number of true variables.
One of the assignments in my algorithms class is to design an exhaustive search algorithm to solve the clique problem. That is, given a graph of size n, the algorithm is supposed to determine if there is a complete sub-graph of size k. I think I've gotten the answer, but I can't help but think it could be improved. Here's what I have:
Version 1
input: A graph represented by an array A[0,...n-1], the size k of the subgraph to find.
output: True if a subgraph exists, False otherwise
Algorithm (in python-like pseudocode):
def clique(A, k):
P = A x A x A //Cartesian product
for tuple in P:
if connected(tuple):
return true
return false
def connected(tuple):
unconnected = tuple
for vertex in tuple:
for test_vertex in unconnected:
if vertex is linked to test_vertex:
remove test_vertex from unconnected
if unconnected is empty:
return true
else:
return false
Version 2
input: An adjacency matrix of size n by n, and k the size of the subgraph to find
output: All complete subgraphs in A of size k.
Algorithm (this time in functional/Python pseudocode):
//Base case: return all vertices in a list since each
//one is a 1-clique
def clique(A, 1):
S = new list
for i in range(0 to n-1):
add i to S
return S
//Get a tuple representing all the cliques where
//k = k - 1, then find any cliques for k
def clique(A,k):
C = clique(A, k-1)
S = new list
for tuple in C:
for i in range(0 to n-1):
//make sure the ith vertex is linked to each
//vertex in tuple
for j in tuple:
if A[i,j] != 1:
break
//This means that vertex i makes a clique
if j is the last element:
newtuple = (i | tuple) //make a new tuple with i added
add newtuple to S
//Return the list of k-cliques
return S
Does anybody have any thoughts, comments, or suggestions? This includes bugs I might have missed as well as ways to make this more readable (I'm not used to using much pseudocode).
Version 3
Fortunately, I talked to my professor before submitting the assignment. When I showed him the pseudo-code I had written, he smiled and told me that I did way too much work. For one, I didn't have to submit pseudo-code; I just had to demonstrate that I understood the problem. And two, he was wanting the brute force solution. So what I turned in looked something like this:
input: A graph G = (V,E), the size of the clique to find k
output: True if a clique does exist, false otherwise
Algorithm:
Find the Cartesian Product Vk.
For each tuple in the result, test whether each vertex is connected to every other. If all are connected, return true and exit.
Return false and exit.
UPDATE: Added second version. I think this is getting better although I haven't added any fancy dynamic programming (that I know of).
UPDATE 2: Added some more commenting and documentation to make version 2 more readable. This will probably be the version I turn in today. Thanks for everyone's help! I wish I could accept more than one answer, but I accepted the answer by the person that's helped me out the most. I'll let you guys know what my professor thinks.
Some comments:
You only need to consider n-choose-k combinations of vertices, not all k-tuples (n^k of them).
connected(tuple) doesn't look right. Don't you need to reset unconnected inside the loop?
As the others have suggested, there are better ways of brute-forcing this. Consider the following recursive relation: A (k+1)-subgraph is a clique if the first k vertices form a clique and vertex (k+1) is adjacent to each of the first k vertices. You can apply this in two directions:
Start with a 1-clique and gradually expand the clique until you get the desired size. For example, if m is the largest vertex in the current clique, try to add vertex {m+1, m+2, ..., n-1} to get a clique that is one vertex larger. (This is similar to a depth-first tree traversal, where the children of a tree node are the vertices larger than the largest vertex in the current clique.)
Start with a subgraph of the desired size and check if it is a clique, using the recursive relation. Set up a memoization table to store results along the way.
(implementation suggestion) Use an adjacency matrix (0-1) to represent edges in the graph.
(initial pruning) Throw away all vertices with degree less than k.
I once implemented an algorithm to find all maximal cliques in a graph, which is a similar problem to yours. The way I did it was based on this paper: http://portal.acm.org/citation.cfm?doid=362342.362367 - it described a backtracking solution which I found very useful as a guide, although I changed quite a lot from that paper. You'd need a subscription to get at that though, but I presume your University would have one available.
One thing about that paper though is I really think they should have named the "not set" the "already considered set" because it's just too confusing otherwise.
The algorithm "for each k-tuple of vertices, if it is a clique, then return true" works for sure. However, it's brute force, which is probably not what an algorithms course is searching for. Instead, consider the following:
Every vertex is a 1-clique.
For every 1-clique, every vertex that connects to the vertex in the 1-clique contributes to a 2-clique.
For every 2-clique, every vertex that connects to each vertex in the 2-clique contributes to a 3-clique.
...
For every (k-1)-clique, every vertex that connects to each vertex in the (k-1) clique contributes to a k-clique.
This idea might lead to a better approach.
It's amazing what typing things down as a question will show you about what you've just written. This line:
P = A x A x A //Cartesian product
should be this:
P = A k //Cartesian product
What do you mean by A^k? Are you taking a matrix product? If so, is A the adjacency matrix (you said it was an array of n+1 elements)?
In setbuilder notation, it would look something like this:
P = {(x0, x1, ... xk) | x0 ∈ A and x1 ∈ A ... and xk ∈ A}
It's basically just a Cartesian product of A taken k times. On paper, I wrote it down as k being a superscript of A (I just now figured out how to do that using markdown).
Plus, A is just an array of each individual vertex without regard for adjacency.