I have a directed graph, that looks like this:
I want to find the cheapest path from Start to End where the orange dotted lines are all required for the path to be valid.
The natural shortest path would be: Start -> A -> B -> End with the resultant cost = 5, but we have not met all required edge visits.
The path I want to find (via a general solution) is Start -> A -> B -> C -> D -> B -> End where the cost = 7 and we have met all required edge visits.
Does anyone have any thoughts on how to require such edge traversals?
Let R be the set of required edges and F = |R|. Let G be the input graph, t (resp. s) the starting (resp. ending) point of the requested path.
Preprocessing: A bunch of Dijkstra's algorithm runs...
The first step is to create another graph. This graph will have exactly F+2 vertices:
One for each edge in R
One for the starting point t of the path you want to compute
One for the ending point s of the path you want to compute
To create this graph, you will have to do the following:
Remove every edge in R from G.
For each edge E = (b,e) in R:
Compute the shortest path from t to b and the shortest path from e to s. If they exist, add an edge linking s to E in the "new graph", weighing the length of the related shortest path.
For each edge E' = (b', e') in R \ {E}:
Compute the shortest path from e to b'. If it exists, add an edge from E to E' in the new graph, weighing the length of that shortest path. Attach the computed paths as payload to the relevent edges.
Attach the computed path as a payload to that edge
The complexity to build this graph is O((F+2)².(E+V).log(V)) where E (resp. V) is the number of edges (resp. vertices) in the original graph.
Exhaustive search for the best possible path
From this point, we have to find the shortest Hamiltonian Path in the newly created graph. Unfortunately, this task is a hard problem. We have no better way than exploring every possible path. But that doesn't mean we can't do it cleverly.
We will perform the search using backtracking. We can achieve this by maintaining two sets:
The list of currently explored vertices: K (K for Known)
The list of currently unknown vertices: U (U for Uknown)
Before digging in the algorithm definition, here are the main ideas. We cannot do anything else than exploring the whole space of possible paths in the new graph. At each step, we have to make a decision: which edge do we take next? This leads to a sequence of decisions until we cannot move anymore or we reached s. But now we need to go back and cancel decisions to see if we can do better by changing a direction. To cancel decisions we proceed like this:
Every time we are stuck (or found a path), we cancel the last decision we made
Each time we take a decision at some point, we keep track of which decision, so when we get back to this point, we know not to take that very same decision and explore the others that are available.
We can be stuck because:
We found a path.
We cannot move further (there is no edge we can explore or the only one we could take increases the current partial path too much -- its length becomes higher than the length of the current best path found).
The final algorithm can be summed up in this fashion: (I give an iterative implementation, one can find a recursive implementation a tad easier and clearer)
Let K ← [], L[0..R+1] ← [] and U ← V (where V is the set of every vertex in the working graph minus the starting and ending vertices t and s). Finally let l ← i ← 0 and best_path_length ← ∞ and best_path ← []
While (i ≥ 0):
While U ≠ []
c ← U.popFront() (we take the head of U)
L[i].pushBack(c)
If i == R+1 AND (l == weight(cur_path.back(), s) + l) < best_path_length:
best_path_length ← l
best_path ← cur_path
If there is an edge e between K.tail() and c, and weight(e) + l < best_path_length: (if K is empty, then replace K.tail() with t in the previous statement)
K.pushBack(c)
i ← i+1
l ← weight(e) + l
cur_path.pushBack(c)
Concatenate L[i] at the end of U
L[i] ← []
i ← i-1
cur_path.popBack()
At the end of the while loop (while (i ≥ 0)), best_path will hold the best path (in the new graph). From there you just have to get the edges' payload to rebuild the path in the original graph.
Related
Trying to understand graphs and having a really hard time with this. I know how to find the shortest path, but not sure how you can find the shortest cycle and still make it in O(n+m) time?
the shortest cycle that contains e
BFS is perfect for that. A cycle will be the goal. Time complexity is the same.
You want something like this(Edited from Wikipedia):
Cycle-With-Breadth-First-Search(Graph g, Edge e):
remove e from E
root is b where e = (a,b)
create empty set S
create empty queue Q
root.parent = a
Q.enqueueEdges(root)
while Q is not empty:
if current = a
return current
current = Q.dequeue()
for each node n that is adjacent to current:
if n is not in S:
add n to S
n.parent = current
Q.enqueue(n)
For more info about cycles and BFS read this link
https://stackoverflow.com/a/4464388/6782134
Give two vertices u and v in G = (V,E) and a positive integer k, describe an algorithm to decide if there exists a k edge disjoint paths from u to v. If the answer to the decision problem is yes, describe how to compute a set of k edge disjoint paths.
Solution : Run max flow from u to v (giving all edges in the Graph G a weight of 1 so that one edge can be part of only one path from u to v) and get the value of flow. If the value of the flow is k then we have the answer to the decision problem as yes.
Now for finding all such paths find the min cut by doing BFS from u and hence I will have the partition of vertices which will separate the vertices into 2 sets one on each side of min cut.
Then do I need to again do a DFS from u to v looking for all the paths which have only these vertices which are there in the two partition set that I got from the min cut.
Or is there any other cleaner way ? to get all the k edge disjoint paths.
Once you have the flow you can extract the edge disjoint paths by following the flow.
The start node will have a flow of k leaving u along k edges.
For each of these edges you can keep moving in the direction of outgoing flow to extract the path until you reach v. All you need to do is to mark the edges you have already used to avoid duplicating edges.
Repeat for each of the k units of flow leaving u to extract all k paths.
Pseudocode
repeat k times:
set x to start node
set path to []
while x is not equal to end node:
find a edge from x which has flow>0, let y be the vertex at the far end
decrease flow from x->y by 1 unit
append y to path
set x equal to y
print path
A slightly more theoretical question, but here it is nonetheless.
Setting
Let:
UCYLE = { : G is an undirected graph that contains a simple cycle}.
My Solution
we show UCYLE is in L by constructing algorithm M that decides UCYLE using $L$ space.
M = "On input where G = (V,E)
For each v_i in V, for each v_j in Neighbor(v_i), store the current v_i and v_j
Traverse the edge (v_i,v_j) and then follow all possible paths through G using DFS.
If we encounter v_k in Neighbor(v_i) / {v_j} so that there is an edge (v_i,v_k) in E, then ACCEPT. Else REJECT."
First we claim M decides UCYLE. First, if there exists a cycle in $G$, then it must start and end on some vertex $v_i$, step one of $M$ tries all such $v_i$'s and therefore must find the desired vertex. Next, suppose the cycle starts at $v_i$, then there must exists a starting edge $(v_i,v_j)$ so that if we follow the cycle, we come back to $v_i$ through a different edge $(v_k,v_i)$, so we accept in step three. Since the graph is undirected, we can always come back to $v_i$ through $(v_i,v_j)$, but $M$ does not accept this case. By construction, neither does $M$ accept if we come upon some $v_k in Neighbor(v_i)/{v_j}$ but there is no edge from $v_k$ to $v_i$.
Now we show M is in L. First if the vertices are labled $1,\ldots,n$ where $|\mathbb V| = n$, then it requires $log(n)$ bits to specify each $v_i$. Next note in $\mathcal M$ we only need to keep track of the current $v_i$ and $v_j$, so M is $2 log(n) = O(log n), which is in L
My Problem
My problem is how do you perform DFS on the graph in $log(n)$ space. For example, in the worst case where each vertex has degree $n$, you'd have to keep a counter of which vertex you took on a particular path, which would require $n log(n)$ space.
The state you maintain as you search is four vertices: (v_i, v_j, prev, current).
The next state is: (v_i, v_j, current, v) where v is the next neighbour of current after prev (wrapping back to the first if prev is the numerically last neighbour of current).
You stop when current is a neighbour of v_i and reject if it's not v_j.
In pseudo-code, something like this:
for v_i in vertices
for v_j in neighbours(v_i)
current, prev = v_j, v_i
repeat
idx = neighbours(current).index(v_j)
idx = (idx + 1) % len(neighbours(current))
current, prev = neighbours(current)[idx], current
until current adjacent to v_i
if current != v_j
return FOUND_A_CYCLE
return NO_CYCLES_EXIST
Intuitively, this is saying for each point in a maze, and for each corridor from that point, follow the left-hand wall, and if when you can see the start point again if it's not through the original corridor then you've found a cycle.
While it's easy to see that this algorithm uses O(log n) space, there's some proof necessary to show that this algorithm terminates.
I've known the algorithm to find the diameter of a tree mentioned here for quite some time:
Select a random node A
Run BFS on this node to find furthermost node from A. name this node as S.
Now run BFS starting from S, find the furthermost node from S, name it D.
Path between S and D is diameter of the tree.
But why does it work?
I would accept both Ivan's and coproc's answer if I can. These are 2 very different approaches that both answer my question.
say S = [A - B - C - D - ... X - Y - Z] is the diameter of the tree.
consider each node in S, say #, start from it and go "away" from the diameter, there won't be a longer chain than min(length(#, A), length(#, Z)).
so dfs from any node on the tree, it will ends at A or 'Z', i.e. one end of the diameter, dfs again from it will of course lead you to the other side of the tree.
refer to this
Suppose you've completed steps 1 and 2 and have found S, and that there is no diameter in the tree that includes S. Pick a diameter PQ of the tree. You basically have to check the possible cases and in all of them, find that either PS or SQ is at least as long as PQ - which would be a contradiction.
In order to systematically check all cases, you can assume that the tree is rooted at A. Then the shortest path between any two vertices U and V is calculated in the following way - let W be the lowest common ancestor of U and V. Then the length of UV is equal to the sum of the distances between U and W and between V and W - and, in a rooted tree, these distances are just differences in the levels of the nodes (and S has a maximum level in this tree).
Then analyze all possible positions S could take with respect to the subtree rooted at W (lowest common ancestor of P and Q) and the vertices P and Q. For example, the first case is simple - S is not in the subtree rooted at W. Then, we can trivially improve the path by selecting the one of P and Q that is more distant to the root, and connecting it to the S. The rest of the cases are similar.
This algorithm works for any acyclic graph (a tree being a special acyclic graph in that it has a root).
A proof can be constructed by choosing any two additional points S2 and D2 and showing that their distance d(S2,D2) ≤ d(S,D). From the algorithm we know
by step 2: d(A,S)≥d(A,D), d(A,S)≥d(A,S2), d(A,S)≥d(A,D2) and
by step 3: d(S,D)≥d(S,A), d(S,D)≥d(S,S2), d(S,D)≥d(S,D2).
By distinguishing at most 5 cases (e.g. the paths SD and S2D2 have no edge in common, the paths SD and S2D2 have edges in common and A is connected to the edges running to S, etc. see image below) one can decompose the above distances into sub-paths and rewrite the inequalities based on the sub-paths. The conclusion follows from simple algebra. The details are left to the reader as an exercise. :-)
A few lemmas/facts before we get started with proof.
T is a tree so there is exactly 1 path between any 2 pair of vertices.
If S--D is the diameter then a BFS with source as S (or D) will end up giving D (or S) the largest distance. (By definition of diameter)
Also lets define |XY| to be the length of the path X--Y.
Define |XX| = 0.
Let A be the random node selected by the algorithm.
After Step 2 let the furthest node got be P.
If P is either S or D then using Lemma 2 we are done. So we must show that P has to be either S or D.
Claim : If S--D is the diameter, then P is either S or D.
Proof: I am going to prove the above by proving the Contrapositive. The proof is for a tree with a unique diameter but it should work with minor changes (mostly the equalities) for non-unique diameters too.
If P is neither S nor D then S--D is not the diameter.
Assume P is neither S nor D.
Case 1: The Path A--P intersects S--D
Let the point of intersection be K. We know that BFS marked P as the farthest node from A and from Lemma 1.
|AP| > |AS|
|AK| + |KP| > |AK| + |KS|
Therefore we get |KP| > |KS|.
Similarly |KP| > |KD|.
Now we consider the path SP
|SP| = |SK| + |KP|
|SP| > |SK| + |KD|
|SP| > |SD|
So SP is longer than the diameter which means SD is NOT the diameter.
Case 2:The Path A--P does NOT intersects S--D
Now we know BFS marked P as the farthest node. So we have
|AP| > |AD|
|AP| > |AS|
We can write |AD| = |AK| + |KD| where K is one of the vertices in the diameter (including S and D). Similarly |AS| = |AK| + |KS|.
Without loss of generality assume |AD|>=|AS|
|AK| + |KD| >= |AK| + |KS|
|KD| >= |KS|
Now consider the path PD
|PD| = |AP| + |AD|
|PD| = |AP| + |AK| + |KD|
|PD| > |AP| + |KD| (|AK| > 0 since A cannot be on the diameter)
|PD| > |KD| + |KD| (|AP| > |KD|)
|PD| > |SK| + |KD| (|KD| >= |KS|)
|PD| > |SD|
So SD is not the diameter and hence the claim.
Let the Set s represents the nodes along the diameter of the tree, with A and Z being the end nodes, and the distance from A to Z is the diameter. For any node, n, that is a member of s the longest possible path from n will end in either A or Z. Now if you pick a rand node in the tree, v, it either is a member of the set, or it has a path to a node, n, in this set. Since the longest path from n is either A or Z and the path from v to n can not be longer than either the path from n to A or n to Z (if it was then v would have to be a member of the set) then running BFS on any node V will first find either A or Z, and the subsequent call will find the complementary end point. Not a math girl, just throwing out thoughts.
I have an undirected graph. For now, assume that the graph is complete. Each node has a certain value associated with it. All edges have a positive weight.
I want to find a path between any 2 given nodes such that the sum of the values associated with the path nodes is maximum while at the same time the path length is within a given threshold value.
The solution should be "global", meaning that the path obtained should be optimal among all possible paths. I tried a linear programming approach but am not able to formulate it correctly.
Any suggestions or a different method of solving would be of great help.
Thanks!
If you looking for an algorithm in general graph, your problem is NP-Complete, Assume path length threshold is n-1, and each vertex has value 1, If you find the solution for your problem, you can say given graph has Hamiltonian path or not. In fact If your maximized vertex size path has value n, then you have a Hamiltonian path. I think you can use something like Held-Karp relaxation, for finding good solution.
This might not be perfect, but if the threshold value (T) is small enough, there's a simple algorithm that runs in O(n^3 T^2). It's a small modification of Floyd-Warshall.
d = int array with size n x n x (T + 1)
initialize all d[i][j][k] to -infty
for i in nodes:
d[i][i][0] = value[i]
for e:(u, v) in edges:
d[u][v][w(e)] = value[u] + value[v]
for t in 1 .. T
for k in nodes:
for t' in 1..t-1:
for i in nodes:
for j in nodes:
d[i][j][t] = max(d[i][j][t],
d[i][k][t'] + d[k][j][t-t'] - value[k])
The result is the pair (i, j) with the maximum d[i][j][t] for all t in 0..T
EDIT: this assumes that the paths are allowed to be not simple, they can contain cycles.
EDIT2: This also assumes that if a node appears more than once in a path, it will be counted more than once. This is apparently not what OP wanted!
Integer program (this may be a good idea or maybe not):
For each vertex v, let xv be 1 if vertex v is visited and 0 otherwise. For each arc a, let ya be the number of times arc a is used. Let s be the source and t be the destination. The objective is
maximize ∑v value(v) xv .
The constraints are
∑a value(a) ya ≤ threshold
∀v, ∑a has head v ya - ∑a has tail v ya = {-1 if v = s; 1 if v = t; 0 otherwise (conserve flow)
∀v ≠ x, xv ≤ ∑a has head v ya (must enter a vertex to visit)
∀v, xv ≤ 1 (visit each vertex at most once)
∀v ∉ {s, t}, ∀cuts S that separate vertex v from {s, t}, xv ≤ ∑a such that tail(a) ∉ S ∧ head(a) ∈ S ya (benefit only from vertices not on isolated loops).
To solve, do branch and bound with the relaxation values. Unfortunately, the last group of constraints are exponential in number, so when you're solving the relaxed dual, you'll need to generate columns. Typically for connectivity problems, this means using a min-cut algorithm repeatedly to find a cut worth enforcing. Good luck!
If you just add the weight of a node to the weights of its outgoing edges you can forget about the node weights. Then you can use any of the standard algorigthms for the shortest path problem.