I want to know the distance between all pairs (ex. dijkstra. Specifically I'm using networkx)
Then when an edge is added to graph I want to update the distances without recomputing from scratch.
How can this be done?
Thanks
It is possible without recomputing all shortest paths but still pretty costly O(n^2).
So lets assume you have a distance matrix M with size n*n where each entry M_{i,j} contains the distance from node i to node j. M is assumed to be precalculated by some algorithm.
Now if a new edge e_{i,j} is added to the graph between node i and node j with cost w_{i,j}, you check if w_{i,j} < M_{i,j}. If not, then nothing has to be changed. But if it holds, then the shortest paths in the graph might improve.
Then for each node pair k, l you check if the path going through the new edge is shorter than the previously calculated one. This can be done by evaluating the following.
M_{k,l} > min (M_{k,i} + w_{i,j} + M_{j,l} , M_{k,j} + w_{j,i} + M_{i,l})
If this holds then you can replace M_{k,l} by min (M_{k,i} + w_{i,j} + M_{j,l} , M_{k,j} + w_{j,i} + M_{i,l})
The above works for unidirectional graphs but can be adapted to bi-directional graphs as well.
Edit 1
I firmly believe that \Omega(n^2) is also the lower bound of this problem. Assume you have two disconnected areas of the graph both containing n/2 vertices. Then if you add a new edge connecting these areas you will have to update n/2 * n/2 shortest paths resulting in at least O(n^2) runtime.
Edit 2
A second idea would try to exploit the above equation but run through the graph first in order to find all pairs of vertices that have to be updated first. The sketch of the idea is as follows:
Start a Dijkstra from node i. Whenever you reach a vertex k you check if M_{k, i} + w_{i, j} < M_{k, j} if yes then add k to the set U of vertices that have to be updated. If not, then you can stop exploring further paths following k, since no vertex "beyond" k will use e_{i, j} for the shortest path.
Then do the same for node j. And then perform the update of M for all pairs of vertices in U according to the ideas above.
Related
I'm solving this problem where we have a graph, and are trying to get from node 1 to node N. The edge weights are the "cost" and each edge also has a "flow" value. For any path from node 1 to node N, the total cost would be the sum of all the costs of the edges on the path, and the flow would be the minimum flow value among the edges. We want to maximize the ratio of flow/cost.
I had the idea to use Dijkstra to find the smallest cost path from 1 to N, and when I tried finding the path this way I realized I wasn't accounting for flow. I want to perform modified Dijkstra where I take into account the flow of each edge when calculating the best path, but I'm not sure how to do this.
Should I manipulate the edge costs by subtracting or adding extra flow, or would this not work because we are looking at the ratio?
I also tried finding every path through BFS, but there is a time constraint and I am unable to do that as well.
Could anyone give me some tips on how to solve this problem?
Edit:
An example is having 3 nodes, 1, 2, and 3. 1 and 2 have an edge cost of 2, and a flow of 4. 2 and 3 have an edge cost of 5, and a flow of 3. In this example, there is only one path from 1 to N. Its flow is min(3,4)=3 and its cost is 2+5=7. So the ratio would be 3/7. But in most cases we will have several possible paths.
Follow Dijkstra's algorithm and maintain for each node v a distance label D[v] (as usual), and additionally a flow label F[v]. The goal is to maximize the ratio F[v] / D[v]. The vertex u the algorithm should select next is the one which maximizes this ratio.
Then, during the relaxation of any incident edge e=(u,v), perform the following computation to see if the ratio of a new possible path from the starting vertex to v that uses u as an intermediate vertex is better than any previously found path.
// relaxing edge e = (u,v)
newDistance = min{ D[u], D[v] + cost(e) }
newFlow = min{ F[u], flow(e) }
if ( (newFlow / newDistance) > (F[v] / D[v]) )
v.parent = u
F[v] = newFlow
D[v] = newDistance
I'm not completely sure, but you can just use your flow/cost ration as a node weight. It's ok for Dijkstra algorithm.
By the way, I've got a coursework on the same topic, I've got length and speed costs, and using length/speed ratio worked for me. You can see all the sources on GitHub.
Let's say i have a directed graph G(V,E) with edge cost(real number) ∈ (0,1).For given i,I need to find all the couples of vertices (i,j) starting from i that "match".Two vertices (i,j) match if there is a directed path from i to j with length exactly k(k is a given number that is relatively small and could be considered as constant)with cost >=C(C is a given number).Cost of a path is calculated as the product of it's edges.For example if a path starting from i and ending in j of lenght 2 consists edges e1 and e2 then CostOfpath=cost(e1)*cost(e2).
This has to be done in O(E+V*k).So what i thought is modifying the DFS algorithm updating the distances from given starting vertice i until they reach the length of k.If they don't then we can't have a match.However i am having a hard time finding what exactly i can modify in the DFS.Any ideas?
When you need to consider paths with a fixed number of edges in it, dynamic programming often comes to help (while other approaches often fail).
Let's denote dp[v][j] the maximal cost of the path from vertex i (fixed) to vertex v that has exactly j edges.
For a starting values, you can set values for j==1: dp[v][1] is the cost of edge from i to v (or 0 if no such edge exists). Or if you think on it it will be obvious that you can set values for j==0, not j==1: dp[i][0] is 1, while dp[v][0] can be set to zero for v!=i.
Now, if you have values for some j, it is easy to calculate values for j+1:
dp[v][j+1] = max( dp[v'][j] * cost((v', v)) )
This is very similar to Ford-Bellman's algorithm, only that the latter does not need to track the number of edges and thus can use one-dimensional array.
This gives you the solution in O((E+V)*k). Not exactly what you have requested, but I doubt that there exists solution in O(E+V*k).
(In the solution above, I assume that the constant C is positive, and so a zero cost path is equivalent to the path being absolutely absent. If you need, you can specifically account for the C==0 case.)
I need help to solve this question :
you have undirected graph G=(V,E) you want to write an algorithm to adjust all
One of the edges, so that in the directed graph obtained, he number of incoming edges into the node be always greater than zero.
for all edge {u,v} ∈ E you should chose one direction (u,v) or (v,u).
When the answer is positive, the algorithm must return the intention of the edges - which fulfills the requirement
As was pointed out, this problem clearly does not always have a solution. Since every vertex must have at least one incoming edge, if we have E < V, the problem is impossible.
So, let's assume that we have E >= V. Here is one way to approach the algorithm: First, count the number of edges attached to each vertex in O(E) time. Note, my solution assumes appropriate storage like an adjacency list. Can you see why an adjacency matrix would have a worse complexity?
Second, we will make a binary minheap of the vertices, according to their corresponding edge count, in O(V). Some intuition: if we have a vertex with only one edge, we must convert that into an incoming edge! When we assign the direction of that edge, we need to update the edge count of the vertex on the other side. If that other vertex goes from 2 edges to 1, we now are forced to assign the direction of its one edge left. Visually:
1 - 2 - 1
Arbitrarily choose a 1 edge count to make directed
1 - 2 -> 0
2 just lost an edge, so update to 1!
1 - 1 -> 0
Since it only has 1 edge now, convert its edge to be incoming!
0 -> 0 -> 0
Obviously this graph doesn't work since V > E, but you get the idea.
So, V times, we extract the minimum from the heap and fix in O(logV). Each time, decrement the edge count of the neighbor. Assuming the adjacency list, we can find a neighbor (first element) and update the count in O(1), and we can fix the heap again in O(logV). Overall, this step takes O(V logV).
Note, if all of our remaining vertices have more than one possible edge, this approach will arbitrarily select one of the vertices with a smallest edge count and arbitrarily select one of its edges. I will let you think about why this works (or try to provide a counter-example if you think it doesn't!) Finally, if E > V, then when we finish, there may be extra unnecessary edges left over. In O(E) time, we can arbitrarily assign directions to those edges.
Overall, we are looking at V + V*logV + E aka O(E + VlogV). Hope this helps!
I have an algorithm to determine, in a DAG, the number of paths from each vertex to a specific vertex t (which has out-degree equal to 0). Now I choose other specific vertex s with 0 in-degree. I have to develop another algorithm to determine, for each edge (u, v), the number of paths that run through (u, v) from s to t in O(|V|+|E|).
I have tried to modify the BFS (since with a DFS I think is impossible to reach the solution) but if I have an edge with more than one paths, it doesn't work. Could you suggest me or give me a hint about how can I focus my work to get the solution?
By the way, the problem is related to topological sort.
Thanks so much in advance! :)
You already have an answer from your previous question to find number of paths from all vertices to a target node t.
So, in specific, using this algorithm, you have #paths from v to t.
Using this algorithm, you can also find paths from s to u.
The total number of paths from s to t that uses (u,v) is exactly #paths(s,u) * #paths(v,t)
Explanation:
Number of paths from s to u is given from correctness of algorithm. You have exactly one choice to go forward to v, thus number of paths from s to v is also the same number. Now, you can continue from v to t using each of the #paths(v,t), giving you total of #paths(s,u)*#paths(v,t)
Complexity:
The algorithm requries to find twice number of paths from node a to node b, each is O(V+E), so the complexity of this algorithm is also O(V+E)
Attachment: algorithm to find #paths from all vertices to a target node t:
Topological sort the vertices, let the ordered vertices be v1,v2,...,vn
create new array of size t, let it be arr
init: arr[t] = 1
for i from t-1 to 1 (descending, inclusive):
arr[i] = 0
for each edge (v_i,v_j) such that i < j <= t:
arr[i] += arr[j]
Proof and analysis in the original question (linked).
Consider a directed graph with n nodes and m edges. Each edge is weighted. There is a start node s and an end node e. We want to find the path from s to e that has the maximum number of nodes such that:
the total distance is less than some constant d
starting from s, each node in the path is closer than the previous one to the node e. (as in, when you traverse the path you are getting closer to your destination e. in terms of the edge weight of the remaining path.)
We can assume there are no cycles in the graph. There are no negative weights. Does an efficient algorithm already exist for this problem? Is there a name for this problem?
Whatever you end up doing, do a BFS/DFS starting from s first to see if e can even be reached; this only takes you O(n+m) so it won't add to the complexity of the problem (since you need to look at all vertices and edges anyway). Also, delete all edges with weight 0 before you do anything else since those never fulfill your second criterion.
EDIT: I figured out an algorithm; it's polynomial, depending on the size of your graphs it may still not be sufficiently efficient though. See the edit further down.
Now for some complexity. The first thing to think about here is an upper bound on how many paths we can actually have, so depending on the choice of d and the weights of the edges, we also have an upper bound on the complexity of any potential algorithm.
How many edges can there be in a DAG? The answer is n(n-1)/2, which is a tight bound: take n vertices, order them from 1 to n; for two vertices i and j, add an edge i->j to the graph iff i<j. This sums to a total of n(n-1)/2, since this way, for every pair of vertices, there is exactly one directed edge between them, meaning we have as many edges in the graph as we would have in a complete undirected graph over n vertices.
How many paths can there be from one vertex to another in the graph described above? The answer is 2n-2. Proof by induction:
Take the graph over 2 vertices as described above; there is 1 = 20 = 22-2 path from vertex 1 to vertex 2: (1->2).
Induction step: assuming there are 2n-2 paths from the vertex with number 1 of an n vertex graph as described above to the vertex with number n, increment the number of each vertex and add a new vertex 1 along with the required n edges. It has its own edge to the vertex now labeled n+1. Additionally, it has 2i-2 paths to that vertex for every i in [2;n] (it has all the paths the other vertices have to the vertex n+1 collectively, each "prefixed" with the edge 1->i). This gives us 1 + Σnk=2 (2k-2) = 1 + Σn-2k=0 (2k-2) = 1 + (2n-1 - 1) = 2n-1 = 2(n+1)-2.
So we see that there are DAGs that have 2n-2 distinct paths between some pairs of their vertices; this is a bit of a bleak outlook, since depending on weights and your choice of d, you may have to consider them all. This in itself doesn't mean we can't choose some form of optimum (which is what you're looking for) efficiently though.
EDIT: Ok so here goes what I would do:
Delete all edges with weight 0 (and smaller, but you ruled that out), since they can never fulfill your second criterion.
Do a topological sort of the graph; in the following, let's only consider the part of the topological sorting of the graph from s to e, let's call that the integer interval [s;e]. Delete everything from the graph that isn't strictly in that interval, meaning all vertices outside of it along with the incident edges. During the topSort, you'll also be able to see whether there is a
path from s to e, so you'll know whether there are any paths s-...->e. Complexity of this part is O(n+m).
Now the actual algorithm:
traverse the vertices of [s;e] in the order imposed by the topological
sorting
for every vertex v, store a two-dimensional array of information; let's call it
prev[][] since it's gonna store information about the predecessors
of a node on the paths leading towards it
in prev[i][j], store how long the total path of length (counted in
vertices) i is as a sum of the edge weights, if j is the predecessor of the
current vertex on that path. For example, pres+1[1][s] would have
the weight of the edge s->s+1 in it, while all other entries in pres+1
would be 0/undefined.
when calculating the array for a new vertex v, all we have to do is check
its incoming edges and iterate over the arrays for the start vertices of those
edges. For example, let's say vertex v has an incoming edge from vertex w,
having weight c. Consider what the entry prev[i][w] should be.
We have an edge w->v, so we need to set prev[i][w] in v to
min(prew[i-1][k] for all k, but ignore entries with 0) + c (notice the subscript of the array!); we effectively take the cost of a
path of length i - 1 that leads to w, and add the cost of the edge w->v.
Why the minimum? The vertex w can have many predecessors for paths of length
i - 1; however, we want to stay below a cost limit, which greedy minimization
at each vertex will do for us. We will need to do this for all i in [1;s-v].
While calculating the array for a vertex, do not set entries that would give you
a path with cost above d; since all edges have positive weights, we can only get
more costly paths with each edge, so just ignore those.
Once you reached e and finished calculating pree, you're done with this
part of the algorithm.
Iterate over pree, starting with pree[e-s]; since we have no cycles, all
paths are simple paths and therefore the longest path from s to e can have e-s edges. Find the largest
i such that pree[i] has a non-zero (meaning it is defined) entry; if non exists, there is no path fitting your criteria. You can reconstruct
any existing path using the arrays of the other vertices.
Now that gives you a space complexity of O(n^3) and a time complexity of O(n²m) - the arrays have O(n²) entries, we have to iterate over O(m) arrays, one array for each edge - but I think it's very obvious where the wasteful use of data structures here can be optimized using hashing structures and other things than arrays. Or you could just use a one-dimensional array and only store the current minimum instead of recomputing it every time (you'll have to encapsulate the sum of edge weights of the path together with the predecessor vertex though since you need to know the predecessor to reconstruct the path), which would change the size of the arrays from n² to n since you now only need one entry per number-of-nodes-on-path-to-vertex, bringing down the space complexity of the algorithm to O(n²) and the time complexity to O(nm). You can also try and do some form of topological sort that gets rid of the vertices from which you can't reach e, because those can be safely ignored as well.