Finding the shortest path in a graph between 2 nodes that goes through a subset of nodes - performance

I'm trying to find out an efficient way of finding the shortest path between 2 nodes in a graph with positive edge costs that goes trough a subset of nodes.
More formally:
Given a graph G = (U, V) where U is the set of all nodes in the graph and V is the set of all edges in the graph, a subset of U called U' and a cost function say:
f : UxU -> R+
f(x, y) = cost to travel from node x to node y if there exists an edge
between node x and node y or 0 otherwise,
I have to find the shortest path between a source node and a target node that goes trough all the nodes in U'.
The order in which I visit the nodes in U' doesn't matter and I am allowed to visit a node more than once.
My original idea was to make use of Roy-Floyd algorithm to generate the cost matrix.
Then, for each permutation of the nodes in U' I would be computing the cost between the source and the target like this: f(source_node, P1) + f(P1, P2) + ... + f(Pk, target) saving the configuration for the lowest cost and then reconstructing the path.
The complexity for this approach is O(n3 + k!) O(n3 + k*k!), where n is the number of nodes in the graph and k the number of nodes in the subset U', which is off limits since I'll have to deal with graphs with maximum n = 2000 nodes out of which maximum n - 2 nodes will be part of the U' subset.

This is a generalization of travelling salesman. If U' == U, you get exactly TSP.
You can use the O(n^2 * 2^n) TSP algorithm, where the exponential factor for full scale TSP (2^n) will reduce to k = |U'|, so you'd get O(n^2 * 2^k).
This has the DP solution to TSP.
http://www.lsi.upc.edu/~mjserna/docencia/algofib/P07/dynprog.pdf

Combine the source node s and target node t into a single node z, and define the node set U'' := U' union {z}, with the distances to and from z defined by f(z,x) := f(s,x) and f(x,z) := f(x,t) for all x in U \ {s, t}. Compute shortest paths between all nodes of U'' and let f'(x,y) be the shortest distances, or infinity when there is no appropriate path. Voila, you now have a travelling salesman problem on the complete directed graph with vertices U'' and edge weights f'.

Related

Pairs of vertices in graph having specific distance

Given a tree with N vertices and a positive number K. Find the number of distinct pairs of the vertices which have a distance of exactly K between them. Note that pairs (v, u) and (u, v) are considered to be the same pair (1 ≤ N ≤ 50000, 1 ≤ K ≤ 500).
I am not able to find an optimum solution for this. I can do BFS from each vertex and count the no of vertices that is reachable from that and having distance less than or equal to K. But then in worst case the complexity will be order of 2. Is there any faster way around??
You can achieve that in more simple way.
Run DFS on tree and for each vertex calculate the distance from the root - save those in array (access by o(1)).
For each pair of vertex in your graph:
Find their LCA (Lowest common ancestor there are algorithm to do that in 0(1)).
Assume u and v are 2 arbitrary vertices and w is their LCA -> subtract the distance from the w to the root from u to the root - now you have the distance between u and w. Do the same for v -> with o(1) you got the distance for (v,w) and (u,w) -> sum them together and you get the (v,u) distance - now all you have to do is compare to K.
Final complexity is o(n^2)
Improving upon the other answer's approach, we can make some key observations.
To calculate distances between two nodes you need their LCA(Lowest Common Ancestor) and depths as the other answer is trying to do. The formula used here is:
Dist(u, v) = depth[u] + depth[v] - 2 * depth[ lca(u, v) ]
Depth[x] denotes distance of x from root, precomputed using DFS once starting from root node.
Now here comes the key observation, you already have distance value K, assume that dist(u, v) = K using this assumption calculate(predict?) depth of LCA. By putting K in above formula we get:
depth[ lca(u, v) ] = (depth[u] + depth[v] - K) / 2
Now that you have depth of LCA you know that distance between u and lca is depth[u] - depth[ lca(u, v) ] and between v and lca is depth[v] - depth[ lca(u, v) ], let this be X and Y respectively.
Now we know that LCA is the lowest common ancestor thus, the Xth parent of u and Yth parent of v should be the LCA, so now if Xth parent of u and Yth parent of v is indeed the same node then we can say that our pre-assumption about distances between the nodes was true and the distance between the two nodes is K.
You can caluculate the Xth and Yth ancestor of the nodes in O(logN) complexity using Binary Lifting Algorithm with a preprocessing of O(NLogN) time, this preprocessing can be included directly in your DFS when a node is visited for the first time.
Corner Cases:
Calcuated depth of LCA should not be a fraction or negative.
If depth of any node u or v matches the calculated depth of the node then that node is the ancestor of the other node.
Consider this tree:
Assuming K = 4, we get that depth[lca] = 1 using the formula above, and if we get the Xth and Yth ancestor of u and v we will get the same node 1, which should validate our assumption but this is not true since the distance between u and v is actually 2 as visible in the picture above. This is because LCA in this case is actually 2, to handle this case calcuate X-1th and Y-1th ancestor of u and v too, respectively and check if they are different.
Final Complexity: O(NlogN)

Finding paths of fixed cost in weighted undirected graph?

I have the following problem and I'm not quite sure how to solve it:
Given a graph G = (V;E) in which every edge e has a positive integer cost c_e and a starting vertex s\in V . Design an O(V + E) algorithm that marks all vertices reachable from s using a path (not necessarily a simple path) with the total cost of that path being multiples of 5.
How can I keep track of the total amount of cost of the path that I've already visited? I've been studying about BFS in undirected weighted graphs and made some attempts on using it here, but most of the BFS references focus on finding the shortest path (and not something like keep it multiple of 5).
What do you think about the next algorithm?
Let's consider new directed graph based on the source graph. For every vertex v from the source graph create 5 new vertexes v[0], v[1], ..., v[4] in the new graph corresponding to the modules from the division by 5. Then, if vertexes v and u were connected in the source graph by the edge with the weight w, add edge between v[i] and u[(i + w) % 5], u[j] and v[(j + w) % 5] in the new graph, where i = 0..4, j = 0..4. Then run BFS from the v[0], where v is the starting vertex in the source graph.
Consider vertexes with the index 0 like v[0]. Each of them corresponding to the path of the length multiple of 5 to the vertex v in the source graph. All of such vertexes marked after BFS as reachable from the starting vertex form the answer. Total complexity is linear.

Algorithm for DAG with weighted vertices

Let G = (V,E) be a DAG (directed acyclic graph). Each vertex v has a weight w(v) assigned to it. Given a vertex u, let f(u) = max{w(v) where v can reach u}. So, in other words, f(u) is the maximum weight among the weights of all vertices that can reach u. The goal is to write a linear time algorithm that computes f(u) for every vertex u in G.
For example, let this be the input graph:
Then the algorithm should compute the following:
f(A) = 5
f(B) = 12
f(C) = 15
f(D) = 12
f(E) = 12
Accomplishing this in O(n(n+m)) time is straightforward, but how can this be done in O(n+m) time?
Do a topological sort, then go through the nodes in this order and assign each node the max of the weight of itself and the f of the immediate predecessors (i.e. nodes that have an edge leading to the current node).

Maximum weighted path between two vertices in a directed acyclic Graph

Love some guidance on this problem:
G is a directed acyclic graph. You want to move from vertex c to vertex z. Some edges reduce your profit and some increase your profit. How do you get from c to z while maximizing your profit. What is the time complexity?
Thanks!
The problem has an optimal substructure. To find the longest path from vertex c to vertex z, we first need to find the longest path from c to all the predecessors of z. Each problem of these is another smaller subproblem (longest path from c to a specific predecessor).
Lets denote the predecessors of z as u1,u2,...,uk and dist[z] to be the longest path from c to z then dist[z]=max(dist[ui]+w(ui,z))..
Here is an illustration with 3 predecessors omitting the edge set weights:
So to find the longest path to z we first need to find the longest path to its predecessors and take the maximum over (their values plus their edges weights to z).
This requires whenever we visit a vertex u, all of u's predecessors must have been analyzed and computed.
So the question is: for any vertex u, how to make sure that once we set dist[u], dist[u] will never be changed later on? Put it in another way: how to make sure that we have considered all paths from c to u before considering any edge originating at u?
Since the graph is acyclic, we can guarantee this condition by finding a topological sort over the graph. topological sort is like a chain of vertices where all edges point left to right. So if we are at vertex vi then we have considered all paths leading to vi and have the final value of dist[vi].
The time complexity: topological sort takes O(V+E). In the worst case where z is a leaf and all other vertices point to it, we will visit all the graph edges which gives O(V+E).
Let f(u) be the maximum profit you can get going from c to u in your DAG. Then you want to compute f(z). This can be easily computed in linear time using dynamic programming/topological sorting.
Initialize f(u) = -infinity for every u other than c, and f(c) = 0. Then, proceed computing the values of f in some topological order of your DAG. Thus, as the order is topological, for every incoming edge of the node being computed, the other endpoints are calculated, so just pick the maximum possible value for this node, i.e. f(u) = max(f(v) + cost(v, u)) for each incoming edge (v, u).
Its better to use Topological Sorting instead of Bellman Ford since its DAG.
Source:- http://www.utdallas.edu/~sizheng/CS4349.d/l-notes.d/L17.pdf
EDIT:-
G is a DAG with negative edges.
Some edges reduce your profit and some increase your profit
Edges - increase profit - positive value
Edges - decrease profit -
negative value
After TS, for each vertex U in TS order - relax each outgoing edge.
dist[] = {-INF, -INF, ….}
dist[c] = 0 // source
for every vertex u in topological order
if (u == z) break; // dest vertex
for every adjacent vertex v of u
if (dist[v] < (dist[u] + weight(u, v))) // < for longest path = max profit
dist[v] = dist[u] + weight(u, v)
ans = dist[z];

Find two paths in a graph that are in distance of at least D(constant)

Instance of the problem:
Undirected and unweighted graph G=(V,E).
two source nodes a and b, two destination nodes c and d and a constant D(complete positive number).(we can assume that lambda(c,d),lambda(a,b)>D, when lambda(x,y) is the shortest path between x and y in G).
we have two peoples standing on the nodes a and b.
Definition:scheduler set-
A scheduler set is a set of orders such that in each step only one of the peoples make a move from his node v to one of v neighbors, when the starting position of them is in the nodes a,b and the ending position is in the nodes c,d.A "scheduler set" is missing-disorders if in each step the distance between the two peoples is > D.
I need to find an algorithm that decides whether there is a "missing-disorders scheduler set" or not.
any suggestions?
One simple solution would be to first solve all-pairs shortest paths using n breadth-first searches from every node in O(n * (n + m)).
Then create the graph of valid node pairs (x,y) with lambda(x, y) > D, with edges indicating the possible moves. There is an edge {(v,w), (x,y)} if v = x and there is an edge {w, y} in the original graph or if w = y and there is an edge {v, x} in the original graph. This new graph has O(n^2) nodes and O(nm) edges.
Now you just need to check whether (c, d) is reachable from (a, b) in the new graph. This can be achieved using DFS or BFS.
The total runtime be O(n * (n + m)).

Resources