Finding cheapest path on a graph, cost determined by max-weight of used nodes - algorithm

I have a graph G with a starting node S and an ending node E. What's special with this graph is that instead of edges having costs, here it's the nodes that have a cost. I want to find the way (a set of nodes, W) between S and E, so that max(W) is minimized. (In reality, I am not interested of W, just max(W)) Equivalently, if I remove all nodes with cost larger than k, what's the smallest k so that S and E are still connected?
I have one idea, but want to know if it is correct and optimal. Here's my current pseudocode:
L := Priority Queue of nodes (minimum on top)
L.add(S, S.weight)
while (!L.empty) {
X = L.poll()
return X.weight if (X == G)
mark X visited
foreach (unvisited neighbour N of X, N not in L) {
N.weight = max(N.weight, X.weight)
L.add(N, N.weight)
}
}
I believe it is worst case O(n log n) where n is the number of nodes.
Here are some details for my specific problem (percolation), but I am also interested of algorithms for this problem in general. Node weights are randomly uniformly distributed between 0 and a given max value. My nodes are Poisson distributed on the R²-plane, and an edge between two nodes exists if the distance between two nodes is less than a given constant. There are potentially very many nodes, so they are generated on the fly (hidden in the foreach in the pseudocode). My starting node is in (0,0) and the ending node is any node on a distance larger than R from (0,0).
EDIT: The weights on the nodes are floating point numbers.

Starting from an empty graph, you can insert vertices (and their edges to existing neighbours) one at a time in increasing weight order, using a fast union/find data structure to maintain the set of connected components. This is just like the Kruskal algorithm for building minimum spanning trees, but instead of adding edges one at a time, for each vertex v that you process, you would combine the components of all of v's neighbours.
You also keep track of which two components contain the start and end vertices. (Initially comp(S) = S and comp(E) = E; before each union operation, the two input components X and Y can be checked to see whether either one is either comp(S) or comp(E), and the latter updated accordingly in O(1) time.) As soon as these two components become a single component (i.e. comp(S) = comp(E)), you stop. The vertex just added is the maximum weight vertex on the the path between S and E that minimises the maximum weight of any vertex.
[EDIT: Added time complexity info]
If the graph contains n vertices and m edges, it will take O(n log n) time to sort the vertices by weight. There will be at most m union operations (since every edge could be used to combine two components). If a simple disjoint set data structure is used, all of these union operations could be done in O(m + n log n) time, and this would become the overall time complexity; if path compression is also used, this drops to O(m A(n)), where A(n) is the incredibly slowly growing inverse Ackermann function, but the overall time complexity remains unchanged from before because the initial sorting dominates.
Assuming integer weights, Pham Trung's binary search approach will take O((n + m) log maxW) time, where maxW is the heaviest vertex in the graph. On sparse graphs (where m = O(n)), this becomes O(n log maxW), while mine becomes O(n log n), so here his algorithm will beat mine if log(maxW) << log(n) (i.e. if all weights are very small). If his algorithm is called on a graph with large weights but only a small number of distinct weights, then one possible optimisation would be to sort the weights in O(n log n) time and then replace them all with their ranks in the sorted order.

This problem can be solved by using binary search.
Assume that the solution is x, Starting from the start, we will use BFS or DFS to discover the graph, visit only those nodes which have weight <= x. So, in the end, if Start and End is connected, x can be the solution. We can find the optimal value for x by applying binary search.
Pseudo code
int min = min_value_of_all_node;
int max = max_value_of_all_node;
int result = max;
while(min<= max){
int mid = (min + max)>>1;
if(BFS(mid)){//Using Breadth first search to discover the graph.
result = min(mid, result);
max = mid - 1;
}else{
min = mid + 1;
}
}
print result;
Note: we only need to apply those weights that exist in the graph, so this can help to reduce time complexity of the binary search to O(log n) with n is number of distinct weights
If the weights are float, just use the following approach:
List<Double> listWeight ;//Sorted list of weights
int min = 0;
int max = listWeight.size() - 1;
int result = max;
while(min<= max){
int mid = (min + max)>>1;
if(BFS(listWeight.get(mid))){//Using Breadth first search to discover the graph.
result = min(mid, result);
max = mid - 1;
}else{
min = mid + 1;
}
}
print listWeight.get(result);

Related

All pairs shortest path in a connected graph with N nodes and N-1 edges

I have a graph with N nodes (2 <= N <= 50000) and N is even. The values of the Nodes are always a number between 1 and N/2. It's granted that there is only one path between any pair of nodes and the weight of the edges is always one. How can i sum the distance of all nodes with equal value ?
This is a example:
The number inside the square is the value of the node and the small number below it its the identification of the node.
In this example the sum is distance(1,6) + distance(2,5) + distance(3,4) = 5
Floyd-Marshall or simple BFS its to expensive for this case.
I've seen that on DAGs is possible to get the shortest path with a topological sort. In this case it is a good approach ?
I'm assuming here that you have a disjoint partition of your node set into pairs, represented by numbers from 1 to N/2. I'm also assuming that by "there is only one path between any pair of nodes" you really mean any pair and not just those of the same color.
In that case, first realize that your graph is a tree. So root it arbitrarily, and traverse it in depth-first order to compute the depth of all nodes. Note that for two nodes x and y, if their lowest common ancestor is l, then
distance(x, y) = distance(x, l) + distance(y, l)
= depth(x) - depth(l) + depth(y) - depth(l)
= depth(x) + depth(y) - 2*depth(l)
You can use Tarjan's off-line LCA algorithm to compute the LCA of all your pairs in (almost) linear time and compute the distances. You don't even need to store the LCAs in this case.
Runtime: O(n * α(n)) with naive disjoint set union, O(n) with the improvements from "A Linear-Time Algorithm for a Special Case of
Disjoint Set Union", Gabow & Tarjan, 1983

Algorithm for finding the path that minimizes the maximum weight between two nodes

I would like to travel by car from city X to city Y. My car has a small tank, and gas stations exist only at intersections of the roads (the intersections are nodes and the roads are edges). Therefore, I would like to take a path such that the maximum distance that I drive between two gas stations is minimized. What efficient algorithm can I use to find that path? Brute force is one bad solution. I am wondering if there exists a more efficient algorithm.
Here is a simple solution:
Sort the edges by their weights.
Start adding them one by one(from the lightest to the heaviest) until X and Y become connected.
To check if they are connected, you can use a union-find data structure.
The time complexity is O(E log E).
A proof of correctness:
The correct answer is not larger than the one returned by this solution. It is the case because the solution is constructive: once X and Y are in the same component, we can explicitly write down the path between them. It cannot contain heavier edges because they haven't been added yet.
The correct answer is not smaller than the one returned by this solution. Let's assume that there is a path between X and Y that consists of edges which have weight strictly less than the returned answer. But is not possible as all lighter edges were processed before(we iterate over them in the sorted order) and X and Y were in different components. Thus, there was no path between them.
1) and 2) imply the correctness of this algorithm.
This solution works for undirected graphs.
Here is an algorithms which solves the problem for a directed case(it works for undirected graphs, too):
Let's sort the edges by their weights.
Let's binary search over the weight of the heaviest edge in the path(it is determined by an index of the edge in the sorted list of all edges).
For a fixed answer candidate i, we can do the following:
Add all edges with indices up to i in the sorted list(that is, all edges which are not heavier than the current one).
Run DFS or BFS to check that there is a path from X to Y.
Adjust left and right borders in the binary search depending on the existence of such path.
The time complexity is O((E + V) * log E)(we run DFS/BFS log E times and each of them is done in O(E + V) time).
Here is a pseudo code:
if (X == Y)
return 0 // We don't need any edges.
if (Y is not reachable from X using all edges)
return -1 // No solution.
edges = a list of edges sorted by their weight in increasing order
low = -1 // definitely to small(no edges)
high = edges.length - 1 // definitely big enough(all edges)
while (high - low > 1)
mid = low + (high - low) / 2
g = empty graph
for i = 0...mid
g.add(edges[i])
if (g.hasPath(X, Y)) // Checks that there is a path using DFS or BFS
high = mid
else
low = mid
return edges[high]

Longest path in ordered graph

Let G = (V, E) be a directed graph with nodes v_1, v_2,..., v_n. We say that G is an ordered graph if it has the following properties.
Each edge goes from a node with lower index to a node with a higher index. That is, every directed edge has the form (v_i, v_j) with i < j.
Each node except v_n has at least one edge leaving it. That is, for every node v_i, there is at least one edge of the form (v_i, v_j).
Give an efficient algorithm that takes an ordered graph G and returns the length of the longest path that begins at v_1 and ends at v_n.
If you want to see the nice latex version: here
My attempt:
Dynamic programming. Opt(i) = max {Opt(j)} + 1. for all j such such j is reachable from i.
Is there perhaps a better way to do this? I think even with memoization my algorithm will still be exponential. (this is just from an old midterm review I found online)
Your approach is right, you will have to do
Opt(i) = max {Opt(j)} + 1} for all j such that j is reachable from i
However, this is exponential only if you run it without memoization. With memoization, you will have the memoized optimal value for every node j, j > i, when you are on node i.
For the worst case complexity, let us assume that every two nodes that could be connected are connected. This means, v_1 is connected with (v_2, v_3, ... v_n); v_i is connected with (v_(i+1), v_(i+2), ... v_n).
Number of Vertices (V) = n
Hence, number of edges (E) = n*(n+1)/2 = O(V^2)
Let us focus our attention on a vertex v_k. For this vertex, we have to go through the already derived optimal values of (n-k) nodes.
Number of ways of reaching v_k directly = (k-1)
Hence worst case time complexity => sigma((k-1)*(n-k)) from k=1 to k=n, which is a sigma of power 2 polynomical, and hence will result in O(n^3) Time complexity.
Simplistically, the worst case time complexity is O(n^3) == O(V^3) == O(E) * O(V) == O(EV).
Thanks to the first property, this problem can be solved O(V^2) or even better with O(E) where V is the number of vertices and E is the number of edges. Indeed, it uses the dynamic programming approach which is quiet similar with the one you gives. Let opt[i] be the length of the longest path for v_1 to v_i. Then
opt[i] = max(opt[j]) + 1 where j < i and we v_i and v_j is connected,
using this equation, it can be solved in O(V^2).
Even better, we can solve this in another order.
int LongestPath() {
for (int v = 1; v <= V; ++v) opt[v] = -1;
opt[1] = 0;
for (int v = 1; v <= V; ++v) {
if (opt[v] >= 0) {
/* Each edge can be visited at most once,
thus the runtime time is bounded by |E|.
*/
for_each( v' can be reached from v)
opt[v'] = max(opt[v]+1, opt[v']);
}
}
return opt[V];
}

Connecting a Set of Vertices into an optimally weighted graph

This is essentially the problem of connecting n destinations with the minimal amount of road possible.
The input is a set of vertices (a,b, ... , n)
The weight of an edge between two vertices is easily calculated (example the cartesian distance between the two vertices)
I would like an algorithm that given a set of vertices in euclidian space, returns a set of edges that would constitute a connected graph and whose total weight of edges is as small as it could be.
In graph language, this is the Minimum Spanning Tree of a Connected Graph.
With brute force I would have:
Define all possible edges between all vertices - say you have n
vertices, then you have n(n-1)/2 edges in the complete graph
A possible edge can be on or off (2 states)
Go through all possible edge on/off
combinations: 2^(n(n-1)/2)!
Ignore all those that would not connect the
graph
From the remaining combinations, find the one whose sum of
edge weights is the smallest of all
I understand this is an NP-Hard problem. However, realistically for my application, I will have a maximum of 11 vertices. I would like to be able to solve this on a typical modern smart phone, or at the very least on a small server size.
As a second variation, I would like to obtain the same goal, with the restriction that each vertex is connected to a maximum of one other vertex. Essentially obtaining a single trace, starting from any point, and finishing at any other point, as long as the graph is connected. There is no need to go back to where you started. In graph language, this is the Open Euclidian Traveling Salesman Problem.
Some pseudocode algorithms would be much helpful.
Ok for the first problem you have to build a Minimum Spanning Tree. There are several algorithms to do so, Prim and Kruskal. But take a look also in the first link to the treatment for complete graphs that it is your case.
For the second problem, it becomes a little more complicated. The problem becomes an Open Traveling Salesman Problem (oTSP). Reading the previous link maybe focused on Euclidean and Asymmetric.
Regards
Maybee you could try a greedy algorithm:
1. Create a list sortedList that stores each pair of nodes i and j and is sorted by the
weight w(i,j).
2. Create a HashSet connectedNodes that is empty at the beginning
3. while (connectedNodes.size() < n)
element := first element of sortedList
if (connectedNodes.isEmpty())
connectedNodes.put(element.nodeI);
connectedNodes.put(element.nodeJ);
delete element from sortedList
else
for(element in sortedList) //start again with the first
if(connectedNodes.get(element.nodeI) || connectedNodes.get(element.nodeJ))
if(!(connectedNodes.get(element.nodeI) && connectedNodes.get(element.nodeJ)))
//so it does not include already both nodes
connectedNodes.put(element.nodeI);
connectedNodes.put(element.nodeJ);
delete element from sortedList
break;
else
continue;
So I explain step 3 a little bit:
You add as long nodes till all nodes are connected to one other. It is sure that the graph is connected, because you just add a node, if he has a connection to an other one already in the connectedNodes list.
So this algorithm is greedy what means, it does not make sure, that the solution is optimal. But it is a quite good approximation, because it always takes the shortest edge (because sortedList is sorted by the weight of the edge).
Yo don't get duplicates in connectedNodes, because it is a HashSet, which also make the runtime faster.
All in all the runtime should be O(n^2) for the sorting at the beginning and below its around O(n^3), because in worst case you run in every step through the whole list that has size of n^2 and you do it n times, because you add one element in each step.
But more likely is, that you find an element much faster than O(n^2), i think in most cases it is O(n).
You can solve the travelsalesman problem and the hamilton path problem with the optimap tsp solver fron gebweb or a linear program solver. But the first question seems to ask for a minimum spanning tree maybe the question tag is wrong?
For the first problem, there is an O(n^2 * 2^n) time algorithm. Basically, you can use dynamic programming to reduce the search space. Let's say the set of all vertices is V, so the state space consists of all subsets of V, and the objective function f(S) is the minimum sum of weights of the edges connecting vertices in S. For each state S, you may enumerate over all edges (u, v) where u is in S and v is in V - S, and update f(S + {v}). After checking all possible states, the optimal answer is then f(V).
Below is the sample code to illustrate the idea, but it is implemented in a backward approach.
const int n = 11;
int weight[n][n];
int f[1 << n];
for (int state = 0; state < (1 << n); ++state)
{
int res = INF;
for (int i = 0; i < n; ++i)
{
if ((state & (1 << i)) == 0) continue;
for (int j = 0; j < n; ++j)
{
if (j == i || (state & (1 << j)) == 0) continue;
if (res > f[state - (1 << j)] + weight[i][j])
{
res = f[state - (1 << j)] + weight[i][j];
}
}
}
f[state] = res;
}
printf("%d\n", f[(1 << n) - 1]);
For the second problem, sorry I don't quite understand it. Maybe you should provide some examples?

Shortest path with a fixed number of edges

Find the shortest path through a graph in efficient time, with the additional constraint that the path must contain exactly n nodes.
We have a directed, weighted graph. It may, or may not contain a loop. We can easily find the shortest path using Dijkstra's algorithm, but Dijkstra's makes no guarantee about the number of edges.
The best we could come up with was to keep a list of the best n paths to a node, but this uses a huge amount of memory over vanilla Dijkstra's.
It is a simple dynamic programming algorithm.
Let us assume that we want to go from vertex x to vertex y.
Make a table D[.,.], where D[v,k] is the cost of the shortest path of length k from the starting vertex x to the vertex v.
Initially D[x,1] = 0. Set D[v,1] = infinity for all v != x.
For k=2 to n:
D[v,k] = min_u D[u,k-1] + wt(u,v), where we assume that wt(u,v) is infinite for missing edges.
P[v,k] = the u that gave us the above minimum.
The length of the shortest path will then be stored in D[y,n].
If we have a graph with fewer edges (sparse graph), we can do this efficiently by only searching over the u that v is connected to. This can be done optimally with an array of adjacency lists.
To recover the shortest path:
Path = empty list
v = y
For k= n downto 1:
Path.append(v)
v = P[v,k]
Path.append(x)
Path.reverse()
The last node is y. The node before that is P[y,n]. We can keep following backwards, and we will eventually arrive at P[v,2] = x for some v.
The alternative that comes to my mind is a depth first search (as opposed to Dijkstra's breadth first search), modified as follows:
stop "depth"-ing if the required vertex count is exceeded
record the shortest found (thus far) path having the correct number of nodes.
Run time may be abysmal, but it should come up with the correct result while using a very reasonable amount of memory.
Interesting problem. Did you discuss using a heuristic graph search (such as A*), adding a penalty for going over or under the node count? This may or may not be admissible, but if it did work, it may be more efficient than keeping a list of all the potential paths.
In fact, you may be able to use backtracking to limit the amount of memory being used for the Dijkstra variation you discussed.
A rough idea of an algorithm:
Let A be the start node, and let S be a set of nodes (plus a path). The invariant is that at the end of step n, S will all nodes that are exactly n steps from A and the paths will be the shortest paths of that length. When n is 0, that set is {A (empty path)}. Given such a set at step n - 1, you get to step n by starting with an empty set S1 and
for each (node X, path P) in S
for each edge E from X to Y in S,
If Y is not in S1, add (Y, P + Y) to S1
If (Y, P1) is in S1, set the path to the shorter of P1 and P + Y
There are only n steps, and each step should take less than max(N, E), which makes the
entire algorithm O(n^3) for a dense graph and O(n^2) for a sparse graph.
This algorith was taken from looking at Dijkstra's, although it is a different algorithm.
let say we want shortest distance from node x to y of k step
simple dp solution would be
A[k][x][y] = min over { A[1][i][k] + A[t-1][k][y] }
k varies from 0 to n-1
A[1][i][j] = r[i][j]; p[1][i][j]=j;
for(t=2; t<=n; t++)
for(i=0; i<n; i++) for(j=0; j<n; j++)
{
A[t][i][j]=BG; p[t][i][j]=-1;
for(k=0; k<n; k++) if(A[1][i][k]<BG && A[t-1][k][j]<BG)
if(A[1][i][k]+A[t-1][k][j] < A[t][i][j])
{
A[t][i][j] = A[1][i][k]+A[t-1][k][j];
p[t][i][j] = k;
}
}
trace back the path
void output(int a, int b, int t)
{
while(t)
{
cout<<a<<" ";
a = p[t][a][b];
t--;
}
cout<<b<<endl;
}

Resources