Fastest way traversing a graph with several unknown weight edges - algorithm

Given a undirected and positive weighted graph G, some edges of G have unknown weight. For instance,
where edge(B, C) has unknown weight.
Traversing from A to B costs you 7.
We are allowed to derive the unknown weight e = weight(B,C) by traversing from B to C or vise versa and costs you e, which becomes a known weight in the end. And walk from A to C through B costs you e+7 in total.
My question is, how fast can we get all the unknown weight when given a starting point? That is, traverse all the unknown weight edges from a starting point(e.g A) with as small costs as possible.
The case that the number of unknown weight is 1 is simple. You can first find out the shortest path from the starting point to the vertices of the desired edge and traverse the unknown weight edge. However, I don't know how to solve it when the number of unknown weight edges grows larger than 1.
Can anyone figure out how to do this?

Can't offer a complete solution, but it looks related to the travelling salesman problem where the unknown edges are the nodes to be visited.
But I think you can't find an optimal solution a priori. Consider this example
a-b = 1
b-c = ?
b-d = ?
a-d = 10
if b-c has low weight (say 1) the shortest path starting from a is a-b-c-b-d which traverses b-c twice. If b-c has high weight (say 100) the shortest path becomes a-d-b-c, preferring the cheaper connection a-d over traversing b-c twice.

You can create a second graph, G', which will be the same - but without the "unkown edges"1. Then, you can use all to all shortest path algorithm, and use the data from the algorithm to fill in the blanks.
Floyd Warshall algorithm offers an O(n3) all to all shortest path.
(1) Formally: G'=(V,E',w') where:
E' = { e | e is in E and w(e) != ? }
w'(e) = w(e) if w(e) != ?

Related

How to update MST from the old MST if one edge is deleted

I am studying algorithms, and I have seen an exercise like this
I can overcome this problem with exponential time but. I don't know how to prove this linear time O(E+V)
I will appreciate any help.
Let G be the graph where the minimum spanning tree T is embedded; let A and B be the two trees remaining after (u,v) is removed from T.
Premise P: Select minimum weight edge (x,y) from G - (u,v) that reconnects A and B. Then T' = A + B + (x,y) is a MST of G - (u,v).
Proof of P: It's obvious that T' is a tree. Suppose it were not minimum. Then there would be a MST - call it M - of smaller weight. And either M contains (x,y), or it doesn't.
If M contains (x,y), then it must have the form A' + B' + (x,y) where A' and B' are minimum weight trees that span the same vertices as A and B. These can't have weight smaller than A and B, otherwise T would not have been an MST. So M is not smaller than T' after all, a contradiction; M can't exist.
If M does not contain (x,y), then there is some other path P from x to y in M. One or more edges of P pass from a vertex in A to another in B. Call such an edge c. Now, c has weight at least that of (x,y), else we would have picked it instead of (x,y) to form T'. Note P+(x,y) is a cycle. Consequently, M - c + (x,y) is also a spanning tree. If c were of greater weight than (x,y) then this new tree would have smaller weight than M. This contradicts the assumption that M is a MST. Again M can't exist.
Since in either case, M can't exist, T' must be a MST. QED
Algorithm
Traverse A and color all its vertices Red. Similarly label B's vertices Blue. Now traverse the edge list of G - (u,v) to find a minimum weight edge connecting a Red vertex with a Blue. The new MST is this edge plus A and B.
When you remove one of the edges then the MST breaks into two parts, lets call them a and b, so what you can do is iterate over all vertices from the part a and look for all adjacent edges, if any of the edges forms a link between the part a and part b you have found the new MST.
Pseudocode :
for(all vertices in part a){
u = current vertex;
for(all adjacent edges of u){
v = adjacent vertex of u for the current edge
if(u and v belong to different part of the MST) found new MST;
}
}
Complexity is O(V + E)
Note : You can keep a simple array to check if vertex is in part a of the MST or part b.
Also note that in order to get the O(V + E) complexity, you need to have an adjacency list representation of the graph.
Let's say you have graph G' after removing the edge. G' consists have two connected components.
Let each node in the graph have a componentID. Set the componentID for all the nodes based on which component they belong to. This can be done with a simple BFS for example on G'. This is an O(V) operation as G' only has V nodes and V-2 edges.
Once all the nodes have been flagged, iterate over all unused edges and find the one with the least weight that connects the two components (componentIDs of the two nodes will be different). This is an O(E) operation.
Thus the total runtime is O(V+E).

Maximum weighted path between two vertices in a directed acyclic Graph

Love some guidance on this problem:
G is a directed acyclic graph. You want to move from vertex c to vertex z. Some edges reduce your profit and some increase your profit. How do you get from c to z while maximizing your profit. What is the time complexity?
Thanks!
The problem has an optimal substructure. To find the longest path from vertex c to vertex z, we first need to find the longest path from c to all the predecessors of z. Each problem of these is another smaller subproblem (longest path from c to a specific predecessor).
Lets denote the predecessors of z as u1,u2,...,uk and dist[z] to be the longest path from c to z then dist[z]=max(dist[ui]+w(ui,z))..
Here is an illustration with 3 predecessors omitting the edge set weights:
So to find the longest path to z we first need to find the longest path to its predecessors and take the maximum over (their values plus their edges weights to z).
This requires whenever we visit a vertex u, all of u's predecessors must have been analyzed and computed.
So the question is: for any vertex u, how to make sure that once we set dist[u], dist[u] will never be changed later on? Put it in another way: how to make sure that we have considered all paths from c to u before considering any edge originating at u?
Since the graph is acyclic, we can guarantee this condition by finding a topological sort over the graph. topological sort is like a chain of vertices where all edges point left to right. So if we are at vertex vi then we have considered all paths leading to vi and have the final value of dist[vi].
The time complexity: topological sort takes O(V+E). In the worst case where z is a leaf and all other vertices point to it, we will visit all the graph edges which gives O(V+E).
Let f(u) be the maximum profit you can get going from c to u in your DAG. Then you want to compute f(z). This can be easily computed in linear time using dynamic programming/topological sorting.
Initialize f(u) = -infinity for every u other than c, and f(c) = 0. Then, proceed computing the values of f in some topological order of your DAG. Thus, as the order is topological, for every incoming edge of the node being computed, the other endpoints are calculated, so just pick the maximum possible value for this node, i.e. f(u) = max(f(v) + cost(v, u)) for each incoming edge (v, u).
Its better to use Topological Sorting instead of Bellman Ford since its DAG.
Source:- http://www.utdallas.edu/~sizheng/CS4349.d/l-notes.d/L17.pdf
EDIT:-
G is a DAG with negative edges.
Some edges reduce your profit and some increase your profit
Edges - increase profit - positive value
Edges - decrease profit -
negative value
After TS, for each vertex U in TS order - relax each outgoing edge.
dist[] = {-INF, -INF, ….}
dist[c] = 0 // source
for every vertex u in topological order
if (u == z) break; // dest vertex
for every adjacent vertex v of u
if (dist[v] < (dist[u] + weight(u, v))) // < for longest path = max profit
dist[v] = dist[u] + weight(u, v)
ans = dist[z];

Shortest Path Algorithm with Fuel Constraint and Variable Refueling

Suppose you have an undirected weighted graph. You want to find the shortest path from the source to the target node while starting with some initial "fuel". The weight of each edge is equal to the amount of "fuel" that you lose going across the edge. At each node, you can have a predetermined amount of fuel added to your fuel count - this value can be 0. A node can be visited more than once, but the fuel will only be added the first time you arrive at the node. **All nodes can have different amounts of fuel to provide.
This problem could be related to a train travelling from town A to town B. Even though the two are directly connected by a simple track, there is a shortage of coal, so the train does not have enough fuel to make the trip. Instead, it must make the much shorter trip from town A to town C which is known to have enough fuel to cover the trip back to A and then onward to B. So, the shortest path would be the distance from A to C plus the distance from C to A plus the distance from A to B. We assume that fuel cost and distance is equivalent.
I have seen an example where the nodes always fill the "tank" up to its maximum capacity, but I haven't seen an algorithm that handles different amounts of refueling at different nodes. What is an efficient algorithm to tackle this problem?
Unfortunately this problem is NP-hard. Given an instance of traveling salesman path from s to t with decision threshold d (Is there an st-path visiting all vertices of length at most d?), make an instance of this problem as follows. Add a new destination vertex connected to t by a very long edge. Give starting fuel d. Set the length of the new edge and the fuel at each vertex other than the destination so that (1) the total fuel at all vertices is equal to the length of the new edge (2) it is not possible to use the new edge without collecting all of the fuel. It is possible to reach the destination if and only if there is a short traveling salesman path.
Accordingly, algorithms for this problem will resemble those for TSP. Preprocess by constructing a complete graph on the source, target, and vertices with nonzero fuel. The length of each edge is equal to the distance.
If there are sufficiently few special vertices, then exponential-time (O(2^n poly(n))) dynamic programming is possible. For each pair consisting of a subset of vertices (in order of nondecreasing size) and a vertex in that subset, determine the cheapest way to visit all of the subset and end at the specified vertex. Do this efficiently by using the precomputed results for the subset minus the vertex and each possible last waypoint. There's an optimization that prunes the subsolutions that are worse than a known solution, which may help if it's not necessary to use very many waypoints.
Otherwise, the play may be integer programming. Here's one formulation, quite probably improvable. Let x(i, e) be a variable that is 1 if directed edge e is taken as the ith step (counting from the zeroth) else 0. Let f(v) be the fuel available at vertex v. Let y(i) be a variable that is the fuel in hand after i steps. Assume that the total number of steps is T.
minimize sum_i sum_{edges e} cost(e) x(i, e)
subject to
for each i, for each vertex v,
sum_{edges e with head v} x(i, e) - sum_{edges e with tail v} x(i + 1, e) =
-1 if i = 0 and v is the source
1 if i + 1 = T and v is the target
0 otherwise
for each vertex v, sum_i sum_{edges e with head v} x(i, e) <= 1
for each vertex v, sum_i sum_{edges e with tail v} x(i, e) <= 1
y(0) <= initial fuel
for each i,
y(i) >= sum_{edges e} cost(e) x(i, e)
for each i, for each vertex v,
y(i + 1) <= y(i) + sum_{edges e} (-cost(e) + f(head of e)) x(i, e)
for each i, y(i) >= 0
for each edge e, x(e) in {0, 1}
There is no efficient algorithm for this problem. If you take an existing graph G of size n you can give each edge a weight of 1, each node a deposit of 5, and then add a new node that you are trying to travel to connected to each node with a weight of 4 * (n -1). Now the existence of a path from the source to the target node in this graph is equivalent to the existence of a Hamiltonian path in G, which is a known np-complete problem. (See http://en.wikipedia.org/wiki/Hamiltonian_path for details.)
That said, you can do better than a naive recursive solution for most graphs. First do a breadth first search from the target node so that every node's distance to the target is known. Now you can borrow the main idea of Dijkstra's A* search. Do a search of all paths from the source, using a priority queue to always try to grow a path whose current distance + the minimum to the target is at a minimum. And to reduce work you probably also want to discard all paths that have returned to a node that they have previously visited, except with lower fuel. (This will avoid silly paths that travel around loops back and forth as fuel runs out.)
Assuming the fuel consumption as positive weight and the option to add the fuel as negative weight and additionally treating the initial fuel available as another negative weighted edge, you can use Bellman Ford to solve it as single source shortest path.
As per this answer, elsewhere, undirected graphs can be addressed by replacing each edge with two in both directions. The only constraint I'm not sure about is the part where you can only refuel once. This may be naturally addressed, by the the algorithm, but I'm not sure.

Finding a New Minimum Spanning Tree After a New Edge Was Added to The Graph

Let G = (V, E) be a weighted, connected and undirected graph and let T be a minimum spanning tree. Let e be any edge not in E (and has a weight W(e)).
Prove or disprove:
T U {e} is an edge set that contains a minimum spanning tree of G' = (V, E U {e}).
Well, it sounds true to me, so I decided to prove it but I just get stuck every time...
For example, if e is the new edge with minimum weight, who can promise us that the edges in T weren't chosen in a bad way that would prevent us from obtaining a new minimum weight without the 'help' of other edges in E - T ?
I would appreciate any help,
Thanks in advance.
Let [a(1), a(2), ..., a(n-1)] be a sequence of edges selected from E to construct MST of G by Kruskal's algorithm (in the order they were selected - weight(a(i)) <= weight(a(i + 1))).
Let's now consider how Kruskal's Algorithm behaves being given as input E' = E U {e}.
Let i = min{i: weight(e) < weight(a(i))}. Firstly algorithm decides to choose edges [a(1), ..., a(i - 1)] (e hasn't been processed yet, so it behaves the same). Then it need to decide on e - if e is dropped, solution for E' will be the same as for E. So let's suppose that first i edges selected by algorithm are [a(1), ..., a(i - 1), e] - I will call this new sequence a'. Algorithm continues - as long as its following selections (for j > i) satisfy a'(j) = a(j - 1) we are cool. There are two scenarios that break such great streak (let's say streak breaks at index k + 1):
1) Algorithm selects some edge e' that is not in T, and weight(e') < weight(a(k+1)). By now a' sequence is:
[a(1), ..., a(i-1), e, a(i), a(i+1), ..., a(k-1), a(k), e']
But if it was possible to append e' to this list it would be also possible to append it to [a(1), ..., a(k-1), a(k)]. But Kruskal's algorithm didn't do it when looking for MST for G. That leads to contradiction.
2) Algorithm politely selected:
[a(1), ..., a(i-1), e, a(i), a(i+1), ..., a(k-1), a(k)]
but decided to drop edge a(k+1). But if e was not present in the list algorithm would decide to append a(k+1). That means that in graph (V, {a(1), ..., a(k)}) edge a(k+1) would connect the same components as edge e. And that means that after considering by algorithm edge a(k + 1) in case of both G and G' the division into connected components (determined by set of selected edges) is the same. So after processing a(k+1) algorithm will proceed in the same way in both cases.
When ever a edge is add to a graph without adding a node , then that edge creates a cycle in minimum spanning tree of graph, cycle length may vary from 2 to n where n= no of nodes in graph.
T = Minimum spanning tree of G
Now to find the MST for (T + added edge) , we have to just remove one edge from that cycle .. so remove that edge which has maximum weight.
So T' always comes from T U {e}.
And if you are thinking that this doesn't prove that new MST will be an edge set of T U {e} then analyse Kruskal algorithim for for new graph. i.e. if e is of minimum weight it must have been selected for MST acc to Kruskal algorithim and same here if it is minimum it can not be removed from cycle.

Maximum matching in a bipartite graph

Use the following heuristic algorithm:
M = NULL
while E != NULL do {
if ((∃u vertex) and (gr(u) == 1)) then
e ← the incident edge with u
else
e ← an incident edge with a vertex with the most incident edges
M ← M ∪ {e}
E ← E - (all the incident edges with e)
}
return M //return the matching
Where: M,E - edges ; gr(u) - the grade of u (the number of incident edges with u) ;
What we were asked:
a) Prove that this algorithm returns the maximum matching for a tree.
b) Prove that if there is a perfect matching M0 then the algorithm returns it, for any bipartite graph.
c) Prove that |M| ≥ (v(G)/2), for any bipartite graph.
//G is the graph, v(G) is the matching number, size of the maximum matching.
I'm almost sure this algorithm is similar to some classic algorithm that I'm failing to find, or the solution could be completely based on theorems and properties of bipartite graphs.
Can you please give me a starting point.. What am I missing ?
I think a) is easy.. I'm still trying to find the right proof, I think it may be completely based on properties of trees and bipartite graphs.
For b) and c) .. I don't have any idea yet.
This is very similar to the greedy matching algorithm. See the wikipedia article for more information.
As for the questions...
a) Show that the matching you get is maximal (there are no larger matchings containing it). What does this imply on a tree?
b) Show that if M0 is a valid matching that can be found in M ∪ E in a given step, that it can be found in M ∪ E in the next. By induction, the statement holds.
c) Consider a maximum matching M1. Why must at least one of the vertices adjacent to any given edge in M1 appear as an endpoint for some edge in the matching the algorithm outputs? What does this tell you about a lower bound for its size?

Resources