What algorithm should I use to get all possible paths in a directed weighted graph, with positive weights? - algorithm

I have a directed weighted graph, with positive weights, which looks something like this :-
What I am trying to do is:-
Find all possible paths between two nodes.
Arrange the paths in ascending order, based on their path length (as given by the edge weights), say top 5 atleast.
Use an optimal way to do so, so that even in cases of larger number of nodes, the program won't take much time computing.
E.g.:- Say my initial node is d, and final node is c.
So the output should be something like
d to c = 11
d to e to c = 17
d to b to c = 25
d to b to a to c = 31
d to b to a to f to c = 38
How can I achieve this?

The best approach would be to take the Dijkstra’s shortest path algorithm, we can get a shortest path in O(E + VLogV) time.
Take this basic approach to help you find the shortest path possible:
Look at all nodes directly adjacent to the starting node. The values carried by the edges connecting the start and these adjacent nodes are the shortest distances to each respective node.
Record these distances on the node - overwriting infinity - and also cross off the nodes, meaning that their shortest path has been found.
Select one of the nodes which has had its shortest path calculated, we’ll call this our pivot. Look at the nodes adjacent to it (we’ll call these our destination nodes) and the distances separating them.
For every ending (destination node):
If the value in the pivot plus the edge value connecting it totals less than the destination node’s value, then update its value, as a new shorter path has been found.
If all routes to this destination node have been explored, it can be crossed off.
Repeat step 2 until all nodes have been crossed off. We now have a graph where the values held in any node will be the shortest distance to it from the start node.

Find all possible paths between two nodes
You could use bruteforce here, but it is possible, that you get a lot of paths, and it will really take years for bigger graphs (>100 nodes, depending on a lot of facotrs).
Arrange the paths in ascending order, based on their path length (as given by the edge weights), say top 5 atleast.
Simply sort them, and take the 5 first. (You could use a combination of a list of edges and an integer/double for the length of the path).
Use an optimal way to do so, so that even in cases of larger number of nodes, the program won't take much time computing.
Even finding all possible paths between two nodes is NP-Hard (Source, it's for undirected graphs, but is still valid). You will have to use heuristics.
What do you mean with a larger number of nodes? Do you mean 100 or 100 million? It depends on your context.

Related

Good algorithm for finding shortest path for specific vertices

I'm solving the problem described below and can't think of a better algorithm than trying every permutation of every vertex of every group with every.
I'm given a graph of vertices, along with a list of groups of specific vertices, the goal is to find the shortest path from a specific starting vertex to a specific ending vertex, and the path must pass through at least one vertex from each specified group of vertices.
There are also vertices in the graph that are not part of any given group.
Re-visiting vertices and edges is possible.
The graph data is specified as follows:
Vertex list - each vertex is identified by a sequence number (0 to the number of vertices -1 )
Edge list - list of vertex pairs (by vertex number)
Vertex group list - list of lists of vector numbers
A specific starting and ending vertex.
I would be grateful for any ideas for a better solution, thank you.
Summary:
We can use bitmasks to efficiently check which groups we have visited so far, and combine this with a traditional BFS/ Dijkstra's shortest-path algorithm.
If we assume E edges, V vertices, and K vertex-groups that have to be included, the below algorithm has a time complexity of O((V + E) * 2^K) and a space complexity of O(V * 2^K). The exponential 2^K term means it will only work for a relatively small K, say up to 10 or 20.
Details:
First, are the edges weighted?
If yes then a "shortest path" algorithm will usually be a variation of Dijkstra's algorithm, in which we keep a (min) priority queue of the shortest paths. We only visit a node once it's at the top of the queue, meaning that this must be the shortest path to this node. Any other shorter path to this node would already have been added to the priority queue and would come before the current iteration. (Note: this doesn't work for negative paths).
If no, meaning all edges have the same weight, then there is no need to maintain a priority queue with the shortest edges. We can instead just run a regular Breadth-first search (BFS), in which we maintain a deque with all nodes at the current depth. At each step we iterate over all nodes at the current depth (popping them from the left of the deque), and for each node we add all it's not-yet-visited neighbors to the right side of the deque, forming the next level.
The below algorithm works for both BFS and Dijkstra's, but for simplicity's sake for the rest of the answer I'll pretend that the edges have positive weights and we will use Dijkstra's. What is important to take away though is that for either algorithm we will only "visit" or "explore" a node for a path that must be the shortest path to that node. This property is essential for the algorithm to be efficient, since we know that we will at most visit each of the V nodes and E edges only one time, giving us a time complexity of O(V + E). If we use Dijkstra's we have to multiply this with log(V) for the priority queue usage (this also applies to the time complexity mentioned in the summary).
Our Problem
In our case we have the additional complexity that we have K vertex-groups, for each of which our shortest path has to contain at least one the nodes in it. This is a big problem, since it destroys our ability to simple go along with the "shortest current path".
See for example this simple graph. Notation: -- means an edge, start is that start node, and end is the end node. A vertex with value 0 does not have a vertex-group, and a vertex with value >= 1 belongs to the vertex-group of that index.
end -- 0 -- 2 -- start -- 1 -- 2
It is clear that the optimal path will first move right to the node in group 1, and then move left until the end. But this is impossible to do for the BFS and Dijkstra's algorithm we introduced above! After we move from the start to the right to capture the node in group 1, we would never ever move back left to the start, since we have already been there with a shorter path.
The Trick
In the above example, if the right-hand side would have looked like start -- 0 -- 0, where 0 means the vertex does not not belonging to a group, then it would be of no use to go there and back to the start.
The decisive reason of why it makes sense to go there and come back, although the path will get longer, is that it includes a group that we have not seen before.
How can we keep track of whether or not at a current position a group is included or not? The most efficient solution is a bit mask. So if we for example have already visited a node of group 2 and 4, then the bitmask would have a bit set at the position 2 and 4, and it would have the value of 2 ^ 2 + 2 ^ 4 == 4 + 16 == 20
In the regular Dijkstra's we would just keep a one-dimensional array of size V to keep track of what the shortest path to each vertex is, initialized to a very high MAX value. array[start] begins with value 0.
We can modify this method to instead have a two-dimensional array of dimensions [2 ^ K][V], where K is the number of groups. Every value is initialized to MAX, only array[mask_value_of_start][start] begins with 0.
The value we store at array[mask][node] means Given the already visited groups with bit-mask value of mask, what is the length of the shortest path to reach this node?
Suddenly, Dijkstra's resurrected
Once we have this structure, we can suddenly use Dijkstra's again (it's the same for BFS). We simply change the rules a bit:
In regular Dijkstra's we never re-visit a node
--> in our modification we differentiate by mask and never re-visit a node if it's already been visited for that particular mask.
In regular Dijkstra's, when exploring a node, we look at all neighbors and only add them to the priority queue if we managed to decrease the shortest path to them.
--> in our modification we look at all neighbors, and update the mask we use to check for this neighbor like: neighbor_mask = mask | (1 << neighbor_group_id). We only add a {neighbor_mask, neighbor} pair to the priority queue, if for that particular array[neighbor_mask][neighbor] we managed to decrease the minimal path length.
In regular Dijkstra's we only visit unexplored nodes with the current shortest path to it, guaranteeing it to be the shortest path to this node
--> In our modification we only visit nodes that for their respective mask values are not explored yet. We also only visit the current shortest path among all masks, meaning that for any given mask it must be the shortest path.
In regular Dijkstra's we can return once we visit the end node, since we are sure we got the shortest path to it.
--> In our modification we can return once we visit the end node for the full mask, meaning the mask containing all groups, since it must be the shortest path for the full mask. This is the answer to our problem.
If this is too slow...
That's it! Because time and space complexity are exponentially dependent on the number of groups K, this will only work for very small K (of course depending on the number of nodes and edges).
If this is too slow for your requirements then there might be a more sophisticated algorithm for this that someone smarter can come up with, it will probably involve dynamic programming.
It is very possible that this is still too slow, in which case you will probably want to switch to some heuristic, that sacrifices accuracy in order to gain more speed.

simple path in Graph with special nodes with max amount of special nodes

I have this problem:
Given a graph G = (V,E), that has a subset of the nodes called, R, that are "special" nodes. The amount of special nodes can very from case to case. The graph can be directed, undirected, does not have weights, and can contain cycles.
Now, I need a algorithm that can find a path from a node s to a node t, that passes through a maximum amount of the "special" nodes in R.
Im aware that this problem is np-hard, and is easily reducible from hamiltionian path, but I i've been looking for different ways of solving it without having to bruteforce all paths.
First attempt
First I tried doing some preprocessing of the graph, where every edge that goes to a "normal" node, gets a weight of 2, and every edge to a node in R gets a weight of 0.
Then I would just run dijkstra on the graph.
A counterexample this could however look like this:
In this graph, dijkstra would pick path [s,4,t] even though path [s,1,2,3,t] is a actual simple path with the maximum amount of red nodes
Second Attempt
My second attempt was a bit more convoluted. In this attempt I would run a bfs from the s-node and each R-node in the graph. I would then createa a new reachabilitygraph that could model which R-nodes that are connected to each other.
This approach would run into major issues in any graph that has cycles or is not directed, as connections between R-nodes that did not exist in the original graph would be included in the new graph.
So if anyone has any bids on any smart preprocessing steps that I could take, I would be cery happy
Your first method seems good, for example:
weigh of all edges to some node v in R = 1
weigh of rest of edges = 0
Then run Dijkstra with a cutoff = max special nodes

Shortest path passing though some defined nodes

In a directed graph, find the shortest path from s to t such that the path passes through a certain subset of V, let's call them death nodes. The algorithm is given a number n, while traversing from s to t, the path cannot pass though more than n death nodes. What is the best way to find the shortest path, her? I am thiniing Dijkstra's, but how to make sure we are not passing though more than n nodes? Please help me tweak Dijkstra's to include this condition.
Small n
If n is small you can make n copies of your graph, call them levels 1 to n.
You start at s in level 1. If you are at a normal node, the edges take you to nodes within the same level. If you are at a death node, the edges take you to nodes within the next level. If you are at a death node on level n, the edges are simply omitted.
Also connect the t nodes at all levels to a new single destination T (with zero weight).
Then compute the shortest path from s to T.
The problem with this approach is that the graph size goes up by a factor of n, so it is only appropriate for small n.
Large n
An alternative approach is to increase the weight for each edge leaving a death node by a variable x.
As you increase the variable x, the shortest path will use fewer and fewer death nodes. Adjust the value for x (e.g. with bisection) until the graph only uses n death nodes.
This should take around O(logn) evaluations of the shortest path.
I'd add the number of dead nodes encountered on the way as a new (sparse) dimension to the computed distance -- basically you'd have up to n best distances per node.
Implementing your own BFS would be similar: You'll need to treat "seen with x dead nodes" different from "seen with y dead nodes" for each node, unless the total distance and number of dead nodes on the way are both smaller.
p.s.: If you get stuck with this approach, please post code so far O:)

Use O(n^2) time to fix a mistake in bipartite matching

This is a problem from Algorithm Design book.
Given a bipartite graph with vertices G=(V,E) where V=(A,B) such that |A|=|B|=n.
We manage to perfectly match n-2 nodes in A to n-2 nodes in B. However, for the remaining two nodes in A we map them both to a certain node in B (not one of the n-2 nodes in B that are already matched to.)
Given the information from the "matching" above, how to use O(n^2) time to decide whether a perfect matching between A and B actually exists? A hint is fine. Thank you.
Let's have u and v be the two nodes in A that match to the same node x in B. Pick one of those two nodes - call it u - and remove the edge to x from the matching. You are now left with a graph where you have a matching between n - 1 of the nodes from A and n - 1 of the nodes from B. The question now is whether you can extend this matching to make it even bigger.
There's a really nice way to do this using Berge's theorem, which says that a matching in a graph is maximum if and only if there is no alternating path between two unmatched nodes. (An alternating path is one that alternates between using edges not included in the matching and edges included in the matching). You can find a path like this by starting from the node u and trying to find a path to x by doing a modified binary search, where when you go from A to B you only follow unmatched edges and when you go from B back to A you only follow matched edges. If an alternating path exists from u to x, then you'll be sure to find it this way, and if no such path exists, then you can be certain of that as well.
If you do find an alternating path from u to x, you can "flip" it to increase the size of the matching by one. Specifically, take all the edges in the path that aren't in the matching and add them in, and take all the edges that were in the matching and delete them. The resulting is still a valid matching that has one more edge in it than what you started with (if you don't see why this is, play around with some examples and see what you find, or look at the proof of Berge's theorem).
Overall, this approach will require time O(m + n), where m is the number of edges in the graph and n is the number of nodes. The number of edges m is at most O(n2) in a bipartite graph, so this matches your time bound (and, in fact, is actually a bit tighter!)
Transform this problem to the max flow min cut problem by adding a source s which is connected to A by unit capacity edges and a sink t to which B is connected by unit capacity edges.
As templatetypedef said in their answer, we already have a flow of size n-1 on this network.
The problem is now to determine whether the size of the flow can be increased to n. This can be achieved by running one round of Edmonds-Karp heuristic which takes O(E)=O(n^2) time (i.e find the shortest path in the residual graph of the flow of size n-1 above and look for the bottleneck edge.)

Shortest path in absence of the given edge

Suppose we are given a weighted graph G(V,E).
The graph contains N vertices (Numbered from 0 to N-1) and M Bidirectional edges .
Each edge(vi,vj) has postive distance d (ie the distance between the two vertex vivj is d)
There is atmost one edge between any two vertex and also there is no self loop (ie.no edge connect a vertex to
itself.)
Also we are given S the source vertex and D the destination vertex.
let Q be the number of queries,each queries contains one edge e(x,y).
For each query,We have to find the shortest path from the source S to Destination D, assuming that edge (x,y) is absent in original graph.
If no any path exists from S to D ,then we have to print No.
Constraints are high 0<=(N,Q,M)<=25000
How to solve this problem efficiently?
Till now what i did is implemented the simple Dijakstra algorithm.
For each Query Q ,everytime i am assigning (x,y) to Infinity
and finding Dijakstra shortest path.
But this approach will be very slow as overall complexity will be Q(time complexity of Dijastra Shortes path)*
Example::
N=6,M=9
S=0 ,D=5
(u,v,cost(u,v))
0 2 4
3 5 8
3 4 1
2 3 1
0 1 1
4 5 1
2 4 5
1 2 1
1 3 3
Total Queries =6
Query edge=(0,1) Answer=7
Query edge=(0,2) Answer=5
Query edge=(1,3) Answer=5
Query edge=(2,3) Answer=6
Query edge=(4,5) Answer=11
Query edge=(3,4) Answer=8
First, compute the shortest path tree from source node to destination.
Second, loop over all the queries and cut the shortest path at the edge specified by the query; this defines a min-cut problem, where you have the distance between the source node and the frontier of one partition and the frontier of the another and the destination; you can compute this problem very easily, at most O(|E|).
Thus, this algorithm requires O(Q|E| + |V|log|V|), asymptotically faster than the naïve solution when |V|log|V| > |E|.
This solution reuses Dijkstra's computation, but still processes each query individually, so perhaps there are room to improvements by exploiting the work did in a previous query in successive queries by observing the shape of the cut induced by the edge.
For each query the graph changes only very slightly, so you can reuse a lot of your computation.
I suggest the following approach:
Compute the shortest path from S to all other nodes (Dijkstras Algorithm does that for you already). This will give you a shortest path tree T.
For each query, take this tree, pruned by the edge (x,y) from the query. This might be the original tree (if (x,y) was no where on the tree) or a smaller tree T'.
If D is in the T', you can take the original shortest path
Otherwise start Dijkstra, but use the labels you already have from the T' (these paths are already smallest) as permanent labels.
If you run the Dijkstra in step 2 you can reuse the pruned of part of tree T in the following way: Every time you want to mark a node permanent (which is one of the nodes not in T') you may attach the entire subtree of this node (from the original tree T) to your new shortest path tree and label all its nodes permanent.
This way you reuse as much information as possible from the first shortest path run.
In your example this would mean:
Compute shortest path tree:
0->1->2->3->4->5
(in this case a very simple)
Now assume we get query (1,2).
We prune edge (1,2) leaving us with
0->1
From there we start Dijkstra getting 2 and 3 as next permanent marked nodes.
We connect 1 to 2 and 1 to 3 in the new shortest path tree and attach the old subtree from 3:
2<-0->1->3->4->5
So we got the shortest path with just running one additional step of Dijkstras Algorithm.
The correctness of the algorithm follows from all paths in tree T being at most as long as in the new Graph from the Query (where every shortest path can only be longer). Therefore we can reuse every path from the tree that is still feasible (i.e. where no edge was removed).
If performance matters a lot, you can improve on the Dijkstra performance through a lot of implementation tricks. A good entry point for this might be the DIMACS Shortest Path Implementation Challenge.
One simple optimization: first run Dijkstra on complete graph (with no edges removed).
Then, for each query - check if the requested edge belongs to that shortest path. If not - removing this edge won't make any difference.

Resources