Good algorithm for finding shortest path for specific vertices - algorithm

I'm solving the problem described below and can't think of a better algorithm than trying every permutation of every vertex of every group with every.
I'm given a graph of vertices, along with a list of groups of specific vertices, the goal is to find the shortest path from a specific starting vertex to a specific ending vertex, and the path must pass through at least one vertex from each specified group of vertices.
There are also vertices in the graph that are not part of any given group.
Re-visiting vertices and edges is possible.
The graph data is specified as follows:
Vertex list - each vertex is identified by a sequence number (0 to the number of vertices -1 )
Edge list - list of vertex pairs (by vertex number)
Vertex group list - list of lists of vector numbers
A specific starting and ending vertex.
I would be grateful for any ideas for a better solution, thank you.

Summary:
We can use bitmasks to efficiently check which groups we have visited so far, and combine this with a traditional BFS/ Dijkstra's shortest-path algorithm.
If we assume E edges, V vertices, and K vertex-groups that have to be included, the below algorithm has a time complexity of O((V + E) * 2^K) and a space complexity of O(V * 2^K). The exponential 2^K term means it will only work for a relatively small K, say up to 10 or 20.
Details:
First, are the edges weighted?
If yes then a "shortest path" algorithm will usually be a variation of Dijkstra's algorithm, in which we keep a (min) priority queue of the shortest paths. We only visit a node once it's at the top of the queue, meaning that this must be the shortest path to this node. Any other shorter path to this node would already have been added to the priority queue and would come before the current iteration. (Note: this doesn't work for negative paths).
If no, meaning all edges have the same weight, then there is no need to maintain a priority queue with the shortest edges. We can instead just run a regular Breadth-first search (BFS), in which we maintain a deque with all nodes at the current depth. At each step we iterate over all nodes at the current depth (popping them from the left of the deque), and for each node we add all it's not-yet-visited neighbors to the right side of the deque, forming the next level.
The below algorithm works for both BFS and Dijkstra's, but for simplicity's sake for the rest of the answer I'll pretend that the edges have positive weights and we will use Dijkstra's. What is important to take away though is that for either algorithm we will only "visit" or "explore" a node for a path that must be the shortest path to that node. This property is essential for the algorithm to be efficient, since we know that we will at most visit each of the V nodes and E edges only one time, giving us a time complexity of O(V + E). If we use Dijkstra's we have to multiply this with log(V) for the priority queue usage (this also applies to the time complexity mentioned in the summary).
Our Problem
In our case we have the additional complexity that we have K vertex-groups, for each of which our shortest path has to contain at least one the nodes in it. This is a big problem, since it destroys our ability to simple go along with the "shortest current path".
See for example this simple graph. Notation: -- means an edge, start is that start node, and end is the end node. A vertex with value 0 does not have a vertex-group, and a vertex with value >= 1 belongs to the vertex-group of that index.
end -- 0 -- 2 -- start -- 1 -- 2
It is clear that the optimal path will first move right to the node in group 1, and then move left until the end. But this is impossible to do for the BFS and Dijkstra's algorithm we introduced above! After we move from the start to the right to capture the node in group 1, we would never ever move back left to the start, since we have already been there with a shorter path.
The Trick
In the above example, if the right-hand side would have looked like start -- 0 -- 0, where 0 means the vertex does not not belonging to a group, then it would be of no use to go there and back to the start.
The decisive reason of why it makes sense to go there and come back, although the path will get longer, is that it includes a group that we have not seen before.
How can we keep track of whether or not at a current position a group is included or not? The most efficient solution is a bit mask. So if we for example have already visited a node of group 2 and 4, then the bitmask would have a bit set at the position 2 and 4, and it would have the value of 2 ^ 2 + 2 ^ 4 == 4 + 16 == 20
In the regular Dijkstra's we would just keep a one-dimensional array of size V to keep track of what the shortest path to each vertex is, initialized to a very high MAX value. array[start] begins with value 0.
We can modify this method to instead have a two-dimensional array of dimensions [2 ^ K][V], where K is the number of groups. Every value is initialized to MAX, only array[mask_value_of_start][start] begins with 0.
The value we store at array[mask][node] means Given the already visited groups with bit-mask value of mask, what is the length of the shortest path to reach this node?
Suddenly, Dijkstra's resurrected
Once we have this structure, we can suddenly use Dijkstra's again (it's the same for BFS). We simply change the rules a bit:
In regular Dijkstra's we never re-visit a node
--> in our modification we differentiate by mask and never re-visit a node if it's already been visited for that particular mask.
In regular Dijkstra's, when exploring a node, we look at all neighbors and only add them to the priority queue if we managed to decrease the shortest path to them.
--> in our modification we look at all neighbors, and update the mask we use to check for this neighbor like: neighbor_mask = mask | (1 << neighbor_group_id). We only add a {neighbor_mask, neighbor} pair to the priority queue, if for that particular array[neighbor_mask][neighbor] we managed to decrease the minimal path length.
In regular Dijkstra's we only visit unexplored nodes with the current shortest path to it, guaranteeing it to be the shortest path to this node
--> In our modification we only visit nodes that for their respective mask values are not explored yet. We also only visit the current shortest path among all masks, meaning that for any given mask it must be the shortest path.
In regular Dijkstra's we can return once we visit the end node, since we are sure we got the shortest path to it.
--> In our modification we can return once we visit the end node for the full mask, meaning the mask containing all groups, since it must be the shortest path for the full mask. This is the answer to our problem.
If this is too slow...
That's it! Because time and space complexity are exponentially dependent on the number of groups K, this will only work for very small K (of course depending on the number of nodes and edges).
If this is too slow for your requirements then there might be a more sophisticated algorithm for this that someone smarter can come up with, it will probably involve dynamic programming.
It is very possible that this is still too slow, in which case you will probably want to switch to some heuristic, that sacrifices accuracy in order to gain more speed.

Related

What is the algorithm that finds the minimal highest cost of all edges?

I'm trying to solve a problem where I need to find the minimal cost per step to get from a start to a goal node. I think this algorithm exists, but I can not find the name of this algorithm. In the case I am working on there are only positive edges and there could be cycles.
It is not dijkstra's, because I am not looking for the total minimum cost, but for a cost that represents the minimal highest cost of all the steps.
In the following example this algorithm would thus output 3 as 3 is the highest minimal cost the algorithm can find a path for.
And is thus not the minimal cost, as that would be 4.
*The start node is gray and the goal node is green.
I think such an algorithm exists, I have tried searching on google, but so far could not find the name of this algorithm.
This can be solved with a simple modification on dijkstra.
Dijkstra works by always picking the minimum cost path. We know that as long as path costs never decrease as we move in the graph (this is true in your case), we'll always find the optimal answer by iterating in order from lowest to highest path cost. We just have to modify the cost function to be the maximum across each path and run dijkstra.
Here's a pseudocode (basically python) implementation:
import priority_queue
def min_cost_path(start, end):
min_heap = priority_queue([[0, start]]) # queue in form of (cost, node)
visited = set()
while min_heap:
# get lowest weight path thus far
# cost is the lowest cost possible to get to node
cost, node = min_heap.pop()
# optimal path to this node has already been found, ignore this
if node in visited: continue
if node == end: return cost
# this node has been visited
visited.add(node)
# add candidate node-weights to the queue
for weight, neighbor in neighbors(node):
if neighbor not in visited:
min_heap.push((max(weight, cost), neighbor))
return -1 # not possible
Well I have only heard about such problem for un-directed graphs, for directed ones (like your example) I do not know how it's called, yet its not hard to think up some efficient way to solve it:
We can just binary search for the answer, initial search space is
[0, maxWeightInWholeGraph]
During each iteration of binary search we pick some middle value m and we need to check if there exist a path from start node to goal node with edge weights <= m
This can be done by simple BFS, only traversing allowed edges
Now we divide our search space by half choosing left part if we found start-goal path and right part otherwise.
continue the binary search till we converge to answer
Complexity of this approach: O( (|V| + |E|) * log2(maxWeightInWholeGraph) )

Algorithm: Minimal path alternating colors

Let G be a directed weighted graph with nodes colored black or white, and all weights non-negative. No other information is specified--no start or terminal vertex.
I need to find a path (not necessarily simple) of minimal weight which alternates colors at least n times. My first thought is to run Kosaraju's algorithm to get the component graph, then find a minimal path between the components. Then you could select nodes with in-degree equal to zero since those will have at least as many color alternations as paths which start at components with in-degree positive. However, that also means that you may have an unnecessarily long path.
I've thought about maybe trying to modify the graph somehow, by perhaps making copies of the graph that black-to-white edges or white-to-black edges point into, or copying or deleting edges, but nothing that I'm brain-storming seems to work.
The comments mention using Dijkstra's algorithm, and in fact there is a way to make this work. If we create an new "root" vertex in the graph, and connect every other vertex to it with a directed edge, we can run a modified Dijkstra's algorithm from the root outwards, terminating when a given path's inversions exceeds n. It is important to note that we must allow revisiting each vertex in the implementation, so the key of each vertex in our priority queue will not be merely node_id, but a tuple (node_id, inversion_count), representing that vertex on its ith visit. In doing so, we implicitly make n copies of each vertex, one per potential visit. Visually, we are effectively making n copies of our graph, and translating the edges between each (black_vertex, white_vertex) pair to connect between the i and i+1th inversion graphs. We run the algorithm until we reach a path with n inversions. Alternatively, we can connect each vertex on the nth inversion graph to a "sink" vertex, and run any conventional path finding algorithm on this graph, unmodified. This will run in O(n(E + Vlog(nV))) time. You could optimize this quite heavily, and also consider using A* instead, with the smallest_inversion_weight * (n - inversion_count) as a heuristic.
Furthermore, another idea hit me regarding using knowledge of the inversion requirement to speedup the search, but I was unable to find a way to implement it without exceeding O(V^2) time. The idea is that you can use an addition-chain (like binary exponentiation) to decompose the shortest n-inversion path into two smaller paths, and rinse and repeat in a divide and conquer fashion. The issue is you would need to construct tables for the shortest i-inversion path from any two vertices, which would be O(V^2) entries per i, and O(V^2logn) overall. To construct each table, for every entry in the preceding table you'd need to append V other paths, so it'd be O(V^3logn) time overall. Maybe someone else will see a way to merge these two ideas into a O((logn)(E + Vlog(Vlogn))) time algorithm or something.

Shortest path passing though some defined nodes

In a directed graph, find the shortest path from s to t such that the path passes through a certain subset of V, let's call them death nodes. The algorithm is given a number n, while traversing from s to t, the path cannot pass though more than n death nodes. What is the best way to find the shortest path, her? I am thiniing Dijkstra's, but how to make sure we are not passing though more than n nodes? Please help me tweak Dijkstra's to include this condition.
Small n
If n is small you can make n copies of your graph, call them levels 1 to n.
You start at s in level 1. If you are at a normal node, the edges take you to nodes within the same level. If you are at a death node, the edges take you to nodes within the next level. If you are at a death node on level n, the edges are simply omitted.
Also connect the t nodes at all levels to a new single destination T (with zero weight).
Then compute the shortest path from s to T.
The problem with this approach is that the graph size goes up by a factor of n, so it is only appropriate for small n.
Large n
An alternative approach is to increase the weight for each edge leaving a death node by a variable x.
As you increase the variable x, the shortest path will use fewer and fewer death nodes. Adjust the value for x (e.g. with bisection) until the graph only uses n death nodes.
This should take around O(logn) evaluations of the shortest path.
I'd add the number of dead nodes encountered on the way as a new (sparse) dimension to the computed distance -- basically you'd have up to n best distances per node.
Implementing your own BFS would be similar: You'll need to treat "seen with x dead nodes" different from "seen with y dead nodes" for each node, unless the total distance and number of dead nodes on the way are both smaller.
p.s.: If you get stuck with this approach, please post code so far O:)

least cost path, destination unknown

Question
How would one going about finding a least cost path when the destination is unknown, but the number of edges traversed is a fixed value? Is there a specific name for this problem, or for an algorithm to solve it?
Note that maybe the term "walk" is more appropriate than "path", I'm not sure.
Explanation
Say you have a weighted graph, and you start at vertex V1. The goal is to find a path of length N (where N is the number of edges traversed, can cross the same edge multiple times, can revisit vertices) that has the smallest cost. This process would need to be repeated for all possible starting vertices.
As an additional heuristic, consider a turn-based game where there are rooms connected by corridors. Each corridor has a cost associated with it, and your final score is lowered by an amount equal to each cost 'paid'. It takes 1 turn to traverse a corridor, and the game lasts 10 turns. You can stay in a room (self-loop), but staying put has a cost associated with it too. If you know the cost of all corridors (and for staying put in each room; i.e., you know the weighted graph), what is the optimal (highest-scoring) path to take for a 10-turn (or N-turn) game? You can revisit rooms and corridors.
Possible Approach (likely to fail)
I was originally thinking of using Dijkstra's algorithm to find least cost path between all pairs of vertices, and then for each starting vertex subset the LCP's of length N. However, I realized that this might not give the LCP of length N for a given starting vertex. For example, Dijkstra's LCP between V1 and V2 might have length < N, and Dijkstra's might have excluded an unnecessary but low-cost edge, which, if included, would have made the path length equal N.
It's an interesting fact that if A is the adjacency matrix and you compute Ak using addition and min in place of the usual multiply and sum used in normal matrix multiplication, then Ak[i,j] is the length of the shortest path from node i to node j with exactly k edges. Now the trick is to use repeated squaring so that Ak needs only log k matrix multiply ops.
If you need the path in addition to the minimum length, you must track where the result of each min operation came from.
For your purposes, you want the location of the min of each row of the result matrix and corresponding path.
This is a good algorithm if the graph is dense. If it's sparse, then doing one bread-first search per node to depth k will be faster.

algorithm to find the total number of ways to reach the last layer from the initial one of a directed graph

I want an algorithm to find the total number of ways to reach the last layer from the initial one of a directed graph whose last and first layers contain only one node .Please suggest which algorithm should i use .
If there is cycle on the path from the first node to the last one, the number of paths is infinitely large.
Otherwise, there are no cycles, so a part of the graph we are interested in is acyclic(there a can be a cycle somewhere in the graph, but if does not lie on the path between the first and last node, it does not matter). That's why can use dynamic programming to count the number of paths:
The base case: f(start node) = 1.
Transitions: f(node) = sum f(neighbor) for all neighbors of the node. We can compute this value correctly because there is no cycle.
The answer is f(last node).
We can either use topological sort explicitly or write a recursive function with memoization(in both cases, the time and space complexity is linear).

Resources