Consider the following graph:
nodes 1 to 6 are connected with a transition edge that have a direction and a volume property (red numbers). I'm looking for the right algorithm to find paths with a high volume. In the above example the output should be:
Path: [4,5,6] with a minimal volume of 17
Path: [1,2,3] with a
minimal volume of 15
I've looked at Floyd–Warshall algorithm but I'm not sure it's the right approach.
Any resources, comments or ideas would be appreciated.
Finding a beaten graph:
In the comments, you clarify that you are looking for "beaten" paths. I am assume this means that you are trying to contrast the paths with the average; for instance, looking for paths which can support weight at least e*w, where 0<e and w is the average edge weight. (You could have any number of contrast functions here, but the function you choose does not affect the algorithm.)
Then the algorithm to find all paths that meet this condition is incredibly simple and only takes O(m) time:
Loop over all edges to find the average weight. (Takes O(m) time.)
Calculate the threshold based on the average. (Takes O(1) time.)
Remove all edges which do not support the threshold weight. (Takes O(m) time.)
Any path in the resulting graph will be a member of the "widest path collection."
Example:
Consider that e=1.5. That is, you require that a beaten path support at least 1.5x the average edge weight. Then in graph you provided, you will loop over all the edges to find their average weight, and multiply this by e:
((20+4)+15+3+(2+20)+(1+1+17))/9 = 9.2
9.2*1.5 = 13.8
Then you loop over all edges, removing any that have weight less than 13.8. Any remaining paths in the graph are "beaten" paths.
Enumerating all beaten paths:
If you then want to find the set of beaten paths with maximal length (that is, they are not "parts" of paths), the modified graph is must be a DAG (because a cycle can be repeated infinite times). If it is a DAG, you can find the set of all maximal paths by:
In your modified graph, select the set of all source nodes (no incoming edges).
From each of these source nodes, perform a DFS (allowing repeated visits to the same node).
Every time you get to a sink node (no outgoing edges), write down the path that you took to get here.
This will take up to O(IncompleteGamma[n,1]) time (super exponential), depending on your graph. That is, it is not very feasible.
Finding the widest paths:
An actually much simpler task is to find the widest paths between every pair of nodes. To do this:
Start from the modified graph.
Run Floyd-Warshall's, using pathWeight(i,j,k+1) = max[pathWeight(i,j,k), min[pathWeight(i,k+1,k), pathWeight(k+1,j,k)]] (that is, instead of adding the weights of two paths, you take the minimum volume they can support).
Related
Let G be a directed weighted graph with nodes colored black or white, and all weights non-negative. No other information is specified--no start or terminal vertex.
I need to find a path (not necessarily simple) of minimal weight which alternates colors at least n times. My first thought is to run Kosaraju's algorithm to get the component graph, then find a minimal path between the components. Then you could select nodes with in-degree equal to zero since those will have at least as many color alternations as paths which start at components with in-degree positive. However, that also means that you may have an unnecessarily long path.
I've thought about maybe trying to modify the graph somehow, by perhaps making copies of the graph that black-to-white edges or white-to-black edges point into, or copying or deleting edges, but nothing that I'm brain-storming seems to work.
The comments mention using Dijkstra's algorithm, and in fact there is a way to make this work. If we create an new "root" vertex in the graph, and connect every other vertex to it with a directed edge, we can run a modified Dijkstra's algorithm from the root outwards, terminating when a given path's inversions exceeds n. It is important to note that we must allow revisiting each vertex in the implementation, so the key of each vertex in our priority queue will not be merely node_id, but a tuple (node_id, inversion_count), representing that vertex on its ith visit. In doing so, we implicitly make n copies of each vertex, one per potential visit. Visually, we are effectively making n copies of our graph, and translating the edges between each (black_vertex, white_vertex) pair to connect between the i and i+1th inversion graphs. We run the algorithm until we reach a path with n inversions. Alternatively, we can connect each vertex on the nth inversion graph to a "sink" vertex, and run any conventional path finding algorithm on this graph, unmodified. This will run in O(n(E + Vlog(nV))) time. You could optimize this quite heavily, and also consider using A* instead, with the smallest_inversion_weight * (n - inversion_count) as a heuristic.
Furthermore, another idea hit me regarding using knowledge of the inversion requirement to speedup the search, but I was unable to find a way to implement it without exceeding O(V^2) time. The idea is that you can use an addition-chain (like binary exponentiation) to decompose the shortest n-inversion path into two smaller paths, and rinse and repeat in a divide and conquer fashion. The issue is you would need to construct tables for the shortest i-inversion path from any two vertices, which would be O(V^2) entries per i, and O(V^2logn) overall. To construct each table, for every entry in the preceding table you'd need to append V other paths, so it'd be O(V^3logn) time overall. Maybe someone else will see a way to merge these two ideas into a O((logn)(E + Vlog(Vlogn))) time algorithm or something.
Question
How would one going about finding a least cost path when the destination is unknown, but the number of edges traversed is a fixed value? Is there a specific name for this problem, or for an algorithm to solve it?
Note that maybe the term "walk" is more appropriate than "path", I'm not sure.
Explanation
Say you have a weighted graph, and you start at vertex V1. The goal is to find a path of length N (where N is the number of edges traversed, can cross the same edge multiple times, can revisit vertices) that has the smallest cost. This process would need to be repeated for all possible starting vertices.
As an additional heuristic, consider a turn-based game where there are rooms connected by corridors. Each corridor has a cost associated with it, and your final score is lowered by an amount equal to each cost 'paid'. It takes 1 turn to traverse a corridor, and the game lasts 10 turns. You can stay in a room (self-loop), but staying put has a cost associated with it too. If you know the cost of all corridors (and for staying put in each room; i.e., you know the weighted graph), what is the optimal (highest-scoring) path to take for a 10-turn (or N-turn) game? You can revisit rooms and corridors.
Possible Approach (likely to fail)
I was originally thinking of using Dijkstra's algorithm to find least cost path between all pairs of vertices, and then for each starting vertex subset the LCP's of length N. However, I realized that this might not give the LCP of length N for a given starting vertex. For example, Dijkstra's LCP between V1 and V2 might have length < N, and Dijkstra's might have excluded an unnecessary but low-cost edge, which, if included, would have made the path length equal N.
It's an interesting fact that if A is the adjacency matrix and you compute Ak using addition and min in place of the usual multiply and sum used in normal matrix multiplication, then Ak[i,j] is the length of the shortest path from node i to node j with exactly k edges. Now the trick is to use repeated squaring so that Ak needs only log k matrix multiply ops.
If you need the path in addition to the minimum length, you must track where the result of each min operation came from.
For your purposes, you want the location of the min of each row of the result matrix and corresponding path.
This is a good algorithm if the graph is dense. If it's sparse, then doing one bread-first search per node to depth k will be faster.
What is the fastest way (algorithm) to calculate the number of Hamilton paths in an extremely dense undirected simple graph (approximately 99.99% edges are connected)?
I was thinking of the following way :
First, calculate the number of Hamilton paths in the complete graph.
Remove one edge at a time, but I am not able to figure out how many paths would be reduced on removing an edge. Also how to prevent double counting while removing the edges ?
I came across a similar question on Math.SE but that was about Hamilton cycles and not paths, I hope that changes the question significantly. Also the answers were not quite clear, hence this post.
I don't think you can calculate the number of Hamilton paths without
actually generating the paths or considering each path individually
when counting. For special graphs -- like the complete graph -- this
is certainly possible but not in general.
You could generate all Hamilton paths in the complete graph and check
for each one if it uses a subset of the edges in your graph. Of course
you can speed things up by already pruning certain branches while
generating the Hamilton paths in the complete graph.
Since your graph is very large, this approach is certainly not
feasible. However, you can calculate the number of all paths in the
complete graph that contain one of the missing edges and then subtract
this number.
I don't think this is trivial. Some thoughts on it: Let's consider the
simplest case that only one edge is missing. We can describe a path
with a sequence of edges or nodes. Let's say your graph has n
nodes. There are n-1 possible positions of the missing edge in a
hamilton path through the complete graph. The edge may be traversed in
two directions and the nodes not adjacent to the edge can be traversed
in (n-2)! different orders. Hence we can subtract
2 * (n-1) * (n-2)! = 2 * (n-1)!
from the total number of hamilton paths through the complete graph to
obtain the desired result.
If exactly two edges are missing we cannot just subtract twice the
number because we are counting several paths twice, namely the paths
containing both edges. So we have to calculate this number and add it
again. But now it becomes complicated: It is important how the edges
are related. If they are adjacent, the number is smaller than it would
be otherwise. So in general you cannot just calculate the number of
hamilton paths containing k of of the missing edges but it is
important which edges you are considering and whether they are
adjacent or not.
But let's say you can calculate the number of paths through a certain
selection of edges (all permutations, directions of traversal and
positions in the paths). Let's further assume that k edges are
missing. You can calculate the number of paths including at least one
of the edges like this:
Calculate the number of paths through any of the k edges individually
and sum them up.
For each pair of edges you have counted the paths traversing the pair
twice, so subtract these paths again (consider each pair
individually).
Now consider the paths containing three of the edges. They have been
counted six times and subtracted three times (3 different pairs), so
you have to subtract them twice.
The paths containing four edges have to be subtracted 3 times (because
they are represented 4 times in the paths containing 3 edges). And so
on.
But again: You have to consider each combination of edges
individually. It is even possible that a certain set of edges is
incompatible because a certain node occurs three times. Also take into
account the directions in which the edges are traversed.
So there is no simple formula but if the number of missing edges is
really small, you can count the paths.
I have a weighted graph G and a pair of nodes s and t. I want to find, of all the paths from s to t with the fewest number of edges, the one that has the lowest total cost. I'm not sure how to do this. Here are my thoughts:
I am thinking of finding the shortest path and if there are more than one path then i should compare the number of steps of these paths.
I think I can find the number of steps by setting the weights of all edges to 1 and calculate the distance.
A reasonable first guess for a place to start here is Dijkstra's algorithm, which can solve each individual piece of this problem (minimize number of edges, or minimize total length). The challenge is getting it to do both at the same time.
Normally, when talking about shortest paths, we think of paths as having a single cost. However, you could imagine assigning paths two different costs: one cost based purely on the number of edges, and one cost based purely on the weights of those edges. You could then represent the cost of a path as a pair (length, weight), where length is the number of edges in the path and weight is the total weight of all of those edges.
Imagine running Dijkstra's algorithm on a graph with the following modifications. First, instead of tracking a candidate distance to each node in the graph, you track a pair of candidate distances to each node: a candidate length and a candidate weight. Second, whenever you need to fetch the lowest-code node, pick the node that has the shortest length (not weight). If there's a tie between multiple nodes with the same length, break the tie by choosing the one with the lowest weight. (If you've heard about lexicographical orderings, you could consider this as taking the node whose (length, weight) is lexicographically first). Finally, whenever you update a distance by extending a path by one edge, update both the candidate length and the candidate weight to that node. You can show that this process will compute the best path to each node, where "best" means "of all the paths with the minimum number of edges, the one with the lowest cost."
You could alternatively implement the above technique by modifying all the costs of the edges in the graph. Suppose that the maximum-cost edge in the graph has cost U. Then do the following: Add U+1 to all the costs in the graph, then run Dijkstra's algorithm on the result. The net effect of this is that the shortest path in this new graph will be the one that minimizes the number of edges used. Why? Well, every edge adds U+1 to the cost of the path, and U+1 is greater than the cost of any edge in the graph, so if one path is cheaper than another, it either uses at least one fewer edge, or it uses the same number of edges but has cheaper weights. In fact, you can prove that this approach is essentially identical to the one above using pairs of weights - it's a good exercise!
Overall, both of these approaches will run in the same time as a normal Dijkstra's algorithm (O(m + n log n) with a Fibonacci heap, O(m log n) with another type of heap), which is pretty cool!
One node to another would be a shortest-path-algorithm (e.g. Dijkstra).
It depends on your input whether you use a heuristic function to determine the total distance to the goal-node.
If you consider heuristics, you might want to choose A*-search instead. Here you just have to accumulate the weights to each node and add the heuristic value according to it.
If you want to get all paths from any node to any other node, you might consider Kruskal’s or Prim’s algorithm.
Both to basically the same, incl. pruning.
Given a weighted directed multigraph, I have to find shortest path between starting vertex u to vertex v. Apart from weight, each edge also has time. The path connecting u and v cannot take more than a given maximum time. The trouble is while using Djikstra, there are chances that shortest path takes more time than the limit.
My approach is to find all valid paths between u and v and than minimize the weight. But the approach is not practical due to its high complexity.
Any ideas?
If weights are small enough
In this case, what you can do is for each node, store all possible sums of weights that you can get on path to that node. Now you can do dijsktra on this new graph and try to minimize time over nodes which are pairs (node, weight_sum).
If times are small enough
You can do the same as in previous example, but over pairs (node, time).
General problem
I'm afraid that in general all you can do is try all possible paths, try to improve it with prunning.