Number of Hamilton paths in an extremely dense undirected simple graph - algorithm

What is the fastest way (algorithm) to calculate the number of Hamilton paths in an extremely dense undirected simple graph (approximately 99.99% edges are connected)?
I was thinking of the following way :
First, calculate the number of Hamilton paths in the complete graph.
Remove one edge at a time, but I am not able to figure out how many paths would be reduced on removing an edge. Also how to prevent double counting while removing the edges ?
I came across a similar question on Math.SE but that was about Hamilton cycles and not paths, I hope that changes the question significantly. Also the answers were not quite clear, hence this post.

I don't think you can calculate the number of Hamilton paths without
actually generating the paths or considering each path individually
when counting. For special graphs -- like the complete graph -- this
is certainly possible but not in general.
You could generate all Hamilton paths in the complete graph and check
for each one if it uses a subset of the edges in your graph. Of course
you can speed things up by already pruning certain branches while
generating the Hamilton paths in the complete graph.
Since your graph is very large, this approach is certainly not
feasible. However, you can calculate the number of all paths in the
complete graph that contain one of the missing edges and then subtract
this number.
I don't think this is trivial. Some thoughts on it: Let's consider the
simplest case that only one edge is missing. We can describe a path
with a sequence of edges or nodes. Let's say your graph has n
nodes. There are n-1 possible positions of the missing edge in a
hamilton path through the complete graph. The edge may be traversed in
two directions and the nodes not adjacent to the edge can be traversed
in (n-2)! different orders. Hence we can subtract
2 * (n-1) * (n-2)! = 2 * (n-1)!
from the total number of hamilton paths through the complete graph to
obtain the desired result.
If exactly two edges are missing we cannot just subtract twice the
number because we are counting several paths twice, namely the paths
containing both edges. So we have to calculate this number and add it
again. But now it becomes complicated: It is important how the edges
are related. If they are adjacent, the number is smaller than it would
be otherwise. So in general you cannot just calculate the number of
hamilton paths containing k of of the missing edges but it is
important which edges you are considering and whether they are
adjacent or not.
But let's say you can calculate the number of paths through a certain
selection of edges (all permutations, directions of traversal and
positions in the paths). Let's further assume that k edges are
missing. You can calculate the number of paths including at least one
of the edges like this:
Calculate the number of paths through any of the k edges individually
and sum them up.
For each pair of edges you have counted the paths traversing the pair
twice, so subtract these paths again (consider each pair
individually).
Now consider the paths containing three of the edges. They have been
counted six times and subtracted three times (3 different pairs), so
you have to subtract them twice.
The paths containing four edges have to be subtracted 3 times (because
they are represented 4 times in the paths containing 3 edges). And so
on.
But again: You have to consider each combination of edges
individually. It is even possible that a certain set of edges is
incompatible because a certain node occurs three times. Also take into
account the directions in which the edges are traversed.
So there is no simple formula but if the number of missing edges is
really small, you can count the paths.

Related

Algorithm: Minimal path alternating colors

Let G be a directed weighted graph with nodes colored black or white, and all weights non-negative. No other information is specified--no start or terminal vertex.
I need to find a path (not necessarily simple) of minimal weight which alternates colors at least n times. My first thought is to run Kosaraju's algorithm to get the component graph, then find a minimal path between the components. Then you could select nodes with in-degree equal to zero since those will have at least as many color alternations as paths which start at components with in-degree positive. However, that also means that you may have an unnecessarily long path.
I've thought about maybe trying to modify the graph somehow, by perhaps making copies of the graph that black-to-white edges or white-to-black edges point into, or copying or deleting edges, but nothing that I'm brain-storming seems to work.
The comments mention using Dijkstra's algorithm, and in fact there is a way to make this work. If we create an new "root" vertex in the graph, and connect every other vertex to it with a directed edge, we can run a modified Dijkstra's algorithm from the root outwards, terminating when a given path's inversions exceeds n. It is important to note that we must allow revisiting each vertex in the implementation, so the key of each vertex in our priority queue will not be merely node_id, but a tuple (node_id, inversion_count), representing that vertex on its ith visit. In doing so, we implicitly make n copies of each vertex, one per potential visit. Visually, we are effectively making n copies of our graph, and translating the edges between each (black_vertex, white_vertex) pair to connect between the i and i+1th inversion graphs. We run the algorithm until we reach a path with n inversions. Alternatively, we can connect each vertex on the nth inversion graph to a "sink" vertex, and run any conventional path finding algorithm on this graph, unmodified. This will run in O(n(E + Vlog(nV))) time. You could optimize this quite heavily, and also consider using A* instead, with the smallest_inversion_weight * (n - inversion_count) as a heuristic.
Furthermore, another idea hit me regarding using knowledge of the inversion requirement to speedup the search, but I was unable to find a way to implement it without exceeding O(V^2) time. The idea is that you can use an addition-chain (like binary exponentiation) to decompose the shortest n-inversion path into two smaller paths, and rinse and repeat in a divide and conquer fashion. The issue is you would need to construct tables for the shortest i-inversion path from any two vertices, which would be O(V^2) entries per i, and O(V^2logn) overall. To construct each table, for every entry in the preceding table you'd need to append V other paths, so it'd be O(V^3logn) time overall. Maybe someone else will see a way to merge these two ideas into a O((logn)(E + Vlog(Vlogn))) time algorithm or something.

Maximal number of vertex pairs in undirected not weighted graph

Given undirected not weighted graph with any type of connectivity, i.e. it can contain from 1 to several components with or without single nodes, each node can have 0 to many connections, cycles are allowed (but no loops from node to itself).
I need to find the maximal amount of vertex pairs assuming that each vertex can be used only once, ex. if graph has nodes 1,2,3 and node 3 is connected to nodes 1 and 2, the answer is one (1-3 or 2-3).
I am thinking about the following approach:
Remove all single nodes.
Find the edge connected a node with minimal number of edges to node with maximal number of edges (if there are several - take any of them), count and remove this pair of nodes from graph.
Repeat step 2 while graph has connected nodes.
My questions are:
Does it provide maximal number of pairs for any case? I am
worrying about some extremes, like cycles connected with some
single or several paths, etc.
Is there any faster and correct algorithm?
I can use java or python, but pseudocode or just algo description is perfectly fine.
Your approach is not guaranteed to provide the maximum number of vertex pairs even in the case of a cycle-free graph. For example, in the following graph your approach is going to select the edge (B,C). After that unfortunate choice, there are no more vertex pairs to choose from, and therefore you'll end up with a solution of size 1. Clearly, the optimal solution contains two vertex pairs, and hence your approach is not optimal.
The problem you're trying to solve is the Maximum Matching Problem (not to be confused with the Maximal Matching Problem which is trivial to solve):
Find the largest subset of edges S such that no vertex is incident to more than one edge in S.
The Blossom Algorithm solves this problem in O(EV^2).
The way the algorithm works is not straightforward and it introduces nontrivial notions (like a contracted matching, forest expansions and blossoms) to establish the optimal matching. If you just want to use the algorithm without fully understanding its intricacies you can find ready-to-use implementations of it online (such as this Python implementation).

Find the lowest-cost shortest path from one node to another?

I have a weighted graph G and a pair of nodes s and t. I want to find, of all the paths from s to t with the fewest number of edges, the one that has the lowest total cost. I'm not sure how to do this. Here are my thoughts:
I am thinking of finding the shortest path and if there are more than one path then i should compare the number of steps of these paths.
I think I can find the number of steps by setting the weights of all edges to 1 and calculate the distance.
A reasonable first guess for a place to start here is Dijkstra's algorithm, which can solve each individual piece of this problem (minimize number of edges, or minimize total length). The challenge is getting it to do both at the same time.
Normally, when talking about shortest paths, we think of paths as having a single cost. However, you could imagine assigning paths two different costs: one cost based purely on the number of edges, and one cost based purely on the weights of those edges. You could then represent the cost of a path as a pair (length, weight), where length is the number of edges in the path and weight is the total weight of all of those edges.
Imagine running Dijkstra's algorithm on a graph with the following modifications. First, instead of tracking a candidate distance to each node in the graph, you track a pair of candidate distances to each node: a candidate length and a candidate weight. Second, whenever you need to fetch the lowest-code node, pick the node that has the shortest length (not weight). If there's a tie between multiple nodes with the same length, break the tie by choosing the one with the lowest weight. (If you've heard about lexicographical orderings, you could consider this as taking the node whose (length, weight) is lexicographically first). Finally, whenever you update a distance by extending a path by one edge, update both the candidate length and the candidate weight to that node. You can show that this process will compute the best path to each node, where "best" means "of all the paths with the minimum number of edges, the one with the lowest cost."
You could alternatively implement the above technique by modifying all the costs of the edges in the graph. Suppose that the maximum-cost edge in the graph has cost U. Then do the following: Add U+1 to all the costs in the graph, then run Dijkstra's algorithm on the result. The net effect of this is that the shortest path in this new graph will be the one that minimizes the number of edges used. Why? Well, every edge adds U+1 to the cost of the path, and U+1 is greater than the cost of any edge in the graph, so if one path is cheaper than another, it either uses at least one fewer edge, or it uses the same number of edges but has cheaper weights. In fact, you can prove that this approach is essentially identical to the one above using pairs of weights - it's a good exercise!
Overall, both of these approaches will run in the same time as a normal Dijkstra's algorithm (O(m + n log n) with a Fibonacci heap, O(m log n) with another type of heap), which is pretty cool!
One node to another would be a shortest-path-algorithm (e.g. Dijkstra).
It depends on your input whether you use a heuristic function to determine the total distance to the goal-node.
If you consider heuristics, you might want to choose A*-search instead. Here you just have to accumulate the weights to each node and add the heuristic value according to it.
If you want to get all paths from any node to any other node, you might consider Kruskal’s or Prim’s algorithm.
Both to basically the same, incl. pruning.

How to find widest paths collection on a directed weighted graph

Consider the following graph:
nodes 1 to 6 are connected with a transition edge that have a direction and a volume property (red numbers). I'm looking for the right algorithm to find paths with a high volume. In the above example the output should be:
Path: [4,5,6] with a minimal volume of 17
Path: [1,2,3] with a
minimal volume of 15
I've looked at Floyd–Warshall algorithm but I'm not sure it's the right approach.
Any resources, comments or ideas would be appreciated.
Finding a beaten graph:
In the comments, you clarify that you are looking for "beaten" paths. I am assume this means that you are trying to contrast the paths with the average; for instance, looking for paths which can support weight at least e*w, where 0<e and w is the average edge weight. (You could have any number of contrast functions here, but the function you choose does not affect the algorithm.)
Then the algorithm to find all paths that meet this condition is incredibly simple and only takes O(m) time:
Loop over all edges to find the average weight. (Takes O(m) time.)
Calculate the threshold based on the average. (Takes O(1) time.)
Remove all edges which do not support the threshold weight. (Takes O(m) time.)
Any path in the resulting graph will be a member of the "widest path collection."
Example:
Consider that e=1.5. That is, you require that a beaten path support at least 1.5x the average edge weight. Then in graph you provided, you will loop over all the edges to find their average weight, and multiply this by e:
((20+4)+15+3+(2+20)+(1+1+17))/9 = 9.2
9.2*1.5 = 13.8
Then you loop over all edges, removing any that have weight less than 13.8. Any remaining paths in the graph are "beaten" paths.
Enumerating all beaten paths:
If you then want to find the set of beaten paths with maximal length (that is, they are not "parts" of paths), the modified graph is must be a DAG (because a cycle can be repeated infinite times). If it is a DAG, you can find the set of all maximal paths by:
In your modified graph, select the set of all source nodes (no incoming edges).
From each of these source nodes, perform a DFS (allowing repeated visits to the same node).
Every time you get to a sink node (no outgoing edges), write down the path that you took to get here.
This will take up to O(IncompleteGamma[n,1]) time (super exponential), depending on your graph. That is, it is not very feasible.
Finding the widest paths:
An actually much simpler task is to find the widest paths between every pair of nodes. To do this:
Start from the modified graph.
Run Floyd-Warshall's, using pathWeight(i,j,k+1) = max[pathWeight(i,j,k), min[pathWeight(i,k+1,k), pathWeight(k+1,j,k)]] (that is, instead of adding the weights of two paths, you take the minimum volume they can support).

Finding maximum number k such that for all combinations of k pairs, we have k different elements in each combination

We are given N pairs. Each pair contains two numbers. We have to find maximum number K such that if we take any combination of J (1<=J<=K) pairs from the given N pairs, we have at least J different numbers in all those selected J pairs. We can have more than one pair same.
For example, consider the pairs
(1,2)
(1,2)
(1,2)
(7,8)
(9,10)
For this case K = 2, because for K > 2, if we select three pairs of (1,2), we have only two different numbers i.e 1 and 2.
Checking for each possible combination starting from one will take a very large amount of time. What would be an efficient algorithm for solving the problem?
Create a graph with one vertex for each number and one edge for each pair.
If this graph is a chain or a tree, we have the number of "numbers", equal to number of "pairs" plus one, After removing any number of edges from this graph, we never get less vertexes than edges.
Now add a single cycle to this chain/tree. There is equal number of vertexes and edges. After removing any number of edges from this graph, again we never get less vertexes than edges.
Now add any number of disconnected components, each should not contain more than one cycle. Once again, we never get less vertexes than edges after removing any number of edges.
Now add a second cycle to any of disconnected components. After removing all other components. at last we have more edges than vertexes (more pairs than numbers).
All this leads to the conclusion that K+1 is exactly the number of edges in the smallest possible subgraph, consisting of two cycles and, possibly, a chain, connecting these cycles.
Algorithm:
For each connected component, find the shortest cycle going through every node with Floyd-Warshall algorithm.
Then for each non-overlapping pair of cycles (in single component), use Dijkstra’s algorithm, starting from any node with at least 3 edges in one cycle, to find shortest path to other cycle; and compute a sum of lengths of both cycles and a shortest path, connecting them. For each overlapping pair of cycles, just compute the number of their edges.
Now find the minimum length of all these subgraphs. And subtract 1.
The above algorithm computes K if there is at least one double-cycle component in the graph. If there are no such components, K = N.
Seems related to MinCut/MaxFlow. Here is a try to reduce it to MinCut/MaxFlow:
- Produce one vertex for each number
- Produce one vertex for each pair
- Produce an edge from number i to a pair if the number is present in the pair, weight 1
- Produce a source node and connect it to all numbers, weight 1 for each connection
- Produce a sink node and connect it to all numbers, weight 1 for each connection
Running MaxFlow on this should give you the number K, since any set of three pairs which only contains two numbers in total, will be "blocked" by the constrains on the outgoing edges from the number.
I am not sure whether this is the fastest solution. There might also be a matroid hidden in there somewhere, I think. In that case there is a greedy approach. But I cannot find a proof for the matroid properties of the sets you are constructing.
I made some progress on it, but not yet an efficient solution. However it may point the way.
Make a graph whose points are pairs, and connect any pair of points if they share a number. Then for any subgraph, the number of numbers in it is the number of vertices minus the number of edges. Therefore your problem is the same as locating the smallest subgraph (if any) that has more edges than vertices.
A minimal subgraph that has the same number of edges and vertices is a cycle. Therefore the graphs we're looking for are either 2 cycles that share one or more vertices, or else 2 cycles which are connected by a path. There are no other minimal types possible.
You can locate and enumerate cycles fairly easily with a breadth-first search. There may be a lot of them, but this is doable. Armed with that you can look for subgraphs of these subtypes. (Enumerate minimal cycles, look for either pairs that share points, or which are connected.) But that isn't guaranteed to be polynomial. I suspect it will be something where on average it is pretty good, but the worst case is very bad. However that may be more efficient than what you're doing now.
I keep on thinking that some kind of breadth-first search can find these in polynomial time, but I keep failing to see exactly how to do it.
This is equivalent to finding the chord that chords the smallest cycle in the graph. A very naive algorithm would be:
Check if removal of an edge results in a cycle containing the vertices corresponding to the edge. If yes, then note down the length of the smallest cycle.

Resources