Induced subgraph; existence of path between two nodes - algorithm

Sorry for the wall of text, its as concise as I could make it!
I've got one very large directed graph, G, and subset of vertices, S, from within G. What I want to do is find the subgraph of G induced by S, with the additional consideration that if some path exists between a vertex p and a vertex q in G, that an edge exists between these two vertices in the induced subgraph. This is key; its a little more complicated (I think) than the usual induced subgraph problem.
The most rudimentary way I can think of to solve the problem is the following (I realize its probably not the most efficient, let me know if you have other suggestions that aren't too complicated to implement): For every pair of vertices within S, test for the existence of a path between them in G. If such a path exists, insert an edge between p and q in the induced subgraph. For my purposes, an n^2 time isn't that bad.
So, I suppose I have two questions:
1) What is the fastest way to determine whether or not a path EXISTS between two vertices? I don't need to know the path, just whether or not it exists. Furthermore, if there is some preprocessing I can do to the whole graph to make this calculation faster, what might it be, since I have to perform this calculation between each pair of vertices?
2) Is there a faster way than the one I suggested to find the type of induced subgraph I described?
Thanks so much for the help!

The problem of finding whether a path exists between two vertices is called the transitive closure problem, and it's as hard as matrix multiplication in the general case. I would first run a strongly connected components algorithm on your graph to compress cycles into a single node and form a directed graph. If you are lucky, you'll have some big cycles and that will make the subsequent transitive problem easy. Then I'd run the Floyd Warshall all pairs shortest paths algorithm on that graph to compute the transitive closure because it's incredibly simple to code. Maybe one of the o(n^3) matrix multiplication based algorithm will be faster, but I doubt it will be that much faster because the constant is so low Floyd Warhsall.
Here is a fast algorithm for strongly connected components.
And this contains a proof of the equivalence of matrix multiplication and transitive closure.
I am not sure if there is any good way to get around computing the transitive closure to solve your original problem. I suspect not, but on the other hand, sometimes clever people come up with something great.

Related

Minimum number of non-intersecting simple cycles in unweighted directed graph

I decided to try implement some assignment problem algorithms. I already did some, but I got stuck on the problem described below:
To put it simply, I need to cover all its vertices with the minimum number of non-intersecting simple cycles.
But I don't understand how, does anyone have any ideas? I would be especially glad to see an explanation.
This problem is NP-hard via a reduction from the Hamiltonian cycle problem. More specifically, if a graph has a Hamiltonian cycle, then you can cover all the vertices with a single simple cycle, namely the Hamiltonian cycle, and otherwise the graph requires multiple cycles to cover its nodes (if it can even be done at all).
As a result, unless P = NP, there are no polynomial-time algorithms for this problem. You can still solve it using either heuristic searches or brute force, but those approaches won’t necessarily be fast on all inputs.

How is Matrix Chain Multiplication a special case of shortest path in DAG?

Can DP algorithm for Matrix Chain Multiplication be modeled as shortest path in DAG? I read somewhere that every DP problem is a walk on an implicit DAG but I am unable to visualize those problems in which a transition leads to more than one state ( or sub-state ).
One more example where I fail to visulize the same is UVA 10003. A DP solution of the above is discussed here: Cutting a stick such that cost is minimized.
Imagine that there is a directed edge between two states if we can go from the first state to the second one(of course, a state can consist of several parameters). There are no cycles in this graph, so it is DAG. So visualizing a DAG itself is not hard(you can just write down all states and edges between them). But is not necessary can modeled as a shortest path search. For example, in a problem about cutting a rope the value for a state is a sum of values for two other states, so it is not even a path. Anyway, it might impractical to visualize a solution if the number of parameters is very big. And there is no need to do any visualization to solve a problem and prove the correctness of your solution.

Hamiltonian Cycle algorithm

I was looking for some hamiltonian cycle algorithms, but I can't find any implementations, not even a single pseudo-code ! I don't even need to output the cycle, just check if the graph has one. The input is a graph with V vertices and E edges. Also, I would like to have an algorithm to check if a graph has a hamiltonian path. I don't need to output the path, just check if it has one. Both should be in polynomial time.
The problem is one of the NP-Complete problems.
A brute force algorithm is just creating all permutations and checking if one of them is feasible solution.
Checking the feasibility:
let the current permutation be v1,v2,...,vn: if for each i there is an edge v_i -> v_(i+1) in the graph, and also v_n->v1 - then the solution is feasible.
An alternative is creating a graph G'=(V,E',w) where the new edges E' = VxV (all edges) and the weight function is:
w(u,v) = 1 if there is an edge (u,v) in the original graph
infinity otherwise.
Now you got yourself a Traveling-salesman problem, and can solve it with dynamic programming in O(n^2*2^n)
Unless P = NP, Hamiltonicity can not be decided for general graphs in polynomial time.
An online HCP heuristic exists at http://fhcp.edu.au/slhweb/ where you can upload a graph and test it, but if you want your own function you will either need to write it yourself, or splice in someone else's function. Andrew Chalaturnyk wrote a very good algorithm.

Approximation algorithm for TSP variant, fixed start and end anywhere but starting point + multiple visits at each vertex ALLOWED

NOTE: Due to the fact that the trip does not end at the same place it started and also the fact that every point can be visited more than once as long as I still visit all of them, this is not really a TSP variant, but I put it due to lack of a better definition of the problem.
So..
Suppose I am going on a hiking trip with n points of interest. These points are all connected by hiking trails. I have a map showing all trails with their distances, giving me a directed graph.
My problem is how to approximate a tour that starts at a point A and visits all n points of interest, while ending the tour anywhere but the point where I started and I want the tour to be as short as possible.
Due to the nature of hiking, I figured this would sadly not be a symmetric problem (or can I convert my asymmetric graph to a symmetric one?), since going from high to low altitude is obviously easier than the other way around.
Also I believe it has to be an algorithm that works for non-metric graphs, where the triangle inequality is not satisfied, since going from a to b to c might be faster than taking a really long and weird road that goes from a to c directly. I did consider if triangle inequality still holds, since there are no restrictions regarding how many times I visit each point, as long as I visit all of them, meaning I would always choose the shortest of two distinct paths from a to c and thus never takr the long and weird road.
I believe my problem is easier than TSP, so those algorithms do not fit this problem. I thought about using a minimum spanning tree, but I have a hard time convincing myself that they can be applied to a non-metric asymmetric directed graph.
What I really want are some pointers as to how I can come up with an approximation algorithm that will find a near optimal tour through all n points
To reduce your problem to asymmetric TSP, introduce a new node u and make arcs of length L from u to A and from all nodes but A to u, where L is very large (large enough that no optimal solution revisits u). Delete u from the tour to obtain a path from A to some other node via all others. Unfortunately this reduction preserves the objective only additively, which make the approximation guarantees worse by a constant factor.
The target of the reduction Evgeny pointed out is non-metric symmetric TSP, so that reduction is not useful to you, because the approximations known all require metric instances. Assuming that the collection of trails forms a planar graph (or is close to it), there is a constant-factor approximation due to Gharan and Saberi, which may unfortunately be rather difficult to implement, and may not give reasonable results in practice. Frieze, Galbiati, and Maffioli give a simple log-factor approximation for general graphs.
If there are a reasonable number of trails, branch and bound might be able to give you an optimal solution. Both G&S and branch and bound require solving the Held-Karp linear program for ATSP, which may be useful in itself for evaluating other approaches. For many symmetric TSP instances that arise in practice, it gives a lower bound on the cost of an optimal solution within 10% of the true value.
You can simplify this problem to a normal TSP problem with n+1 vertexes. To do this, take node 'A' and all the points of interest and compute a shortest path between each pair of these points. You can use the all-pairs shortest path algorithm on the original graph. Or, if n is significantly smaller than the original graph size, use single-source shortest path algorithm for these n+1 vertexes. Also you can set length of all the paths, ending at 'A', to some constant, larger than any other path, which allows to end the trip anywhere (this may be needed only for TSP algorithms, finding a round-trip path).
As a result, you get a complete graph, which is metric, but still asymmetric. All you need now is to solve a normal TSP problem on this graph. If you want to convert this asymmetric graph to a symmetric one, Wikipedia explains how to do it.

Is this minimum spanning tree algorithm correct?

The minimum spanning tree problem is to take a connected weighted graph and find the subset of its edges with the lowest total weight while keeping the graph connected (and as a consequence resulting in an acyclic graph).
The algorithm I am considering is:
Find all cycles.
remove the largest edge from each cycle.
The impetus for this version is an environment that is restricted to "rule satisfaction" without any iterative constructs. It might also be applicable to insanely parallel hardware (i.e. a system where you expect to have several times more degrees of parallelism then cycles).
Edits:
The above is done in a stateless manner (all edges that are not the largest edge in any cycle are selected/kept/ignored, all others are removed).
What happens if two cycles overlap? Which one has its longest edge removed first? Does it matter if the longest edge of each is shared between the two cycles or not?
For example:
V = { a, b, c, d }
E = { (a,b,1), (b,c,2), (c,a,4), (b,d,9), (d,a,3) }
There's an a -> b -> c -> a cycle, and an a -> b -> d -> a
#shrughes.blogspot.com:
I don't know about removing all but two - I've been sketching out various runs of the algorithm and assuming that parallel runs may remove an edge more than once I can't find a situation where I'm left without a spanning tree. Whether or not it's minimal I don't know.
For this to work, you'd have to detail how you would want to find all cycles, apparently without any iterative constructs, because that is a non-trivial task. I'm not sure that's possible. If you really want to find a MST algorithm that doesn't use iterative constructs, take a look at Prim's or Kruskal's algorithm and see if you could modify those to suit your needs.
Also, is recursion barred in this theoretical architecture? If so, it might actually be impossible to find a MST on a graph, because you'd have no means whatsoever of inspecting every vertex/edge on the graph.
I dunno if it works, but no matter what your algorithm is not even worth implementing. Finding all cycles will be the freaking huge bottleneck that will kill it. Also doing that without iterations is impossible. Why don't you implement some standard algorithm, let's say Prim's.
Your algorithm isn't quite clearly defined. If you have a complete graph, your algorithm would seem to entail, in the first step, removing all but the two minimum elements. Also, listing all the cycles in a graph can take exponential time.
Elaboration:
In a graph with n nodes and an edge between every pair of nodes, there are, if I have my math right, n!/(2k(n-k)!) cycles of size k, if you're counting a cycle as some subgraph of k nodes and k edges with each node having degree 2.
#Tynan The system can be described (somewhat over simplified) as a systems of rules describing categorizations. "Things are in category A if they are in B but not in C", "Nodes connected to nodes in Z are also in Z", "Every category in M is connected to a node N and has 'child' categories, also in M for every node connected to N". It's slightly more complicated than this. (I have shown that by creating unstable rules you can model a turning machine but that's beside the point.) It can't explicitly define iteration or recursion but can operate on recursive data with rules like the 2nd and 3rd ones.
#Marcin, Assume that there are an unlimited number of processors. It is trivial to show that the program can be run in O(n^2) for n being the longest cycle. With better data structures, this can be reduced to O(n*O(set lookup function)), I can envision hardware (quantum computers?) that can evaluate all cycles in constant time. giving a O(1) solution to the MST problem.
The Reverse-delete algorithm seems to provide a partial proof of correctness (that the proposed algorithm will not produce a non-minimal spanning tree) this is derived by arguing that mt algorithm will remove every edge that the Reverse-delete algorithm will. However I'm not sure how to show that my algorithm won't delete more than that algorithm.
Hhmm....
OK this is an attempt to finish the proof of correctness. By analogy to the Reverse-delete algorithm, we know that enough edges will be removed. What remains is to show that there will not be to many edges removed.
Removing to many edges can be described as removing all the edges between the side of a binary partition of the graph nodes. However only edges in a cycle are ever removed, therefor, for all edge between partitions to be removed, there needs to be a return path to complete the cycle. If we only consider edges between the partitions then the algorithm can at most remove the larger of each pair of edges, this can never remove the smallest bridging edge. Therefor for any arbitrary binary partitioning, the algorithm can't sever all links between the side.
What remains is to show that this extends to >2 way partitions.

Resources