Minimum number of non-intersecting simple cycles in unweighted directed graph - algorithm

I decided to try implement some assignment problem algorithms. I already did some, but I got stuck on the problem described below:
To put it simply, I need to cover all its vertices with the minimum number of non-intersecting simple cycles.
But I don't understand how, does anyone have any ideas? I would be especially glad to see an explanation.

This problem is NP-hard via a reduction from the Hamiltonian cycle problem. More specifically, if a graph has a Hamiltonian cycle, then you can cover all the vertices with a single simple cycle, namely the Hamiltonian cycle, and otherwise the graph requires multiple cycles to cover its nodes (if it can even be done at all).
As a result, unless P = NP, there are no polynomial-time algorithms for this problem. You can still solve it using either heuristic searches or brute force, but those approaches won’t necessarily be fast on all inputs.

Related

Is there an efficient algorithm for finding or approximating the shortest walk of a graph which must visit some subset of vertices of the graph?

The title is a mouth-full, but put simply, I have a large, undirected, incomplete graph, and I need to visit some subset of vertices in the (approximately) shortest time possible. Note that this isn't TSP, as I don't need to visit all vertices.
The naive approach would be to simply brute-force the solution by trying every possible walk which includes the required vertices, using A*, for example, to calculate the walks between required vertices. However, this is O(n!) where n is the number of required vertices. This is unfeasible for me, as n > 40 in my average case, and n ≈ 80 in my worst case.
Is there a more efficient algorithm for this, perhaps one that approximates the solution?
This question is similar to the question here, but differs in the fact that my graph is larger than the one in the linked question. There are several other similar questions, but none that I've seen exactly solve my specific problem.
If you allow visiting the same nodes several times, find the shortest path between each pair of mandatory vertices. Then you solve the TSP between the mandatory vertices, using the above shortest path costs. If you disallow multiple visits, the problem is much worse.
I am afraid you cannot escape the TSP.

Finding fully connected components?

I'm not sure if I'm using the right term here, but for fully connected components I mean there's an (undirected) edge between every pair of vertices in a component, and no additional vertices can be included without breaking this property.
There're a number algorithms for finding strongly connected components in a graph though (for example Tarjan's algorithm), is there an algorithm for finding such "fully connected components"?
What you are looking for is a list of all the maximal cliques of the graph. It's also called the clique problem. No known polynomial time solution exists for a generic undirected graph.
Most versions of the clique problem are hard. The clique decision problem is NP-complete (one of Karp's 21 NP-complete problems). The problem of finding the maximum clique is both fixed-parameter intractable and hard to approximate. And, listing all maximal cliques may require exponential time as there exist graphs with exponentially many maximal cliques. Therefore, much of the theory about the clique problem is devoted to identifying special types of graph that admit more efficient algorithms, or to establishing the computational difficulty of the general problem in various models of computation.
-https://en.wikipedia.org/wiki/Clique_problem
I was also looking at the same question.
https://en.wikipedia.org/wiki/Bron-Kerbosch_algorithm This turns out to be an algorithm to list it, however, it's not fast. If your graph is sparse, you may want to use the vertex ordering version of the algorithm:
For sparse graphs, tighter bounds are possible. In particular the vertex-ordering version of the Bron–Kerbosch algorithm can be made to run in time O(dn3d/3), where d is the degeneracy of the graph, a measure of its sparseness. There exist d-degenerate graphs for which the total number of maximal cliques is (n − d)3d/3, so this bound is close to tight.[6]

Shortest path to connect n points

I have n points and I need to connect all of them minimizing the final distance. The image above represents an algorithm that in each node it connects to the nearest one but the final output might be really of.
I've been searching a lot, I know some pathfinding algos but unaware of one that solves exactly this case. I found a question on Math Stackexchange but the answer is not providing any algorithm - https://math.stackexchange.com/a/581844/156584.
Is there any algorithm that solves exactly this problem? Otherwise I can bruteforce it.
Edit: Some clarification regarding the result I'm expecting: each node can be connected to 2 other nodes, creating a continuous path (like taking a pen and without ever lifting it, connect the nodes minimizing the final distance). I don't want to create a cycle (that being the travelling salesman problem).
PS: this question can also be translated to "complete graph with n vertices, and wanting to choose the set of edges such that the graph is connected, but the sum of the edge weights is minimized"
This problem is known as the shortest Hamiltonian path problem and it is NP-hard. So if the number of points is small, you can use backtracking or dynamic programming to find an optimal solution. If the number of points is large, you can use heuristics and/or approximations to obtain a relatively good answer(it is not always possible to find the best one in this case, though).

Minimal cost cyclic path in a graph - A variant of TSP

For example, we have a graph consisting of vertices (cities) and edges (roads) and each edge(road) has a particular cost, find the minimal cost to visit all cities ATLEAST ONCE. Cost is the sum of the edge costs of the edges traversed.
The part "ATLEAST ONCE" caught me. In a TSP we can visit a node only once according to Wiki. Consider the graph,
A-B 11
A-C 5
B-C 2
B-E 4
C-E 3
C-D 20
D-E 100
In a TSP, The cyclic path would be A-B-E-D-C-A cost- 140 (or) A-C-D-E-B-A cost- 140. Where as from my problem description we can visit each vertex ATLEAST ONCE so we can have a cyclic path A-C-D-C-E-B-A cost- 63 which is << a TSP. This is where I had a problem. Any specific algorithm here? I'm pretty sure TSP wont work well here.
Pointers or pseudo code will be very helpful.
For each pair of nodes, you can apply the shortest path algorithm and calculate the shortest distance. This will be the new cost matrix for each pair.
Now it is reduced to Travelling Salesman Problem.
Then you can apply TSP solving technique.
Given that you are allowing a vertex to be visited multiple times, this effectively turns your incomplete graph into a complete graph (all vertices connected), which is what TSP requires. Solving your problem in the general case is exactly the same as solving the metric TSP. The good news is that this is a heavily researched topic. The bad news is that you aren't able to sidestep the TSP - since your problem is identical to a form of the TSP.
As pointed out by others, you complete the graph by computing the shortest cost between each pair of vertices and adding those edges where missing. You also need to replace any existing direct edge for which you've found a lower indirect path cost so that you have a Metric TSP. You can store with the new synthetic edges their actual paths (through intermediate vertices) so you can recover those for your final answer, or you can recompute those paths as needed upon receiving the result of the TSP.
Now you can solve this as a TSP. However, solving TSP optimally is too expensive in the general case, so you'll likely want to use an approximate solution algorithm. A variety of these (e.g. Christofides algorithm, Lin–Kernighan heuristic) are available which make differing tradeoffs between guaranteed levels of optimality and performance of the algorithm.
If you actually don't care about completing the cycle, and just want a minimum path that visits all vertices, starting and ending at any vertex, this is a somewhat different problem. For this, read my answer here: https://stackoverflow.com/a/33601043/5237297

Hamiltonian paths & social graph algorithm

I have a random undirected social graph.
I want to find a Hamiltonian path if possible. Or if not possible (or not possible to know if possible in polynomial time) a series of paths. In this "series of paths" (where all N nodes are used exactly once), I want to minimize the number of paths and maximize the average length of the paths. (So no trivial solution of N paths of a single node).
I have generated an adjacency matrix for the nodes and edges already.
Any suggestions? Pointers in the right direction? I realize this will require heuristics because of the NP-complete (?) nature of the problem, and I am OK with a "good enough" answer. Also I would like to do this in Java.
Thanks!
If I'm interpreting your question correctly, what you're asking for is still NP-hard, since the best solution to the "multiple paths" problem would be a Hamiltonian path, and determining whether one exists is known to be NP-hard. Moreover, even if you're guaranteed that a Hamiltonian path doesn't exist, solving this problem could still be NP-hard, since I could give you a graph with a single disconnected node floating in space, for which the best solution is a trivial path containing that node and a Hamiltonian path in the remaining graph. As a result, unless P = NP, there isn't going to be a polynomial-time algorithm for your problem.
Hope this helps, and sorry for the negative result!
Angluin and Valiant gave a near linear-time heuristic that works almost always in a sufficiently dense Erdos-Renyi random graph. It's described by Wilf, on page 121. Probably your random graph is not Erdos-Renyi, but the heuristic might work anyway (when it "fails", it still gives you a (hopefully) long path; greedily take this path and run A-V again).
Use a genetic algorithm (without crossover), where each individual is a permutation of the nodes. This gives you "series of paths" at each generation, evolving to a minimal number of paths (1) and a maximal avg. length (N).
As you have realized there is no exact solution in polynomial time. You can try some random search methods though. My recommendation, start with genetic algorithm and try out tabu search.

Resources