I have a variation of the NP-hard minimum clique cover problem. My problem reduces to that problem, so is also NP-hard.
We have n sets of vertices. Each set has between 1 and k vertices. Let's assume that n is at most 210 and k is at most 10.
There are edges between some but not all pairs of vertices that belong to different sets, and no edges between vertices in the same set. I don't yet know how likely any given pair of vertices in different sets is to have an edge between them, but we can assume many but not all of these potential edges exist.
We need to find the minimum collection of cliques such that each of our n sets of vertices has a vertex in at least one clique. Assuming this is effectively impossible given the NP-hard nature of the problem and the size of n and k, can we find a reasonable approximation?
For clarity, I expect that exactly one vertex from each set will be in the solution, and that we'll have either the minimum number of cliques that covers that set, or some approximation of it.
My best idea so far is to pick n vertices that form a relatively dense graph, and then apply existing techniques for minimum clique covers/approximations to that. I could pick the vertices by greedily choosing the highest degree in succession, where for latter picks I exclude edges to vertices that have been eliminated because some other vertex in the same set has been chosen.
Related
I am trying to write an algorithm that computes the edge clique cover number (the smallest number of cliques that cover all edges) of an input graph (undirected and no self-loops). My idea would be to
Calculate all maximal cliques with the Bron-Kerbosch algorithm, and
Try if any 1,2,3,... of them would cover all edges until I find the
minimum number
Would that work and does anyone know a better method; is there a standard algorithm? To my surprise, I couldn't find any such algorithm. I know that the problem is NP-hard, so I don't expect a fast solution.
I would gather maximal cliques as you do now (or perhaps using a different algorithm, as suggested by CaptainTrunky), but then use branch and bound. This won't guarantee a speedup, but will often produce a large speedup on "easy" instances.
In particular:
Instead of trying all subsets of maximal cliques in increasing subset size order, pick an edge uv and branch on it. This means:
For each maximal clique C containing uv:
Make a tentative new partial solution that contains all cliques in the current solution
Add C to this partial solution
Make a new subproblem containing the current subproblem's graph, but with all vertices in C collapsed into a single vertex
Recurse to solve this smaller subproblem.
Keep track of the best complete solution so far. This is your upper bound (UB). You do not need to continue processing any subproblem that has already reached this upper bound but still has edges present; a better solution already exists!
It's best to pick an edge to branch on that is covered by as few cliques as possible. When choosing in what order to try those cliques, try whichever you think is likely to be the best (probably the largest one) first.
And here is an idea for a lower bound to improve the pruning level:
If a subgraph G' contains an independent set of size s, then you will need at least s cliques to cover G' (since no clique can cover two or more vertices in an independent set). Computing the largest possible IS is NP-hard and thus impractical here, but you could get a cheap bound by using the 2-approximation for Vertex Cover: Just keep choosing an edge and throwing out both vertices until no edges are left; if you threw out k edges, then what remains is an IS that is within k of optimal.
You can add the size of this IS to the total number of cliques in your solution so far; if that is larger than the current UB, you can abort this subproblem, since we know that fleshing it out further cannot produce a better solution than one we have already seen.
I was working on the similar problem 2 years ago and I've never seen any standard existing approaches to it. I did the following:
Compute all maximal cliques.
MACE was way better than
Bron-Kerbosch in my case.
Build a constraint-satisfaction problem for determining a minimum number of cliques required to cover the graph. You could use SAT, Minizinc, MIP tools to do so. Which one to pick? It depends on your skills, time resources, environment and dozens of other parameters. If we are talking about proof-of-concept, I would stick with Minizinc.
A bit details for the second part. Define a set of Boolean variables with respect to each edge, if it's value == True, then it's covered, otherwise, it's not. Add constraints that allow you covering sets of edges only with respect to each clique. Finally, add variables corresponding to each clique, if it's == True, then it's used already, otherwise, it's not. Finally, require all edges to be covered AND a number of used cliques is minimal.
In Singapore, this year's (2016) NOI (National Olympiad in Informatics) included the following problem "ROCKCLIMBING" (I was unable to solve it during the contest.) :
Abridged Problem Statement
Given a DAG with N <= 500 vertices, find the maximum number of vertices in a subset of the original vertices such that there is no path from 1 vertex in the set to another vertex in the same set, directly or indirectly.
Solution
The solution was to use transitive closure algorithm, and then to form a bipartite graph by duplicating each vertex i to form i' such that if vertex j can be reached from vertex i directly or indirectly in the original graph, then there is a directed edge from i to j' in the new graph.
However, during the solution presentation, the presenters did not explain how or why N - MCBM (MCBM being the Maximum Cardinality Bipartite Matching) of the new bipartite graph is also the maximum size of the set of vertices that cannot reach each other directly or indirectly in the original DAG.
I looked up other problems related to DAGs and bipartite graphs, such as the Minimum Path Cover problem on DAGs, but I could not find anything that explains this.
Does anyone know a way in which to prove this equality?
The problem statement can be found here: ROCKCLIMBING
Thank you in advance.
There are two things going on here:
A set is independent if and only if its complement is a vertex cover (see wikipedia). This means that the size of a max independent set is equal to the size of a minimum vertex cover.
Konig's theorem proves that
In any bipartite graph, the number of edges in a maximum matching equals the number of vertices in a minimum vertex cover.
Therefore to find the size of the max independent set we first compute the size MCBM of the max matching, and then compute its complement which equals N-MCBM.
An alternative viewpoint is as follows:
If we use A<B to mean we can climb from A to B, we have defined a partially ordered set
There is a result called Dilworth's theorem that says the maximum number of incomparable elements is equal to the minimum number of chains
The proof shows how to construct the minimum number of chains by constructing a maximum matching in your bipartite graph.
I just started reading graph theory and was reading about graph coloring. This problem popped in my mind:
We have to color our undirected graph(not completely) with only 1 color so that number of colored nodes are maximized. We need to find this maximum number. I was able to formulate an approach for non cyclic graphs :
My approach : First we divide graph into isolated components and do this for each component. We make a dfs tree and make 2 dp arrays while traversing it so that root comes last :
dp[0][u]=sum(dp[1][visited children])
dp[1][u]=sum(dp[0][visited children])
ans=max(dp[1][root],dp[0][root])
dp[0][i] , dp[1][i] are initialized to 0,1 respectively.
Here 0 signifies uncolored and 1 signifies colored.
But this does not work for cyclic graphs as I have assumed that no visited children are connected.
Can someone guide me in the right direction on how to solve this problem for cyclic graphs(not by brute force)? Is it possible to modify my approach or do we need to come up with a different approach? Would a greedy approach like coloring a nodes with least edges work?
This problem is NP-Hard as well, and is known as maximum independent set problem.
A set S<=V is said to be Independent Set in a graph if for each two vertices u,v in S, there is no edge (u,v).
The maximum size of S (which is the number you are seeking) is called the independence number of the graph, and unfortunately finding it is NP-Hard.
So, unless P=NP, your algorithm fails for general purposes graphs.
Proving it is fairly simple, given a graph G=(V,E), create the complementary graph G'=(V,E') where (u,v) is in E' if and only if (u,v) is NOT in E.
Now, given a graph G, there is a clique of size k if and only if there is an independent set of size k in G', using the same vertices (since if (u,v) are two vertices the independent set, there is no edge (u,v) in E', and by definition there is an edge in E. Repeat for all vertices in the independent set, and you got a clique in G).
Since clique problem is NP-Hard, this makes this one such as well.
I have a list of GPS points...but what am I asking for could also work for any X,Y coordinates.
In my use-case, I want to assign points in sets. Each point can belong in only one set and each set has a condition that distance between any of two points in the set is not greater than some constant...that means, all points of the set fit a circle of a specific diameter.
For a list of points, I want to find best (or at least some) arrangement in which there is minimal number of sets.
There will be sets with just a single point because other points around are already in different sets or simply because there are no points around (distance between them is greater than in the condition of the set)...what I want to avoid is inefficient set assignment where e.g. instead of finding ideal 2 sets, each having 30 points, I find 5 sets, one with 1 point, second with 40 points, etc...
All I'am capable of is a brute-force solution, compute all distances, build all posible set arrangements, sort them by number of sets and pick one with the least number of sets.
Is there a better approach?
The Problem here is NP-complete. What you try to solve is the max-clique problem combined with a set cover problem.
Your problem can be represented as a Graph G=(V,E), where the vertices are your coordinates and the edges the connections in distances. This graph can be made in O(n^2) time. Then you filter out all edges with a distance greater then your constant giving the graph G'.
With the the remaining graph G' you want to find all cliques (effectively solving max-clique). A clique is a fully connected set of vertices. Name this list of cliques S.
Now finding a minimal set of elements of S that cover all vertices V is the set cover problem.
Both the set cover problem and the max clique are NP complete. And therefore finding an optimal solution would take exponential time. You could look at approximation algorithms for these two problems.
For a given undirected graph with N nodes and its adjacency matrix, suppose that there is at least one set of n nodes in which every member node is a one-hop neighbor of the others in the set. (n << N)
In this situation, what is an algorithm to find such a set of nodes?
For example, this is a sample adjacency matrix (link: GitHub Gist) with 42 nodes. And it is known that there is a set in which 3 nodes are one-hop neighbors of each other. Then, what is the composition of the set?
My final goal is to do this procedure with about 450 nodes to find a set containing 9 one-hop neighbor nodes. Thus, I am looking for a solution that is scalable and efficient.
A set of nodes in a graph that are neighbours of each other is called a clique, and finding the largest clique in a graph, or indeed whether any clique of a specified size even exists in the graph, is a classic NP-hard problem.
Some NP-hard problems, called fixed-parameter tractable (FPT) problems, can be solved efficiently even for large n, provided that some problem parameter (besides the size) is small: here the obvious parameter is the size of the largest clique. Unfortunately Maximum Clique is not FPT with respect to this parameter, and the best known algorithms are exponential in n. Several are mentioned on the Wikipedia page. Of course, since you already know a clique of the given size exists, you might get lucky and find it quickly with a heuristic.
Note the difference between maximum (largest possible) and maximal (can't be grown) cliques.