I am trying to solve a sub problem of set cover.
For example, the U={ (1,2,3), (4,3), (5,3), (1,2), (4) }
I want to find the minimum number of sets that cover all the elements in U, however with a constraint that the solution set should not have overlapping element between them.
In this case, a subset {(5,3), (1,2), (4)} is a solution,
but the subset {(1,2,3), (4,3), (5,3)} is not a solution.
Has anyone already studied this before? What is it called in the literature? Any algorithm suggested?
I am trying to write an algorithm that computes the edge clique cover number (the smallest number of cliques that cover all edges) of an input graph (undirected and no self-loops). My idea would be to
Calculate all maximal cliques with the Bron-Kerbosch algorithm, and
Try if any 1,2,3,... of them would cover all edges until I find the
minimum number
Would that work and does anyone know a better method; is there a standard algorithm? To my surprise, I couldn't find any such algorithm. I know that the problem is NP-hard, so I don't expect a fast solution.
I would gather maximal cliques as you do now (or perhaps using a different algorithm, as suggested by CaptainTrunky), but then use branch and bound. This won't guarantee a speedup, but will often produce a large speedup on "easy" instances.
In particular:
Instead of trying all subsets of maximal cliques in increasing subset size order, pick an edge uv and branch on it. This means:
For each maximal clique C containing uv:
Make a tentative new partial solution that contains all cliques in the current solution
Add C to this partial solution
Make a new subproblem containing the current subproblem's graph, but with all vertices in C collapsed into a single vertex
Recurse to solve this smaller subproblem.
Keep track of the best complete solution so far. This is your upper bound (UB). You do not need to continue processing any subproblem that has already reached this upper bound but still has edges present; a better solution already exists!
It's best to pick an edge to branch on that is covered by as few cliques as possible. When choosing in what order to try those cliques, try whichever you think is likely to be the best (probably the largest one) first.
And here is an idea for a lower bound to improve the pruning level:
If a subgraph G' contains an independent set of size s, then you will need at least s cliques to cover G' (since no clique can cover two or more vertices in an independent set). Computing the largest possible IS is NP-hard and thus impractical here, but you could get a cheap bound by using the 2-approximation for Vertex Cover: Just keep choosing an edge and throwing out both vertices until no edges are left; if you threw out k edges, then what remains is an IS that is within k of optimal.
You can add the size of this IS to the total number of cliques in your solution so far; if that is larger than the current UB, you can abort this subproblem, since we know that fleshing it out further cannot produce a better solution than one we have already seen.
I was working on the similar problem 2 years ago and I've never seen any standard existing approaches to it. I did the following:
Compute all maximal cliques.
MACE was way better than
Bron-Kerbosch in my case.
Build a constraint-satisfaction problem for determining a minimum number of cliques required to cover the graph. You could use SAT, Minizinc, MIP tools to do so. Which one to pick? It depends on your skills, time resources, environment and dozens of other parameters. If we are talking about proof-of-concept, I would stick with Minizinc.
A bit details for the second part. Define a set of Boolean variables with respect to each edge, if it's value == True, then it's covered, otherwise, it's not. Add constraints that allow you covering sets of edges only with respect to each clique. Finally, add variables corresponding to each clique, if it's == True, then it's used already, otherwise, it's not. Finally, require all edges to be covered AND a number of used cliques is minimal.
I have a set S={a,c,d,e,f,j,m,q,s,t} with a constraint C={am,cm,de,df,dm,ds,ef,em,eq,es,et,fj,fm,fs,jm,js}. xy in C means that x and y cannot be in the same subset. I would like an algorithm to split set S into subsets Sj such that:
1.The number of Sj is minimized
2.The difference between size of each subset is as large as possible
For example in this case, both {{q,a,c,d,j,t},{m,s},{f},{e}} and {{a,c,e,j},{m,s,q,t},{d},{f}} are satisfying 1, but the first is optimal.
Coming from a computer science background, I wonder whether Mathematicians have devised an algorithm for this problem.
As I understand, your task can be rewritten as: find the largest independent subset of vertices S' of graph G=(S, C); repeat the step for graph G'=G\S'.
It's well-known (also pointed by #tobias_k in his comment) that largest independent set of the graph is NP-hard problem (as it's equivalent to the famous clique-problem).
I think this is very hard problem, and that is why. For finding minimum number of subsets, you must solve problem about minimum chromatic number of graph. This problem is generally solved by brute force.
I have a list of GPS points...but what am I asking for could also work for any X,Y coordinates.
In my use-case, I want to assign points in sets. Each point can belong in only one set and each set has a condition that distance between any of two points in the set is not greater than some constant...that means, all points of the set fit a circle of a specific diameter.
For a list of points, I want to find best (or at least some) arrangement in which there is minimal number of sets.
There will be sets with just a single point because other points around are already in different sets or simply because there are no points around (distance between them is greater than in the condition of the set)...what I want to avoid is inefficient set assignment where e.g. instead of finding ideal 2 sets, each having 30 points, I find 5 sets, one with 1 point, second with 40 points, etc...
All I'am capable of is a brute-force solution, compute all distances, build all posible set arrangements, sort them by number of sets and pick one with the least number of sets.
Is there a better approach?
The Problem here is NP-complete. What you try to solve is the max-clique problem combined with a set cover problem.
Your problem can be represented as a Graph G=(V,E), where the vertices are your coordinates and the edges the connections in distances. This graph can be made in O(n^2) time. Then you filter out all edges with a distance greater then your constant giving the graph G'.
With the the remaining graph G' you want to find all cliques (effectively solving max-clique). A clique is a fully connected set of vertices. Name this list of cliques S.
Now finding a minimal set of elements of S that cover all vertices V is the set cover problem.
Both the set cover problem and the max clique are NP complete. And therefore finding an optimal solution would take exponential time. You could look at approximation algorithms for these two problems.
I am trying to solve set cover problem in a way that vertex cover is solved
Input: we have a base set X and collection C of subsets of X, so that each element in C is a subset of X
Output: the size of the smallest set F from set in C in a way that the union of all elements of F results in X
I know how to solve this but I am looking for a heuristic to stop going further in the tree earlier. For example Now I remove each element from C and do a recursive call and I check for stopping point in this way: if(bestsofar <= F.length+1) stop
but I know that there would be better heuristic because for example in vertex cover I can check like this : if K+1 >best stop; which k is the number of added vertice in the result to cover edges but the better approach is if K+ number Edges/maxdeg >=best stop which is much better.
I want the same thing for set-cover .
does anyone have any idea?
From a theoretical perspective, what your heuristic for vertex cover is doing is constructing a feasible solution to the dual of the relaxed linear program for vertex cover. The same can be done for set cover. If for whatever reason you don't want to use the simplex method to find the optimal dual solution, then there are a variety of approximations available. You could use K plus the number of items divided by maximum number of items in a set, which generalizes your heuristic for vertex cover. You also could use a greedy algorithm to find a packing, by which I mean the following. For vertex cover, this would be a set of edges with no endpoints in common (i.e., a matching). Every cover contains at least one endpoint of each of the edges in the packing. For set cover, this would be a collection of items such that no set contains more than one item of the collection.
I hava a problem which is described below. Do you have any good solution or this problem is just another form of any "classics" or "have been solved" problem?
The problem is :
There are some group of numbers,e.g.
A(1 8 9)
B(1 4 5)
C(2 4 6)
D(3 4 7)
E(2 10 11)
F(3 12 13)
There are "A-F" six group. we have numbers "1,2,3,4,5,6,7,8,9,10,11,12,13".
Now find the minimum amount of number set which satisfies each group must have a number in this set at least. For examlpe, we can find the set "1 4 2 13 12" that A has "1",B has "1,4",C has "2,4",D has "4" ,E has "2",F has "12,13" .
But set "1 2 4" is not that we find ,F does not has any number in the set.
The best set is "1,2,3",every gruop has a number in the set,and the size of the set is optimal. It has only three numbers. THIS is What we want. If there are many best sets,finding any one is OK. Thanks.
This is equivalent to the set cover problem. In this case, each of your sets A, B, ..., F are the elements of the set cover problem, and each of the numbers 1, 2, ..., 13 are the sets. For example, in this mapping 1 becomes {A, B}, and 11 becomes the set {E}.
Set cover is NP-hard. The integer linear programming formulation on the linked Wikipedia page is probably as good as you'll get for exact solutions; the greedy algorithm there is a decent approximation for large problems.
This problem is NP-hard via a reduction from the NP-hard vertex cover problem (given a graph, can you find a set of k nodes such that every vertex in the graph is adjacent to some chosen node?)
The reduction is as follows. Number all of the nodes in the graph 1, 2, 3, ..., n in any order that you'd like. Then, for each edge in the graph, construct the set containing just two numbers - the edge's endpoints. If there is a k-node vertex cover in the original graph, then there is a set of k numbers you can pick (namely, the nodes in the vertex cover) such that you have one number chosen from each set. This can be computed in polynomial-time.
To see why the reduction works, note that if there is a set of size k you can pick such that each set in the construction has at least one element picked, then the vertices corresponding to those numbers form a k-element vertex cover in the original graph.
This reduction can be done in polynomial-time, so we have a polynomial-time reduction from the NP-hard vertex-cover problem to your problem. Thus this problem is NP-hard. So, unless P = NP, there is no polynomial-time algorithm for this problem.
Hope this helps!