SAT/CNF optimization - algorithm

Problem
I'm looking at a special subset of SAT optimization problem. For those not familiar with SAT and related topics, here's the related Wikipedia article.
TRUE=(a OR b OR c OR d) AND (a OR f) AND ...
There are no NOTs and it's in conjunctive normal form. This is easily solvable. However I'm trying to minimize the number of true assignments to make the whole statement true. I couldn't find a way to solve that problem.
Possible solutions
I came up with the following ways to solve it:
Convert to a directed graph and search the minimum spanning tree, spanning only a subset of vertices. There's Edmond's algorithm but that gives a MST for the complete graph instead of a subset of the vertices.
Maybe there's a version of Edmond's algorithm that solves the problem for a subset of the vertices?
Maybe there's a way to construct a graph out of the original problem that's solvable with other algorithms?
Use a SAT solver, a LIP solver or exhaustive search. I'm not interested in those solutions as I'm trying to use this problem as lecture material.
Question
Do you have any ideas/comments? Can you come up with other approaches that might work?

This problem is NP-Hard as well.
One can show an east reduction from Hitting Set:
Hitting Set problem: Given sets S1,S2,...,Sn and a number k: chose set S of size k, such that for every Si there is an element s in S such that s is in Si. [alternative definition: the intersection between each Si and S is not empty].
Reduction:
for an instance (S1,...,Sn,k) of hitting set, construct the instance of your problem: (S'1 AND S'2 And ... S'n,k) where S'i is all elements in Si, with OR. These elements in S'i are variables in the formula.
proof:
Hitting Set -> This problem: If there is an instance of hittins set, S then by assigning all of S's elements with true, the formula is satisfied with k elements, since for every S'i there is some variable v which is in S and Si and thus also in S'i.
This problem -> Hitting set: build S with all elements whom assigment is true [same idea as Hitting Set->This problem].
Since you are looking for the optimization problem for this, it is also NP-Hard, and if you are looking for an exact solution - you should try an exponential algorithm

Related

Number of elements required to occur at least ones in each set of a set

I have a list L of lists l[i] of elements e. I am looking for an algorithm that finds a minimum set S_min of elements such that at least one member of S_min occurs in each l.
I am not only curious to find a simple algorithm that does this for me, but also to learn what problems of this sort are actually called. I am sure there is something out there
I have implemented brute force algorithms that start with adding all those elements to S_min which occur in sets of len(l[i])=1. The rest is simple trial and error.
The problem you describe ist the vertex cover problem in hypergraphs, an optimization problem which is NP-hard in the general case but admits approximation algorithms for suitably bounded instances.

Reducing Graph Reachability to SAT (CNF)

So I came across this problem in my textbook. I was wondering how to develop a reduction from the Graph Reachability problem to SAT (CNF) problem. (i.e. formula is satisfiable iff there exists a path in graph G from start to end node)
1) I can't wrap my head around how to go from something that can be solved in polynomial time (Graph Reachability) to something that is NP (SAT).
2) I can't seem to find a way to formulate these nodes/edges of Graph into actual clauses in CNF that correspond to reachability.
I tried to think about algorithms like Floyd-Warshall that determine if a path exists from start to end node but I can't seem to formulate that idea into actual CNF clauses. Help would be much appreciated!
It probably wouldn't be too hard to come up with the kind of answer you're expecting, but here's the real answer instead:
"Reducing" a problem X to problem Y means transforming any instance of X to an instance of Y such that the answer to Y provides the answer to X. Usually, we require a P-time reduction, i.e., the transformation of the problem and the extraction of the answer must both happen in polynomial time.
Graph Reachability is easily solved in linear time, which is certainly polynomial time, so the reduction from Graph Reachability to SAT is very simple:
Given a graph reachability problem, solve it in linear time;
If the desired path exists, write out any satisfiable SAT instance, like (A). Otherwise, write out any unsatisfiable SAT instance like (A)&(~A)
We did something similar to your task a few years ago. Our approach was based exactly on Floyd-Warshall (F.-W.) algorithm.
Intuitively, you would like to something like this:
Generate all possible paths using F.-W. for each pair of nodes
Generate a clause representing each path. It could be described as "if a path is selected, then the following nodes must be selected"
Generate a clause that unites all paths into a single CNF. Most likely it would be "exactly_one" clause.
A bit more formally:
Assign a binary literal to each node in a graph. The literal has value True iff. it belongs to a path between two nodes.
Run F.-W. for a pair of nodes
Turn resulting path to a clause:
nodes <- get_nodes_from_path(path)
node_lits <- logical_and([n.literal for n in nodes])
Get new literal for a path path_lit <- get_new_literal()
Add it to path a path: path_clause <- if_then_else(node_lits, path_lit)
Go to 2, enumerate all pairs
Finally, you could the following:
all_paths <- exactly_one(all_path_clauses)
all_paths <- True
SAT solver would be forced to select one of paths and this would lead to selecting corresponding nodes.
With respect to your first question: Since you're only devising a way to reduce a problem in P into a problem in NP (and not the other way around), this isn't actually a problem. You can turn any Graph Reachability problem into a SAT problem, but that doesn't mean you can turn any SAT problem into a Graph Reachability problem.

Is finding a subset with exact cut with other given subsets NP-hard?

I am trying to figure out whether the following problem is NP-hard:
Given G_1,..,G_n subsets of {1..m}
c_1,..,c_n non-negative integers in {0..m}
Find T subset of {1..m}
S.T. for all i=1..n, T intersects G_i in exactly c_i elements
I tried to find reductions to NP problems such as coloring, or P problems such as matching, but on both cases could think of exponential conversion algorithm (i.e. one that takes into account all subsets of G_i of size c_i), which doesn't help me much :(
A side note: would any restrictions on the parameters make this problem much easier?
For instance, would m<< n make a computational difference (we would still be looking for an algorithm that is polynomial in m)?
What if we knew that some of the constants c_i were zero?
Would appreciate any insight :)
This is NP-Complete problem, and is a generalization of Hitting Set Problem. Proof of NP-Completeness follows.
The problem is in NP (trivial - given a solution T, it is easy to check the intersection with each of Gi, and verify if its size is Ci).
It is also NP-Complete, assuming you are looking for minimal such T, with reduction from Exact Hitting Set Problem:
Where the decision problem of exact hitting set is:
Given a universe of elements U, subsets S1,...,Sk, and a number
m - is there a subset S of U of size at most m that contains
exactly one element from each Si?
Given instance of hitting-set problem (S1,S2,...Sk,d) - reduce it to this problem with Gi=Si, ci = 1, if the minimal solution of this problem is of size d, there is also a solution to exact hitting set of size d (and vise versa).

solving the Longest-Path-Length. Is my solution correct?

This is the question [From CLRS]:
Define the optimization problem LONGEST-PATH-LENGTH as the relation that
associates each instance of an undirected graph and two vertices with the number
of edges in a longest simple path between the two vertices. Define the decision
problem LONGEST-PATH = {: G=(V,E) is an undirected
graph, u,v contained in V, k >= 0 is an integer, and there exists a simple path
from u to v in G consisting of at least k edges}. Show that the optimization problem
LONGEST-PATH-LENGTH can be solved in polynomial time if and only if
LONGEST-PATH is contained in P.
My solution:
Given an algorith A, that can solve G(u,v) in polytime, so we run the A on G(u,v) if it returns 'YES" and k' such that k' is the longest path in G(u,v), now all we have to do it compare if
k =< k'
if then the longest path length is solved. If we recieve "NO" or k>=k', then there exists no solution.
so polytime to run A + constant for comparsion, then to find the longest path length it takes poly time. Also this is only possible since G(u,v) runs in Polytime (in P), thus G(u,v,k) runs also in polytime (in P), therefore since longest path can be reduced to longest-path-length, then longest-path-length is in P.
we can solve it the oposite way, what we do is, run G(u,v,k') for k'=0 to n, every time check if the k==k', is so we solved it.
run time analysis for this:
n*polytime+ n*(constant comparsion)=polytime
Can someone tell me if my answer is reasonable? if not please tell me where i've gone wrong
Also can you give me some advice to how to study algorithms ,and what approch i should take to solve a algorith question (or a graph question)
please and thankyou
Your answer is reasonable but I would try to shore it up a little bit formally (format the cases separately in a clear manner, be more precise about what polynomial time means, that kind of stuff...)
The only thing that I would like to point out is that in your second reduction (showing the decision problem solves the optimization problem) the for k=0 to N solution is not general. Polynomial time is determined in relation to the length of input so in problems where N is a general number (such as weight or something) instead of a number of a count of items from the input (as in this case) you need to use a more advanced binary search to be sure.

Optimization problem - vector mapping

A and B are sets of N dimensional vectors (N=10), |B|>=|A| (|A|=10^2, |B|=10^5). Similarity measure sim(a,b) is dot product (required). The task is following: for each vector a in A find vector b in B, such that sum of similarities ss of all pairs is maximal.
My first attempt was greedy algorithm:
find the pair with the highest similarity and remove that pair from A,B
repeat (1) until A is empty
But such greedy algorithm is suboptimal in this case:
a_1=[1, 0]
a_2=[.5, .4]
b_1=[1, 1]
b_2=[.9, 0]
sim(a_1,b_1)=1
sim(a_1,b_2)=.9
sim(a_2,b_1)=.9
sim(a_2, b_2)=.45
Algorithm returns [a_1,b_1] and [a_2, b_2], ss=1.45, but optimal solution yields ss=1.8.
Is there efficient algo to solve this problem? Thanks
This is essentially a matching problem in weighted bipartite graph. Just assume that weight function f is a dot product (|ab|).
I don't think the special structure of your weight function will simplify problem a lot, so you're pretty much down to finding a maximum matching.
You can find some basic algorithms for this problem in this wikipedia article. Although at first glance they don't seem viable for your data (V = 10^5, E = 10^7), I would still research them: some of them might allow you to take advantage of your 'lame' set of vertixes, with one part orders of magnitude smaller than the other.
This article also seems relevant, although doesn't list any algorithms.
Not exactly a solution, but hope it helps.
I second Nikita here, it is an assignment (or matching) problem. I'm not sure this is computationally feasible for your problem, but you could use the Hungarian algorithm, also known as Munkres' assignment algorithm, where the cost of assignment (i,j) is the negative of the dot product of ai and bj. Unless you happen to know how the elements of A and B are formed, I think this is the most efficient known algorithm for your problem.

Resources