I've got a logical problem, I find some expression of what I want, but for now I'm overconstrainted and I don't know how to relax it.
The Context:
Let's assume we've got an oriented graph which can have cycles. There is no weight on edges. So something really simple as follow (don't bother with what is inside each node, it is not important) :
One important thing : w1 is never changing his name.
The Problem:
I'm formalizing logical problem with graph. To tell that there is an edge between the node wi and the node wj : I have got a propositional variable (which can be true or false) named Rij. So for N nodes, I've got N^2 propositional variables.
Theses graphs have isomorphism, and I would like to formalize the constraints
that if there is no edge between i and j then there is no edge between i and k (for k > j).
Say with others words, I want to put constraints to formalize the breaking of symmetry in graphs.
I try something naïve as follow :
forall i, forall j (>i), forall k (>j), ( not(Rij) --> not(Rik) )
This constraint is working quite often, unfortunately, it's over constraining. It happens sometimes that we can't put any label of nodes in order to respect theses constraints;
example of problem:
With this example, it is impossible to name A,B,C such as if there is no edge between wi and wj then there is no edge between wi and wk (with k>j).
Sum-up of the problem
There is a way to adapt theses logicals constraints (or to formulate totally new ones) in order than we can always rename A,B,C... in one and only one way, no matter the original graph. My problem: how to formulate theses constraints ?
Thanks in advance for your help !
Best Regards.
Related
I have a DFA (Q, Σ, δ, q0, F) with some “don't care transitions.” These transitions model symbols which are known not to appear in the input in some situations. If any such transition is taken, it doesn't matter whether the resulting string is accepted or not.
Is there an algorithm to compute an equivalent DFA with a minimal amount of states? Normal DFA minimisation algorithms cannot be used as they don't know about “don't care” transitions and there doesn't seem to be an obvious way to extend the algorithms.
I think this problem is NP-hard (more on that in a bit). This is what I'd try.
(Optional) Preprocess the input via the usual minimization algorithm with accept/reject/don't care as separate outcomes. (Since don't care is not equivalent to accept or reject, we get the Myhill–Nerode equivalence relation back, allowing a variant of the usual algorithm.)
Generate a conflict graph as follows. Start with all edges between accepting and rejecting states. Take the closure where we iteratively add edges q1—q2 such that there exists a symbol s for which there exists an edge σ(q1, s)—σ(q2, s).
Color this graph with as few colors as possible. (Or approximate.) Lots and lots of coloring algorithms out there. PartialCol is a good starting point.
Merge each color class into a single node. This potentially makes the new transition function multi-valued, but we can choose arbitrarily.
With access to an alphabet of arbitrary size, it seems easy enough to make this reduction to coloring run the other way, proving NP-hardness. The open question for me is whether having a fixed-size alphabet constrains the conflict graph in such a way as to make the resulting coloring instance easier somehow. Alas, I don't have time to research this.
I believe a slight variation of the normal Moore algorithm works. Here's a statement of the algorithm.
Let S be the set of states.
Let P be the set of all unordered pairs drawn from S.
Let M be a subset of P of "marked" pairs.
Initially set M to all pairs where one state is accepting and the other isn't.
Let T(x, c) be the transition function from state x on character c.
Do
For each pair z = <a, b> in P - M
For each character c in the alphabet
If <T(a, c), T(b, c)> is in M
Add z to M and continue with the next z
Until no new additions to M
The final set P - M is a pairwise description of an equivalence relation on states. From it you can create a minimum DFA by merging states and transitions of the original.
I believe don't care transitions can be handled by never marking (adding to M) pairs based on them. That is, we change one line:
If T(a, c) != DC and T(b, c) != DC and <T(a, c), T(b, c)> is in M
(Actually in an implementation, no real algorithm change is needed if DC is a reserved value of type state that's not a state in the original FA.)
I don't have time to think about a formal proof right now, but this makes intuitive sense to me. We skip splitting equivalence classes of states based on transitions that we know will never occur.
The thing I still need to prove to myself is whether the set P - M is still a pairwise description of an equivalence relation. I.e., can we end up with <a,b> and <b,c> but not <a,c>? If so, is there a fixup?
I asked about the minimum cost maximum flow several weeks ago. Kraskevich's answer was brilliant and solved my problem. I've implemented it and it works fine (available only in French, sorry). Additionaly, the algorithm can handle the assignement of i (i > 1) projects to each student.
Now I'm trying something more difficult. I'd like to add constraints on choices. In the case one wants to affect i (i > 1) projects to each student, I'd like to be able to specify which projects are compatible (each other).
In the case some projects are not compatible, I'd like the algorithm to return the global optimum, i.e. affect i projects to each student maximizing global happiness and repecting compatibility constraints.
Chaining i times the original method (and checking constraints at each step) will not help, as it would only return a local optimum.
Any idea about the correct graph to work with ?
Unfortunately, it is not solvable in polynomial time(unless P = NP or there are additional constraints).
Here is a polynomial time reduction from the maximum independent set problem(which is known to be NP-complete) to this one:
Given a graph G and a number k, do the following:
Create a project for each vertex in the graph G and say that two project are incompatible iff there is an edge between the corresponding vertices in G.
Create one student who likes each project equally(we can assume that the happiness each project gives to him is equal to 1).
Find the maximum happiness using an algorithm that solves the problem stated in your question. Let's call it h.
A set of projects can be picked iff they all are compatible, which means that the picked vertices of G form an independent set(due to the way we constructed the graph).
Thus, h is equal to the size of the maximum independent set.
Return h >= k.
What does it mean in practice? It means that it is not reasonable to look for a polynomial time solution to this problem. There are several things that can be done:
If the input is small, you can use exhaustive search.
If it is not, you can use heuristics and/or approximations to find a relatively good solution(not necessary the optimal one, though).
If you can stomach the library dependency, integer programming will be quicker and easier than anything you can implement yourself. All you have to do is formulate the original problem as an integer program and then add your ad hoc constraints at the end.
I'm looking for leads on algorithms to deduce the timeline/chronology of a series of novels. I've split the texts into days and created a database of relationships between them, e.g.: X is a month before Y, Y and Z are consecutive, date of Z is known, X is on a Tuesday, etc. There is uncertainty ('month' really only means roughly 30 days) and also contradictions. I can mark some relationships as more reliable than others to help resolve ambiguity and contradictions.
What kind of algorithms exist to deduce a best-fit chronology from this kind of data, assigning a highest-probability date to each day? At least time is 1-dimensional but dealing with a complex relationship graph with inconsistencies seems non-trivial. I have a CS background so I can code something up but some idea about the names of applicable algorithms would be helpful. I guess what I have is a graph with days as nodes as relationships as edges.
A simple, crude first approximation to your problem would be to store information like "A happened before B" in a directed graph with edges like "A -> B". Test the graph to see whether it is a Directed Acyclic Graph (DAG). If it is, the information is consistent in the sense that there is a consistent chronology of what happened before what else. You can get a sample linear chronology by printing a "topological sort" (topsort) of the DAG. If events C and D happened simultaneously or there is no information to say which came before the other, they might appear in the topsort as ABCD or ABDC. You can even get the topsort algorithm to print all possibilities (so both ABCD and ABDC) for further analysis using more detailed information.
If the graph you obtain is not a DAG, you can use an algorithm like Tarjan's algorithm to quickly identify "strongly connected components", which are areas of the graph which contain chronological contradictions in the form of cycles. You could then analyze them more closely to determine which less reliable edges might be removed to resolve contradictions. Another way to identify edges to remove to eliminate cycles is to search for "minimum feedback arc sets". That's NP-hard in general but if your strongly connected components are small the search could be feasible.
Constraint programming is what you need. In propagation-based CP, you alternate between (a) making a decision at the current choice point in the search tree and (b) propagating the consequences of that decision as far as you can. Notionally you do this by maintaining a domain D of possible values for each problem variable x such that D(x) is the set of values for x which have not yet been ruled out along the current search path. In your problem, you might be able to reduce it to a large set of Boolean variables, x_ij, where x_ij is true iff event i precedes event j. Initially D(x) = {true, false} for all variables. A decision is simply reducing the domain of an undecided variable (for a Boolean variable this means reducing its domain to a single value, true or false, which is the same as an assignment). If at any point along a search path D(x) becomes empty for any x, you have reached a dead-end and have to backtrack.
If you're smart, you will try to learn from each failure and also retreat as far back up the search tree as required to avoid redundant search (this is called backjumping -- for example, if you identify that the dead-end you reached at level 7 was caused by the choice you made at level 3, there's no point in backtracking just to level 6 because no solution exists in this subtree given the choice you made at level 3!).
Now, given you have different degrees of confidence in your data, you actually have an optimisation problem. That is, you're not just looking for a solution that satisfies all the constraints that must be true, but one which also best satisfies the other "soft" constraints according to the degree of trust you have in them. What you need to do here is decide on an objective function assigning a score to a given set of satisfied/violated partial constraints. You then want to prune your search whenever you find the current search path cannot improve on the best previously found solution.
If you do decide to go for the Boolean approach, you could profitably look into SAT solvers, which tear through these kinds of problems. But the first place I'd look is at MiniZinc, a CP language which maps on to a whole variety of state of the art constraint solvers.
Best of luck!
Problem
I'm looking at a special subset of SAT optimization problem. For those not familiar with SAT and related topics, here's the related Wikipedia article.
TRUE=(a OR b OR c OR d) AND (a OR f) AND ...
There are no NOTs and it's in conjunctive normal form. This is easily solvable. However I'm trying to minimize the number of true assignments to make the whole statement true. I couldn't find a way to solve that problem.
Possible solutions
I came up with the following ways to solve it:
Convert to a directed graph and search the minimum spanning tree, spanning only a subset of vertices. There's Edmond's algorithm but that gives a MST for the complete graph instead of a subset of the vertices.
Maybe there's a version of Edmond's algorithm that solves the problem for a subset of the vertices?
Maybe there's a way to construct a graph out of the original problem that's solvable with other algorithms?
Use a SAT solver, a LIP solver or exhaustive search. I'm not interested in those solutions as I'm trying to use this problem as lecture material.
Question
Do you have any ideas/comments? Can you come up with other approaches that might work?
This problem is NP-Hard as well.
One can show an east reduction from Hitting Set:
Hitting Set problem: Given sets S1,S2,...,Sn and a number k: chose set S of size k, such that for every Si there is an element s in S such that s is in Si. [alternative definition: the intersection between each Si and S is not empty].
Reduction:
for an instance (S1,...,Sn,k) of hitting set, construct the instance of your problem: (S'1 AND S'2 And ... S'n,k) where S'i is all elements in Si, with OR. These elements in S'i are variables in the formula.
proof:
Hitting Set -> This problem: If there is an instance of hittins set, S then by assigning all of S's elements with true, the formula is satisfied with k elements, since for every S'i there is some variable v which is in S and Si and thus also in S'i.
This problem -> Hitting set: build S with all elements whom assigment is true [same idea as Hitting Set->This problem].
Since you are looking for the optimization problem for this, it is also NP-Hard, and if you are looking for an exact solution - you should try an exponential algorithm
This is the question [From CLRS]:
Define the optimization problem LONGEST-PATH-LENGTH as the relation that
associates each instance of an undirected graph and two vertices with the number
of edges in a longest simple path between the two vertices. Define the decision
problem LONGEST-PATH = {: G=(V,E) is an undirected
graph, u,v contained in V, k >= 0 is an integer, and there exists a simple path
from u to v in G consisting of at least k edges}. Show that the optimization problem
LONGEST-PATH-LENGTH can be solved in polynomial time if and only if
LONGEST-PATH is contained in P.
My solution:
Given an algorith A, that can solve G(u,v) in polytime, so we run the A on G(u,v) if it returns 'YES" and k' such that k' is the longest path in G(u,v), now all we have to do it compare if
k =< k'
if then the longest path length is solved. If we recieve "NO" or k>=k', then there exists no solution.
so polytime to run A + constant for comparsion, then to find the longest path length it takes poly time. Also this is only possible since G(u,v) runs in Polytime (in P), thus G(u,v,k) runs also in polytime (in P), therefore since longest path can be reduced to longest-path-length, then longest-path-length is in P.
we can solve it the oposite way, what we do is, run G(u,v,k') for k'=0 to n, every time check if the k==k', is so we solved it.
run time analysis for this:
n*polytime+ n*(constant comparsion)=polytime
Can someone tell me if my answer is reasonable? if not please tell me where i've gone wrong
Also can you give me some advice to how to study algorithms ,and what approch i should take to solve a algorith question (or a graph question)
please and thankyou
Your answer is reasonable but I would try to shore it up a little bit formally (format the cases separately in a clear manner, be more precise about what polynomial time means, that kind of stuff...)
The only thing that I would like to point out is that in your second reduction (showing the decision problem solves the optimization problem) the for k=0 to N solution is not general. Polynomial time is determined in relation to the length of input so in problems where N is a general number (such as weight or something) instead of a number of a count of items from the input (as in this case) you need to use a more advanced binary search to be sure.