Why is the term 'reduce' used in the context of NP complexity? - complexity-theory

Why is the term 'reduce' used when B is at least as hard?
In the context of NP complexity, we say that A is reducible to B in polynomial time, namely, A ≤ B where A is a known hard problem and we try to show that it can be reduced to B, a problem with unknown hardness.
Suppose we prove it successfully, that means that B is at least as hard as A. Then what exactly is reduced? It does not seem to be inline with the meaning of 'reduce' when B is a problem that is harder and less general.

Reduce comes from the latin "reducere", composed of "re" (back) and "ducere" (to lead). In this context, it literally means "bring back, convert", since the problem of deciding if an element x is in A is converted to the problem of deciding if a suitably transformed input f(x) is in B.
Let me observe that the notion of reducibility is used in may different context apart from (NP) complexity. In particular, it originated in computability theory.

Related

The problem complexity of the Maximum Coverage problem with set size constraint (P or NP)

The classic Maximum Coverage (MC) problem is an NP-hard optimization problem. Consider d elements U = {e1, e2, ... ed} and c sets T1, T2 ... Tc. Each set contains some elements in U. The problem aims to find at most b sets, such that the cardinality of the union of these sets is maximized.
For example, T1={e1, e3}, T2={e1, e2, e3} and T3={e3, e4}. When b=2, the optimal solution picks T2 and T3.
I am considering a variation of the classic MC problem, which imposes a set size constraint. Consider 1 < k <= d, if the size of all sets is bounded by k. Call this problem k-MC. Is the problem still NP-hard?
My conjecture is that k-MC is still NP-hard, but I am struggling to come up with a polynomial reduction from a proven NP-hard problem, like MC.
For an arbitrary instance of Maximum coverage, if I could find a polynomial reduction to my problem for all k>1, I can conclude that my problem is also NP-hard.
Here is what I got so far:
When k=d, the problem is trivially equivalent to the classic Maximum Coverage.
When k=d-1, we look at the given MC instance and see if there exist a set with size d. If there is, simply pick that. Otherwise, it reduces to the k-MC problem with k=d-1.
When k is less than d-1, I resort to dynamic programming to complete the reduction. However, this yields a non-polynomial time reduction, which defeat the purpose of reduction from a NP-hard problem.
If anyone could give me some pointers on how I should tackle this problem, or even just make an educated guess on the problem complexity of k-MC (P or NP), I'd really appreciate it.
2-MC is easy -- interpret the sets of size 2 as a graph and run your favorite matching algorithm for non-bipartite graphs. Once you exceed the matching cardinality, you're stuck picking singletons.
3-MC is hard. You can encode an instance of 3-partition as 3-MC by taking the sets to be the triples that sum to the target, then decide if it's solvable by checking coverage for b = n/3.

Minimize a DFA with don't care transitions

I have a DFA (Q, Σ, δ, q0, F) with some “don't care transitions.” These transitions model symbols which are known not to appear in the input in some situations. If any such transition is taken, it doesn't matter whether the resulting string is accepted or not.
Is there an algorithm to compute an equivalent DFA with a minimal amount of states? Normal DFA minimisation algorithms cannot be used as they don't know about “don't care” transitions and there doesn't seem to be an obvious way to extend the algorithms.
I think this problem is NP-hard (more on that in a bit). This is what I'd try.
(Optional) Preprocess the input via the usual minimization algorithm with accept/reject/don't care as separate outcomes. (Since don't care is not equivalent to accept or reject, we get the Myhill–Nerode equivalence relation back, allowing a variant of the usual algorithm.)
Generate a conflict graph as follows. Start with all edges between accepting and rejecting states. Take the closure where we iteratively add edges q1—q2 such that there exists a symbol s for which there exists an edge σ(q1, s)—σ(q2, s).
Color this graph with as few colors as possible. (Or approximate.) Lots and lots of coloring algorithms out there. PartialCol is a good starting point.
Merge each color class into a single node. This potentially makes the new transition function multi-valued, but we can choose arbitrarily.
With access to an alphabet of arbitrary size, it seems easy enough to make this reduction to coloring run the other way, proving NP-hardness. The open question for me is whether having a fixed-size alphabet constrains the conflict graph in such a way as to make the resulting coloring instance easier somehow. Alas, I don't have time to research this.
I believe a slight variation of the normal Moore algorithm works. Here's a statement of the algorithm.
Let S be the set of states.
Let P be the set of all unordered pairs drawn from S.
Let M be a subset of P of "marked" pairs.
Initially set M to all pairs where one state is accepting and the other isn't.
Let T(x, c) be the transition function from state x on character c.
Do
For each pair z = <a, b> in P - M
For each character c in the alphabet
If <T(a, c), T(b, c)> is in M
Add z to M and continue with the next z
Until no new additions to M
The final set P - M is a pairwise description of an equivalence relation on states. From it you can create a minimum DFA by merging states and transitions of the original.
I believe don't care transitions can be handled by never marking (adding to M) pairs based on them. That is, we change one line:
If T(a, c) != DC and T(b, c) != DC and <T(a, c), T(b, c)> is in M
(Actually in an implementation, no real algorithm change is needed if DC is a reserved value of type state that's not a state in the original FA.)
I don't have time to think about a formal proof right now, but this makes intuitive sense to me. We skip splitting equivalence classes of states based on transitions that we know will never occur.
The thing I still need to prove to myself is whether the set P - M is still a pairwise description of an equivalence relation. I.e., can we end up with <a,b> and <b,c> but not <a,c>? If so, is there a fixup?

Sorting algorithm for expensive comparison

Given is an array of n distinct objects (not integers), where n is between 5 and 15. I have a comparison function cmp(a, b) which is true if a < b and false otherwise, but it's very expensive to call. I'm looking for a sorting algorithm with the following properties:
It calls cmp(a, b) as few times as possible (subject to constraints below). Calls to cmp(a, b) can't be parallelized or replaced. The cost is unavoidable, i.e. think of each call to cmp(a, b) as costing money.
Aborting the algorithm should give good-enough results (best-fit sort of the array). Ideally the algorithm should attempt to produce a coarse order of the whole array, as opposed partially sorting one subset at a time. This may imply that the overall number of calls is not as small as theoretically possible to sort the entire array.
cmp(a, b) implies not cmp(b, a) => No items in the array are equal => Stability is not required. This is always true, unless...
In rare cases cmp(a, b) violates transitivity. For now I'll ignore this, but ultimately I would like this to be handled as well. Transitivity could be violated in short chains, i.e. x < y < z < x, but not in longer chains. In this case the final order of x y z doesn't matter.
Only the number of calls to cmp() needs to be optimized; algorithm complexity, space, speed and other factors are irrelevant.
Back story
Someone asked where this odd problem arose. Well, despite at my shallow attempt at formalism, the problem is actually not formal at all. A while back a friend of mine found a web page on the internets, that allowed him to put some stuff in a list, and make comparisons on that list in order to get it sorted. He since lost that web page, and asked me to help him out. Sure, I said and smashed my keyboard arriving at this implemtation. You are welcome to peruse the source code to see how i pretended to solve the problem above. Since I was quite inebriated when all this happened, I decided to outsource the real thinking to stack overflow.
Your best bet to start with would be Chp 5 of Knuth's TAOCP Vol III..it is about optimal sorting (ie with minimal number of comparisons). OTOH, since the number of objects you are sorting is very small I doubt there will be any noticeable difference between an optimal algorithm vs, say, bubble sort. So perhaps you will need to focus on making the comparisons cheaper. Strange problem though...would you mind giving details? Where does it arise?

Using nondeterminism to detect cliques?

I am trying to understand non-determinism with the clique-problem.
In computer science, the clique problem refers to any of the problems related to
finding particular complete subgraphs ("cliques") in a graph, i.e., sets of
elements where each pair of elements is connected.
Say I have a graph with nodes A, B, C, D, E, F and i want to decide if a clique of 4 exists.
My understanding of non-determinism is to make a guess by taking four nodes (B, C, D, F) and check if a connection exists between all 4 nodes. If it exists, I conclude that a clique exists and if doesn't, I conclude a clique does not exist.
What I am not sure of however is how this helps solve the problem as I just might have made the wrong choice.
I guess I am trying to understand the application of non-determinism in general.
Nondeterministic choices are different from random or arbitrary choices. When using nondeterminism, if any possible choice that can be made will lead to the algorithm outputting YES, then one of those choices will be selected. If no choice exists that does this, then an arbitrary choice will be made.
If this seems like cheating, in a sense it is. It's unknown how to implement nondeterminism efficiently using a deterministic computer, a randomized algorithm, or parallel computers that have lots of processors but which can only do a small amount of work on each core. These are the P = NP, BPP = NP, and NC = NP questions, respectively. Accordingly, nondeterminism is primarily a theoretical approach to problem solving.
Hope this helps!

A very complex problem in reduction notion

I have studied many about reduction but I have a bad problem in it:
I take this from CLRS :
" ... by “reducing” solving problem A to solving problem B, we use the “easiness” of B to prove the “easiness” of A."
And I take this from "Computational Complexity by Christos H. Papadimitriou " :
" ... problem A is at least as hard as problem B if B reduces to A."
I got confused with these two notion:
when we use easiness , we say that problem X reduces to problem Y and if we have polynomial time algorithm for Y and reduction process is done in polynomial time then problem X is solvable in polynomial time and X is easier than Y or at least is not harder than Y.
But when we use hardness , we say problem X reduces to problem Y and Y is easier than X or at least is not harder than X.
I really got confused, Please help me.
Special thanks.
I think you might have missed that the first quote says "reduce A to B", and the second quote says "reduce B to A".
If X reduces to Y, meaning that Y can be used to solve X, then X is no harder than Y. That's because polynomial-complexity reduction is considered "free", so by reducing X to Y we've found a way to solve X using whatever solutions there are to Y.
So, in the first quote, if A reduces to B and B is easy, that means A is easy (strictly speaking, it's no harder).
The second quote uses the logical contrapositive: if B reduces to A and B is hard, then A must be hard (strictly speaking, it's no easier). Proof: If A was easy, then B would be easy (as above but A and B are reversed). B is not easy, therefore A is not easy.
Your statement, "we say problem X reduces to problem Y and Y is easier than X or at least is not harder than X" is false. It is possible for X to reduce to Y (that is, we can use Y to solve X), even though Y in fact is harder than X. So we could reduce addition (X) to a special case of some NP-hard problem (Y), by defining a scheme to construct in polynomial time an instance of the NP-hard problem whose solution is the sum of our two input numbers. It doesn't mean addition is NP-hard, just that we've made things unnecessarily difficult for ourselves. It's unwise to use that reduction in order to perform addition, since there are better ways to do addition. Well, better assuming P!=NP, that is.
Think of reduction as reduction of the proof for the problem being in a certain class rather than reducing the problem itself. The relation is more related to logic than to complexity.
The theory is simply this.
You have an algorithm to solve problem A that you know can be solved in polynominal time.
If it is possible to convert problem B into a notation that can be solved by problem A and then convert the result back into the notation for problem B in polynominal time, then to solve problem B will also be in polynominal time - as the total time is just the addition of two polynominals - hence no harder.

Resources