How do we know that an NFA has a minimum amount of states? - nfa

Is there some kind of proof for this? How can we know that the current NFA has the minimum amount?

As opposed to DFA minimization, where efficient methods exist to not only determine the size of, but actually compute, the smallest DFA in terms of number of states that describes a given regular language, no such general method is known for determining the size of a smallest NFA. Moreover, unless P=PSPACE, no polynomial-time algorithm exists to compute a minimal NFA to recognize a language, as the following decision problem is PSPACE-complete:
Given a DFA M that accepts the regular language L, and an integer k, is there an NFA with ≤ k states accepting L?
(Jiang & Ravikumar 1993).
There is, however, a simple theorem from Glaister and Shallit that can be used to determine lower bounds on the number of states of a minimal NFA:
Let L ⊆ Σ* be a regular language and suppose that there exist n pairs P = { (xi, wi) | 1 ≤ i ≤ n } such that:
xi wi ∈ L for 1 ≤ i ≤ n
xj wi ∉ L for 1 ≤ j, i ≤ n and j ≠ i
Then any NFA accepting L has at least n states.
See: Ian Glaister and Jeffrey Shallit (1996). "A lower bound technique for the size of nondeterministic finite automata". Information Processing Letters 59 (2), pp. 75–77. DOI:10.1016/0020-0190(96)00095-6.

Related

Roots of a polynomial mod a prime

I'm looking for a speedy algorithm to find the roots of a univariate polynomial in a prime finite field.
That is, if f = a0 + a1x + a2x2 + ... + anxn (n > 0) then an algorithm that finds all r < p satisfying f(r) = 0 mod p, for a given prime p.
I found Chiens search algorithm https://en.wikipedia.org/wiki/Chien_search but I can't imagine this being that fast for primes greater than 20 bits. Does anyone have experience with Chien's search algorithm or know a faster way? Is there a sympy module for this?
This is pretty well studied, as mcdowella's comment indicates. Here is how the Cantor-Zassenhaus random algorithm works for the case where you want to find the roots of a polynomial, instead of the more general factorization.
Note that in the ring of polynomials with coefficients mod p, the product x(x-1)(x-2)...(x-p+1) has all possible roots, and equals x^p-x by Fermat's Little Theorem and unique factorization in this ring.
Set g = GCD(f,x^p-x). Using Euclid's algorithm to compute the GCD of two polynomials is fast in general, taking a number of steps that is logarithmic in the maximum degree. It does not require you to factor the polynomials. g has the same roots as f in the field, and no repeated factors.
Because of the special form of x^p-x, with only two nonzero terms, the first step of Euclid's algorithm can be done by repeated squaring, in about 2 log_2 (p) steps involving only polynomials of degree no more than twice the degree of f, with coefficients mod p. We may compute x mod f, x^2 mod f, x^4 mod f, etc, then multiply together the terms corresponding to nonzero places in the binary expansion of p to compute x^p mod f, and finally subtract x.
Repeatedly do the following: Choose a random d in Z/p. Compute the GCD of g with r_d = (x+d)^((p-1)/2)-1, which we can again compute rapidly by Euclid's algorithm, using repeated squaring on the first step. If the degree of this GCD is strictly between 0 and the degree of g, we have found a nontrivial factor of g, and we can recurse until we have found the linear factors hence roots of g and thus f.
How often does this work? r_d has as roots the numbers that are d less than a nonzero square mod p. Consider two distinct roots of g, a and b, so (x-a) and (x-b) are factors of g. If a+d is a nonzero square, and b+d is not, then (x-a) is a common factor of g and r_d, while (x-b) is not, which means GCD(g,r_d) is a nontrivial factor of g. Similarly, if b+d is a nonzero square while a+d is not, then (x-b) is a common factor of g and r_d while (x-a) is not. By number theory, one case or the other happens close to half of the possible choices for d, which means that on average it takes a constant number of choices of d before we find a nontrivial factor of g, in fact one separating (x-a) from (x-b).
Your answers are good, but I think I found a wonderful method to find the roots modulo any number: This method based on "LATTICES". Let r ≤ R be a root of mod p. We must find another function such as h(x) such that h isn't large and r is root of h. Lattice method find this function. At the first time, we must create a basis of polynomial for lattice and then, with "LLL" algorithm, we find a "shortest vector" that has root r without modulo p. In fact, we eliminate modulo p with this way.
For more explanation, refer to "Coppersmith D. Finding small solutions to small degree polynomials. In Cryptography and lattices".

How can I reduce this kind of BinPack algorithm? (It might be called sth like MinBreak-BinFill.)

I use a special variant of BinPack problem.
I use a naïve algorithm, atm, so I like to know how it might be called to do some initial research. Or does anyone know how to reduce this problem to something known?
The problem:
There are items I and bins B in specific quantity and size.
|I| ∈ ℕ, |B| ∈ ℕ
s : (I ∪ B) → ℕ
The sum of all item-sizes is at least the size of the sum of all bins.
∑ _{i∈I} s(i) ≥ ∑ _{b∈B} s(b)
Each bin has to be filled with items or parts of items so that it is filled completely. s(b,i) is the size of that part of i that is in b, or 0 iff not.
∀ b ∈ B, i ∈ I: s(b,i) ∈ ℕ ∪ {0}
∀ i ∈ I: ∑ _{b∈B} s(b,i) ≤ s(i)
∀ b ∈ B: ∑ _{i∈I} s(b,i) ≥ s(b)
The goal is to minimize the number of item-parts needed to fill all bins.
numparts = |{ (b,i) ∈ B×I | s(b,i)>0 }|
find min(numparts)
Finally!
It is called bin packing with size preserving fragmentation (BP-SPF).
There exists this paper
Approximation
Schemes
for
Packing
with
Item
Fragmentation
Omer
Yehezkely
(November 2006)

Find the set S which maximize the minimum distance between points in S union A

I would like to find a set S of given cardinality k maximizing the minimum distance between each points and a given set A. Is there a simple algorithm to find the solution of this max-min problem ?
Given a universe X ⊆ R^d and A ⊆ X,
find the argmax_{S⊆X, |S|=k} min_{s⊆S, s'≠s ⊆ S∪A} distance(s,s')
Thanks !
For a given k with X and A as input brute forcing is obviously in P (as C(|X|,k) is polynomial in |X|).
If k is also an input then it might depends on 'distance' :
If 'distance' is arbitrary then your problem is equivalent to find a fixed size clique in a graph (which is NP-complete):
NP-Hardness :
Take an instance of the clique problem, that is a graph (G,E) and a integer k.
Add to this graph a vertex 'a' connected to every other vertex, let (G',E') be your modified graph.
(G',E') k+1 is then an equivalent instance of your first instance of the clique problem.
Create a map Phi from G' to R^d (you can map G' on N anyway ...) and defined 'distance' such that distance(Phi(c),Phi(d')) = 1 if (c,d) ⊆ E', 0 otherwise.
X = Phi(G'), A = Phi({a}), k+1 give you an instance of your problem.
You can notice that by construction s ≠ Phi(a) <=> distance(s,Phi(A)) = 1 = max distance(.,.), i.e. min_{s⊆S, s'≠s ⊆ S∪A} distance(s,s') = min_{s⊆S, s'≠s ⊆ S} distance(s,s') if |S| = k >= 2
Solve this instance (X,A,k+1) of your problem : this give you a set S of cardinality k+1 such that min( distance(s,s') |s,s'⊆ S, s≠s') is maximal.
Check whether there are s,s'⊆ S, s≠s', distance(s,s') = 0 (this can be done in k^2 as |S| = k) :
if it is the case then there is no set of cardinality k+1 such that forall s,s' distance(s,s') = 1 i.e. there is no subgraph of G' which is a k+1-clique
if it's not the case then map back S into G', by definition of the 'distance' there is an edge between any vertices of Phi^-1 (S) : it's a k+1 clique
This solve with a polynomial reduction a problem equivalent to the clique problem (known as NP-hard).
NP-Easiness :
Let X, A, k be an instance of your problem.
For any subset S of X min_{s⊆S, s'≠s ⊆ S∪A} distance(s,s') can only take value in {distance(x,y), x,y ⊆ X} -which has a polynomial cardinal-.
To find a set that maximize this distance we will just test every possible distance for a correct set in a decreasing order.
To test a distance d we first reduce X to X' containing only points at distance >= d of A{the point itself}.
Then we create a graph (X',E) ¤ where (s,s') ⊆ E iff distance(s,s') >= d.
Test this graph for a k-clique (it's NP-easy), by construction there is one iff there it's vertices S are a set with min_{s⊆S, s'≠s ⊆ S∪A} distance(s,s') >= d
If 'distance' is euclidean or have any funny property it might be in P, I don't know but I won't hope too much if I were you.
¤ I assumed that X (and thus X') was finite here
This is probably NP-hard. Greedily choosing the point furthest away from previous choices is a 2-approximation. There might be a complicated approximation scheme for low d based on the scheme for Euclidean TSP by Arora and Mitchell. For d = 10, forget it.
The top 2 results for the search sphere packing np complete turned up references to a 1981 paper proving that optimal packing and covering in 2d is np complete. I do not have access to a research library, so I can't read the paper. But I expect that your problem can be rephrased as that one, in which case you have a proof that it you have an NP-complete problem.

String to string correction problem np-completeness proof

I have this assignment to prove that this problem:
Finite alphabet £, two strings x,y €
£*, and a positive integer K. Is
there a way to derive the string y
from the string x by a sequence of K
or fewer operations of single symbol
deletion or adjacent symbol
interchange?
is np-complete. I already figured out I have to make transformation from decision version of set covering problem, but I have no clue how to do this. Any help would be appreciated.
It looks like modified Levenshtein distance. Problem can be solved with DP in quadratic time.
Transformation from minimum set cover (MSC) to this string correction problem is described in:
Robert A. Wagner
On the complexity of the Extended String-to-String Correction Problem
1975, Proceedings of seventh annual ACM symposium on Theory of computing
In short with MSC problem:
Given finite sets x_1, ..., x_n, and integer L, does there exists a subset J of {1,...,n} such that |J| <= L, and
union_{j in J} x_j = union all x_i ?
Let w = union all x_i, let t = |w| and r = t^2, and choose symbols Q, R, S not in w.
Take strings:
A = Q^r R x_1 Q^r S^(r+1) ... Q^r R x_n Q^r S^(r+1)
B = R Q^r ... R Q^r w S^(r+1) ... S^(r+1) <- each ... is n times
and
k = (l+1)r - 1 + 2t(r+1)(n-1) + n(n-1)(r+1)^2/2 + (r*n + |x_1 ... x_n| - t)*W_d
[W_d is delete operation weight, can be 1.]
It is shown that string-string correction problems (A,B,k) is satisfiable iff source MSC problem is.
From strings construction it is clear that proof is not trivial :-) But it isn't too complex to manage.
The NP-hardness proof that is mentioned only works for arbitrarily large alphabets.
For finite alphabets, the problem is polynomial-time, see
https://dblp.org/rec/bibtex/journals/tcs/Meister15

Find sum in array equal to zero

Given an array of integers, find a set of at least one integer which sums to 0.
For example, given [-1, 8, 6, 7, 2, 1, -2, -5], the algorithm may output [-1, 6, 2, -2, -5] because this is a subset of the input array, which sums to 0.
The solution must run in polynomial time.
You'll have a hard time doing this in polynomial time, as the problem is known as the Subset sum problem, and is known to be NP-complete.
If you do find a polynomial solution, though, you'll have solved the "P = NP?" problem, which will make you quite rich.
The closest you get to a known polynomial solution is an approximation, such as the one listed on Wikipedia, which will try to get you an answer with a sum close to, but not necessarily equal to, 0.
This is a Subset sum problem, It's NP-Compelete but there is pseudo polynomial time algorithm for it. see wiki.
The problem can be solved in polynomial if the sum of items in set is polynomially related to number of items, from wiki:
The problem can be solved as follows
using dynamic programming. Suppose the
sequence is
x1, ..., xn
and we wish to determine if there is a
nonempty subset which sums to 0. Let N
be the sum of the negative values and
P the sum of the positive values.
Define the boolean-valued function
Q(i,s) to be the value (true or false)
of
"there is a nonempty subset of x1, ..., xi which sums to s".
Thus, the solution to the problem is
the value of Q(n,0).
Clearly, Q(i,s) = false if s < N or s
P so these values do not need to be stored or computed. Create an array to
hold the values Q(i,s) for 1 ≤ i ≤ n
and N ≤ s ≤ P.
The array can now be filled in using a
simple recursion. Initially, for N ≤ s
≤ P, set
Q(1,s) := (x1 = s).
Then, for i = 2, …, n, set
Q(i,s) := Q(i − 1,s) or (xi = s) or Q(i − 1,s − xi) for N ≤ s ≤ P.
For each assignment, the values of Q
on the right side are already known,
either because they were stored in the
table for the previous value of i or
because Q(i − 1,s − xi) = false if s −
xi < N or s − xi > P. Therefore, the
total number of arithmetic operations
is O(n(P − N)). For example, if all
the values are O(nk) for some k, then
the time required is O(nk+2).
This algorithm is easily modified to
return the subset with sum 0 if there
is one.
This solution does not count as
polynomial time in complexity theory
because P − N is not polynomial in the
size of the problem, which is the
number of bits used to represent it.
This algorithm is polynomial in the
values of N and P, which are
exponential in their numbers of bits.
A more general problem asks for a
subset summing to a specified value
(not necessarily 0). It can be solved
by a simple modification of the
algorithm above. For the case that
each xi is positive and bounded by the
same constant, Pisinger found a linear
time algorithm.[2]
It is well known Subset sum problem which NP-complete problem.
If you are interested in algorithms then most probably you are math enthusiast that I advise you look at
Subset Sum problem in mathworld
and here you can find the algorithm for it
Polynomial time approximation algorithm
initialize a list S to contain one element 0.
for each i from 1 to N do
let T be a list consisting of xi+y,
for all y in S
let U be the union of T and S
sort U
make S empty
let y be the smallest element of U
add y to S
for each element z of U in
increasing order do //trim the list by
eliminating numbers
close one to another
if y<(1-c/N)z, set y=z and add z to S
if S contains a number between (1-c)s and s, output yes, otherwise no

Resources