Related
I am trying to solve the problem below for the last two days. I can't think of any solution for it other than brute force. Any kind of hints or references will be appreciated. TIA.
"Given N distinct prime integers i.e p1, p2,..., pN and an interval [L,R]. Calculate the number of integers in this interval that are divisible by at least one of the given primes."
N is very small (1<=N<=10) and L,R are very big (1<=L<=R<=10^10)
First note, it's easier to restrict the problem, and ignore the lower bound (ie: treat L=1). If we can count numbers divisible by the primes <= N for any N, we can also count them on an interval, by subtracting the count of numbers <= L-1 from the count <= R.
Given any number x, the count of numbers <= R divisible by x is floor(R / x).
Now, we can apply the inclusion-exclusion principle to get the result. First, I'll show the results by hand for 3 primes p1, p2 and p3, and then give the general result.
The count of numbers <= R divisible by p1, p2 or p3 is:
R / p1 + R / p2 + R / p3
- R / (p1p2) - R / (p1p3) - R / (p2p3)
+ R / (p1p2p3)
(Here / is assumed to be rounding-down integer division).
The general case is as follows:
sum((-1)^(|S|+1) * R / prod(S) for S a non-empty subset of {p1, p2, .., pN}).
Here S ranges over all subsets of your primes, prod(S) is the product of the primes in the subset, and the initial term varies between -1 and +1 depending on the size of the subset.
For your problem, N<=10, so there's 1023 non-empty subsets which a small number of things to iterate over.
Here's some example Python code:
from itertools import *
def powerset(iterable):
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
def prod(ns):
r = 1
for n in ns:
r *= n
return r
def divs(primes, N):
r = 0
for S in powerset(primes):
if not S: continue
sign = 1 if len(S) % 2 else -1
r += sign * (N // prod(S))
return r
def divs_in_range(primes, L, R):
return divs(primes, R) - divs(primes, L-1)
Note, that the running time of this code is more-or-less only dependent on the number of primes, and not so much on the magnitudes of L and R.
Assuming n is the interval size and N is const.
For each prime p, there should be roughly (R-L) / p numbers in the interval divisible by the prime.
Finding the first number divisible by p in interval: L' = L + (p - L % p).
Now if L' > R, there is none; otherwise there are 1 + floor((R-L') / p).
Example: 3, [10, 20]:
L' = 10 + 3 - 10 % 3 = 12.
Numbers divisible by 3 in the interval: 1 + floor((20 - 12) / 3) = 3
Note: So far we haven't used the fact that p1..pN are primes.
Remaining problem seems to be: How to avoid counting a number divisible by multiple primes multiple times? Example: Assuming we have 3,5 and [10, 20], we need to avoid counting 15 twice...
Maybe we can just count divisibility by (p1*p2) etc. using the counting algorithm above, and reduce the total accordingly? If N ist const, this should still be const time. Because p1...pN are prime, all their products need to be different (as any number can't have more than one prime factorizations).
You are given N total number of item, P group in which you have to divide the N items.
Condition is the product of number of item held by each group should be max.
example N=10 and P=3 you can divide the 10 item in {3,4,3} since 3x3x4=36 max possible product.
You will want to form P groups of roughly N / P elements. However, this will not always be possible, as N might not be divisible by P, as is the case for your example.
So form groups of floor(N / P) elements initially. For your example, you'd form:
floor(10 / 3) = 3
=> groups = {3, 3, 3}
Now, take the remainder of the division of N by P:
10 mod 3 = 1
This means you have to distribute 1 more item to your groups (you can have up to P - 1 items left to distribute in general):
for i = 0 up to (N mod P) - 1:
groups[i]++
=> groups = {4, 3, 3} for your example
Which is also a valid solution.
For fun I worked out a proof of the fact that it in an optimal solution either all numbers = N/P or the numbers are some combination of floor(N/P) and ceiling(N/P). The proof is somewhat long, but proving optimality in a discrete context is seldom trivial. I would be interested if anybody can shorten the proof.
Lemma: For P = 2 the optimal way to divide N is into {N/2, N/2} if N is even and {floor(N/2), ceiling(N/2)} if N is odd.
This follows since the constraint that the two numbers sum to N means that the two numbers are of the form x, N-x.
The resulting product is (N-x)x = Nx - x^2. This is a parabola that opens down. Its max is at its vertex at x = N/2. If N is even this max is an integer. If N is odd, then x = N/2 is a fraction, but such parabolas are strictly unimodal, so the closer x gets to N/2 the larger the product. x = floor(N/2) (or ceiling, it doesn't matter by symmetry) is the closest an integer can get to N/2, hence {floor(N/2),ceiling(N/2)} is optimal for integers.
General case: First of all, a global max exists since there are only finitely many integer partitions and a finite list of numbers always has a max. Suppose that {x_1, x_2, ..., x_P} is globally optimal. Claim: given and i,j we have
|x_i - x_ j| <= 1
In other words: any two numbers in an optimal solution differ by at most 1. This follows immediately from the P = 2 lemma (applied to N = x_i + x_ j).
From this claim it follows that there are at most two distinct numbers among the x_i. If there is only 1 number, that number is clearly N/P. If there are two numbers, they are of the form a and a+1. Let k = the number of x_i which equal a+1, hence P-k of the x_i = a. Hence
(P-k)a + k(a+1) = N, where k is an integer with 1 <= k < P
But simple algebra yields that a = (N-k)/P = N/P - k/P.
Hence -- a is an integer < N/P which differs from N/P by less than 1 (k/P < 1)
Thus a = floor(N/P) and a+1 = ceiling(N/P).
QED
I am given a large number n and I need to find whether it can be represented as sum of K prime numbers.
Ex 9 can be represented as sum of 3 prime number as 2+2+5.
I am trying to use variation of subset sum but number is too large to generate all primes number till then.
The problem is from the current HackerRank contest. The restrictions are 1 <= n, K <= 10^12
For K = 1, the answer is obviously "Yes" iif N is prime
For K = 2, according to the Goldbach conjecture, which is verified for N up to around 10^18, the answer is "Yes" iif N is even and N >= 4 or if N - 2 is prime.
The interesting case is K = 3. Obviously if N < 6, the answer is "No" because the smallest number expressible as the sum of three primes is 2 + 2 + 2 = 6.
If N >= 6, then either N - 2 or N - 3 is even and >= 4, so we can apply Goldbach's conjecture again.
So for K = 3, the answer is "Yes" simply iif N >= 6.
Via induction (hint: just use K - 3 times the prime 2), we can show that for K >= 3, the answer is "Yes" iif N >= 2*K, so only the cases K = 1 and K = 2 are non-trivial and require just a simple primality check, e.g. via Miller–Rabin in O(log^4 N).
EDIT: As a bonus, this proof also gives a constructive algorithm to output the partition. We use a number of 2's and maybe one 3 to get to K = 2. The tricky K = 2, N even case is not as hard as it looks: We know from computational verification of the Goldbach conjecture that for N >= 12, there is a Goldbach partition with a prime < 5200 or so. There are less than 700 such primes, so we can check them all in a reasonable amount of time.
The concept you are looking for is called the prime partitions of a number. The formula to compute the number of prime partitions of a number is \kappa(n) = \frac{1}{n}\left(\mathrm{sopf}(n) + \sum_{j=1}^{n-1} \mathrm{sopf}(j) \cdot \kappa(n-j)\right); I gave that in LaTeX notation because I don't know how to do it in html. The sopf(n) function is the sum of the distinct prime factors of n, so sopf(42) = 12, since 42 = 2 * 3 * 7, but sopf(12) = 5, since 12 = 2 * 2 * 3 but each prime factor is counted once.
I discuss this formula at my blog.
Your input are n and K. There are many cases :
K > n : impossible
K = n : the K prime numbers are all 1
K < n : 4 subcases :
a. n and K are odd
b. n is even, K is odd
c. n is odd, K is even
d. n and K are even
Case a: select any prime p < n and p > 2. The problem reduces to the same problem with input n-p and K-1 instead of n and K respectively, and we fall in case b
Case b: The problem reduces to the same problem with input n-2 and K-1 instead of n and K respectively, and we fall in case d
Case c: idem than b, but we fall in case a instead of d
Case d: if n = 2K, then 2, 2, ..., 2 taken K times is your solution (ie your primes are 2, 2, ..., 2). Otherwise n can be written
n = (\sum_{i=1}^{i=K-2} 2 ) + p + q
where we add the prime 2 (K-2) times in the sum. Then the problem reduces to the same problem with input n-2(K-2) instead of n and 2 instead of K. But this is Goldbach. You can solve it in O(n sqrt(n)) like this : take p and q both equal to n/2. Increment p and decrease q by 1 at each step until they are both prime.
The question is Number of solutions to a1 x1+a2 x2+....+an xn=k with constraints: 1)ai>0 and ai<=15 2)n>0 and n<=15 3)xi>=0 I was able to formulate a Dynamic programming solution but it is running too long for n>10^10. Please guide me to get a more efficient soution.
The code
int dp[]=new int[16];
dp[0]=1;
BigInteger seen=new BigInteger("0");
while(true)
{
for(int i=0;i<arr[0];i++)
{
if(dp[0]==0)
break;
dp[arr[i+1]]=(dp[arr[i+1]]+dp[0])%1000000007;
}
for(int i=1;i<15;i++)
dp[i-1]=dp[i];
seen=seen.add(new BigInteger("1"));
if(seen.compareTo(n)==0)
break;
}
System.out.println(dp[0]);
arr is the array containing coefficients and answer should be mod 1000000007 as the number of ways donot fit into an int.
Update for real problem:
The actual problem is much simpler. However, it's hard to be helpful without spoiling it entirely.
Stripping it down to the bare essentials, the problem is
Given k distinct positive integers L1, ... , Lk and a nonnegative integer n, how many different finite sequences (a1, ..., ar) are there such that 1. for all i (1 <= i <= r), ai is one of the Lj, and 2. a1 + ... + ar = n. (In other words, the number of compositions of n using only the given Lj.)
For convenience, you are also told that all the Lj are <= 15 (and hence k <= 15), and n <= 10^18. And, so that the entire computation can be carried out using 64-bit integers (the number of sequences grows exponentially with n, you wouldn't have enough memory to store the exact number for large n), you should only calculate the remainder of the sequence count modulo 1000000007.
To solve such a problem, start by looking at the simplest cases first. The very simplest cases are when only one L is given, then evidently there is one admissible sequence if n is a multiple of L and no admissible sequence if n mod L != 0. That doesn't help yet. So consider the next simplest cases, two L values given. Suppose those are 1 and 2.
0 has one composition, the empty sequence: N(0) = 1
1 has one composition, (1): N(1) = 1
2 has two compositions, (1,1); (2): N(2) = 2
3 has three compositions, (1,1,1);(1,2);(2,1): N(3) = 3
4 has five compositions, (1,1,1,1);(1,1,2);(1,2,1);(2,1,1);(2,2): N(4) = 5
5 has eight compositions, (1,1,1,1,1);(1,1,1,2);(1,1,2,1);(1,2,1,1);(2,1,1,1);(1,2,2);(2,1,2);(2,2,1): N(5) = 8
You may see it now, or need a few more terms, but you'll notice that you get the Fibonacci sequence (shifted by one), N(n) = F(n+1), thus the sequence N(n) satisfies the recurrence relation
N(n) = N(n-1) + N(n-2) (for n >= 2; we have not yet proved that, so far it's a hypothesis based on pattern-spotting). Now, can we see that without calculating many values? Of course, there are two types of admissible sequences, those ending with 1 and those ending with 2. Since that partitioning of the admissible sequences restricts only the last element, the number of ad. seq. summing to n and ending with 1 is N(n-1) and the number of ad. seq. summing to n and ending with 2 is N(n-2).
That reasoning immediately generalises, given L1 < L2 < ... < Lk, for all n >= Lk, we have
N(n) = N(n-L1) + N(n-L2) + ... + N(n-Lk)
with the obvious interpretation if we're only interested in N(n) % m.
Umm, that linear recurrence still leaves calculating N(n) as an O(n) task?
Yes, but researching a few of the mentioned keywords quickly leads to an algorithm needing only O(log n) steps ;)
Algorithm for misinterpreted problem, no longer relevant, but may still be interesting:
The question looks a little SPOJish, so I won't give a complete algorithm (at least, not before I've googled around a bit to check if it's a contest question). I hope no restriction has been omitted in the description, such as that permutations of such representations should only contribute one to the count, that would considerably complicate the matter. So I count 1*3 + 2*4 = 11 and 2*4 + 1*3 = 11 as two different solutions.
Some notations first. For m-tuples of numbers, let < | > denote the canonical bilinear pairing, i.e.
<a|x> = a_1*x_1 + ... + a_m*x_m. For a positive integer B, let A_B = {1, 2, ..., B} be the set of positive integers not exceeding B. Let N denote the set of natural numbers, i.e. of nonnegative integers.
For 0 <= m, k and B > 0, let C(B,m,k) = card { (a,x) \in A_B^m × N^m : <a|x> = k }.
Your problem is then to find \sum_{m = 1}^15 C(15,m,k) (modulo 1000000007).
For completeness, let us mention that C(B,0,k) = if k == 0 then 1 else 0, which can be helpful in theoretical considerations. For the case of a positive number of summands, we easily find the recursion formula
C(B,m+1,k) = \sum_{j = 0}^k C(B,1,j) * C(B,m,k-j)
By induction, C(B,m,_) is the convolution¹ of m factors C(B,1,_). Calculating the convolution of two known functions up to k is O(k^2), so if C(B,1,_) is known, that gives an O(n*k^2) algorithm to compute C(B,m,k), 1 <= m <= n. Okay for small k, but our galaxy won't live to see you calculating C(15,15,10^18) that way. So, can we do better? Well, if you're familiar with the Laplace-transformation, you'll know that an analogous transformation will convert the convolution product to a pointwise product, which is much easier to calculate. However, although the transformation is in this case easy to compute, the inverse is not. Any other idea? Why, yes, let's take a closer look at C(B,1,_).
C(B,1,k) = card { a \in A_B : (k/a) is an integer }
In other words, C(B,1,k) is the number of divisors of k not exceeding B. Let us denote that by d_B(k). It is immediately clear that 1 <= d_B(k) <= B. For B = 2, evidently d_2(k) = 1 if k is odd, 2 if k is even. d_3(k) = 3 if and only if k is divisible by 2 and by 3, hence iff k is a multiple of 6, d_3(k) = 2 if and only if one of 2, 3 divides k but not the other, that is, iff k % 6 \in {2,3,4} and finally, d_3(k) = 1 iff neither 2 nor 3 divides k, i.e. iff gcd(k,6) = 1, iff k % 6 \in {1,5}. So we've seen that d_2 is periodic with period 2, d_3 is periodic with period 6. Generally, like reasoning shows that d_B is periodic for all B, and the minimal positive period divides B!.
Given any positive period P of C(B,1,_) = d_B, we can split the sum in the convolution (k = q*P+r, 0 <= r < P):
C(B,m+1, q*P+r) = \sum_{c = 0}^{q-1} (\sum_{j = 0}^{P-1} d_B(j)*C(B,m,(q-c)*P + (r-j)))
+ \sum_{j = 0}^r d_B(j)*C(B,m,r-j)
The functions C(B,m,_) are no longer periodic for m >= 2, but there are simple formulae to obtain C(B,m,q*P+r) from C(B,m,r). Thus, with C(B,1,_) = d_B and C(B,m,_) known up to P, calculating C(B,m+1,_) up to P is an O(P^2) task², getting the data necessary for calculating C(B,m+1,k) for arbitrarily large k, needs m such convolutions, hence that's O(m*P^2).
Then finding C(B,m,k) for 1 <= m <= n and arbitrarily large k is O(n^2*P^2), in time and O(n^2*P) in space.
For B = 15, we have 15! = 1.307674368 * 10^12, so using that for P isn't feasible. Fortunately, the smallest positive period of d_15 is much smaller, so you get something workable. From a rough estimate, I would still expect the calculation of C(15,15,k) to take time more appropriately measured in hours than seconds, but it's an improvement over O(k) which would take years (for k in the region of 10^18).
¹ The convolution used here is (f \ast g)(k) = \sum_{j = 0}^k f(j)*g(k-j).
² Assuming all arithmetic operations are O(1); if, as in the OP, only the residue modulo some M > 0 is desired, that holds if all intermediate calculations are done modulo M.
I already have prime factorization (for integers), but now I want to implement it for gaussian integers but how should I do it? thanks!
This turned out to be a bit verbose, but I hope it fully answers your question...
A Gaussian integer is a complex number of the form
G = a+bi
where i2 = -1, and a and b are integers.
The Gaussian integers form a unique factorization domain. Some of them act as units (e.g. 1, -1, i, and -i), some as primes (e.g. 1 + i), and the rest composite, that can be decomposed as a product of primes and units that is unique, aside from the order of factors and the presence of a set of units whose product is 1.
The norm of such a number G is defined as an integer: norm(G) = a2 + b2 .
It can be shown that the norm is a multiplicative property, that is:
norm(I*J) = norm(I)*norm(J)
So if you want to factor a Gaussian integer G, you could take advantage of the fact that any Gaussian integer I that divides G must satisfy the property that norm(I) divides norm(G), and you know how to find the factors of norm(G).
The primes of the Gaussian integers fall into three categories:
1 +/- i , with norm 2,
a +/- bi, with prime norm a2+b2 congruent to 1 mod 4 ,
a, where a is a prime congruent to 3 mod 4 , with norm a2
Now to turn this into an algorithm...if you want to factor a Gaussian integer G,
you can find its norm N, and then factor that into prime integers. Then
we work our way down this list, peeling off prime factors p of N that correspond
to prime Gaussian factors q of our original number G.
There are only three cases to consider, and two of them are trivial.
If p = 2, let q = (1+i). (Note that q = (1-i) would work equally well, since they only differ by a unit factor.)
If p = 3 mod 4, q = p. But the norm of q is p2, so we can strike
another factor of p from the list of remaining factors of norm(G).
The p = 1 mod 4 case is the only one that's a little tricky to deal with.
It's equivalent to the problem of expressing p as the sum of two squares:
if p = a2 + b2, then a+bi and a-bi form a conjugate pair of Gaussian
primes with norm p, and one of them will be the factor we're looking for.
But with a little number theory, it turns out not to be too difficult.
Consider the integers mod p. Suppose we can find an integer k such
that k2 = -1 mod p. Then k2+1 = 0 mod p, which is equivalent to
saying that p divides k2+1 in the integers (and therefore also the
Gaussian integers). In the Gaussian integers, k2+1 factors into
(k+i)(k-i). p divides the product, but does not divide either
factor. Therefore, p has a nontrivial GCD with each of the
factors (k+i) and (k-i), and that GCD or its conjugate is the
factor we're looking for!
But how do we find such an integer k? Let n be some integer between
2 and p-1 inclusive. Calculate n(p-1)/2 mod p -- this value will be either
1 or -1. If -1, then k = n(p-1)/4, otherwise try a different n.
Nearly half the possible values of n will give us a square root of
-1 mod p, so it won't take many guesses to find a value of k that
works.
To find the Gaussian primes with norm p, just use Euclid's algorithm
(slightly modified to work with Gaussian integers) to compute the GCD
of (p, k+i). That gives one trial divisor. If it evenly divides the
Gaussian integer we're trying to factor (remainder = 0), we're done.
Otherwise, its conjugate is the desired factor.
Euclid's GCD algorithm for Gaussian integers is almost identical to
that for normal integers. Each iteration consists of a trial division
with remainder. If we're looking for gcd(a,b),
q = floor(a/b), remainder = a - q*b, and if the remainder is nonzero
we return gcd(b,remainder).
In the integers, if we get a fraction as the quotient, we round it toward zero.
In the Gaussian integers, if the real or imaginary parts of the quotient are fractions, they get rounded to the nearest integer. Besides that, the
algorithms are identical.
So the algorithm for factoring a Gaussian integer G looks something like
this:
Calculate norm(G), then factor norm(G) into primes p1, p2 ... pn.
For each remaining factor p:
if p=2, u = (1 + i).
strike p from the list of remaining primes.
else if p mod 4 = 3, q = p, and strike 2 copies of p from the list of primes.
else find k such that k^2 = -1 mod p, then u = gcd(p, k+i)
if G/u has remainder 0, q = u
else q = conjugate(u)
strike p from the list of remaining primes.
Add q to the list of Gaussian factors.
Replace G with G/q.
endfor
At the end of this procedure, G is a unit with norm 1. But it's not necessarily
1 -- it could be -1, i, or -i, in which case add G to the list of factors,
to make the signs come out right when you multiply all the factors to
see if the product matches the original value of G.
Here's a worked example: factor G = 361 - 1767i over the Gaussian integers.
norm(G) = 3252610 = 2 * 5 * 17 * 19 * 19 * 53
Considering 2, we try q = (1+i), and find G/q = -703 - 1064i with remainder 0.
G <= G/q = -703 - 1064i
Considering 5, we see it is congruent to 1 mod 4. We need to find a good k.
Trying n=2, n(p-1)/2 mod p = 22 mod p = 4. 4 is congruent to -1 mod 5. Success! k = 21 = 2. u = gcd(5, 2+i) which works out to be 2+i.
G/u = -494 - 285i, with remainder 0, so we find q = 2+i.
G <= G/q = -494 - 285i
Considering 17, it is also congruent to 1 mod 4, so we need to find another
k mod 17. Trying n=2, 28 = 1 mod 17, no good. Try n=3 instead.
38 = 16 mod 17 = -1 mod 17. Success! So k = 34 = 13 mod 17.
gcd(17, 13+i) = u = 4-i, G/u = -99 -96i with remainder -2. No good,
so try conjugate(u) = 4+i. G/u = -133 - 38i with remainder 0. Another factor!
G <= G/(4+i) = -133 - 38i
Considering 19, it is congruent to 3 mod 4. So our next factor is 19, and
we strike the second copy of 19 from the list.
G <= G/19 = -7 - 2i
Considering 53, it is congruent to 1 mod 4. Again with the k process...
Trying n=2, 226 = 52 mod 53 = -1 mod 53. Then k = 213 mod 53 = 30.
gcd(53,30+i) = u = -7 - 2i. That's identical to G, so the final
quotient G/(-7-2i) = 1, and there are no factors of -1, i, or -i to worry
about.
We have found factors (1+i)(2+i)(4+i)(19+0i)(-7-2i). And if you multiply
that out (left as an exercise for the reader...), lo and behold, the
product is 361-1767i, which is the number we started with.
Ain't number theory sweet?
Use floating point for the real and imaginary components if you want full single cell integer accuracy, and define gsub, gmul and a special division gdivr with rounded coefficients, not floored. That's because the Pollard rho factorization method needs gcd via Euclid's algorithm, whith a slightly modified gmodulo:
gmodulo((x,y),(x',y'))=gsub((x,y),gmul((x',y'),gdivr((x,y),(x',y'))))
Pollard rho
def poly((a,b),(x,y))=gmodulo(gsub(gmul((a,b),(a,b)),(1,0)),(x,y))
input (x,y),(a,b) % (x,y) is the Gaussian number to be factorized
(c,d)<-(a,b)
do
(a,b)=poly((a,b),(x,y))
(c,d)=poly(poly((c,d),(x,y)),(x,y))
(e,f)=ggcd((x,y),gsub((a,b),(c,d)))
if (e,f)=(x,y) then return (x,y) % failure, try other (a,b)
until e^2+f^2>1
return (e,f)
A normal start value is a=1, b=0.
I have used this method programmed in Forth on my blog http://forthmath.blogspot.se
For safety, use rounded values in all calculations while using floating points for integers.