Choosing M of N packets so sum is minimal multiple of K - algorithm

I saw this program on Codechef.
There are N packets each containing some candies. (Eg: 1st contains 10, 2nd contains 4 and so on)
We have to select exactly M packets from it ( M<=N) such that total candies are divisible by K.
If there are more than one solution then output the one having lowest number of candies.
I thought its similar to Subset Sum problem but that is NP hard. So it will take exponential time.
I don't want the complete solution of this program. An algorithm would be appreciated. Thinking on it from 2 days but unable to get the correct logic.
1 ≤ M ≤ N ≤ 50000, 1 ≤ K ≤ 20
Number of Candies in each packet [1,10^9]

Let packets contain the original packets.
Partition k into sums of p = 1, 2, ..., m numbers >= 1 and < k (there are O(2^k) such partitions). For each partition, iterate over packets and add those numbers whose remainder modulo k is one of the partition's elements, then remove that element from the partition. Keep the minimum sum as well, and update a global minimum. Note that if m > p, you must also have m - p zeroes.
You might be thinking this is O(2^k * n) and it's too slow, but you don't actually have to iterate the packets array for each partition if you keep num[i] = how many numbers have packets[i] % k == i, in which case it becomes O(2^k + n). To handle the minimum sum requirement too, you can keep num[i] = the list of the numbers that have packets[i] % k == i, which will allow you to always pick the smallest numbers for a valid partition.

Have a look again at http://en.wikipedia.org/wiki/Subset_sum_problem#Pseudo-polynomial_time_dynamic_programming_solution and note that K is relatively small. Furthermore, although N can be large, all you care about in the sums that involve N is the answer mod K. So there is a dynamic programming solution lurking around here, where at each step you have K possible values mod K, and you keep track of which of these values are currently attainable.

Related

The largest number of subsets for a set

Given a set of X numbers less than or equal to Y, which may contain repeated numbers:
which algorithm gives you the maximum number of subsets whose sum of their elements is greater than or equal to Y, where none of the elements of one subset can be contained in another, and each subset cannot repeat the same element.
(note: if in the initial set two numbers are repeated, each counts as a distinct element)
subsets can group elements into duos, trios, quartets or any other number.
doing two for loops to search for the best combination for the highest number worked for doubles, but given that it is possible to do trios and so on, cases like "1 1 1 1 1 7 8" would be suboptimized
You could implement a 'brute force' method and go through every possible partitioning and check if it satisfies your requirements. This would be quite simple, but horribly inefficient except for trivial cases.
Suppose you have N elements e_i in your set S, with 0 <= e_i <= Y. Choose numparts as the number of partitions you are going to try to create with element sum >= Y. Assuming sum e_i >= Y, we can set numparts = 1 initially, otherwise, obviously, the answer is zero..
Then you can generate partitions by creating an array of N elements p_i where 0 <= p_i < numparts. There are not more than numparts^N possible such partitions!! Now, you have to try to find one in which for all 0 <= j < numparts, sum {e_i : p_i = j} >= Y. If you find one, increment numparts, if you don't, then you have your answer which is the largest numparts value for which you did find a qualifying partition.
You could improve the efficiency of this approach significantly by avoiding lots of partitions that don't have a sum >= Y. There are 'only' 2^N distinct subsets of S so the number of subsets with sums >=Y must be less than or equal to 2^N. For each such subset S_k, you can try to find the maximum number of partitions of S - S_k each with sums >= Y which is just a recursion of the same problem. This would give you the absolute maximum result you're looking for, but would still be a computational nightmare for non-trivial problems.
A quick-but-suboptimal algorithm is simply to sort the array in ascending order, then partition according to the partition sum as you process the sorted elements sequentially. e.g.
Suppose s[i] are the elements in the sorted array,...
partitionno = 0;
partitionsum = 0;
for (i=0; i<N; i++) {
partitionsum += s[i];
if (partitionsum >= Y) {
partitionsum = 0;
partitionno++;
}
}
giving partitionno subsets each having a sum of at least Y. Sorting can be done in O(N) time, and the algorithm above is also O(N) so you could use this for N in the 1000000s or more.
This is strongly NP hard since it contains as a special case the special case of the 3 partition problem of dividing a set into triplets that each have the same sum where all numbers are in the range of that sum/4 to that sum/2. And that special case is known to be strongly NP hard.
Therefore there is no known algorithm to solve it, and finding one would be a really big deal.

Given a number N and an array A. Check if N can be expressed as a product of one or more array elements

Given a number N (where N <= 10^18) and an array A(consisting of at most 20 elements). I have to tell if it is possible to form N by multiplying some elements of the array. Note that, I can use any element multiple times.
Example: N = 8 and A = {2, 3}. Here, 8 = 2 * 2 * 2. So the answer is YES. But if N = 15, then I can't form 15 as a product of one or more elements using them any number of times. So in this case the answer is NO.
How can I approach this problem?
Simple pseudocode:
A_divisors = set()
for x in A:
if num % x == 0:
A_divisors.add(x)
candidates = A_divisors.clone()
seen = set()
while(candidates.size()):
size = divisors.size()
new_candidates = set()
for y in candidates:
for x in A_divisors:
if num % (x * y) == 0 and (x * y) not in seen:
new_candidates.add(x * y)
seen.add(x * y)
if x * y == num:
return true
candidates = new_candidates
return false
Complexity: O(|A| * k * log k), with k being amount of divisors. The log k would be the cost of adding and checking if element is present in the set. With a hash based approach it would be O(1) and can be removed. I am also assuming %, * operations to be O(1).
Since you show no code or algorithm, I'll just give one idea. If you want more help, please show more of your own work on the problem.
Note that N can be at most 60 bits long. This is small enough that N could be decomposed into its prime factors pretty quickly. So first work up a good factoring algorithm for numbers of that size.
Your algorithm would factor N and each of the elements in your array A. If there is any prime factor of N that does not divide into any element of A then your answer is NO. This is the case in your example of N = 15.
Now you work with the prime factors and their exponents in N and in the elements of A. Now you want to find a subset (or, more properly, a sub-multiset) of A where the exponents for each prime add up to that in N. This greatly reduces the sizes of your numbers thus makes the problem easier.
That last part is not trivial. Work more on this problem and show us some of your work, then we can continue helping you.
You can follow below approach:
Form 2 queues: Q2 and Q3.
Add 2 in Q2 and 3 in Q3.
Get the minimum of the head of both queues, lets say h. Remove h from the corresponding queue. Check if it is equal to the number N. If yes, return true. If it is greater than N, return false.
If it is less than N, then add 2*h in Q2 and 3*h in Q3. Repeat steps 3 to 4.
Please note that when the minimum h comes from Q3, you need not to add 2*h into Q2. That is because you already have added that element in Q3 before. (I will leave it for you to deduce}. Keep on doing this procedure until your h is greater than N.
If you have more such numbers, you can form there queues as well. I think this is an optimal solution in case you have more numbers to process.
Can you guess the time and space complexity of this?

How many times variable m is updated

Given the following pseudo-code, the question is how many times on average is the variable m being updated.
A[1...n]: array with n random elements
m = a[1]
for I = 2 to n do
if a[I] < m then m = a[I]
end for
One might answer that since all elements are random, then the variable will be updated on average on half the number of iterations of the for loop plus one for the initialization.
However, I suspect that there must be a better (and possibly the only correct) way to prove it using binomial distribution with p = 1/2. This way, the average number of updates on m would be
M = 1 + Σi=1 to n-1[k.Cn,k.pk.(1-p)(n-k)]
where Cn,k is the binomial coefficient. I have tried to solve this but I have stuck some steps after since I do not know how to continue.
Could someone explain me which of the two answers is correct and if it is the second one, show me how to calculate M?
Thank you for your time
Assuming the elements of the array are distinct, the expected number of updates of m is the nth harmonic number, Hn, which is the sum of 1/k for k ranging from 1 to n.
The summation formula can also be represented by the recursion:
H1 &equals; 1
Hn &equals; Hn−1&plus;1/n (n > 1)
It's easy to see that the recursion corresponds to the problem.
Consider all permutations of n−1 numbers, and assume that the expected number of assignments is Hn−1. Now, every permutation of n numbers consists of a permutation of n−1 numbers, with a new smallest number inserted in one of n possible insertion points: either at the beginning, or after one of the n−1 existing values. Since it is smaller than every number in the existing series, it will only be assigned to m in the case that it was inserted at the beginning. That has a probability of 1/n, and so the expected number of assignments of a permutation of n numbers is Hn−1 + 1/n.
Since the expected number of assignments for a vector of length one is obviously 1, which is H1, we have an inductive proof of the recursion.
Hn is asymptotically equal to ln n &plus; γ where γ is the Euler-Mascheroni constant, approximately 0.577. So it increases without limit, but quite slowly.
The values for which m is updated are called left-to-right maxima, and you'll probably find more information about them by searching for that term.
I liked #rici answer so I decided to elaborate its central argument a little bit more so to make it clearer to me.
Let H[k] be the expected number of assignments needed to compute the min m of an array of length k, as indicated in the algorithm under consideration. We know that
H[1] = 1.
Now assume we have an array of length n > 1. The min can be in the last position of the array or not. It is in the last position with probability 1/n. It is not with probability 1 - 1/n. In the first case the expected number of assignments is H[n-1] + 1. In the second, H[n-1].
If we multiply the expected number of assignments of each case by their probabilities and sum, we get
H[n] = (H[n-1] + 1)*1/n + H[n-1]*(1 - 1/n)
= H[n-1]*1/n + 1/n + H[n-1] - H[n-1]*1/n
= 1/n + H[n-1]
which shows the recursion.
Note that the argument is valid if the min is either in the last position or in any the first n-1, not in both places. Thus we are using that all the elements of the array are different.

Count "cool" divisors of given number N

I'm trying to solve pretty complex problem with divisors and number theory.
Namely for a given number m we can say that k is cool divisor if k<m k|m (k divides m evenly), and for a given number n the number k^n (k to the power of n) is not divisor of m. Let s(x) - number of cool divisors of x.
Now for given a and b we should find D = s(a) + s(a+1) + s(a+2) + s(a+3) + ... + s(a+b).
Limits for all values:
(1 <= a <= 10^6), (1 <= b <= 10^7), (2<=n<=10)
Example
Let's say a=32, b=1, n=3;
x = 32, n = 3 divisors of 32 are {1,2,4,8,16,32}. However only {4,8,16} fill the conditions so s(32) = 3
x = 33, n = 3 divisors of 33 are {1,3,11,33}. Only the numbers {3,11} fill the conditions so s(33)=2;
D = s(32) + s(33) = 3 + 2 = 5
What I have tried
We should answer all those questions for 100 test cases in 3 seconds time limit.
I have two ideas, the first one: I iterate in the interval [a, a+b] and for each value i in the range I check how many cool divisors are there for that value, we can check this in O(sqrt(N)) if the function for getting number of power of N is considered as O(1) so the total function for this is O(B*sqrt(B)).
The second one, I'm now sure if it will work and how fast it will be. First I do a precomputation, I have a for loop that iterates from 1 to N, where N = 10^7
and now in the range [2, N] for each number whose divisor is i, where i is in the range [2,N] and I check if i to the power of n is not divisor of j then we update that the number j has one more cool divisor. With this I think that the complexity will be O(NlogN) and for the answers O(B).
Your first idea works but you can improve it.
Instead of checking all numbers from 1 to sqrt(N) whether they are cool divisors, you can factorize N=*p0^q0*p1^q1*p2^q2...pk^qk*. Then the number of cool divisors should then be (q0+1)(q1+1)...(qk+1) - (q0/n+1)(q1/n+1)...(qk/n+1).
So you can first preprocess and find out all the prime numbers using some existing algo like Sieve of Eratosthenes and for each number N between [a,a+b] you do a factorization. The complexity should be roughly O(BlogB).
Your second idea works as well.
For each number i between [2,a+b], you can just check the multiples of i between [a,a+b] and see whether i is a cool divisor of those multiples. The complexity should be O(BlogB) as well. Some tricks can be played in this idea to speed up the program is that, once you don't need to use divide/mod operations from time to time to check whether i is a cool divisor. You can compute the first number m between [a, a+b] that i^n|m. This m should be m=ceiling(a/(i^n))(i^n). And then you know i^n|m+p*i does not hold for p between [1,i^(n-1) - 1] and holds for p=i^n-1. Basically, you know i is not a cool divisor every i^(n-1) multiples, and you do not need to use divide/mod to figure it out, which will speed the program up.

Finding a certain set of numbers from a list

I am working on this project where the user inputs a list of numbers. I put these numbers in an array. I need to find a set of numbers with a given length whose sum is divisible by 5.
For example, if the list is 9768014, and the length required is 6, then the output would be 987641.
What algorithm do I need to find that set of numbers?
You can solve this by dynamic programming. Let f(n,m,k) be the largest index between 1 and n of the number in a subset of indices {1,2,....,n} that gives a sum of k mod 5 that uses m numbers. (It's possible that f(n,m,k) = None). You can compute f(n+1,m,k) and f(n,m+1,k) if you know the values of f(N,M,k) for all N <= n + 1 and M < m and also for all N <= n and M < m + 1 and also for N=n,M=m, and all k = 0,1,2,3,4. If you ever find that f(n,m,0) has a solution where m is your desired number of numbers to use, then you're done. Also you don't have to compute f(N,M,k) for any M greater than your desired count of numbers to use. Total complexity is O(n*m) where n is the total count of numbers and m is the size of subset that you are trying to reach.

Resources