Subset sum variant with modulo - algorithm

Given an array of integers A and integers N, M. I want to find all the subsets S of A where (sum(S) mod M = N).
A can have multiple integers of the same value.
In my case N will be in the range 0<=n<=31, M will be 32 and A will contain integers in the same range as n.
Is there any good/"fast" way to do this?
Thanks!

It is solvable in O(2n/2 log2(2n/2)) = O(2n/2 (n/2)), with your constrains this works on C++ less than a second.
All you need is:
1) compute all possible sums of first n/2 elements of the array and put them in map<int, int> left where left[sum] = how many times sum appears at the left part of the array
2) compute all possible sums of last n/2 elements of the array and for each sum S check does map left contains value (N - S + M)%M
to find all possible sums you could use bitmasks:
for (int mask = 1; mask < pow(2, n/2); mask++) {
int sum = 0;
for (int i = 0; i < n/2; i++)
if ( (int) (mask & (1<<i)) )
sum += A[i];
}

If you'd just like to count them, we can solve it in O(|A| * M) with dynamic programming. Here's an example:
A = [2, 6, 4, 3]
M = 5
0 1 2 3 4
S = 0 0 0 0 0 // The number of subsets with sum i (mod M)
// Iterate over A (through S each time)
2 0 0 1 0 0
6 0 1 1 1 0
4 1 2 2 1 1
3 3 3 3 3 3
Python code:
A = [2, 6, 4, 3]
M = 5
S = [0 for i in range(0, M)]
for a in A:
STemp = [0 for i in range(0, M)]
for (i, v) in enumerate(S):
ii = (a + i) % M
STemp[ii] = S[ii] + v
STemp[a % M] = STemp[a % M] + 1
S = STemp
print(S) # [3, 3, 3, 3, 3]

Related

Count number of subsequences of A such that every element of the subsequence is divisible by its index (starts from 1)

B is a subsequence of A if and only if we can turn A to B by removing zero or more element(s).
A = [1,2,3,4]
B = [1,4] is a subsequence of A.(Just remove 2 and 4).
B = [4,1] is not a subsequence of A.
Count all subsequences of A that satisfy this condition : A[i]%i = 0
Note that i starts from 1 not 0.
Example :
Input :
5
2 2 1 22 14
Output:
13
All of these 13 subsequences satisfy B[i]%i = 0 condition.
{2},{2,2},{2,22},{2,14},{2},{2,22},{2,14},{1},{1,22},{1,14},{22},{22,14},{14}
My attempt :
The only solution that I could came up with has O(n^2) complexity.
Assuming the maximum element in A is C, the following is an algorithm with time complexity O(n * sqrt(C)):
For every element x in A, find all divisors of x.
For every i from 1 to n, find every j such that A[j] is a multiple of i, using the result of step 1.
For every i from 1 to n and j such that A[j] is a multiple of i (using the result of step 2), find the number of B that has i elements and the last element is A[j] (dynamic programming).
def find_factors(x):
"""Returns all factors of x"""
for i in range(1, int(x ** 0.5) + 1):
if x % i == 0:
yield i
if i != x // i:
yield x // i
def solve(a):
"""Returns the answer for a"""
n = len(a)
# b[i] contains every j such that a[j] is a multiple of i+1.
b = [[] for i in range(n)]
for i, x in enumerate(a):
for factor in find_factors(x):
if factor <= n:
b[factor - 1].append(i)
# There are dp[i][j] sub arrays of A of length (i+1) ending at b[i][j]
dp = [[] for i in range(n)]
dp[0] = [1] * n
for i in range(1, n):
k = x = 0
for j in b[i]:
while k < len(b[i - 1]) and b[i - 1][k] < j:
x += dp[i - 1][k]
k += 1
dp[i].append(x)
return sum(sum(dpi) for dpi in dp)
For every divisor d of A[i], where d is greater than 1 and at most i+1, A[i] can be the dth element of the number of subsequences already counted for d-1.
JavaScript code:
function getDivisors(n, max){
let m = 1;
const left = [];
const right = [];
while (m*m <= n && m <= max){
if (n % m == 0){
left.push(m);
const l = n / m;
if (l != m && l <= max)
right.push(l);
}
m += 1;
}
return right.concat(left.reverse());
}
function f(A){
const dp = [1, ...new Array(A.length).fill(0)];
let result = 0;
for (let i=0; i<A.length; i++){
for (d of getDivisors(A[i], i+1)){
result += dp[d-1];
dp[d] += dp[d-1];
}
}
return result;
}
var A = [2, 2, 1, 22, 14];
console.log(JSON.stringify(A));
console.log(f(A));
I believe that for the general case we can't provably find an algorithm with complexity less than O(n^2).
First, an intuitive explanation:
Let's indicate the elements of the array by a1, a2, a3, ..., a_n.
If the element a1 appears in a subarray, it must be element no. 1.
If the element a2 appears in a subarray, it can be element no. 1 or 2.
If the element a3 appears in a subarray, it can be element no. 1, 2 or 3.
...
If the element a_n appears in a subarray, it can be element no. 1, 2, 3, ..., n.
So to take all the possibilities into account, we have to perform the following tests:
Check if a1 is divisible by 1 (trivial, of course)
Check if a2 is divisible by 1 or 2
Check if a3 is divisible by 1, 2 or 3
...
Check if a_n is divisible by 1, 2, 3, ..., n
All in all we have to perform 1+ 2 + 3 + ... + n = n(n - 1) / 2 tests, which gives a complexity of O(n^2).
Note that the above is somewhat inaccurate, because not all the tests are strictly necessary. For example, if a_i is divisible by 2 and 3 then it must be divisible by 6. Nevertheless, I think this gives a good intuition.
Now for a more formal argument:
Define an array like so:
a1 = 1
a2 = 1× 2
a3 = 1× 2 × 3
...
a_n = 1 × 2 × 3 × ... × n
By the definition, every subarray is valid.
Now let (m, p) be such that m <= n and p <= n and change a_mtoa_m / p`. We can now choose one of two paths:
If we restrict p to be prime, then each tuple (m, p) represents a mandatory test, because the corresponding change in the value of a_m changes the number of valid subarrays. But that requires prime factorization of each number between 1 and n. By the known methods, I don't think we can get here a complexity less than O(n^2).
If we omit the above restriction, then we clearly perform n(n - 1) / 2 tests, which gives a complexity of O(n^2).

Good algorithm for a query related problem

Given N pairs, we have to find the count of pairs that contain an element k in their range, i.e :
If a Pairi is defined as (Xi,Yi), then if Pairi contain K in its range, then Xi <= K <= Yi.
Now we are given Q such queries to handle with each query consisting of an integer K.
Input:
The first line contains two space-separated integers N and Q.
Next N lines follow where each line denotes a pair. Each line contains two space-separated integers.
Next Q lines follow where each line denotes an integer K
Output:
We are to output the count of pairs where Xi i<= K <= Yi for each query
Constraints:
1 <= N,Q <= 105
Time limit: 2 s
Example:
Input-
4 2
1 5
2 5
6 10
7 8
7
9
Output-
2
1
Explanation-
First query K=7 holds for (6,10) and (7,8).
Second query K=9 holds for (6,10) only.
Given below is my code in java with complexity O(NQ).
import java.util.*;
class Query
{
public static void main(String args[])
{
Scanner sc = new Scanner(System.in);
int n,q;
n = sc.nextInt();
q = sc.nextInt();
int x[] = new int[n];
int y[] = new int[n];
for(int i=0;i<n;i++)
{
x[i] = sc.nextInt();
y[i] = sc.nextInt();
}
while(q-->0)
{
int k = sc.nextInt();
int count = 0;
for(int i = 0;i<n;i++)
{
if(x[i] <= k && k <= y[i])
count++;
}
System.out.println(count);
}
}
}
Can somebody provide me with an approach that has a better complexity such as O(N + Q log N)? I thought of using segment trees and such but do not if it would work for this problem and how to implement it here.
A complexity O(NlogN + QlogN) can be obtained by performing a preprocessing before the queries themselves.
1st step: Preprocessing
The goal is to determine the number of intervals associated for each limit A[k] of each interval, and to sort these A[k].
This is performed in the following way: for each input interval [X, Y], the corresponding limits X and Y are put in a array, and we count the number of openings and closures for each limit X:
open[X] ++
close[Y] ++
The reason behind is that each value after X is "gaining'" one interval, and each value after Y is "losing" one interval.
Then, after sorting, the number of intervals of a given limit is obtained recursively:
After the limit: W[0] = n_opening[0], W[i] = W[i-1] + n_opening[i] - n_closure[i]
On the limit: WL[0] = n_opening[0], WL[i] = W[i-1] + n_opening[i]
This is better illustrated by an example. For the input intervals [1,5], [2, 5], [6, 10], [7, 8], the V[] values are given by:
open[] 1 1 0 1 1 0 0
close[] 0 0 2 0 0 1 1
---|---|---|---|---|---|---|---
1 2 5 6 7 8 10
And the W[] and WL[] values are provided by
WL[] 1 2 2 1 2 2 1
W[] 1 2 0 1 2 1 0
---|---|---|---|---|---|---|---
1 2 5 6 7 8 10
2nd step: queries
For each query K, we have first to determine the corresponding interval [A[i], A[i+1]]. As the A[i] are sorted, this an be done in log(N). Then:
If K is outside any interval: m[k] = 0
If K is in the interval ]A[i], A[i+1][, i.e. not equal to any limit, then m[k] = W[A[i]]
if K is equal to a limit A[i], then m[k] = WL[a[i]]
In the previous example:
K = 7 -> m(7) = WL[7] = 2
K = 9 -> m(9) = W[8] = 1

General formula for a recurrence relation?

I was solving a coding question, and found out the following relation to find the number of possible arrangements:
one[1] = two[1] = three[1] = 1
one[i] = two[i-1] + three[i-1]
two[i] = one[i-1] + three[i-1]
three[i] = one[i-1] + two[i-1] + three[i-1]
I could have easily used a for loop to find out the values of the individual arrays till n, but the value of n is of the order 10^9, and I won't be able to iterate from 1 to such a huge number.
For every value of n, I need to output the value of (one[n] + two[n] + three[n]) % 10^9+7 in O(1) time.
Some results:
For n = 1, result = 3
For n = 2, result = 7
For n = 3, result = 17
For n = 4, result = 41
I was not able to find out a general formula for n for the above after spending hours on it. Can someone help me out.
Edit:
n = 1, result(1) = 3
n = 2, result(2) = 7
n = 3, result(3) = result(2)*2 + result(1) = 17
n = 4, result(4) = result(3)*2 + result(2) = 41
So, result(n) = result(n-1)*2 + result(n-2) OR
T(n) = 2T(n-1) + T(n-2)
You can use a matrix to represent the recurrence relation. (I've renamed one, two, three to a, b, c).
(a[n+1]) = ( 0 1 1 ) (a[n])
(b[n+1]) ( 1 0 1 ) (b[n])
(c[n+1]) ( 1 1 1 ) (c[n])
With this representation, it's feasible to compute values for large n, by matrix exponentation (modulo your large number), using exponentation by squaring. That'll give you the result in O(log n) time.
(a[n]) = ( 0 1 1 )^(n-1) (1)
(b[n]) ( 1 0 1 ) (1)
(c[n]) ( 1 1 1 ) (1)
Here's some Python that implements this all from scratch:
# compute a*b mod K where a and b are square matrices of the same size
def mmul(a, b, K):
n = len(a)
return [
[sum(a[i][k] * b[k][j] for k in xrange(n)) % K for j in xrange(n)]
for i in xrange(n)]
# compute a^n mod K where a is a square matrix
def mpow(a, n, K):
if n == 0: return [[i == j for i in xrange(len(a))] for j in xrange(len(a))]
if n % 2: return mmul(mpow(a, n-1, K), a, K)
a2 = mpow(a, n//2, K)
return mmul(a2, a2, K)
M = [[0, 1, 1], [1, 0, 1], [1, 1, 1]]
def f(n):
K = 10**9+7
return sum(sum(a) for a in mpow(M, n-1, K)) % K
print f(1), f(2), f(3), f(4)
print f(10 ** 9)
Output:
3 7 17 41
999999966
It runs effectively instantly, even for the n=10**9 case.

Sum of products of elements of all subarrays of length k

An array of length n is given. Find the sum of products of elements of the sub-array.
Explanation
Array A = [2, 3, 4] of length 3.
Sub-array of length 2 = [2,3], [3,4], [2,4]
Product of elements in [2, 3] = 6
Product of elements in [3, 4] = 12
Product of elements in [2, 4] = 8
Sum for subarray of length 2 = 6+12+8 = 26
Similarly, for length 3, Sum = 24
As, products can be larger for higher lengths of sub-arrays calculate in modulo 1000000007.
What is an efficient way for finding these sums for subarrays of all possible lengths, i.e., 1, 2, 3, ......, n where n is the length of the array.
There is rather simple way:
Construct product of terms (1 + A[i] * x):
P = (1 + A[0] * x) * (1 + A[1] * x) * (1 + A[2] * x)...*(1 + A[n-1] * x)
If we open the brackets, then we'll get polynomial
P = 1 + B[1] * x + B[2] * x^2 + ... + B[n] * x^n
Kth coefficient, B[k], is equal to the sum of products of sets with length K - for example, B[n] = A[0]*A[1]*A[2]*..A[n-1], B[2] = A[0]*A[1] + A[0]*A[2] + ... + A[n-2]*A[n-1] and so on.
So to find sum of products of all possible sets, we have to find value of polynomial P for x = 1, then subtract 1 to remove leading 0th term. If we don't want to take into consideration single-element sets, then subtract B1 = sum of A[i].
Example:
(1+2)(1+3)(1+4) = 60
60 - 1 = 59
59 - (2 + 3 + 4) = 50 = 24 + 26 - as your example shows
We first create a recursive relation. Let f(n, k) be the sum of all products of sub-arrays of length k from an array a of length n. The base cases are simple:
f(0, k) = 0 for all k
f(n, 0) = 1 for all n
The second rule might seem a little counter-intuitive, but 1 is the zero-element of multiplication.
Now we find a recursive relation for f(n+1, k). We want the product of all subarrays of size k. There are two types of subarrays here: the ones including a[n+1] and the ones not including a[n+1]. The sum of the ones not including a[n+1] is exactly f(n, k). The ones including a[n+1] are exactly all subarrays of length k-1 with a[n+1] added, so their summed product is a[n+1] * f(n, k-1).
This completes our recurrence relation:
f(n, k) = 0 if n = 0
= 1 if k = 0
= f(n-1, k) + a[n] * f(n-1, k-1) otherwise
You can use a neat trick to use very limited memory for your dynamic programming, because function f only depends on two earlier values:
int[] compute(int[] a) {
int N = a.length;
int[] f = int[N];
f[0] = 1;
for (int n = 1; n < N; n++) {
for (int k = n; k >= 1; k--) {
f[k] = (f[k] + a[n] * f[k-1]) % 1000000007;
}
}
return f;
}

Number of sub sequences of a given array that are divisible by n

I have a sequence of numbers say 1,2,4,0 and I have to find number of sequence divisible by 6.
So we will have 0,12,24,120,240 which means answer will be 5,
The problem is that I devised an algorithm which requires O(2^n) time complexity so basically it iterates through all the possibilities which is naive.
Is there some way to decrease the complexity.
Edit1: multiple copy of digit is allowed. for example input can be 1,2,1,4,3
Edit2: digits should be in order such as in above example 42 420 etc are not allowed
code: This code however is not able to take 120 into account
`#include <stdio.h>
#include<string.h>
#define m 1000000007
int main(void) {
int t;
scanf("%d",&t);
while(t--)
{
char arr[100000];
int r=0,count=0,i,j,k;
scanf("%s",&arr);
int a[100000];
for(i=0;i<strlen(arr);i++)
{
a[i]=arr[i]-'0';
}
for(i=0;i<strlen(arr);i++)
{
for(j=i;j<strlen(arr);j++)
{
if(a[i]==0)
{
count++;
goto label;
}
r=a[i]%6;
for(k=j+1;k<strlen(arr);k++)
{
r=(r*10 + a[k])%6;
if(r==0)
count++;
}
}
label:;
r=0;
}
printf("%d\n",count);
}
return 0;
}
You can use dynamic programming.
As usual, when we decide to solve a problem using dynamic programming, we start by turning some input values into parameters, and maybe adding some other parameters.
The obvious candidate for a parameter is the length of the sequence.
Let our sequence be a[1], a[2], ..., a[N].
So, we search for the value f(n) (for n from 0 to N) which is the number of subsequences of a[1], a[2], ..., a[n] which, when read as numbers, are divisible by D=6.
Computing f(n) when we know f(n-1) does not look obvious yet, so we dig into details.
On closer look, the problem we now face is that adding a digit to the end of a number can turn a number divisible by D into a number not divisible by D, and vice versa.
Still, we know exactly how the remainder changes when we add a digit to the end of a number.
If we have a sequence p[1], p[2], ..., p[k] and know r, the remainder of the number p[1] p[2] ... p[k] modulo D, and then add p[k+1] to the sequence, the remainder s of the new number p[1] p[2] ... p[k] p[k+1] modulo D is easy to compute: s = (r * 10 + p[k+1]) mod D.
To take that into account, we can make the remainder modulo D our new parameter.
So, we now search for f(n,r) (for n from 0 to N and r from 0 to D-1) which is the number of subsequences of a[1], a[2], ..., a[n] which, when read as numbers, have the remainder r modulo D.
Now, knowing f(n,0), f(n,1), ..., f(n,D-1), we want to compute f(n+1,0), f(n+1,1), ..., f(n+1,D-1).
For each possible subsequence of a[1], a[2], ..., a[n], when we consider element number n+1, we either add a[n+1] to it, or omit a[n+1] and leave the subsequence unchanged.
This is easier to express by forward dynamic programming rather than a formula:
let f (n + 1, *) = 0
for r = 0, 1, ..., D - 1:
add f (n, r) to f (n + 1, r * 10 + a[n + 1]) // add a[n + 1]
add f (n, r) to f (n + 1, r) // omit a[n + 1]
The resulting f (n + 1, s) (which, depending on s, is a sum of one or more terms) is the number of subsequences of a[1], a[2], ..., a[n], a[n+1] which yield the remainder s modulo D.
The whole solution follows:
let f (0, *) = 0
let f (0, 0) = 1 // there is one empty sequence, and its remainder is 0
for n = 0, 1, ..., N - 1:
let f (n + 1, *) = 0
for r = 0, 1, ..., D - 1:
add f (n, r) to f (n + 1, r * 10 + a[n + 1]) // add a[n + 1]
add f (n, r) to f (n + 1, r) // omit a[n + 1]
answer = f (N, 0) - 1
We subtract one from the answer since an empty subsequence is not considered a number.
The time and memory requirements are O (N * D).
We can lower the memory to O (D) when we note that, at each given moment, we only need to store f (n, *) and f (n + 1, *), so the storage for f can be 2 * D instead of (N + 1) * D.
An illustration with your example sequence:
-------------------------------
a[n] 1 2 4 0
f(n,r) n 0 1 2 3 4
r
-------------------------------
0 1 1 2 3 6
1 0 1 1 1 1
2 0 0 1 2 4
3 0 0 0 0 0
4 0 0 0 2 5
5 0 0 0 0 0
-------------------------------
Exercise: how to get rid of numbers with leading zeroes with this solution?
Will we need another parameter?

Resources