Sum of 2^Pi mod 1000000007 for all i where Pi is sum of numbers in ith subset of a set X - algorithm

I am stuck on a problem in which I have to print sum of 2Pi mod 1000000007 for all i where Pi is sum of numbers in ith subset of a set X.
Length of set can be upto 100000.
Value of element in the range [0,1012].
Here's the link of the Problem.
Problem Statement
I could not find any approach other than Brute-Force which gives verdict TLE.
Here's the Problem Statement
Violet, Klaus and Sunny Baudelaire were given a task by Count Olaf to keep them busy while he acquires their fortune.He has given them N numbers and asked each of them to do the following:
He asked Sunny to make all possible subsets of the set of numbers.
Then he asked Klaus to find out the sum of the number in each subsets thus formed.
Finally he asked Violet to tell him the sum of 2Pi for all i where Pi is sum of numbers in ith subset.
Since Count Olaf will be bored while listening to such a long number he's asked to give him the answer modulus 1000000007.
Can you help Baudelaires out of this predicament ?
Input Format
First line of input contains a single number N indicating the size of the set. The following line will containing N numbers that makes the set.
Output Format
Print the answer in a single line.
Input Constraints
1 ≤ N ≤ 105
0 ≤ a[i] ≤ 1012
Here's My Solution:
import itertools # importing module
#initializing the sum variable which will store final answer
t=0
#Input, number of elements in array
n=input()
#Input the array
arr=map(int,raw_input().split())
#Traverse all the possible combinations and update the sum variable 't'.
for i in xrange(len(arr)+1):
for val in itertools.combinations(arr,i):
x=sum(val)
t=(t+2**x)%1000000007
#Print final answer
print t
Here's the working algorithm which passes all testcases in time limit but I don't get the logic behind it.
from sys import stdin
mod = 10**9 + 7
n = int(stdin.readline())
ans = 1
a = map(int,stdin.readline().split())
for i in a:
j = pow(2,i,mod)
ans = (ans*(j+1))%mod
print ans
#Moderators,admins etc.
Before putting this question on hold or marking off-Topic or closed....
Please comment the reason so that I can know the reason and if possible reword it or ask on any other StackExchange Site.
I first posted it on codegolf.stackexchange.com and people(moderators) there have suggested me to post it here as it comes under algorithm category.
You can read about it here.
Programming Puzzles and Code golf
Thank You

You're given a set of numbers X, and are asked to compute sum(2^sum(x for x in A) for A a subset of X).
Let X be the set {x[0], x[1], ..., x[n]}, and S[i] be the sum of the powers of 2 of the sums of the subsets of x[0]...x[i]. That is, S[i] = sum(2^sum(x for x in A) for A a subset of x[0]...x[i])).
Subsets of x[0]...x[i+1] are either subsets of x[0]...x[i], or subsets of x[0]...x[i] with x[i+1] added.
Then:
S[i+1] = sum(2^sum(x for x in A) for A a subset of x[0]...x[i+1])
= sum(2^sum(x for x in A) for A a subset of x[0]...x[i])
+ sum(2^sum(x for x in A+{x[i+1]}) for A a subset of x[0]...x[i])
= sum(2^sum(x for x in A) for A a subset of x[0]...x[i])
+ sum(2^x[i+1] * 2^sum(x for x in A) for A a subset of x[0]...x[i])
= S[i]
+ 2^x[i+1] * S[i]
This gives us a linear-time method for computing the result:
A = [3, 3, 6, 1, 2]
m = 10**9 + 7
r = 1
for x in A:
r = (r * (1 + pow(2, x, m))) % m
print r

Related

Finding the combination of N numbers that their sum give number R

How it would be possible to find/guess the combination of N numbers such as 5 or 7 or whatever that gives a final number R?
For example, determine N = 5 and R = 15
The one possible result/guess that the 5 numbers in which their summation give 15 would be {1,2,3,4,5}
To get n floating point numbers that total a target r:
Fill an array of size n with random numbers in (0,1).
Find the total of the array. Call this T.
Multiply every element in the array by r/T.
-edit (thanks #ruakh) -
To account for floating point error, total the array again, calculate the delta from target, and update the array to correct for it however you like. This will be a very small number. It's probably fine to update the first element of the array, but if you want you could spread it across the full array.
This can be solved by backtracking, this is really slow, because it looks for all combinations, but if you only want one combination use K-1 0' and N, this sum = N
n = 15
k = 2
array = [0] * k
def comb (len, sum):
if len == k - 1:
array[len] = sum
print(array)
return
for i in range(0, n + 1):
if sum - i >= 0:
array[len] = i
comb(len + 1, sum - i)
comb(0, n)
Python code
import random
N = 5
R = 7
result = random.sample(range(0, int(R)), N-1)
result.append(R - sum(result))

Finding distinct pairs {x, y} that satisfies the equation 1/x + 1/y = 1/n with x, y, and n being whole numbers

The task is to find the amount of distinct pairs of {x, y} that fits the equation 1/x + 1/y = 1/n, with n being the input given by the user. Different ordering of x and y does not count as a new pair.
For example, the value n = 2 will mean 1/n = 1/2. 1/2 can be formed with two pairs of {x, y}, whcih are 6 and 3 and 4 and 4.
The value n = 3 will mean 1/n = 1/3. 1/3 can be formed with two pairs of {x, y}, which are 4 and 12 and 6 and 6.
The mathematical equation of 1/x + 1/y = 1/n can be converted to y = nx/(x-n) where if y and x in said converted equation are whole, they count as a pair of {x, y}. Using said converted formula, I will iterate n times starting from x = n + 1 and adding x by 1 per iteration to find whether nx % (x - n) == 0; if it yields true, the x and y are a new distinct pair.
I found the answer to limit my iteration by n times by manually computing the answers and finding the number of repetitions 'pattern'. x also starts with n+1 because otherwise, division by zero will happen or y will result in a negative number. The modulo operator is to indicate that the y attained is whole.
Questions:
Is there a mathematical explanation behind why the iteration is limited to n times? I found out that the limit of iteration is n times by doing manual computation and finding the pattern: that I only need to iterate n times to find the amount of distinct pairs.
Is there another way to find the amount of distinct pairs {x, y} other than my method above, which is by finding the VALUES of distinct pairs itself and then summing the amount of distinct pair? Is there a quick mathematical formula I'm not aware of?
For reference, my code can be seen here: https://gist.github.com/TakeNoteIAmHere/596eaa2ccf5815fe9bbc20172dce7a63
Assuming that x,y,n > 0 we have
Observation 1: both, x and y must be greater than n
Observation 2: since (x,y) and (y,x) do not count as distinct, we can assume that x <= y.
Observation 3: x = y = 2n is always a solution and if x > 2n then y < x (thus no new solution)
This means the possible values for x are from n+1 up to 2n.
A little algebra convers the equation
1/x + 1/y = n
into
(x-n)*(y-n) = n*n
Since we want a solution in integers, we seek integers f, g so that
f*g = n*n
and then the solution for x and y is
x = f+n, y = g+n
I think the easiest way to proceed is to factorise n, ie write
n = (P[1]^k[1]) * .. *(P[m]^k[m])
where the Ps are distinct primes, the ks positive integers and ^ denotes exponentiation.
Then the possibilities for f and g are
f = P[1]^a[1]) * .. *(P[m]^a[m])
g = P[1]^b[1]) * .. *(P[m]^b[m])
where the as and bs satisfy, for each i=1..m
0<=a[i]<=2*k[i]
b[i] = 2*k[i] - a[i]
If we just wanted to count the number of solutions, we would just need to count the number of fs, ie the number of distinct sequences a[]. But this is just
Nall = (2*k[1]+1)*... (2*[k[m]+1)
However we want to count the solution (f,g) and (g,f) as being the same. There is only one case where f = g (because the factorisation into primes is unique, we can only have f=g if the a[] equal the b[]) and so the number we seek is
1 + (Nall-1)/2

Dividing N items in p groups

You are given N total number of item, P group in which you have to divide the N items.
Condition is the product of number of item held by each group should be max.
example N=10 and P=3 you can divide the 10 item in {3,4,3} since 3x3x4=36 max possible product.
You will want to form P groups of roughly N / P elements. However, this will not always be possible, as N might not be divisible by P, as is the case for your example.
So form groups of floor(N / P) elements initially. For your example, you'd form:
floor(10 / 3) = 3
=> groups = {3, 3, 3}
Now, take the remainder of the division of N by P:
10 mod 3 = 1
This means you have to distribute 1 more item to your groups (you can have up to P - 1 items left to distribute in general):
for i = 0 up to (N mod P) - 1:
groups[i]++
=> groups = {4, 3, 3} for your example
Which is also a valid solution.
For fun I worked out a proof of the fact that it in an optimal solution either all numbers = N/P or the numbers are some combination of floor(N/P) and ceiling(N/P). The proof is somewhat long, but proving optimality in a discrete context is seldom trivial. I would be interested if anybody can shorten the proof.
Lemma: For P = 2 the optimal way to divide N is into {N/2, N/2} if N is even and {floor(N/2), ceiling(N/2)} if N is odd.
This follows since the constraint that the two numbers sum to N means that the two numbers are of the form x, N-x.
The resulting product is (N-x)x = Nx - x^2. This is a parabola that opens down. Its max is at its vertex at x = N/2. If N is even this max is an integer. If N is odd, then x = N/2 is a fraction, but such parabolas are strictly unimodal, so the closer x gets to N/2 the larger the product. x = floor(N/2) (or ceiling, it doesn't matter by symmetry) is the closest an integer can get to N/2, hence {floor(N/2),ceiling(N/2)} is optimal for integers.
General case: First of all, a global max exists since there are only finitely many integer partitions and a finite list of numbers always has a max. Suppose that {x_1, x_2, ..., x_P} is globally optimal. Claim: given and i,j we have
|x_i - x_ j| <= 1
In other words: any two numbers in an optimal solution differ by at most 1. This follows immediately from the P = 2 lemma (applied to N = x_i + x_ j).
From this claim it follows that there are at most two distinct numbers among the x_i. If there is only 1 number, that number is clearly N/P. If there are two numbers, they are of the form a and a+1. Let k = the number of x_i which equal a+1, hence P-k of the x_i = a. Hence
(P-k)a + k(a+1) = N, where k is an integer with 1 <= k < P
But simple algebra yields that a = (N-k)/P = N/P - k/P.
Hence -- a is an integer < N/P which differs from N/P by less than 1 (k/P < 1)
Thus a = floor(N/P) and a+1 = ceiling(N/P).
QED

arrangement with constraint on the sum

I'm looking to construct an algorithm which gives the arrangements with repetition of n sequences of a given step S (which can be a positive real number), under the constraint that the sum of all combinations is k, with k a positive integer.
My problem is thus to find the solutions to the equation:
x 1 + x 2 + ⋯ + x n = k
where
0 ≤ x i ≤ b i
and S (the step) a real number with finite decimal.
For instance, if 0≤xi≤50, and S=2.5 then xi = {0, 2.5 , 5,..., 47.5, 50}.
The point here is to look only through the combinations having a sum=k because if n is big it is not possible to generate all the arrangements, so I would like to bypass this to generate only the combinations that match the constraint.
I was thinking to start with n=2 for instance, and find all linear combinations that match the constraint.
ex: if xi = {0, 2.5 , 5,..., 47.5, 50} and k=100, then we only have one combination={50,50}
For n=3, we have the combination for n=2 times 3, i.e. {50,50,0},{50,0,50} and {0,50,50} plus the combinations {50,47.5,2.5} * 3! etc...
If xi = {0, 2.5 , 5,..., 37.5, 40} and k=100, then we have 0 combinations for n=2 because 2*40<100, and we have {40,40,20} times 3 for n=3... (if I'm not mistaken)
I'm a bit lost as I can't seem to find a proper way to start the algorithm, knowing that I should have the step S and b as inputs.
Do you have any suggestions?
Thanks
You can transform your problem into an integer problem by dividing everything by S: We want to find all integer sequences y1, ..., yn with:
(1) 0 ≤ yi ≤ ⌊b / S⌋
(2) y1 + ... + yn = k / S
We can see that there is no solution if k is not a multiple of S. Once we have reduced the problem, I would suggest using a pseudopolynomial dynamic programming algorithm to solve the subset sum problem and then reconstruct the solution from it. Let f(i, j) be the number of ways to make sum j with i elements. We have the following recurrence:
f(0,0) = 1
f(0,j) = 0 forall j > 0
f(i,j) = sum_{m = 0}^{min(floor(b / S), j)} f(i - 1, j - m)
We can solve f in O(n * k / S) time by filling it row by row. Now we want to reconstruct the solution. I'm using Python-style pseudocode to illustrate the concept:
def reconstruct(i, j):
if f(i,j) == 0:
return
if i == 0:
yield []
return
for m := 0 to min(floor(b / S), j):
for rest in reconstruct(i - 1, j - m):
yield [m] + rest
result = reconstruct(n, k / S)
result will be a list of all possible combinations.
What you are describing sounds like a special case of the subset sum problem. Once you put it in those terms, you'll find that Pisinger apparently has a linear time algorithm for solving a more general version of your problem, since your weights are bounded. If you're interested in designing your own algorithm, you might start by reading Pisinger's thesis to get some ideas.
Since you are looking for all possible solutions and not just a single solution, the dynamic programming approach is probably your best bet.

Find pairs in an array such that a%b = k , where k is a given integer

Here is an interesting programming puzzle I came across . Given an array of positive integers, and a number K. We need to find pairs(a,b) from the array such that a % b = K.
I have a naive O(n^2) solution to this where we can check for all pairs such that a%b=k. Works but inefficient. We can certainly do better than this can't we ? Any efficient algorithms for the same? Oh and it's NOT homework.
Sort your array and binary search or keep a hash table with the count of each value in your array.
For a number x, we can find the largest y such that x mod y = K as y = x - K. Binary search for this y or look it up in your hash and increment your count accordingly.
Now, this isn't necessarily the only value that will work. For example, 8 mod 6 = 8 mod 3 = 2. We have:
x mod y = K => x = q*y + K =>
=> x = q(x - K) + K =>
=> x = 1(x - K) + K =>
=> x = 2(x - K)/2 + K =>
=> ...
This means you will have to test all divisors of y as well. You can find the divisors in O(sqrt y), giving you a total complexity of O(n log n sqrt(max_value)) if using binary search and O(n sqrt(max_value)) with a hash table (recommended especially if your numbers aren't very large).
Treat the problem as having two separate arrays as input: one for the a numbers and a % b = K and one for the b numbers. I am going to assume that everything is >= 0.
First of all, you can discard any b <= K.
Now think of every number in b as generating a sequence K, K + b, K + 2b, K + 3b... You can record this using a pair of numbers (pos, b), where pos is incremented by b at each stage. Start with pos = 0.
Hold these sequences in a priority queue, so you can find the smallest pos value at any given time. Sort the array of a numbers - in fact you could do this ahead of time and discard any duplicates.
For each a number
While the smallest pos in the priority queue is <= a
Add the smallest multiple of b to it to make it >= a
If it is == a, you have a match
Update the stored value of pos for that sequence, re-ordering the priority queue
At worst, you end up comparing every number with every other number, which is the same as the simple solution, but with priority queue and sorting overhead. However, large values of b may remain unexamined in the priority queue while several a numbers pass through, in which case this does better - and if there are a lot of numbers to process and they are all different, some of them must be large.
This answer mentions the main points of an algorithm (called DL because it uses “divisor lists” ) and gives details via a program, called amodb.py.
Let B be the input array, containing N positive integers. Without much loss of generality, suppose B[i] > K for all i and that B is in ascending order. (Note that x%B[i] < K if B[i] < K; and where B[i] = K, one can report pairs (B[i], B[j]) for all j>i. If B is not sorted initially, charge a cost of O(N log N) to sort it.)
In algorithm DL and program amodb.py, A is an array with K pre-subtracted from the input array elements. Ie, A[i] = B[i] - K. Note that if a%b == K, then for some j we have a = b*j + K or a-K = b*j. That is, a%b == K iff a-K is a multiple of b. Moreover, if a-K = b*j and p is any factor of b, then p is a factor of a-K.
Let the prime numbers from 2 to 97 be called “small factors”. When N numbers are uniformly randomly selected from some interval [X,Y], on the order of N/ln(Y) of the numbers will have no small factors; a similar number will have a greatest small factor of 2; and declining proportions will have successively larger greatest small factors. For example, on the average about N/97 will be divisible by 97, about N/89-N/(89*97) by 89 but not 97, etc. Generally, when members of B are random, lists of members with certain greatest small factors or with no small factors are sub-O(N/ln(Y)) in length.
Given a list Bd containing members of B divisible by largest small factor p, DL tests each element of Bd against elements of list Ad, those elements of A divisible by p. But given a list Bp for elements of B without small factors, DL tests each of Bp's elements against all elements of A. Example: If N=25, p=13, Bd=[18967, 23231], and Ad=[12779, 162383], then DL tests if any of 12779%18967, 162383%18967, 12779%23231, 162383%23231 are zero. Note that it is possible to cut the number of tests in half in this example (and many others) by noticing 12779<18967, but amodb.py does not include that optimization.
DL makes J different lists for J different factors; in one version of amodb.py, J=25 and the factor set is primes less than 100. A larger value of J would increase the O(N*J) time to initialize divisor lists, but would slightly decrease the O(N*len(Bp)) time to process list Bp against elements of A. See results below. Time to process other lists is O((N/logY)*(N/logY)*J), which is in sharp contrast to the O(n*sqrt(Y)) complexity for a previous answer's method.
Shown next is output from two program runs. In each set, the first Found line is from a naïve O(N*N) test, and the second is from DL. (Note, both DL and the naïve method would run faster if too-small A values were progressively removed.) The time ratio in the last line of the first test shows a disappointingly low speedup ratio of 3.9 for DL vs naïve method. For that run, factors included only the 25 primes less than 100. For the second run, with better speedup of ~ 4.4, factors included numbers 2 through 13 and primes up to 100.
$ python amodb.py
N: 10000 K: 59685 X: 100000 Y: 1000000
Found 208 matches in 21.854 seconds
Found 208 matches in 5.598 seconds
21.854 / 5.598 = 3.904
$ python amodb.py
N: 10000 K: 97881 X: 100000 Y: 1000000
Found 207 matches in 21.234 seconds
Found 207 matches in 4.851 seconds
21.234 / 4.851 = 4.377
Program amodb.py:
import random, time
factors = [2,3,4,5,6,7,8,9,10,11,12,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97]
X, N = 100000, 10000
Y, K = 10*X, random.randint(X/2,X)
print "N: ", N, " K: ", K, "X: ", X, " Y: ", Y
B = sorted([random.randint(X,Y) for i in range(N)])
NP = len(factors); NP1 = NP+1
A, Az, Bz = [], [[] for i in range(NP1)], [[] for i in range(NP1)]
t0 = time.time()
for b in B:
a, aj, bj = b-K, -1, -1
A.append(a) # Add a to A
for j,p in enumerate(factors):
if a % p == 0:
aj = j
Az[aj].append(a)
if b % p == 0:
bj = j
Bz[bj].append(b)
Bp = Bz.pop() # Get not-factored B-values list into Bp
di = time.time() - t0; t0 = time.time()
c = 0
for a in A:
for b in B:
if a%b == 0:
c += 1
dq = round(time.time() - t0, 3); t0 = time.time()
c=0
for i,Bd in enumerate(Bz):
Ad = Az[i]
for b in Bd:
for ak in Ad:
if ak % b == 0:
c += 1
for b in Bp:
for ak in A:
if ak % b == 0:
c += 1
dr = round(di + time.time() - t0, 3)
print "Found", c, " matches in", dq, "seconds"
print "Found", c, " matches in", dr, "seconds"
print dq, "/", dr, "=", round(dq/dr, 3)

Resources