Calculating the numbers whose binary representation has exactly required number 1's - algorithm

Okay so the problem is finding a positive integer n such that there are exactly m numbers in n+1 to 2n (both inclusive) whose binary representation has exactly k 1s.
Constraints: m<=10^18 and k<=64. Also answer is less than 10^18.
Now I can't think of an efficient way of solving this instead of going through each integer and calculating the binary 1 count in the required interval for each of them, but that would take too long. So is there any other way to go about with this?

You're correct to suspect that there's a more efficient way.
Let's start with a slightly simpler subproblem. Absent some really clever
insights, we're going to need to be able to find the number of integers in
[n+1, 2n] that have exactly k bits set in their binary representation. To
keep things short, let's call such integers "weight-k" integers (for motivation for this terminology, look up Hamming weight). We can
immediately simplify our counting problem: if we can count all weight-k integers in [0, 2n]
and we can count all weight-k integers in [0, n], we can subtract one count
from the other to get the number of weight-k integers in [n+1, 2n].
So an obvious subproblem is to count how many weight-k integers there are
in the interval [0, n], for given nonnegative integers k and n.
A standard technique for a problem of this kind is to look for a way to break
it down into smaller subproblems of the same kind; this is one aspect of
what's often called dynamic programming. In this case, there's an easy way of
doing so: consider the even numbers in [0, n] and the odd numbers in [0, n]
separately. Every even number m in [0, n] has exactly the same weight as
m/2 (because by dividing by two, all we do is remove a single zero
bit). Similarly, every odd number m has weight exactly one more than the
weight of (m-1)/2. With some thought about the appropriate base cases, this
leads to the following recursive algorithm (in this case implemented in Python,
but it should translate easily to any other mainstream language).
def count_weights(n, k):
"""
Return number of weight-k integers in [0, n] (for n >= 0, k >= 0)
"""
if k == 0:
return 1 # 0 is the only weight-0 value
elif n == 0:
return 0 # only considering 0, which doesn't have positive weight
else:
from_even = count_weights(n//2, k)
from_odd = count_weights((n-1)//2, k-1)
return from_even + from_odd
There's plenty of scope for mistakes here, so let's test our fancy recursive
algorithm against something less efficient but more direct (and, I hope, more
obviously correct):
def weight(n):
"""
Number of 1 bits in the binary representation of n (for n >= 0).
"""
return bin(n).count('1')
def count_weights_slow(n, k):
"""
Return number of weight-k integers in [0, n] (for n >= 0, k >= 0)
"""
return sum(weight(m) == k for m in range(n+1))
The results of comparing the two algorithms look convincing:
>>> count_weights(100, 5)
11
>>> count_weights_slow(100, 5)
11
>>> all(count_weights(n, k) == count_weights_slow(n, k)
... for n in range(1000) for k in range(10))
True
However, our supposedly fast count_weights function doesn't scale well to the
size numbers you need:
>>> count_weights(2**64, 5) # takes a few seconds on my machine
7624512
>>> count_weights(2**64, 6) # minutes ...
74974368
>>> count_weights(2**64, 10) # gave up waiting ...
But here's where a second key idea of dynamic programming comes in: memoize!
That is, keep a record of the results of previous calls, in case we need to use
them again. It turns out that the chain of recursive calls made will tend to
repeat lots of calls, so there's value in memoizing. In Python, this is
trivially easy to do, via the functools.lru_cache decorator. Here's our new
version of count_weights. All that's changed is the extra line at the top:
#lru_cache(maxsize=None)
def count_weights(n, k):
"""
Return number of weight-k integers in [0, n] (for n >= 0, k >= 0)
"""
if k == 0:
return 1 # 0 is the only weight-0 value
elif n == 0:
return 0 # only considering 0, which doesn't have positive weight
else:
from_even = count_weights(n//2, k)
from_odd = count_weights((n-1)//2, k-1)
return from_even + from_odd
Now testing on those larger examples again, we get results much more quickly,
without any noticeable delay.
>>> count_weights(2**64, 10)
151473214816
>>> count_weights(2**64, 32)
1832624140942590534
>>> count_weights(5853459801720308837, 27)
356506415596813420
So now we have an efficient way to count, we've got an inverse problem to
solve: given k and m, find an n such that count_weights(2*n, k) -
count_weights(n, k) == m. This one turns out to be especially easy, since the
quantity count_weights(2*n, k) - count_weights(n, k) is monotonically
increasing with n (for fixed k), and more specifically increases by either
0 or 1 every time n increases by 1. I'll leave the proofs of those
facts to you, but here's a demo:
>>> for n in range(10, 30): print(n, count_weights(n, 3))
...
10 1
11 2
12 2
13 3
14 4
15 4
16 4
17 4
18 4
19 5
20 5
21 6
22 7
23 7
24 7
25 8
26 9
27 9
28 10
29 10
This means that we're guaranteed to be able to find a solution. There may be multiple solutions, so we'll aim to find the smallest one (though it would be equally easy to find the largest one). Bisection search gives us a crude but effective way to do this. Here's the code:
def solve(m, k):
"""
Find the smallest n >= 0 such that [n+1, 2n] contains exactly
m weight-k integers.
Assumes that m >= 1 (for m = 0, the answer is trivially n = 0).
"""
def big_enough(n):
"""
Target function for our bisection search solver.
"""
diff = count_weights(2*n, k) - count_weights(n, k)
return diff >= m
low = 0
assert not big_enough(low)
# Initial phase: expand interval to identify an upper bound.
high = 1
while not big_enough(high):
high *= 2
# Bisection phase.
# Loop invariant: big_enough(high) is True and big_enough(low) is False
while high - low > 1:
mid = (high + low) // 2
if big_enough(mid):
high = mid
else:
low = mid
return high
Testing the solution:
>>> n = solve(5853459801720308837, 27)
>>> n
407324170440003813446
Let's double check that n:
>>> count_weights(2*n, 27) - count_weights(n, 27)
5853459801720308837
Looks good. And if we got our search right, this should be the smallest
n that works:
>>> count_weights(2*(n-1), 27) - count_weights(n-1, 27)
5853459801720308836
There are plenty of other opportunities for optimizations and cleanups in the
above code, and other ways to tackle the problem, but I hope this gives you a
starting point.
The OP commented that they needed to do this in C, where memoization isn't immediately available without using an external library. Here's a variant of count_weights that doesn't need memoization. It's achieved by (a) tweaking the recursion in count_weights so that the same n is used in both recursive calls, and then (b) returning, for a given n, the values of count_weights(n, k) for all k for which the answer is nonzero. In effect, we're just moving the memoization into an explicit list.
Note: as written, the code below needs Python 3.
def count_all_weights(n):
"""
Return frequencies of weights of all integers in [0, n],
as a list. The kth entry in the list gives the count
of weight-k integers in [0, n].
Example
-------
>>> count_all_weights(16)
[1, 5, 6, 4, 1]
"""
if n == 0:
return [1]
else:
wm = count_all_weights((n-1)//2)
weights = [wm[0], *(wm[i]+wm[i+1] for i in range(len(wm)-1)), wm[-1]]
if n % 2 == 0:
weights[bin(n).count('1')] += 1
return weights
An example call:
>>> count_all_weights(7590)
[1, 13, 78, 286, 714, 1278, 1679, 1624, 1139, 559, 182, 35, 3]
This function should be good enough even for larger n: count_all_weights(10**18) takes less than a half a millisecond on my machine.
Now the bisection search will work as before, replacing the call to count_weights(n, k) with count_all_weights(n)[k] (and similarly for count_weights(2*n, k)).
Finally, another possibility is to break up the interval [0, n] into a succession of smaller and smaller subintervals, where each subinterval has length a power of two. For example, we'd break the interval [0, 101] into [0, 63], [64, 95], [96, 99] and [100, 101]. The advantage of this is that we can easily compute how many weight-k integers there are in any one of these subintervals by counting combinations. For example, in [0, 63] we have all possible 6-bit combinations, so if we're after weight-3 integers, we know there must be exactly 6-choose-3 (i.e., 20) of them. And in [64, 95], we know each integer starts with a 1-bit, and then after excluding that 1-bit we have all possible 5-bit combinations, so again we know how many integers there are in this interval with any given weight.
Applying this idea, here's a complete, fast, all-in-one function that solves your original problem. It has no recursion and no memoization.
def solve(m, k):
"""
Given nonnegative integers m and k, find the smallest
nonnegative integer n such that the closed interval
[n+1, 2*n] contains exactly m weight-k integers.
Note that for k small there may be no solution:
if k == 0 then we have no solution unless m == 0,
and if k == 1 we have no solution unless m is 0 or 1.
"""
# Deal with edge cases.
if k < 2 and k < m:
raise ValueError("No solution")
elif k == 0 or m == 0:
return 0
k -= 1
# Find upper bound on n, and generate a subset of
# Pascal's triangle as we go.
rows = []
high, row = 1, [1] + [0] * k
while row[k] < m:
rows.append((high, row))
high, row = high * 2, [1, *(row[i]+row[i+1] for i in range(k))]
# Bisect to find first n that works.
low = mlow = weight = 0
while rows:
high, row = rows.pop()
mmid = mlow + row[k - weight]
if mmid < m:
low, mlow, weight = low + high, mmid, weight + 1
return low + 1

Related

How to get the intuition behind the solution?

I was solving the below problem from USACO training. I found this really fast solution for which, I am finding it unable to absorb fully.
Problem: Consider an ordered set S of strings of N (1 <= N <= 31) bits. Bits, of course, are either 0 or 1.
This set of strings is interesting because it is ordered and contains all possible strings of length N that have L (1 <= L <= N) or fewer bits that are `1'.
Your task is to read a number I (1 <= I <= sizeof(S)) from the input and print the Ith element of the ordered set for N bits with no more than L bits that are `1'.
sample input: 5 3 19
output: 10110
The two solutions I could think of:
Firstly the brute force solution which goes through all possible combinations of bits, selects and stores the strings whose count of '1's are less than equal to 'L' and returning the Ith string.
Secondly, we can find all the permutations of '1's from 5 positions with range of count(0 to L), sort the strings in increasing order and returning the Ith string.
The best Solution:
The OP who posted the solution has used combination instead of permutation. According to him, the total number of string possible is 5C0 + 5C1 + 5C2 + 5C3.
So at every position i of the string, we decide whether to include the ith bit in our output or not, based on the total number of ways we have to build the rest of the string. Below is a dry run of the entire approach for the above input.
N = 5, L = 3, I = 19
00000
at i = 0, for the rem string, we have 4C0 + 4C1 + 4C2 + 4C3 = 15
It says that, there are 15 other numbers possible with the last 4 positions. as 15 is less than 19, our first bit has to be set.
N = 5, L = 2, I = 4
10000
at i = 1, we have 3C0 + 3C1 + 3C2 (as we have used 1 from L) = 7
as 7 is greater than 4, we cannot set this bit.
N = 5, L = 2, I = 4
10000
at i = 2 we have 2C0 + 2C2 = 2
as 2 <= I(4), we take this bit in our output.
N = 5, L = 1, I = 2
10100
at i = 3, we have 1C0 + 1C1 = 2
as 2 <= I(2) we can take this bit in our output.
as L == 0, we stop and 10110 is our answer. I was amazed to find this solution. However, I am finding it difficult to get the intuition behind this solution.
How does this solution sort-of zero in directly to the Ith number in the set?
Why does the order of the bits not matter in the combinations of set bits?
Suppose we have precomputed the number of strings of length n with k or fewer bits set. Call that S(n, k).
Now suppose we want the i'th string (in lexicographic order) of length N with L or fewer bits set.
All the strings with the most significant bit zero come before those with the most significant bit 1. There's S(N-1, L) strings with the most significant bit zero, and S(N-1, L-1) strings with the most significant bit 1. So if we want the i'th string, if i<=S(N-1, L), then it must have the top bit zero and the remainder must be the i'th string of length N-1 with at most L bits set, and otherwise it must have the top bit one, and the remainder must be the (i-S(N-1, L))'th string of length N-1 with at most L-1 bits set.
All that remains to code is to precompute S(n, k), and to handle the base cases.
You can figure out a combinatorial solution to S(n, k) as your friend did, but it's more practical to use a recurrence relation: S(n, k) = S(n-1, k) + S(n-1, k-1), and S(0, k) = S(n, 0) = 1.
Here's code that does all that, and as an example prints out all 8-bit numbers with 3 or fewer bits set, in lexicographic order. If i is out of range, then it raises an IndexError exception, although in your question you assume i is always in range, so perhaps that's not necessary.
S = [[1] * 32 for _ in range(32)]
for n in range(1, 32):
for k in range(1, 32):
S[n][k] = S[n-1][k] + S[n-1][k-1]
def ith_string(n, k, i):
if n == 0:
if i != 1:
raise IndexError
return ''
elif i <= S[n-1][k]:
return "0" + ith_string(n-1, k, i)
elif k == 0:
raise IndexError
else:
return "1" + ith_string(n-1, k-1, i - S[n-1][k])
print([ith_string(8, 3, i) for i in range(1, 94)])

Find the number of pairs in an array with product between l and r incluseive

this is the actual question
however, it simplifies to
Find all SEMPIPRIMES (numbers which are products of 2 DISTINCT prime factors e.g. 6 (2*3) in range L to R
there will be multiple queries for L and R
we cant precompute semiprimes as N is large
BUT we can store primes as they are only upto 10^6 as per the question
Now, assume i have all primes by sieve of eratostheneses
i need all possible pairs of primes with product between L to R
OR THE QUESTION SIMPLIFIES TO GIVEN A SORTED ARRAY.
FIND ALL POSSIBLE PAIRS WITH PRODUCTS BETWEEN L AND R INCLUSIVE
i am including the part of code in the editorial which does this..
for(int i=0; i<cnt and ar[i]<=r; i++)
{
int lower = L/ar[i];
if(L%ar[i]>0)
lower++;
lower = max(lower, ar[i]+1);
int upper = R/ar[i];
if(upper<lower)
continue;
ans += upper_bound(ar.begin(),ar.end(),upper)-
lower_bound(ar.begin(),ar.end(),lower);
}
Here's one approach, this may not be faster but it seems reasonable.
The number of primes below 10^8 is around 5*10^6.
reference: https://en.wikipedia.org/wiki/Prime-counting_function
But we may not have to keep all the primes, as it would be rather inefficient. We can keep the Semiprimes only.
There's already the generative process for Semprimes. Each Semiprime is a product of 2 distinct prime factors.
So, we can keep an array which will store all the semiprimes, as there will be at most 10^5 semiprimes in the range, we can sort that array. For each query, we will just binary search on the array to find the number of elements in the range.
So, how to save the semiprimes?
We can slightly modify the sieve of Eratosthenes to generate semiprimes.
The idea is we will keep a countDivision array which will store the number of divisor for each integer in range. We only consider a integer semiprime is countDivision index value is 2 for that integer (2 divisors).
def createSemiPrimeSieve(n):
v = [0 for i in range(n + 1)]
# This array will initially store the indexes
# After performing below operations if any
# element of array becomes 1 this means
# that the given index is a semi-prime number
# Storing indices in each element of vector
for i in range(1, n + 1):
v[i] = i
countDivision = [0 for i in range(n + 1)]
for i in range(n + 1):
countDivision[i] = 2
# This array will initially be initialized by 2 and
# will just count the divisions of a number
# As a semiprime number has only 2 prime factors
# which means after dividing by the 2 prime numbers
# if the index countDivision[x] = 0 and v[x] = 1
# this means that x is a semiprime number
# If number a is prime then its
# countDivision[a] = 2 and v[a] = a
for i in range(2, n + 1, 1):
# If v[i] != i this means that it is
# not a prime number as it contains
# a divisor which has already divided it
# same reason if countDivision[i] != 2
if (v[i] == i and countDivision[i] == 2):
# j goes for each factor of i
for j in range(2 * i, n + 1, i):
if (countDivision[j] > 0):
# Dividing the number by i
# and storing the dividend
v[j] = int(v[j] / i)
# Decreasing the countDivision
countDivision[j] -= 1
# A new vector to store all Semi Primes
res = []
for i in range(2, n + 1, 1):
# If a number becomes one and
# its countDivision becomes 0
# it means the number has
# two prime divisors
if (v[i] == 1 and countDivision[i] == 0):
res.append(i)
return res
Credit: https://www.geeksforgeeks.org/print-all-semi-prime-numbers-less-than-or-equal-to-n/
But, generating has same complexity as sieve which is O(nloglogn). If (R-L) is < 10^5 this approach will pass. But as (R-L) can be as big as 10^8 it's not feasible.
Another approach is to count instead of generating.
Let's work on an example.
2 10
Now, let's say, we know all the primes up to 10^6 (as p and q can't be more than 10^6).
primes = [2, 3, 5, 7, 11, ...]
The number of primes below 10^6 is less than 10^5 (so we can store
them in an array) and the time complexity is also manageable.
Now, we can scan our primes array to count the contribution for each prime to generate semiprimes in range (L, R).
First, let's start with 2, how many semiprimes we'll generate with the help of 2.
Let's look at primes = [2, 3, 5, 7, 11, ..]
We can't choose 2, because 2 and 2 are not different (p, q must be different). But, 3 is in, as 2*3 <= 10, so is 5, 2*5 2*5 <= 10.
How to count this?
We will take the lower bound as 2//2 (L//primes[i]) but we have to make sure, we don't consider the current prime again as (p and q must be different) so we take the max of L//primes[i] and primes[i]+1.
For 2, our start number is 3 (because of 2+1 = 3, we can't start at 1 or 2 as if we consider 2 then we'll calculate cases like 2*2 = 4 which is not valid). Our end number is 10//2 = 5, how many numbers are there within range 3, 5 in primes. It's 2 and can be found via a simple binary search.
Rest is easy, we'll have to binary search how many primes are there in range (max(L//primes[i], primes[i]+1), R//primes[i]).
This has complexity pre-processing time complexity of O(10^6*loglog(10^6)) + O(10^6*log(10^6)) and O(log(10^6)) per query.

Counting valid sequences with dynamic programming

I am pretty new to Dynamic Programming, but I am trying to get better. I have an exercise from a book, which asks me the following question (slightly abridged):
You want to construct sequence of length N from numbers from the set {1, 2, 3, 4, 5, 6}. However, you cannot place the number i (i = 1, 2, 3, 4, 5, 6) more than A[i] times consecutively, where A is a given array. Given the sequence length N (1 <= N <= 10^5) and the constraint array A (1 <= A[i] <= 50), how many sequences are possible?
For instance if A = {1, 2, 1, 2, 1, 2} and N = 2, this would mean you can only have one consecutive 1, two consecutive 2's, one consecutive 3, etc. Here, something like "11" is invalid since it has two consecutive 1's, whereas something like "12" or "22" are both valid. It turns out that the actual answer for this case is 33 (there are 36 total two-digit sequences, but "11", "33", and "55" are all invalid, which gives 33).
Somebody told me that one way to solve this problem is to use dynamic programming with three states. More specifically, they say to keep a 3d array dp(i, j, k) with i representing the current position we are at in the sequence, j representing the element put in position i - 1, and k representing the number of times that this element has been repeated in the block. They also told me that for the transitions, we can put in position i every element different from j, and we can only put j in if A[j] > k.
It all makes sense to me in theory, but I've been struggling with implementing this. I have no clue how to begin with the actual implementation other than initializing the matrix dp. Typically, most of the other exercises had some sort of "base case" that were manually set in the matrix, and then a loop was used to fill in the other entries.
I guess I am particularly confused because this is a 3D array.
For a moment let's just not care about the array. Let's implement this recursively. Let dp(i, j, k) be the number of sequences with length i, last element j, and k consecutive occurrences of j at the end of the array.
The question now becomes how do we write the solution of dp(i, j, k) recursively.
Well we know that we are adding a j the kth time, so we have to take each sequence of length i - 1, and has j occurring k - 1 times, and add another j to that sequence. Notice that this is simply dp(i - 1, j, k - 1).
But what if k == 1? If that's the case we can add one occurence of j to every sequence of length i - 1 that doesn't end with j. Essentially we need the sum of all dp(i, x, k), such that A[x] >= k and x != j.
This gives our recurrence relation:
def dp(i, j, k):
# this is the base case, the number of sequences of length 1
# one if k is valid, otherwise zero
if i == 1: return int(k == 1)
if k > 1:
# get all the valid sequences [0...i-1] and add j to them
return dp(i - 1, j, k - 1)
if k == 1:
# get all valid sequences that don't end with j
res = 0
for last in range(len(A)):
if last == j: continue
for n_consec in range(1, A[last] + 1):
res += dp(i - 1, last, n_consec)
return res
We know that our answer will be all valid subsequences of length N, so our final answer is sum(dp(N, j, k) for j in range(len(A)) for k in range(1, A[j] + 1))
Believe it or not this is the basis of dynamic programming. We just broke our main problem down into a set of subproblems. Of course, right now our time is exponential because of the recursion. We have two ways to lower this:
Caching, we can simply keep track of the result of each (i, j, k) and then spit out what we originally computed when it's called again.
Use an array. We can reimplement this idea with bottom-up dp, and have an array dp[i][j][k]. All of our function calls just become array accesses in a for loop. Note that using this method forces us iterate over the array in topological order which may be tricky.
There are 2 kinds of dp approaches: top-down and bottom-up
In bottom up, you fill the terminal cases in dp table and then use for loops to build up from that. Lets consider bottom-up algo to generate Fibonacci sequence. We set dp[0] = 1 and dp[1] = 1 and run a for loop from i = 2 to n.
In top down approach, we start from the "top" view of the problem and go down from there. Consider the recursive function to get n-th Fibonacci number:
def fib(n):
if n <= 1:
return 1
if dp[n] != -1:
return dp[n]
dp[n] = fib(n - 1) + fib(n - 2)
return dp[n]
Here we don't fill the complete table, but only the cases we encounter.
Why I am talking about these 2 types is because when you start learning dp, it is often difficult to come up with bottom-up approaches (like you are trying to). When this happens, first you want to come up with a top-down approach, and then try to get a bottom up solution from that.
So let's create a recursive dp function first:
# let m be size of A
# initialize dp table with all values -1
def solve(i, j, k, n, m):
# first write terminal cases
if k > A[j]:
# this means sequence is invalid. so return 0
return 0
if i >= n:
# this means a valid sequence.
return 1
if dp[i][j][k] != -1:
return dp[i][j][k]
result = 0
for num = 1 to m:
if num == j:
result += solve(i + 1, num, k + 1, n)
else:
result += solve(i + 1, num, 1, n)
dp[i][j][k] = result
return dp[i][j][k]
So we know what terminal cases are. We create a dp table of size dp[n + 1][m][50]. Initialize it with all values 0, not -1.
So we can do bottom-up as:
# initially all values in table are zero. With loop below, we set the valid endings as 1.
# So any state trying to reach valid terminal states will get 1, but invalid states will
# return the values 0
for num = 1 to m:
for occour = 1 to A[num]:
dp[n][num][occour] = 1
# now to build up from bottom, we start by filling n-1 th position
for i = n-1 to 1:
for num = 1 to m:
for occour = 1 to A[num]:
for next_num = 1 to m:
if next_num != num:
dp[i][num][occour] += dp[i + 1][next_num][1]
else:
dp[i][num][occour] += dp[i + 1][num][occour + 1]
The answer will be:
sum = 0
for num = 1 to m:
sum += dp[1][num][1]
I am sure there must be some more elegant dp solution, but I believe this answers your question. Note that I considered that k is the number of times j-th number has been repeated consecutively, correct me if I am wrong with this.
Edit:
With the given constraints the size of the table will be, in the worst case, 10^5 * 6 * 50 = 3e7. This would be > 100MB. It is workable, but can be considered too much space use (I think some kernels doesn't allow that much stack space to a process). One way to reduce it would be to use a hash-map instead of an array with top down approach since top-down doesn't visit all the states. That would be mostly true in this case, for example if A[1] is 2, then all the other states where 1 has occoured more that twice need not be stored. Ofcourse this would not save much space if A[i] has large values, say [50, 50, 50, 50, 50, 50]. Another approach would be to modify our approach a bit. We dont actually need to store the dimension k, i.e. the times j has appeared consecutively:
dp[i][j] = no of ways from i-th position if (i - 1)th position didn't have j and i-th position is j.
Then, we would need to modify our algo to be like:
def solve(i, j):
if i == n:
return 1
if i > n:
return 0
if dp[i][j] != -1
return dp[i][j]
result = 0
# we will first try 1 consecutive j, then 2 consecutive j's then 3 and so on
for count = 1 to A[j]:
for num = 1 to m:
if num != j:
result += solve(i + count, num)
dp[i][j] = result
return dp[i][j]
This approach will reduce our space complexity to O(10^6) ~= 2mb, while time complexity is still the same : O(N * 6 * 50)

Complex Combinatorial Conditions on Dynamic Programming

I am exploring how a Dynamic Programming design approach relates to the underlying combinatorial properties of problems.
For this, I am looking at the canonical instance of the coin change problem: Let S = [d_1, d_2, ..., d_m] and n > 0 be a requested amount. In how many ways can we add up to n using nothing but the elements in S?
If we follow a Dynamic Programming approach to design an algorithm for this problem that would allow for a solution with polynomial complexity, we would start by looking at the problem and how it is related to smaller and simpler sub-problems. This would yield a recursive relation describing an inductive step representing the problem in terms of the solutions to its related subproblems. We can then implement either a memoization technique or a tabulation technique to efficiently implement this recursive relation in a top-down or a bottom-up manner, respectively.
A recursive relation to solve this instance of the problem could be the following (Python 3.6 syntax and 0-based indexing):
def C(S, m, n):
if n < 0:
return 0
if n == 0:
return 1
if m <= 0:
return 0
count_wout_high_coin = C(S, m - 1, n)
count_with_high_coin = C(S, m, n - S[m - 1])
return count_wout_high_coin + count_with_high_coin
This recursive relation yields a correct amount of solutions but disregarding the order. However, this relation:
def C(S, n):
if n < 0:
return 0
if n == 0:
return 1
return sum([C(S, n - coin) for coin in S])
yields a correct amount of solutions while regarding the order.
I am interested in capturing more subtle combinatorial patterns through a recursion relation that can be further optimized via memorization/tabulation.
For example, this relation:
def C(S, m, n, p):
if n < 0:
return 0
if n == 0 and not p:
return 1
if n == 0 and p:
return 0
if m == 0:
return 0
return C(S, m - 1, n, p) + C(S, m, n - S[n - 1], not p)
yields a solution disregarding order but counting only solutions with an even number of summands. The same relation can be modified to regard order and counting number of even number of summands:
def C(S, n, p):
if n < 0:
return 0
if n == 0 and not p:
return 1
if n == 0 and p:
return 0
return sum([C(S, n - coin, not p) for coin in S])
However, what if we have more than 1 person among which we want to split the coins? Say I want to split n among 2 persons s.t. each person gets the same number of coins, regardless of the total sum each gets. From the 14 solutions, only 7 include an even number of coins so that I can split them evenly. But I want to exclude redundant assignments of coins to each person. For example, 1 + 2 + 2 + 1 and 1 + 2 + 1 + 2 are different solutions when order matters, BUT they represent the same split of coins to two persons, i.e. person B would get 1 + 2 = 2 + 1. I am having a hard time coming up with a recursion to count splits in a non-redundant manner.
(Before I elaborate on a possible answer, let me just point out that counting the splits of the coin exchange, for even n, by sum rather than coin-count would be more or less trivial since we can count the number of ways to exchange n / 2 and multiply it by itself :)
Now, if you'd like to count splits of the coin exchange according to coin count, and exclude redundant assignments of coins to each person (for example, where splitting 1 + 2 + 2 + 1 into two equal size parts is only either (1,1) | (2,2), (2,2) | (1,1) or (1,2) | (1,2) and element order in each part does not matter), we could rely on your first enumeration of partitions where order is disregarded.
However, we would need to know the multiset of elements in each partition (or an aggregate of similar ones) in order to count the possibilities of dividing them in two. For example, to count the ways to split 1 + 2 + 2 + 1, we would first count how many of each coin we have:
def partitions_with_even_number_of_parts_as_multiset(n, coins):
results = []
def C(m, n, s, p):
if n < 0 or m <= 0:
return
if n == 0:
if not p:
results.append(s)
return
C(m - 1, n, s, p)
_s = s[:]
_s[m - 1] += 1
C(m, n - coins[m - 1], _s, not p)
C(len(coins), n, [0] * len(coins), False)
return results
Output:
=> partitions_with_even_number_of_parts_as_multiset(6, [1,2,6])
=> [[6, 0, 0], [2, 2, 0]]
^ ^ ^ ^ this one represents two 1's and two 2's
Now since we are counting the ways to choose half of these, we need to find the coefficient of x^2 in the polynomial multiplication
(x^2 + x + 1) * (x^2 + x + 1) = ... 3x^2 ...
which represents the three ways to choose two from the multiset count [2,2]:
2,0 => 1,1
0,2 => 2,2
1,1 => 1,2
In Python, we can use numpy.polymul to multiply polynomial coefficients. Then we lookup the appropriate coefficient in the result.
For example:
import numpy
def count_split_partitions_by_multiset_count(multiset):
coefficients = (multiset[0] + 1) * [1]
for i in xrange(1, len(multiset)):
coefficients = numpy.polymul(coefficients, (multiset[i] + 1) * [1])
return coefficients[ sum(multiset) / 2 ]
Output:
=> count_split_partitions_by_multiset_count([2,2,0])
=> 3
Here is a table implementation and a little elaboration on algrid's beautiful answer. This produces an answer for f(500, [1, 2, 6, 12, 24, 48, 60]) in about 2 seconds.
The simple declaration of C(n, k, S) = sum(C(n - s_i, k - 1, S[i:])) means adding all the ways to get to the current sum, n using k coins. Then if we split n into all ways it can be partitioned in two, we can just add all the ways each of those parts can be made from the same number, k, of coins.
The beauty of fixing the subset of coins we choose from to a diminishing list means that any arbitrary combination of coins will only be counted once - it will be counted in the calculation where the leftmost coin in the combination is the first coin in our diminishing subset (assuming we order them in the same way). For example, the arbitrary subset [6, 24, 48], taken from [1, 2, 6, 12, 24, 48, 60], would only be counted in the summation for the subset [6, 12, 24, 48, 60] since the next subset, [12, 24, 48, 60] would not include 6 and the previous subset [2, 6, 12, 24, 48, 60] has at least one 2 coin.
Python code (see it here; confirm here):
import time
def f(n, coins):
t0 = time.time()
min_coins = min(coins)
m = [[[0] * len(coins) for k in xrange(n / min_coins + 1)] for _n in xrange(n + 1)]
# Initialize base case
for i in xrange(len(coins)):
m[0][0][i] = 1
for i in xrange(len(coins)):
for _i in xrange(i + 1):
for _n in xrange(coins[_i], n + 1):
for k in xrange(1, _n / min_coins + 1):
m[_n][k][i] += m[_n - coins[_i]][k - 1][_i]
result = 0
for a in xrange(1, n + 1):
b = n - a
for k in xrange(1, n / min_coins + 1):
result = result + m[a][k][len(coins) - 1] * m[b][k][len(coins) - 1]
total_time = time.time() - t0
return (result, total_time)
print f(500, [1, 2, 6, 12, 24, 48, 60])

Number of different binary sequences of length n generated using exactly k flip operations

Consider a binary sequence b of length N. Initially, all the bits are set to 0. We define a flip operation with 2 arguments, flip(L,R), such that:
All bits with indices between L and R are "flipped", meaning a bit with value 1 becomes a bit with value 0 and vice-versa. More exactly, for all i in range [L,R]: b[i] = !b[i].
Nothing happens to bits outside the specified range.
You are asked to determine the number of possible different sequences that can be obtained using exactly K flip operations modulo an arbitrary given number, let's call it MOD.
More specifically, each test contains on the first line a number T, the number of queries to be given. Then there are T queries, each one being of the form N, K, MOD with the meaning from above.
1 ≤ N, K ≤ 300 000
T ≤ 250
2 ≤ MOD ≤ 1 000 000 007
Sum of all N-s in a test is ≤ 600 000
time limit: 2 seconds
memory limit: 65536 kbytes
Example :
Input :
1
2 1 1000
Output :
3
Explanation :
There is a single query. The initial sequence is 00. We can do the following operations :
flip(1,1) ⇒ 10
flip(2,2) ⇒ 01
flip(1,2) ⇒ 11
So there are 3 possible sequences that can be generated using exactly 1 flip.
Some quick observations that I've made, although I'm not sure they are totally correct :
If K is big enough, that is if we have a big enough number of flips at our disposal, we should be able to obtain 2n sequences.
If K=1, then the result we're looking for is N(N+1)/2. It's also C(n,1)+C(n,2), where C is the binomial coefficient.
Currently trying a brute force approach to see if I can spot a rule of some kind. I think this is a sum of some binomial coefficients, but I'm not sure.
I've also come across a somewhat simpler variant of this problem, where the flip operation only flips a single specified bit. In that case, the result is
C(n,k)+C(n,k-2)+C(n,k-4)+...+C(n,(1 or 0)). Of course, there's the special case where k > n, but it's not a huge difference. Anyway, it's pretty easy to understand why that happens.I guess it's worth noting.
Here are a few ideas:
We may assume that no flip operation occurs twice (otherwise, we can assume that it did not happen). It does affect the number of operations, but I'll talk about it later.
We may assume that no two segments intersect. Indeed, if L1 < L2 < R1 < R2, we can just do the (L1, L2 - 1) and (R1 + 1, R2) flips instead. The case when one segment is inside the other is handled similarly.
We may also assume that no two segments touch each other. Otherwise, we can glue them together and reduce the number of operations.
These observations give the following formula for the number of different sequences one can obtain by flipping exactly k segments without "redundant" flips: C(n + 1, 2 * k) (we choose 2 * k ends of segments. They are always different. The left end is exclusive).
If we had perform no more than K flips, the answer would be sum for k = 0...K of C(n + 1, 2 * k)
Intuitively, it seems that its possible to transform the sequence of no more than K flips into a sequence of exactly K flips (for instance, we can flip the same segment two more times and add 2 operations. We can also split a segment of more than two elements into two segments and add one operation).
By running the brute force search (I know that it's not a real proof, but looks correct combined with the observations mentioned above) that the answer this sum minus 1 if n or k is equal to 1 and exactly the sum otherwise.
That is, the result is C(n + 1, 0) + C(n + 1, 2) + ... + C(n + 1, 2 * K) - d, where d = 1 if n = 1 or k = 1 and 0 otherwise.
Here is code I used to look for patterns running a brute force search and to verify that the formula is correct for small n and k:
reachable = set()
was = set()
def other(c):
"""
returns '1' if c == '0' and '0' otherwise
"""
return '0' if c == '1' else '1'
def flipped(s, l, r):
"""
Flips the [l, r] segment of the string s and returns the result
"""
res = s[:l]
for i in range(l, r + 1):
res += other(s[i])
res += s[r + 1:]
return res
def go(xs, k):
"""
Exhaustive search. was is used to speed up the search to avoid checking the
same string with the same number of remaining operations twice.
"""
p = (xs, k)
if p in was:
return
was.add(p)
if k == 0:
reachable.add(xs)
return
for l in range(len(xs)):
for r in range(l, len(xs)):
go(flipped(xs, l, r), k - 1)
def calc_naive(n, k):
"""
Counts the number of reachable sequences by running an exhaustive search
"""
xs = '0' * n
global reachable
global was
was = set()
reachable = set()
go(xs, k)
return len(reachable)
def fact(n):
return 1 if n == 0 else n * fact(n - 1)
def cnk(n, k):
if k > n:
return 0
return fact(n) // fact(k) // fact(n - k)
def solve(n, k):
"""
Uses the formula shown above to compute the answer
"""
res = 0
for i in range(k + 1):
res += cnk(n + 1, 2 * i)
if k == 1 or n == 1:
res -= 1
return res
if __name__ == '__main__':
# Checks that the formula gives the right answer for small values of n and k
for n in range(1, 11):
for k in range(1, 11):
assert calc_naive(n, k) == solve(n, k)
This solution is much better than the exhaustive search. For instance, it can run in O(N * K) time per test case if we compute the coefficients using Pascal's triangle. Unfortunately, it is not fast enough. I know how to solve it more efficiently for prime MOD (using Lucas' theorem), but O do not have a solution in general case.
Multiplicative modular inverses can't solve this problem immediately as k! or (n - k)! may not have an inverse modulo MOD.
Note: I assumed that C(n, m) is defined for all non-negative n and m and is equal to 0 if n < m.
I think I know how to solve it for an arbitrary MOD now.
Let's factorize the MOD into prime factors p1^a1 * p2^a2 * ... * pn^an. Now can solve this problem for each prime factor independently and combine the result using the Chinese remainder theorem.
Let's fix a prime p. Let's assume that p^a|MOD (that is, we need to get the result modulo p^a). We can precompute all p-free parts of the factorial and the maximum power of p that divides the factorial for all 0 <= n <= N in linear time using something like this:
powers = [0] * (N + 1)
p_free = [i for i in range(N + 1)]
p_free[0] = 1
for cur_p in powers of p <= N:
i = cur_p
while i < N:
powers[i] += 1
p_free[i] /= p
i += cur_p
Now the p-free part of the factorial is the product of p_free[i] for all i <= n and the power of p that divides n! is the prefix sum of the powers.
Now we can divide two factorials: the p-free part is coprime with p^a so it always has an inverse. The powers of p are just subtracted.
We're almost there. One more observation: we can precompute the inverses of p-free parts in linear time. Let's compute the inverse for the p-free part of N! using Euclid's algorithm. Now we can iterate over all i from N to 0. The inverse of the p-free part of i! is the inverse for i + 1 times p_free[i] (it's easy to prove it if we rewrite the inverse of the p-free part as a product using the fact that elements coprime with p^a form an abelian group under multiplication).
This algorithm runs in O(N * number_of_prime_factors + the time to solve the system using the Chinese remainder theorem + sqrt(MOD)) time per test case. Now it looks good enough.
You're on a good path with binomial-coefficients already. There are several factors to consider:
Think of your number as a binary-string of length n. Now we can create another array counting the number of times a bit will be flipped:
[0, 1, 0, 0, 1] number
[a, b, c, d, e] number of flips.
But even numbers of flips all lead to the same result and so do all odd numbers of flips. So basically the relevant part of the distribution can be represented %2
Logical next question: How many different combinations of even and odd values are available. We'll take care of the ordering later on, for now just assume the flipping-array is ordered descending for simplicity. We start of with k as the only flipping-number in the array. Now we want to add a flip. Since the whole flipping-array is used %2, we need to remove two from the value of k to achieve this and insert them into the array separately. E.g.:
[5, 0, 0, 0] mod 2 [1, 0, 0, 0]
[3, 1, 1, 0] [1, 1, 1, 0]
[4, 1, 0, 0] [0, 1, 0, 0]
As the last example shows (remember we're operating modulo 2 in the final result), moving a single 1 doesn't change the number of flips in the final outcome. Thus we always have to flip an even number bits in the flipping-array. If k is even, so will the number of flipped bits be and same applies vice versa, no matter what the value of n is.
So now the question is of course how many different ways of filling the array are available? For simplicity we'll start with mod 2 right away.
Obviously we start with 1 flipped bit, if k is odd, otherwise with 1. And we always add 2 flipped bits. We can continue with this until we either have flipped all n bits (or at least as many as we can flip)
v = (k % 2 == n % 2) ? n : n - 1
or we can't spread k further over the array.
v = k
Putting this together:
noOfAvailableFlips:
if k < n:
return k
else:
return (k % 2 == n % 2) ? n : n - 1
So far so well, there are always v / 2 flipping-arrays (mod 2) that differ by the number of flipped bits. Now we come to the next part permuting these arrays. This is just a simple permutation-function (permutation with repetition to be precise):
flipArrayNo(flippedbits):
return factorial(n) / (factorial(flippedbits) * factorial(n - flippedbits)
Putting it all together:
solutionsByFlipping(n, k):
res = 0
for i in [k % 2, noOfAvailableFlips(), step=2]:
res += flipArrayNo(i)
return res
This also shows that for sufficiently large numbers we can't obtain 2^n sequences for the simply reason that we can not arrange operations as we please. The number of flips that actually affect the outcome will always be either even or odd depending upon k. There's no way around this. The best result one can get is 2^(n-1) sequences.
For completeness, here's a dynamic program. It can deal easily with arbitrary modulo since it is based on sums, but unfortunately I haven't found a way to speed it beyond O(n * k).
Let a[n][k] be the number of binary strings of length n with k non-adjacent blocks of contiguous 1s that end in 1. Let b[n][k] be the number of binary strings of length n with k non-adjacent blocks of contiguous 1s that end in 0.
Then:
# we can append 1 to any arrangement of k non-adjacent blocks of contiguous 1's
# that ends in 1, or to any arrangement of (k-1) non-adjacent blocks of contiguous
# 1's that ends in 0:
a[n][k] = a[n - 1][k] + b[n - 1][k - 1]
# we can append 0 to any arrangement of k non-adjacent blocks of contiguous 1's
# that ends in either 0 or 1:
b[n][k] = b[n - 1][k] + a[n - 1][k]
# complete answer would be sum (a[n][i] + b[n][i]) for i = 0 to k
I wonder if the following observations might be useful: (1) a[n][k] and b[n][k] are zero when n < 2*k - 1, and (2) on the flip side, for values of k greater than ⌊(n + 1) / 2⌋ the overall answer seems to be identical.
Python code (full matrices are defined for simplicity, but I think only one row of each would actually be needed, space-wise, for a bottom-up method):
a = [[0] * 11 for i in range(0,11)]
b = [([1] + [0] * 10) for i in range(0,11)]
def f(n,k):
return fa(n,k) + fb(n,k)
def fa(n,k):
global a
if a[n][k] or n == 0 or k == 0:
return a[n][k]
elif n == 2*k - 1:
a[n][k] = 1
return 1
else:
a[n][k] = fb(n-1,k-1) + fa(n-1,k)
return a[n][k]
def fb(n,k):
global b
if b[n][k] or n == 0 or n == 2*k - 1:
return b[n][k]
else:
b[n][k] = fb(n-1,k) + fa(n-1,k)
return b[n][k]
def g(n,k):
return sum([f(n,i) for i in range(0,k+1)])
# example
print(g(10,10))
for i in range(0,11):
print(a[i])
print()
for i in range(0,11):
print(b[i])

Resources