Counting the bits set in the Fibonacci number system? - algorithm

We know that, each non negative decimal number can be represented uniquely by sum of Fibonacci numbers(here we are concerned about minimal representation i.e- no consecutive Fibonacci numbers are taken in the representation of a number and also each Fibonacci number is taken at most one in the representation).
For example:
1-> 1
2-> 10
3->100
4->101, here f1=1 , f2=2 and f(n)=f(n-1)+f(n-2);
so each decimal number can be represented in the Fibonacci system as a binary sequence. If we write all natural numbers successively in Fibonacci system, we will obtain a sequence like this: 110100101… This is called “Fibonacci bit sequence of natural numbers”.
My task is is counting the numbers of times that bit 1 appears in first N bits of this sequence.Since N can take value from 1 to 10^15,Can i do this without storing the Fibonacci sequence ?
for example: if N is 5,the answer is 3.

So this is just a preliminary sketch of an algorithm. It works when the upper bound is itself a Fibonacci number, but I'm not sure how to adapt it for general upper bounds. Hopefully someone can improve upon this.
The general idea is to look at the structure of the Fibonacci encodings. Here are the first few numbers:
0
1
10
100
101
1000
1001
1010
10000
10001
10010
10100
10101
100000
The invariant in each of these numbers is that there's never a pair of consecutive 1s. Given this invariant, we can increment from one number to the next using the following pattern:
If the last digit is 0, set it to 1.
If the last digit is 1, then since there aren't any consecutive 1s, set the last digit to 0 and the next digit to 1.
Eliminate any doubled 1s by setting them both to 0 and setting the next digit to a 1, repeating until all doubled 1s are eliminated.
The reason that this is important is that property (3) tells us something about the structure of these numbers. Let's revisit the first few Fibonacci-encoded numbers once more. Look, for example, at the first three numbers:
00
01
10
Now, look at all four-bit numbers:
1000
1001
1010
The next number will have five digits, as shown here:
1011 → 1100 → 10000
The interesting detail to notice is that the number of numbers with four digits is equal to the number of values with up to two digits. In fact, we get the four-digit numbers by just prefixing the at-most-two-digit-numbers with 10.
Now, look at three-digit numbers:
000
001
010
100
101
And look at five-digit numbers:
10000
10001
10010
10100
10101
Notice that the five-digit numbers are just the three-digit numbers with 10 prefixed.
This gives us a very interesting way for counting up how many 1s there are. Specifically, if you look at (k+2)-digit numbers, each of them is just a k-digit number with a 10 prefixed to it. This means that if there are B 1s total in all of the k-digit numbers, the number of Bs total in numbers that are just k+2 digits is equal to B plus the number of k-digit numbers, since we're just replaying the sequence with an extra 1 prepended to each number.
We can exploit this to compute the number of 1s in the Fibonacci codings that have at most k digits in them. The trick is as follows - if for each number of digits we keep track of
How many numbers have at most that many digits (call this N(d)), and
How many 1s are represented numbers with at most d digits (call this B(d)).
We can use this information to compute these two pieces of information for one more digit. It's a beautiful DP recurrence. Initially, we seed it as follows. For one digit, N(d) = 2 and B(d) is 1, since for one digit the numbers are 0 and 1. For two digits, N(d) = 3 (there's just one two-digit number, 10, and the two one-digit numbers 0 and 1) and B(d) is 2 (one from 1, one from 10). From there, we have that
N(d + 2) = N(d) + N(d + 1). This is because the number of numbers with up to d + 2 digits is the number of numbers with up to d + 1 digits (N(d + 1)), plus the numbers formed by prefixing 10 to numbers with d digits (N(d))
B(d + 2) = B(d + 1) + B(d) + N(d) (The number of total 1 bits in numbers of length at most d + 2 is the total number of 1 bits in numbers of length at most d + 1, plus the extra we get from numbers of just d + 2 digits)
For example, we get the following:
d N(d) B(d)
---------------------
1 2 1
2 3 2
3 5 5
4 8 10
5 13 20
We can actually check this. For 1-digit numbers, there are a total of 1 one bit used. For 2-digit numbers, there are two ones (1 and 10). For 3-digit numbers, there are five 1s (1, 10, 100, 101). For four-digit numbers, there are 10 ones (the five previous, plus 1000, 1001, 1010). Extending this outward gives us the sequence that we'd like.
This is extremely easy to compute - we can compute the value for k digits in time O(k) with just O(1) memory usage if we reuse space from before. Since the Fibonacci numbers grow exponentially quickly, this means that if we have some number N and want to find the sum of all 1s bits to the largest Fibonacci number smaller than N, we can do so in time O(log N) and space O(1).
That said, I'm not sure how to adapt this to work with general upper bounds. However, I'm optimistic that there is some way to do it. This is a beautiful recurrence and there just has to be a nice way to generalize it.
Hope this helps! Thanks for an awesome problem!

Lest solve 3 problems. Each next is harder then previous, each one uses result of previous.
1. How many ones are set if you write down every number from 0 to fib[i]-1.
Call this dp[i]. Lets look at the numbers
0
1
10
100
101
1000
1001
1010 <-- we want to count ones up to here
10000
If you write all numbers up to fib[i]-1, first you write all numbers up to fib[i-1]-1 (dp[i-1]), then you write the last block of numbers. There are exactly fib[i-2] of those numbers, each has a one on the first position, so we add fib[i-2], and if you erase those ones
000
001
010
then remove leading zeros, you can see that each number from 0 to fib[i-2]-1 is written down. Numbers of one there is equal to dp[i-2], which gives us:
dp[i] = fib[i-2] + dp[i-2] + dp[i-1];
2. How many ones are set if you write down every number from 0 to n.
0
1
10
100
101
1000
1001 <-- we want to count ones up to here
1010
Lets call this solNumber(n)
Suppose, that your number is f[i] + x, where f[i] is a maximum possible fibonacci number. Then anser if dp[i] + solNumber(x). This can be proved in the same way as in point 1.
3. How many ones are set in first n digits.
3a. How many numbers have representation length exactly l
if l = 1 the answer is 1, else its fib[l-2] + 1.
You can note, that if you erase leading ones and then all leading zeros you'll have each number from 0 to fib[l-1]-1. Exactly fib[l] numbers.
//End of 3a
Now you can find such number m than, if you write all numbers from 1 to m, their total length will be <=n. But if you write all from 1 to m+1, total length will be > n. Solve the problem manually for m+1 and add solNumber(m).
All 3 problems are solved in O(log n)
#include <iostream>
using namespace std;
#define FOR(i, a, b) for(int i = a; i < b; ++i)
#define RFOR(i, b, a) for(int i = b - 1; i >= a; --i)
#define REP(i, N) FOR(i, 0, N)
#define RREP(i, N) RFOR(i, N, 0)
typedef long long Long;
const int MAXL = 30;
long long fib[MAXL];
//How much ones are if you write down the representation of first fib[i]-1 natural numbers
long long dp[MAXL];
void buildDP()
{
fib[0] = 1;
fib[1] = 1;
FOR(i,2,MAXL)
fib[i] = fib[i-1] + fib[i-2];
dp[0] = 0;
dp[1] = 0;
dp[2] = 1;
FOR(i,3,MAXL)
dp[i] = fib[i-2] + dp[i-2] + dp[i-1];
}
//How much ones are if you write down the representation of first n natural numbers
Long solNumber(Long n)
{
if(n == 0)
return n;
Long res = 0;
RREP(i,MAXL)
if(n>=fib[i])
{
n -= fib[i];
res += dp[i];
res += (n+1);
}
return res;
}
int solManual(Long num, Long n)
{
int cr = 0;
RREP(i,MAXL)
{
if(n == 0)
break;
if(num>=fib[i])
{
num -= fib[i];
++cr;
}
if(cr != 0)
--n;
}
return cr;
}
Long num(int l)
{
if(l<=2)
return 1;
return fib[l-1];
}
Long sol(Long n)
{
//length of fibonacci representation
int l = 1;
//totatl acumulated length
int cl = 0;
while(num(l)*l + cl <= n)
{
cl += num(l)*l;
++l;
}
//Number of digits, that represent numbers with maxlength
Long nn = n - cl;
//Number of full numbers;
Long t = nn/l;
//The last full number
n = fib[l] + t-1;
return solNumber(n) + solManual(n+1, nn%l);
}
int main(int argc, char** argv)
{
ios_base::sync_with_stdio(false);
buildDP();
Long n;
while(cin>>n)
cout<<"ANS: "<<sol(n)<<endl;
return 0;
}

Compute m, the number responsible for the (N+1)th bit of the sequence. Compute the contribution of m to the count.
We have reduced the problem to counting the number of one bits in the range [1, m). In the style of interval trees, partition this range into O(log N) subranges, each having an associated glob like 10100???? that matches the representations of exactly the numbers belonging to that range. It is easy to compute the contribution of the prefixes.
We have reduced the problem to counting the total number T(k) of one bits in all Fibonacci words of length k (i.e., the ???? part of the globs). T(k) is given by the following recurrence.
T(0) = 0
T(1) = 1
T(k) = T(k - 1) + T(k - 2) + F(k - 2)
Mathematica says there's a closed form solution, but it looks awful and isn't needed for this polylog(N)-time algorithm.

This is not a full answer but it does outline how you can do this calculation without using brute force.
The Fibonacci representation of Fn is a 1 followed by n-1 zeros.
For the numbers from Fn up to but not including F(n+1), the number of 1's consists of two parts:
There are F(n-1) such numbers, so there are F(n-1) leading 1's.
The binary digits after the leading numbers are just the binary representations of all numbers up to but not including F(n-1).
So, if we call the total number of bits in the sequence up to but not including the nth Fibonacci number an, then we have the following recursion:
a(n+1) = an + F(n-1) + a(n-1)
You can also easily get the number of bits in the sequence up to Fn.
If it takes k Fibonacci numbers to get to (but not pass) N, then you can count those bits with the above formula, and after some further manipulation reduce the problem to counting the number of bits in the remaining sequence.

[Edit] : Basically I have followed the property that for any number n which is to be represented in fibonacci base, we can break it as n = n - x where x is the largest fibonacci just less than n. Using this property, any number can be broken in bit form.
First step is finding the decimal number such that Nth bit ends in it.
We can see that all numbers between fibonacci number F(n) and F(n+1) will have same number of bits. Using this, we can pre-calculate a table and find the appropriate number.
Lets say that you have the decimal number D at which there is the Nth bit.
Now, let X be the largest fibonacci number lesser than or equal to D.
To find set bits for all numbers from 1 to D we represnt it as ...
X+0, X+1, X+2, .... X + D-X. So, all the X will be repsented by 1 at the end and we have broken the problem into a much smaller sub-problem. That is, we need to find all set bits till D-X. We keep doing this recusively. Using the same logic, we can build a table which has appropriate number of set bits count for all fibonacci numbers (till limit). We would use this table for finding number of set bits from 1 to X.
So,
Findsetbits(D) { // finds number of set bits from 1 to D.
find X; // largest fibonacci number just less than D
ans = tablesetbits[X];
ans += 1 * (D-x+1); // All 1s at the end due to X+0,X+1,...
ans += Findsetbits(D-x);
return ans;
}
I tried some examples by hand and saw the pattern.
I have coded a rough solution which I have checked by hand for N <= 35. It works pretty fast for large numbers, though I can't be sure that it is correct. If it is an online judge problem, please give the link to it.
#include<iostream>
#include<vector>
#include<map>
#include<algorithm>
using namespace std;
#define pb push_back
typedef long long LL;
vector<LL>numbits;
vector<LL>fib;
vector<LL>numones;
vector<LL>cfones;
void init() {
fib.pb(1);
fib.pb(2);
int i = 2;
LL c = 1;
while ( c < 100000000000000LL ) {
c = fib[i-1] + fib[i-2];
i++;
fib.pb(c);
}
}
LL answer(LL n) {
if (n <= 3) return n;
int a = (lower_bound(fib.begin(),fib.end(),n))-fib.begin();
int c = 1;
if (fib[a] == n) {
c = 0;
}
LL ans = cfones[a-1-c] ;
return ans + answer(n - fib[a-c]) + 1 * (n - fib[a-c] + 1);
}
int fillarr(vector<int>& a, LL n) {
if (n == 0)return -1;
if (n == 1) {
a[0] = 1;
return 0;
}
int in = lower_bound(fib.begin(),fib.end(),n) - fib.begin(),v=0;
if (fib[in] != n) v = 1;
LL c = n - fib[in-v];
a[in-v] = 1;
fillarr(a, c);
return in-v;
}
int main() {
init();
numbits.pb(1);
int b = 2;
LL c;
for (int i = 1; i < fib.size()-2; i++) {
c = fib[i+1] - fib[i] ;
c = c*(LL)b;
b++;
numbits.pb(c);
}
for (int i = 1; i < numbits.size(); i++) {
numbits[i] += numbits[i-1];
}
numones.pb(1);
cfones.pb(1);
numones.pb(1);
cfones.pb(2);
numones.pb(1);
cfones.pb(5);
for (int i = 3; i < fib.size(); i++ ) {
LL c = 0;
c += cfones[i-2]+ 1 * fib[i-1];
numones.pb(c);
cfones.pb(c + cfones[i-1]);
}
for (int i = 1; i < numones.size(); i++) {
numones[i] += numones[i-1];
}
LL N;
cin>>N;
if (N == 1) {
cout<<1<<"\n";
return 0;
}
// find the integer just before Nth bit
int pos;
for (int i = 0;; i++) {
if (numbits[i] >= N) {
pos = i;
break;
}
}
LL temp = (N-numbits[pos-1])/(pos+1);
LL temp1 = (N-numbits[pos-1]);
LL num = fib[pos]-1 + (temp1>0?temp+(temp1%(pos+1)?1:0):0);
temp1 -= temp*(pos+1);
if(!temp1) temp1 = pos+1;
vector<int>arr(70,0);
int in = fillarr(arr, num);
int sub = 0;
for (int i = in-(temp1); i >= 0; i--) {
if (arr[i] == 1)
sub += 1;
}
cout<<"\nNumber answer "<<num<<" "<<answer(num) - sub<<"\n";
return 0;
}

Here is O((log n)^3).
Lets compute how many numbers fits in first N bits
Imagine that we have function:
long long number_of_all_bits_in_sequence(long long M);
It computes length of "Fibonacci bit sequence of natural numbers" created by all numbers that aren't greater than M.
With this function we could use binary search to find how many numbers fits in the first N bits.
How many bits are 1's in representation of first M numbers
Lets create function which calculates how many numbers <= M have 1 at k-th bit.
long long kth_bit_equal_1(long long M, int k);
First lets preprocess results of this function for all small values, lets say M <= 1000000.
Implementation for M > PREPROCESS_LIMIT:
long long kth_bit_equal_1(long long M, int k) {
if (M <= PREPROCESS_LIMIT) return preprocess_result[M][k];
long long fib_number = greatest_fib_which_isnt_greater_than(M);
int fib_index = index_of_fib_in_fibonnaci_sequence(fib);
if (fib_index < k) {
// all numbers are smaller than k-th fibbonacci number
return 0;
}
if (fib_index == k) {
// only numbers between [fib_number, M] have k-th bit set to 1
return M - fib_number + 1;
}
if (fib_index > k) {
long long result = 0;
// all numbers between [fib_number, M] have bit at fib_index set to 1
// so lets subtrack fib_number from all numbers in this interval
// now this interval is [0, M - fib_number]
// lets calculate how many numbers in this inteval have k-th bit set.
result += kth_bit_equal_1(M - fib_number, k);
// don't forget about remaining numbers (interval [1, fib_number - 1])
result += kth_bit_equal_1(fib_number - 1, k);
return result;
}
}
Complexity of this function is O(M / PREPROCESS_LIMIT).
Notice that in reccurence one of the addends is always one of fibbonaci numbers.
kth_bit_equal_1(fib_number - 1, k);
So if we memorize all computed results than complexity will improve to T(N) = T(N/2) + O(1) . T(n) = O(log N).
Lets get back to number_of_all_bits_in_sequence
We can slighly modify kth_bit_equal_1 so it would also count bits equal to 0.

Here's a way to count all the one digits in the set of numbers up to a given digit length bound. This seems to me to be a reasonable starting point for a solution
Consider 10 digits. Start by writing;
0000000000
Now we can turn some number of these zeros into ones, keeping the last digit always as a 0. Consider the possibilities case by case.
0 There's just one way to chose 0 of these to be ones. Summing the 1-bits in this one case gives 0.
1 There are {9 choose 1} ways to turn one of the zeros into a one. Each of these contributes 1.
2 There are {8 choose 2} ways to turn two of the zeros into ones. Each of these contributes 2.
...
5 There are {5 choose 5} ways to turn five of the zeros into ones. Each of these contributes 5 to the bit count.
It's easy to think of this as a tiling problem. The string of 10 zeros is a 10x1 board, which we want to tile with 1x1 squares and 2x1 dominoes. Choosing some number of the zeros to be ones is then the same as choosing some of the tiles to be dominoes. My solution is closely related to Identity 4 in "Proofs that really count" by Benjamin and Quinn.
Second step Now try to use the above construction to solve the original problem
Suppose we want to the one bits in the first 100100010 bits (the number is in Fibonacci representation of course). Start by overcounting the sum for all ways to replace the x's with zeros and ones in 10xxxxx0. To overcompensate for overcounting, subract the count for 10xxx0. Continue the procedure of overcounting and overcompensation.

This problem has a dynamic solution, as illustrated by the tested algorithm below.
Some points to keep in mind, which are evident in the code:
The best solution for each number i will be obtained by using the fibonacci number f where f == i
OR where f is less than i then it must be f and the greatest number n <= f: i = f+n.
Note that the fib sequence is memoized over the entire algorithm.
public static int[] fibonacciBitSequenceOfNaturalNumbers(int num) {
int[] setBits = new int[num + 1];
setBits[0] = 0;//anchor case of fib seq
setBits[1] = 1;//anchor case of fib seq
int a = 1, b = 1;//anchor case of fib seq
for (int i = 2; i <= num; i++) {
int c = b;
while (c < i) {
c = a + b;
a = b;
b = c;
}//fib
if (c == i) {
setBits[i] = 1;
continue;
}
c = a;
int tmp = c;//to optimize further, make tmp the fib before a
while (c + tmp != i) {
tmp--;
}
setBits[i] = 1 + setBits[tmp];
}//done
return setBits;
}
Test with:
public static void main(String... args) {
int[] arr = fibonacciBitSequenceOfNaturalNumbers(23);
//print result
for(int i=1; i<arr.length; i++)
System.out.format("%d has %d%n", i, arr[i]);
}
RESULT OF TEST: i has x set bits
1 has 1
2 has 1
3 has 1
4 has 2
5 has 1
6 has 2
7 has 2
8 has 1
9 has 2
10 has 2
11 has 2
12 has 3
13 has 1
14 has 2
15 has 2
16 has 2
17 has 3
18 has 2
19 has 3
20 has 3
21 has 1
22 has 2
23 has 2
EDIT BASED ON COMMENT:
//to return total number of set between 1 and n inclusive
//instead of returning as in original post, replace with this code
int total = 0;
for(int i: setBits)
total+=i;
return total;

Related

Sort numbers in lexicographical order of their prime factors

I am trying to solve an algorithmic problem which requires me to sort a list of numbers of size N (N <= 1e6) based on the lexicographical order of their prime factors. Each number in the list is in [2,1e6].
Link to problem.
For example,
2 3 4 5 6
would be sorted to:
2 4 6 3 5
Their prime factors are shown below:
2 = 2
3 = 3
4 = 2 * 2
5 = 5
6 = 2 * 3
My attempt:
I am able to devise a correct solution for this by using a O(logn) prime factorization method on each of the numbers and storing this into a 1e6 * 21 2d array because all numbers <= 1e6 can have at most 20 prime factors since 2^20 > 1e6.
Thereafter I sort each of the numbers using the lexographical order of these prime factors.
My program is able to run well under the time limit of 2 seconds but uses too much memory (the memory limit is 32mb).
Could someone please advise me on a better way to solve this problem?
p.s. This problem was tagged with "depth-first-search" but I can't see how this would work anyway.
This sounds like a partitioning problem to me. The first step would be to partition the array so that the numbers that divide by 2 come first. Then partition that group by the ones that divide by 2 a second time. Recurse until you have an empty subgroup. Now do it again with a divisor of 3. Continue up the list of primes until you reach sqrt(1e6) or you've found all the divisors for each number.
Since you are already well within your time limit, you just need a more efficient way to store all the prime factors for each number so that you can easily look them up. You can do this with a 2-level lookup table.
In one table, store the numbers up to 1e3 (1000) as you are storing them now. This will require a 1e3 x 10 2d array (1000 x 10 x 4 = 10000 bytes).
To store all the prime factors for numbers up to 1e6 you need to store up to 3 numbers for each one (that's 12 million bytes). To compute the 3 numbers, start with the list of prime factors and multiply them back together until you can't multiply another one without going over 1000. Store that in the first entry, then do the same with remainder and store in the second number and if you have any left over just put it in the 3rd position (you'll never need more than 3 - if you had 4 it would mean the last two multiplied together would be over 1000, which would imply the first 2 multiplied together would be < 1000, in which case they wouldn't be stored separately). If there is a prime factor over 1000 in the list you only need 2 because all the others will multiply to < 1000.
To retrieve the original list of prime factors for an entry take each of your three numbers (which will be prime or composite numbers 1000 or less or prime numbers > 1000), if they are under 1000 look up their prime factors in the small table and if not take them as is, and you can reconstruct the list.
For e.g. to store 515130 (2*3*5*7*11*223)
1st number: 210 (2*3*5*7) can't multiply by 11 without going over 1000
2nd number: 11 (prime) can't multiply by 223 without going over
3rd number: 223
667023 (3*7*23*1381)
1st: 438 (3*7*23)
2nd: 1381 (prime)
This will work even if the list of prime factors is unsorted.
Actually, a simple modification to my algorithm did the trick. Rather than storing the prime factorization of each integer, I made use of what I already had, which is a prime factorization method that works in O(logn). I created a custom sort method that uses the factorization method to factorize two integers while I compared their prime factors. Hence, the time complexity remained the same and there was no need to store the prime factorization of any integer.
For those curious to know how this fast factorization method works, here is a link (see the method that uses lowest divisors).
For future readers who face the same problem, here is my accepted code:
#include<cstdio>
#include<algorithm>
#define FACTOR_LIM (int) 1e6+2 // used by preFactor(n). Defined as <= n <= 1e8
using namespace std;
int lowestDiv[FACTOR_LIM+1], a[FACTOR_LIM], n;
void preFactor(int n) {
int root = 2;
for(int i = 2; i*i <= n; i++) {
if(lowestDiv[i]) continue;
root = lowestDiv[i] = i;
for(int j = i*i; j <= n; j+=i) {
lowestDiv[j] = (lowestDiv[j]) ? lowestDiv[j] : i;
}
}
for(int i = root; i <= n; i++) {
if(!lowestDiv[i]) {
lowestDiv[i] = i;
}
}
}
bool cmp(const int i, const int j) {
int x = i, y = j;
while (x != y) {
int p = lowestDiv[x];
int q = lowestDiv[y];
if (p != q) return p < q;
x /= p;
y /= q;
}
return false;
}
int main() {
preFactor(FACTOR_LIM-1);
scanf("%d",&n);
for(int i = 0; i < n; i++) {
scanf("%d",&a[i]);
}
sort(a,a+n,cmp);
for(int i = 0; i < n; i++) {
printf("%d\n",a[i]);
}
return 0;
}

More efficient "First K numbers, that their digit sum is S" algorithm

The whole problem sounds like:
"We have 2 numbers on input, K and S. We want to print on output first(lowest) K numbers, while their digit sum is exactly S"
There is an easy naive algorithm to solve such problem (which I was able to construct and to find). It's principle is to have a function bigint digitSum(i), (I write bigint, because S is not anyhow limited, as I want just more effective algorithm...) which will return digit sum of argument number. We will start off from number 0 and always increment by 1, while putting that numbers in the function. If function returns sum same as S, print that number and continue, until we print K numbers.
Function code is here:
bigint digitSum(number){
bigint total = 0;
while(number > 0)
{
total += number % 10;
number /= 10;
}
return total;
}
Algorithm asymptotic complexity in Big-O is as is complexity of searching trough the numbers 0,1,2,3...n until we find exactly K needed numbers and is complexity of our function to find digit sum, as it always divide number by 10.
Is there any algorithm or way to make it more efficient?? Thanks!
This is a recursive algorithm that will give you the K smallest numbers whose digits sum up to S. The complexity is definitely better than your brute force algorithm, although I'm not sure what it would be in big O notation.
The algorithm goes as follows:
Find all the combinations of nDigits=1 that sum up to S
Then nDigits=2, nDigits=3, ... until count == K
all combinations of nDigit → all combinations of 1 + all combinations of nDigit-1
Here's the code in Java:
public static void main(String[] args) {
int[] currentCount = {0};
int k = 10, s = 10;
for(int n = s/9 ; currentCount[0] != 10 ; n++) {
digitSum(new StringBuilder(), n, 0, s, k, currentCount);
}
}
public static void digitSum(StringBuilder subNumber, int nDigit, int currentSum, int s, int k, int[] currentCount) {
if(nDigit == 0) {
if(currentSum == s) {
System.out.println(subNumber);
currentCount[0]++;
}
return;
}
if(currentCount[0] == k) return; //if already have k numbers, terminate
int remaining = s-currentSum;
if(remaining > nDigit*9) return; //if not enough digits to reach S, terminate
final int bound = Integer.min(9, remaining); //what's the largest valid digit
//zero digit is only valid if subNumber != 0
if(subNumber.length()!=0) digitSum(new StringBuilder(subNumber).append('0'), nDigit-1, currentSum, s, k, currentCount);
for(int i = 1 ; i <= bound ; i++) digitSum(new StringBuilder(subNumber).append((char)(i+'0')), nDigit-1, currentSum+i, s, k, currentCount);
}
EDIT
I roughly measured the time complexity, and it is clearly O(N) as seen below:
To solve such problems, you have to catch some regularities. For example, build a sequence of the first numbers, for which digit sum is S
S 0 1 .. 9 10 11 12 .. 18 19 20 .. 31 ...
F(S) 0 1 .. 9 19 29 39 .. 99 199 299 .. 4999...
We can see that the first number could be found using values
M = S div 9
R = S mod 9
as
F(S) = R(9xM) ////concatenation of digit R and M 9s
for S=31 M=3,R=4, and
F(31) = 4(9x3) = 4999 //concatenation of 4 and three nines
So we can determine the first needed number in O(1).
Then elaborate rules for the next number with the same digit sum (note that often N(i+1) = N(i) + 9)

Use of masks in SRM 655 Div 2 Hard on Topcoder

The question goes as follows...
Bob's little sister Alice is nine years old. Bob is testing her mathematical prowess by asking her to compute the remainder a number gives when divided by 9.
Today, Bob gave Alice exactly N such questions. We will number the questions 0 through N-1. In each question, Bob gave Alice the same M-digit number. (Note that Bob's number is allowed to have some leading zeros.)
In some of those cases Alice may have skipped some of the digits when reading the number. However, she never made any other mistakes in her calculations For example, if Bob gave Alice the number 012345 three times, she may have read it as 0145 the first time, 012345 the second time, and 135 the third time. Then, her answers would be 145 modulo 9 = 1, 12345 modulo 9 = 6, and 135 modulo 9 = 0.
You are given the int N and a int[] d with M elements. For each i, the number d[i] corresponds to the digit of the order 10^i in Bob's number. For each i and j, Alice read digit i when answering question j if and only if bit number j of the number d[i] is 1.
For example, suppose that d[3] = 6. In binary, 6 is 110. In other words, the binary digits number 0, 1, and 2 are 0, 1, and 1. Hence, Alice skipped the corresponding digit in question 0 but she read it in questions 1 and 2.
A surprising thing happened in today's experiment: For each of the N questions, Alice's answer was that the remainder is 0. Bob found that interesting. He now wonders: given N and d, how many different M-digit numbers have this property?
Let X be the answer to Bob's question. Compute and return the value (X modulo 1,000,000,007).
After thinking for the solution for quite a while, and no idea turning up, I referred the editorial given below
Add, don't multiply
When problems ask you to do things using modulo 9 specifically (ask the remainder when dividing by 9), it is worth knowing a special property about modulo 9 and digits. 10≡1mod9 , this means that all powers of 10 are also 1 modulo 9. Modular arithmetic will make it so if you take a number, a quick way to get that number modulo 9 is to first add its digits and then take the remainder after dividing by 9. For example: 4671mod9≡(4⋅1000+6⋅100+7⋅10+1)mod9≡4⋅1+6⋅1+7⋅1+1⋅1mod9≡4+6+7+1.
The N modulos
Let's think of deciding each digit in order from 0 to M−1. There are N questions and each of them represents the total sum of some of the digits of the number modulo 9. We need the final sum for each of the questions to be 0, but while filling each digit the current sums might vary and be distinct to 0.
Let's think of this as a state problem. We start with an empty M digit number and we wish to fill it. Initially, all the sums of digits for each of the N questions are 0. Imagine we decide that digit with index 0 will be 7. This means that some of the sums s will become (s+7)mod9. We do not need to remember the whole sum s, just the modulo 9 of it. Some other sums will stay at 0. It all depends whether or not 0 is included in the specific question's sum.
This allows us to think of a recurrence like : f(s0,s1,...,sN−1,p). Which will give us the number of ways to fill the digits with indexes greater than or equal to p, such that si is the current sum for each question.
Base case: p=M, this means we have ran out of digits. We no longer need to add any digits and the sums cannot change anymore, all si must be equal to 0.
Else we can try each digit i as the digit that will be used in position p of the number and see how the state changes. The new sums will be s'0,s'1,...s'N−1 and they will add i or not depending if digit #p is included in the respective sum. The number of ways to fill the remaining positions is: f(s'0,s'1,...s'N−1,p+1). We need to add this result for each of the digit we try.
This recurrence solves the problem. It is acyclic and the state size is not very large. Note that each si can be one of 9 numbers (results modulo 9), so there are at most 95 different combinations of si. p is O(M), M≤20. For each state we need to try 9 digits and update 5 sums. If we use memoization or implement the function iteratively, a worst case will look like 95⋅20⋅9⋅5, which is very appropriate for the time limit.
Implementing the memoization might be complicated because of the flexible number of arguments. We can work around this in two ways. One is assuming there are always 5 questions, but 5−M of the questions ignore ALL digits.
The other is to use base-9 encoding to represent the states.
This is very similar to using bit-masks (base 2 numbers) but with base
9.
I understood almost everything including the choice of the dynamic programming state. But what I couldn't understand was the use of masks in the solution. Can someone please explain how it was done i.e. the implementation of the mask.
Here is the code given in the editorial
const int MOD = 1000000007;
int pow9[6];
int N;
vector<int> d;
long dp[9*9*9*9*9][21];
long f(int mask, int p)
{
long & res = dp[mask][p];
if (res == -1) {
res = 0;
if (p == d.size() ) {
// base case
if (mask == 0) {
// good
res = 1;
}
} else {
// pick a digit for the number
for (int i = 0; i <= 9; i++) {
// calculate the new mask:
int mask2 = 0;
for (int j = N-1; j >= 0; j--) {
int o = (mask / pow9[j]) % 9;
if ( (d[p] & (1<<j)) != 0 ) {
o = (o + i) % 9;
}
mask2 = mask2 * 9 + o;
}
res += f(mask2, p+1);
}
res %= MOD;
}
}
return res;
}
int count(int N, vector<int> d)
{
this->d = d;
this->N = N;
memset(dp, -1, sizeof(dp));
pow9[0] = 1;
for (int i = 1; i <= N; i++) {
pow9[i] = 9 * pow9[i-1];
}
return (int)f(0,0);
}
NOTE: I know about masks a bit and I have used them in the subset sum problem. Though it was quite different from this implementation.
Any help would be appreciated. Thanks in advance!

Calculate the index of a given number within a sorted set

Not sure if this question should be on Math-Overflow or here, so will try here first:
Suppose we are given a number with N 1s and M 0s.
There are (M+N)!/(M!*N!) different such numbers, that can be sorted in a countable set.
For example, the sorted set of all numbers with 2 ones and 3 zeros, is:
0 00011
1 00101
2 00110
3 01001
4 01010
5 01100
6 10001
7 10010
8 10100
9 11000
How can we efficiently calculate the index of a given number within the corresponding set?
Note: the input to this question is only the number, and not the entire (corresponding) set.
Let choose (n, k) = n! / k! / (n-k)!.
Observe the following structure of your sorted set:
0 0|0011
1 0|0101
2 0|0110
3 0|1001
4 0|1010
5 0|1100
------
6 1|0001
7 1|0010
8 1|0100
9 1|1000
In the sorted set, there are choose (N + M, M) numbers (binary strings of length N + M) in total.
First go the numbers starting by a zero, and there are choose (N + M-1, M-1) of them. Then go the numbers starting by a one, and there are choose (N-1 + M, M) of them. Each of these two sections is also sorted.
So, if your number b1b2...bk starts with a zero (b1 = 0), its index in the sorted set is the same as index of b2...bk in the sorted set of all binary strings of N ones and M-1 zeroes. If it starts with a one (b1 = 1), its index in the sorted set is the same as index of b2...bk in the sorted set of all binary strings of N-1 ones and M zeroes, plus the total number of binary strings starting with a zero, which is choose (N + M-1, M-1).
In this way, you recursively descent to subproblems involving suffixes of your original binary string, increasing the sought number by some amount whenever you meet a 1. In the end, you come to an empty binary string which clearly is the one and only string consisting of 0 zeroes and 0 ones.
This is called ranking in combinatorial algorithms. Here is a C function that does that for you:
unsigned long rank_choose(unsigned long n, unsigned long k, unsigned long c) {
unsigned long res = 0;
for (; n > 0; n--) {
if (c & 1) { res += binomial(n-1, k); k--;}
c >>= 1;
}
return res;
}
It assume that you have a function binomial(n, k) which compute the coefficient n!/k!/(n-k)!. Be careful, the solution I propose here use an endianess reversed representation:
const int m = 5, n = 2;
int k = 12;
std::cout << std::bitset<m>(k) << " " << rank_choose(m, n, k) << std::endl;
k = 9;
std::cout << std::bitset<m>(k) << " " << rank_choose(m, n, k) << std::endl;
Returns:
01100 2
01001 7
Here is a solution with the other endianess:
unsigned long rank_choose_rev(unsigned long n, unsigned long k, unsigned long c) {
unsigned long res = 0, mask = 1<<(n-1);
for (; n > 0; n--) {
if (c & mask) { res += binomial(n-1, k); k--;}
mask >>= 1;
}
return res;
}
Then
01100 5
01001 3
Note : The algorithms is nicely described by #Gassa below (+1 to him).
I'll explain one solution with the help of an example. Say there are 3 ones and 3 zeroes and we have to find the index of 010110
The first one from left is at the second position. Hence all number that have double zeroes at their right shall be less than this number:
00---- (C(4,3)) = 4
Now place this one at its position and move to the next one. Next one is at position 4. Hence all the following numbers shall be less than the candidate:
0100-- (C(2,2)) = 1
Now place this one at its position and move to the next one. Next one is at position 5. Hence all the following numbers shall be less than the candidate:
01010- (C(1,1)) = 1
Hence the number of numbers less than the candidate = 4 + 1 + 1 = 6
000111
001011
001101
001110
010011
010101
010110

Double Squares: counting numbers which are sums of two perfect squares

Source: Facebook Hacker Cup Qualification Round 2011
A double-square number is an integer X which can be expressed as the sum of two perfect squares. For example, 10 is a double-square because 10 = 32 + 12. Given X, how can we determine the number of ways in which it can be written as the sum of two squares? For example, 10 can only be written as 32 + 12 (we don't count 12 + 32 as being different). On the other hand, 25 can be written as 52 + 02 or as 42 + 32.
You need to solve this problem for 0 ≤ X ≤ 2,147,483,647.
Examples:
10 => 1
25 => 2
3 => 0
0 => 1
1 => 1
Factor the number n, and check if it has a prime factor p with odd valuation, such that p = 3 (mod 4). It does if and only if n is not a sum of two squares.
The number of solutions has a closed form expression involving the number of divisors of n. See this, Theorem 3 for a precise statement.
Here is my simple answer in O(sqrt(n)) complexity
x^2 + y^2 = n
x^2 = n-y^2
x = sqrt(n - y^2)
x should be integer so (n-y^2) should be perfect square. Loop to y=[0, sqrt(n)] and check whether (n-y^2) is perfect square or not
Pseudocode :
count = 0;
for y in range(0, sqrt(n))
if( isPerfectSquare(n - y^2))
count++
return count/2
Here's a much simpler solution:
create list of squares in the given range (that's 46340 values for the example given)
for each square value x
if list contains a value y such that x + y = target value (i.e. does [target - x] exist in list)
output √x, √y as solution (roots can be stored in a std::map lookup created in the first step)
Looping through all pairs (a, b) is infeasible given the constrains on X. There is a faster way though!
For fixed a, we can work out b: b = √(X - a2). b won't always be an integer though, so we have to check this. Due to precision issues, perform the check with a small tolerance: if b is x.99999, we can be fairly certain it's an integer. So we loop through all possible values of a and count all cases where b is an integer. We need to be careful not to double-count, so we place the constraint that a <= b. For X = a2 + b2, a will be at most √(X/2) with this constraint.
Here is an implementation of this algorithm in C++:
int count = 0;
// add EPS to avoid flooring x.99999 to x
for (int a = 0; a <= sqrt(X/2) + EPS; a++) {
int b2 = X - a*a; // b^2
int b = (int) (sqrt(b2) + EPS);
if (abs(b - sqrt(b2)) < EPS) // check b is an integer
count++;
}
cout << count << endl;
See it on ideone with sample input
Here's a version which is trivially O(sqrt(N)) and avoids all loop-internal branches.
Start by generating all squares up to the limit, easily done without any multiplications, then initialize a l and r index.
In each iteration you calculate the sum, then update the two indices and the count based on a comparison with the target value. This is sqrt(N) iterations to generate the table and maximum sqrt(N) iterations of the search loop. Estimated running time with a reasonable compiler is max 10 clock cycles per sqrt(N), so for a maximum input value if 2^31 (sqrt(N) ~= 46341) this should correspond to less than 500K clock cycles or a few tenths of a second:
unsigned countPairs(unsigned n)
{
unsigned sq = 0, i;
unsigned square[65536];
for (i = 0; sq <= n; i++) {
square[i] = sq;
sq += i+i+1;
}
unsigned l = 0, r = i-1, count = 0;
do {
unsigned sum = square[l] + square[r];
l += sum <= n; // Increment l if the sum is <= N
count += sum == n; // Increment the count if a match
r -= sum >= n; // Decrement r if the sum is >= N
} while (l <= r);
return count;
}
A good compiler can note that the three compares at the end are all using the same operands so it only needs a single CMP opcode followed by three different conditional move operations (CMOVcc).
I was in a hurry, so solved it using a rather brute-force approach (very similar to marcog's) using Python 2.6.
def is_perfect_square(x):
rt = int(math.sqrt(x))
return rt*rt == x
def double_sqaures(n):
rng = int(math.sqrt(n))
ways = 0
for i in xrange(rng+1):
if is_perfect_square(n - i*i):
ways +=1
if ways % 2 == 0:
ways = ways // 2
else:
ways = ways // 2 + 1
return ways
Note: ways will be odd when the number is a perfect sqaure.
The number of solutions (x,y) of
x^2+y^2=n
over the integers is exactly 4 times the number of divisors of n congruent to 1 mod 4.
Similar identities exist also for the problems
x^2 + 2y^2 = n
and
x^2 + y^2 + z^2 + w^2 = n.

Resources