Finding Smallest Number of Elements to make a Sum - algorithm

I have a simple algorithmic question:
If I have certain elements that integer values like:
1 1 1 1 1 1 1 1 1 1 1 1 10 12 2
and I have to make the sum 12, the minimum number of elements needed would 1, I would just use 12.
Thus, my question is how would you:
find the minimum number of elements to make some sum, and if you can't output -1.
Please suggest an algorithm I can look into so I can solve this efficiently. I've already tried brute force but it is much to slow for my needs.

The problem is np-complete and can be reduced to subset sum or knapsack problem. There is pseudo polynomial time algorithm that can solve it using dynamic programming. Following is a solution similar to knapsack analogy:-
1. Knapsack capacity = Sum
2. Items have same weight and value
3. Maximize profit
4. if max_profit == Sum then there is a solution
5. else Sum cannot be made from the items given.
6. Evaluate the minimum items needed using matrix alongside the DP.
7. Can also reconstruct all solutions and get the minimum one.
Time Complexity : - O(Sum*Items)
Java Implementation :-
public class SubSetSum {
static int[][] costs;
static int[][] minItems;
public static void calSets(int target,int[] arr) {
costs = new int[arr.length][target+1];
minItems = new int[arr.length][target+1];
for(int j=0;j<=target;j++) {
if(arr[0]<=j) {
costs[0][j] = arr[0];
minItems[0][j] = 1;
}
}
for(int i=1;i<arr.length;i++) {
for(int j=0;j<=target;j++) {
costs[i][j] = costs[i-1][j];
minItems[i][j] = minItems[i-1][j];
if(arr[i]<=j) {
costs[i][j] = Math.max(costs[i][j],costs[i-1][j-arr[i]]+arr[i]);
if(costs[i-1][j]==costs[i-1][j-arr[i]]+arr[i]) {
minItems[i][j] = Math.min(minItems[i][j],minItems[i-1][j-arr[i]]+1);
}
else if(costs[i-1][j]<costs[i-1][j-arr[i]]+arr[i]) {
minItems[i][j] = minItems[i-1][j-arr[i]]+1;
}
}
}
}
// System.out.println(costs[arr.length-1][target]);
if(costs[arr.length-1][target]==target) {
System.out.println("Minimum items need : "+minItems[arr.length-1][target]);
}
else System.out.println("No such Set found");
}
public static void main(String[] args) {
int[] arr = {1,1,1,1, 1 ,1 ,1, 1 ,1, 1 ,1 ,1, 10 ,12, 2};
calSets(12, arr);
}
}

here is a recursive approach that should be rather fast:
1) if your input vector is of length 1, either return 1 if the value is equal the target, or return -1 if it doesn't. similarly, if your target is less than any of your items in your input vector, return -1.
2) otherwise, loop on (unique) values in your input vector (in descending order, for performance):
2a) remove the value for your vector, and substract it from your target.
2b) recursively call this function on the new vector and the new target
note: you can pass down the algorithm a max.step parameter, so that if you have already found a solution with length K, you would stop the recursive calls at that depth, but not beyond. remember to decrease your max.step value in each recursive call.
3) collect all the values from the recursive calls, take the minimum (which is not -1) and add 1 to it and return, or, if all values in the loop are -1, return -1.

Disclaimer: This is an advertisement for nice but relatively simple mathematics which leads to very clever and fast counting formulas and algorithms. I'm aware that you can find a much simpler and efficient solution using usual programming. I just like the fact that using properly a Computer Algebra System you can do it in a one liner: Lets get 19 with this list:
sage: l = [1,1,1,2,5,2,1,3,12,1,3]; goal = 19
sage: prod((1+t*x^i) for i in l).expand().collect(x).coefficient(x,goal).low_degree(t)
3
What about 25:
sage: goal=25
sage: prod((1+t*x^i) for i in l).expand().collect(x).coefficient(x,goal).low_degree(t)
5
36 is not feasible:
sage: goal=36
sage: prod((1+t*x^i) for i in l).expand().collect(x).coefficient(x,goal).low_degree(t)
0
Here are some details: Just expand the product
(1+t*x^l[0]) (1+t*x^l[1]) ... (1+t*x^l[n])
Where your list is l. Then to find the minimum number of element required to get the sum S, collect the coefficients of x^S and return the minimum degree of a term in t.
Here is how it could be done in sage:
sage: var("x t")
(x, t)
sage: l = [1,1,1,2,5,2,1,3,12,1,3]
sage: s = prod((1+t*x^i) for i in l)
sage: s = expand(s).collect(x)
Now
sage: print(s)
t^11*x^32 + 5*t^10*x^31 + 2*(t^10 + 5*t^9)*x^30 + 2*(t^10 + 5*t^9 + 5*t^8)*x^29 + (11*t^9 + 20*t^8 + 5*t^7)*x^28 + (t^10 + 4*t^9 + 25*t^8 + 20*t^7 + t^6)*x^27 + 2*(3*t^9 + 10*t^8 + 15*t^7 + 5*t^6)*x^26 + (2*t^9 + 17*t^8 + 40*t^7 + 20*t^6 + 2*t^5)*x^25 + (2*t^9 + 12*t^8 + 30*t^7 + 40*t^6 + 7*t^5)*x^24 + (11*t^8 + 30*t^7 + 35*t^6 + 20*t^5 + t^4)*x^23 + 2*(2*t^8 + 13*t^7 + 20*t^6 + 13*t^5 + 2*t^4)*x^22 + (t^8 + 20*t^7 + 35*t^6 + 30*t^5 + 11*t^4)*x^21 + (t^10 + 7*t^7 + 40*t^6 + 30*t^5 + 12*t^4 + 2*t^3)*x^20 + (5*t^9 + 2*t^7 + 20*t^6 + 40*t^5 + 17*t^4 + 2*t^3)*x^19 + 2*(t^9 + 5*t^8 + 5*t^6 + 15*t^5 + 10*t^4 + 3*t^3)*x^18 + (2*t^9 + 10*t^8 + 10*t^7 + t^6 + 20*t^5 + 25*t^4 + 4*t^3 + t^2)*x^17 + (11*t^8 + 20*t^7 + 5*t^6 + 5*t^5 + 20*t^4 + 11*t^3)*x^16 + (t^9 + 4*t^8 + 25*t^7 + 20*t^6 + t^5 + 10*t^4 + 10*t^3 + 2*t^2)*x^15 + 2*(3*t^8 + 10*t^7 + 15*t^6 + 5*t^5 + 5*t^3 + t^2)*x^14 + (2*t^8 + 17*t^7 + 40*t^6 + 20*t^5 + 2*t^4 + 5*t^2)*x^13 + (2*t^8 + 12*t^7 + 30*t^6 + 40*t^5 + 7*t^4 + t)*x^12 + (11*t^7 + 30*t^6 + 35*t^5 + 20*t^4 + t^3)*x^11 + 2*(2*t^7 + 13*t^6 + 20*t^5 + 13*t^4 + 2*t^3)*x^10 + (t^7 + 20*t^6 + 35*t^5 + 30*t^4 + 11*t^3)*x^9 + (7*t^6 + 40*t^5 + 30*t^4 + 12*t^3 + 2*t^2)*x^8 + (2*t^6 + 20*t^5 + 40*t^4 + 17*t^3 + 2*t^2)*x^7 + 2*(5*t^5 + 15*t^4 + 10*t^3 + 3*t^2)*x^6 + (t^5 + 20*t^4 + 25*t^3 + 4*t^2 + t)*x^5 + (5*t^4 + 20*t^3 + 11*t^2)*x^4 + 2*(5*t^3 + 5*t^2 + t)*x^3 + 2*(5*t^2 + t)*x^2 + 5*t*x + 1
Ok this is a huge expression. The nice feature here is that If I take the coefficient say of x^17 I get:
sage: s.coefficient(x, 17)
2*t^9 + 10*t^8 + 10*t^7 + t^6 + 20*t^5 + 25*t^4 + 4*t^3 + t^2
which says the following: the term 10*t^7 tells me that there are 10 different way to obtains the sum 17 using 7 number. Another example, there are 25 way to get 17 using 4 number (25*t^4).
Also since this expression ends with t^2 I learn that I only need two number to get 17. Unfortunately this doesn't tells which numbers.
If you want to understand the trick, look at Wikipedia article on generating functions and This Page.
Note 1: this is not the most efficient since I compute much more than what you need. The huge expression actually described and somehow computed all possible choices (that is 2^the length of the list). But it's a one liner:
sage: prod((1+t*x^i) for i in l).expand().collect(x).coefficient(x,17).low_degree(t)
2
And still relatively efficient:
sage: %timeit prod((1+t*x^i) for i in l).expand().collect(x).coefficient(x,17).low_degree(t)
10 loops, best of 3: 42.6 ms per loop
Note 2: After thinking carefully about it I also realized the following: Generating series is just a compact encoding of what you would have written if you tried to implement a dynamic programming solution.

I don't think this solution is optimal, but it's very easy to understand and use, you sort the elements in decreasing order, then you take each element and try to fit it in your number. If you have the sequence [5,6,2,7] and you need to make the 15 number, you'll reorder the sequence [7,6,5,2] and take 7, then you need to extract 8 so you'll take 6, then you'll need 2 more, check 5 but it's too big and you'll skip it and check the last number, 2, which it's perfect and finishes your number. So you'd print out 3. This is the worst case of the algorithm which is O(n). But in your example with 12, it'll be O(1), because you'll pick 12 from the first checkup of the ordered sequence. (running time applies only for the program of choosing items, not sorting)
resolve_sum(ordered_items[], number) {
count = 0;
aux = number;
i = 0;
while (aux - ordered_items[i] <= 0) {
count = count + 1;
aux = aux - ordered_items[i];
i = i + 1;
}
if (aux == 0) return count;
else return -1;
}
I haven't included an algorithm for sorting, you can choose one that you know best or try to learn a new efiecient one. Link with sorting algorithms and their running time. This is just a sample code you can use in C/C++ or Java or what you need. I hope it isn't way too much brute force.

Related

Multiply polynomials using DFT algorithm [duplicate]

I am new to FFTs so I am slightly confused on some concepts. So far the FFT examples I've seen for equation multiplication involve equations with consecutive exponents (i.e. A(x) = 1 + 3x + 5x^2 +... and B(x) = 4 + 6x + 9x^2 + ... and C(x) = A(x)*B(x)). However, it is possible to use FFT on two equations that do not have equal exponents? For example, is it possible to use FFT to multiply:
A(x) = 1 + 3x^2 + 9x^8
and
B(x) = 5x + 6 x^3 + 10x^8
in O(nlogn) time?
If not, are there any cases where the runtime will be O(nlogn)? For example, if the number of terms in the product is O(n) instead of O(n^2)?
Even if the runtime is more than O(nlogn), how can we use FFT to minimize the runtime?
yes it is possible to use DFFT on non equal exponent polynomials...
the missing exponents are just multiplied by 0 which is also a number... just rewrite your polynomials:
A(x) = 1 + 3x^2 + 9x^8
B(x) = 5x + 6x^3 + 10x^8
to something like this:
A(x) = 1x^0 + 0x^1 + 3x^2 + 0x^3 + 0x^4+ 0x^5+ 0x^6+ 0x^7 + 9x^8
B(x) = 0x^0 + 5x^1 + 0x^2 + 6x^3 + 0x^4+ 0x^5+ 0x^6+ 0x^7 + 10x^8
so your vectors for DFFT are:
A = (1,0,3,0,0,0,0,0, 9)
B = (0,5,0,6,0,0,0,0,10)
add zero's so the vector is the correct result size (max A exponent +1 + max B exponent +1) and also round up to closest power of 2 for DFFT usage so original sizes are 9,9 -> 9+9 -> 18 -> round up -> 32
A = (1,0,3,0,0,0,0,0, 9,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0)
B = (0,5,0,6,0,0,0,0,10,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0)
// | original | correct result | nearest power of 2 |
and do the DFFT stuff you want ... I assume you want to do something like this:
A' = DFFT(A)
B' = DFFT(B)
C(i)' = A'(i) * B'(i) // i=0..n-1
C= IDFFT(C')
which is O(n*log(n)). Do not forget that if you use DFFT (not DFT) n = 32 and not 18 !!! because n must be power of 2 for fast algorithm of DFT also if you want performance improvements than look at the DFFT weight matrices for DFFT(A),DFFT(B) they are the same so no need to compute them twice ...

Is there a general term formula for 3 queens problem?

The specific question description is:
Put 3 queens on a chessboard of M columns and N rows, how to determine the number of ways that no two of them are in attacking positions?
Note that M is not equals to N, and M/N are larger than a Integer in C language so that there is no way to solve this question using classical computer algorithm like DFS/BFS(for time and memory complexity considerations).
I guess this problem can be calculated by the mathematical method of permutation or combination, but I am not good at math, so, please help me.
Yes.
Searching for keyword "3 queens" in OEIS gives us A047659, and in the Formula section, Vaclav Kotesovec wrote that:
In general, for m <= n, n >= 3, the number of ways to place 3 nonattacking queens on an m X n board is n^3/6*(m^3 - 3m^2 + 2m) - n^2/2*(3m^3 - 9m^2 + 6m) + n/6(2m^4 + 20m^3 - 77m^2 + 58m) - 1/24*(39m^4 - 82m^3 - 36m^2 + 88m) + 1/16*(2m - 4n + 1)(1 + (-1)^(m+1)) + 1/2(1 + abs(n - 2m + 3) - abs(n - 2m + 4))(1/24((n - 2m + 11)^4 - 42(n - 2m + 11)^3 + 656(n - 2m + 11)^2 - 4518(n - 2m + 11) + 11583) - 1/16(4m - 2n - 1)*(1 + (-1)^(n+1))) [Panos Louridas, idee & form 93/2007, pp. 2936-2938].
This formula can be manually confirmed on small Ns and Ms. A simple Python script for this purpose is shown below:
import fractions # to avoid floating error
m = fractions.Fraction(4)
n = fractions.Fraction(4)
assert m<=n
one = fractions.Fraction(1)
ans = n**3/6*(m**3 - 3*m**2 + 2*m) - n**2/2*(3*m**3 - 9*m**2 + 6*m) + n/6*(2*m**4 + 20*m**3 - 77*m**2 + 58*m) - one/24*(39*m**4 - 82*m**3 - 36*m**2 + 88*m) + one/16*(2*m - 4*n + 1)*(1 + (-1)**(m+1)) + one/2*(1 + abs(n - 2*m + 3) - abs(n - 2*m + 4))*(one/24*((n - 2*m + 11)**4 - 42*(n - 2*m + 11)**3 + 656*(n - 2*m + 11)**2 - 4518*(n - 2*m + 11) + 11583) - one/16*(4*m - 2*n - 1)*(1 + (-1)**(n+1)))
print(ans)
Unfortunately, I failed to find the proof of this formula ([Panos Louridas, idee & form 93/2007, pp. 2936-2938], as Vaclav Kotesovec cited). The journal idee & form does not seem to be freely accessible. But due to the reputation of Vaclav Kotesovec (the author of Non-attacking chess pieces), I think we should trust this result.
The simple answer is inclusion/exclusion.
We start by counting the number of ways to place 3 queens in order. Which is just (n*m) * (n*m - 1) * (n*m - 2).
Now we have overcounted, because we don't want the count of the number of ways to place 3 queens with queens 1,2 attacking. Or queens 1,3. Or queens 2,3. But that is just the sum over rows, columns and diagonals of length l of l * (l-1) * (m*n-2).
But now we have undercounted, because if all 3 queens attack each other then we added them in the first step, subtracted 3x in the second step, and now we need to add them back 2x to get to counting those 0 times. Which is the sum over rows, columns and diagonals of length l of l * (l-1) * (l-2).
But this is the count of ways to place all of the queens in order. But given 3 queens on the board, there are 6 orders we could place them. So divide the whole answer by 6.
This can be turned into a program that is O(n+m) to run. Which should be fast enough for your purposes. If you were willing to do a bunch more analysis, we could actually produce a O(1) formula.
The field with coordinates (i, j) is vulnerable for the queen locаted at (qi , qj) if
i == qi || j == qj || abs(i - j) == abs(qi - qj)
This boolean expression should be false for feasible coordinates of each queen. Finding three such fields should not be hard. One cаn try Monte Carlo method which has complexity o(M * N) in worst case.

Is this a correct recurrence relationship I've found for the coin change challenge?

I'm trying to solve the "coin change problem" and I think I've come up with a recursive solution but I want to verify.
As a a example, let's suppose we have pennies, nickles and dimes and are trying to make change for 22 cents.
C = { 1 = penny, nickle = 5, dime = 10 }
K = 22
Then the number of ways to make change is
f(C,N) = f({1,5,10},22)
=
(# of ways to make change with 0 dimes)
+ (# of ways to make change with 1 dimes)
+ (# of ways to make change with 2 dimes)
= f(C\{dime},22-0*10) + f(C\{dime},22-1*10) + f(C\{dime},22-2*10)
= f({1,5},22) + f({1,5},12) + f({1,5},2)
and
f({1,5},22)
= f({1,5}\{nickle},22-0*5) + f({1,5}\{nickle},22-1*5) + f({1,5}\{nickle},22-2*5) + f({1,5}\{nickle},22-3*5) + f({1,5}\{nickle},22-4*5)
= f({1},22) + f({1},17) + f({1},12) + f({1},7) + f({1},2)
= 5
and so forth.
In other words, my algorithm is like
let f(C,K) be the number of ways to make change for K cents with coins C
and have the following implementation
if(C is empty or K=0)
return 0
sum = 0
m = C.PopLargest()
A = {0, 1, ..., K / m}
for(i in A)
sum += f(C,K-i*m)
return sum
If there any flaw in that?
Would be linear time, I think.
Rethink about your base cases:
1. What if K < 0 ? Then no solution exists. i.e. No of ways = 0.
2. When K = 0, so there is 1 way to make changes and which is to consider zero elements from array of coin-types.
3. When coin array is empty then No of ways = 0.
Rest of the logic is correct. But your perception that the algorithm is Linear is absolutely wrong.
Lets compute the complexity:
Popping largest element is O(C.length). However this step can be
optimised if you consider sorting the whole array in the beginning.
Your for Loop works O(K/C.max) times in every call and in every iteration it is calling the function recursively.
So if you write the recurrence for it. then it should be something like:
T(N) = O(N) + K*T(N-1)
And this is going to be exponential in terms of N (Size of array).
In case you are looking for improvement, i would suggest you to use Dynamic Programming.

Find the sum of Fibonacci Series

I have given a Set A I have to find the sum of Fibonacci Sum of All the Subsets of A.
Fibonacci(X) - Is the Xth Element of Fibonacci Series
For example, for A = {1,2,3}:
Fibonacci(1) + Fibonacci(2) + Fibonacci(3) + Fibonacci(1+2) + Fibonacci(2+3) + Fibonacci(1+3) + Fibonacci(1+2+3)
1 + 1 + 2 + 2 + 5 + 3 + 8 = 22
Is there any way I can find the sum without generating the subset?
Since I find the Sum of all subset easily
i.e. Sum of All Subset - (1+2+3)*(pow(2,length of set-1))
There surely is.
First, let's recall that the nth Fibonacci number equals
φ(n) = [φ^n - (-φ)^(-n)]/√5
where φ = (√5 + 1)/2 (Golden Ratio) and (-φ)^(-1) = (1-√5)/2. But to make this shorter, let me denote φ as A and (-φ)^(-1) as B.
Next, let's notice that a sum of Fibonacci numbers is a sum of powers of A and B:
[φ(n) + φ(m)]*√5 = A^n + A^m - B^n - B^m
Now what is enough to calc (in the {1,2,3} example) is
A^1 + A^2 + A^3 + A^{1+2} + A^{1+3} + A^{2+3} + A^{1+2+3}.
But hey, there's a simpler expression for this:
(A^1 + 1)(A^2 + 1)(A^3 + 1) - 1
Now, it is time to get the whole result.
Let our set be {n1, n2, ..., nk}. Then our sum will be equal to
Sum = 1/√5 * [(A^n1 + 1)(A^n2 + 1)...(A^nk + 1) - (B^n1 + 1)(B^n2 + 1)...(B^nk + 1)]
I think, mathematically, this is the "simplest" form of the answer as there's no relation between n_i. However, there could be some room for computative optimization of this expression. In fact, I'm not sure at all if this (using real numbers) will work faster than the "straightforward" summing, but the question was about avoiding subsets generation, so here's the answer.
I tested the answer from YakovL using Python 2.7. It works very well and is plenty quick. I cannot imagine that summing the sequence values would be quicker. Here's the implementation.
_phi = (5.**0.5 + 1.)/2.
A = lambda n: _phi**n
B = lambda n: (-_phi)**(-n)
prod = lambda it: reduce(lambda x, y: x*y, it)
subset_sum = lambda s: (prod(A(n)+1 for n in s) - prod(B(n)+1 for n in s))/5**0.5
And here are some test results:
print subset_sum({1, 2, 3})
# 22.0
# [Finished in 0.1s]
print subset_sum({1, 2, 4, 8, 16, 32, 64, 128, 256, 512})
# 7.29199318438e+213
# [Finished in 0.1s]

Sum of all numbers written with particular digits in a given range

My objective is to find the sum of all numbers from 4 to 666554 which consists of 4,5,6 only.
SUM = 4+5+6+44+45+46+54+55+56+64+65+66+.....................+666554.
Simple method is to run a loop and add the numbers made of 4,5 and 6 only.
long long sum = 0;
for(int i=4;i <=666554;i++){
/*check if number contains only 4,5 and 6.
if condition is true then add the number to the sum*/
}
But it seems to be inefficient. Checking that the number is made up of 4,5 and 6 will take time. Is there any way to increase the efficiency. I have tried a lot but no new approach i have found.Please help.
For 1-digit numbers, note that
4 + 5 + 6 == 5 * 3
For 2-digits numbers:
(44 + 45 + 46) + (54 + 55 + 56) + (64 + 65 + 66)
== 45 * 3 + 55 * 3 + 65 * 3
== 55 * 9
and so on.
In general, for n-digits numbers, there are 3n of them consist of 4,5,6 only, their average value is exactly 5...5(n digits). Using code, the sum of them is ('5' * n).to_i * 3 ** n (Ruby), or int('5' * n) * 3 ** n (Python).
You calculate up to 6-digits numbers, then subtract the sum of 666555 to 666666.
P.S: for small numbers like 666554, using pattern matching is fast enough. (example)
Implement a counter in base 3 (number of digit values), e.g. 0,1,2,10,11,12,20,21,22,100.... and then translate the base-3 number into a decimal with the digits 4,5,6 (0->4, 1->5, 2->6), and add to running total. Repeat until the limit.
def compute_sum(digits, max_val):
def _next_val(cur_val):
for pos in range(len(cur_val)):
cur_val[pos]+=1
if cur_val[pos]<len(digits):
return
cur_val[pos]=0
cur_val.append(0)
def _get_val(cur_val):
digit_val=1
num_val=0
for x in cur_val:
num_val+=digits[x]*digit_val
digit_val*=10
return num_val
cur_val=[]
sum=0
while(True):
_next_val(cur_val)
num_val=_get_val(cur_val)
if num_val>max_val:
break
sum+=num_val
return sum
def main():
digits=[4,5,6]
max_val=666554
print(digits, max_val)
print(compute_sum(digits, max_val))
Mathematics are good, but not all problems are trivially "compressible", so knowing how to deal with them without mathematics can be worthwhile.
In this problem, the summation is trivial, the difficulty is efficiently enumerating the numbers that need be added, at first glance.
The "filter" route is a possibility: generate all possible numbers, incrementally, and filter out those which do not match; however it is also quite inefficient (in general):
the condition might not be trivial to match: in this case, the easier way is a conversion to string (fairly heavy on divisions and tests) followed by string-matching
the ratio of filtering is not too bad to start with at 30% per digit, but it scales very poorly as gen-y-s remarked: for a 4 digits number it is at 1%, or generating and checking 100 numbers to only get 1 out of them.
I would therefore advise a "generational" approach: only generate numbers that match the condition (and all of them).
I would note that generating all numbers composed of 4, 5 and 6 is like counting (in ternary):
starts from 4
45 becomes 46 (beware of carry-overs)
66 becomes 444 (extreme carry-over)
Let's go, in Python, as a generator:
def generator():
def convert(array):
i = 0
for e in array:
i *= 10
i += e
return i
def increment(array):
result = []
carry = True
for e in array[::-1]:
if carry:
e += 1
carry = False
if e > 6:
e = 4
carry = True
result = [e,] + result
if carry:
result = [4,] + result
return result
array = [4]
while True:
num = convert(array)
if num > 666554: break
yield num
array = increment(array)
Its result can be printed with sum(generator()):
$ time python example.py
409632209
python example.py 0.03s user 0.00s system 82% cpu 0.043 total
And here is the same in C++.
"Start with a simpler problem." —Polya
Sum the n-digit numbers which consist of the digits 4,5,6 only
As Yu Hao explains above, there are 3**n numbers and their average by symmetry is eg. 555555, so the sum is 3**n * (10**n-1)*5/9. But if you didn't spot that, here's how you might solve the problem another way.
The problem has a recursive construction, so let's try a recursive solution. Let g(n) be the sum of all 456-numbers of exactly n digits. Then we have the recurrence relation:
g(n) = (4+5+6)*10**(n-1)*3**(n-1) + 3*g(n-1)
To see this, separate the first digit of each number in the sum (eg. for n=3, the hundreds column). That gives the first term. The second term is sum of the remaining digits, one count of g(n-1) for each prefix of 4,5,6.
If that's still unclear, write out the n=2 sum and separate tens from units:
g(2) = 44+45+46 + 54+55+56 + 64+65+66
= (40+50+60)*3 + 3*(4+5+6)
= (4+5+6)*10*3 + 3*g(n-1)
Cool. At this point, the keen reader might like to check Yu Hao's formula for g(n) satisfies our recurrence relation.
To solve OP's problem, the sum of all 456-numbers from 4 to 666666 is g(1) + g(2) + g(3) + g(4) + g(5) + g(6). In Python, with dynamic programming:
def sum456(n):
"""Find the sum of all numbers at most n digits which consist of 4,5,6 only"""
g = [0] * (n+1)
for i in range(1,n+1):
g[i] = 15*10**(i-1)*3**(i-1) + 3*g[i-1]
print(g) # show the array of partial solutions
return sum(g)
For n=6
>>> sum456(6)
[0, 15, 495, 14985, 449955, 13499865, 404999595]
418964910
Edit: I note that OP truncated his sum at 666554 so it doesn't fit the general pattern. It will be less the last few terms
>>> sum456(6) - (666555 + 666556 + 666564 + 666565 + 666566 + 666644 + 666645 + 666646 + 666654 + 666655 + 666656 + + 666664 + 666665 + 666666)
409632209
The sum of 4 through 666666 is:
total = sum([15*(3**i)*int('1'*(i+1)) for i in range(6)])
>>> 418964910
The sum of the few numbers between 666554 and 666666 is:
rest = 666555+666556+666564+666565+666566+
666644+666645+666646+
666654+666655+666656+
666664+666665+666666
>>> 9332701
total - rest
>>> 409632209
Java implementation of question:-
This uses the modulo(10^9 +7) for the answer.
public static long compute_sum(long[] digits, long max_val, long count[]) {
List<Long> cur_val = new ArrayList<>();
long sum = 0;
long mod = ((long)Math.pow(10,9))+7;
long num_val = 0;
while (true) {
_next_val(cur_val, digits);
num_val = _get_val(cur_val, digits, count);
sum =(sum%mod + (num_val)%mod)%mod;
if (num_val == max_val) {
break;
}
}
return sum;
}
public static void _next_val(List<Long> cur_val, long[] digits) {
for (int pos = 0; pos < cur_val.size(); pos++) {
cur_val.set(pos, cur_val.get(pos) + 1);
if (cur_val.get(pos) < digits.length)
return;
cur_val.set(pos, 0L);
}
cur_val.add(0L);
}
public static long _get_val(List<Long> cur_val, long[] digits, long count[]) {
long digit_val = 1;
long num_val = 0;
long[] digitAppearanceCount = new long[]{0,0,0};
for (Long x : cur_val) {
digitAppearanceCount[x.intValue()] = digitAppearanceCount[x.intValue()]+1;
if (digitAppearanceCount[x.intValue()]>count[x.intValue()]){
num_val=0;
break;
}
num_val = num_val+(digits[x.intValue()] * digit_val);
digit_val *= 10;
}
return num_val;
}
public static void main(String[] args) {
long [] digits=new long[]{4,5,6};
long count[] = new long[]{1,1,1};
long max_val= 654;
System.out.println(compute_sum(digits, max_val, count));
}
The Answer by #gen-y-s (https://stackoverflow.com/a/31286947/8398943) is wrong (It includes 55,66,44 for x=y=z=1 which is exceeding the available 4s, 5s, 6s). It gives output as 12189 but it should be 3675 for x=y=z=1.
The logic by #Yu Hao (https://stackoverflow.com/a/31285816/8398943) has the same mistake as mentioned above. It gives output as 12189 but it should be 3675 for x=y=z=1.

Resources