Sum of all numbers written with particular digits in a given range - algorithm

My objective is to find the sum of all numbers from 4 to 666554 which consists of 4,5,6 only.
SUM = 4+5+6+44+45+46+54+55+56+64+65+66+.....................+666554.
Simple method is to run a loop and add the numbers made of 4,5 and 6 only.
long long sum = 0;
for(int i=4;i <=666554;i++){
/*check if number contains only 4,5 and 6.
if condition is true then add the number to the sum*/
}
But it seems to be inefficient. Checking that the number is made up of 4,5 and 6 will take time. Is there any way to increase the efficiency. I have tried a lot but no new approach i have found.Please help.

For 1-digit numbers, note that
4 + 5 + 6 == 5 * 3
For 2-digits numbers:
(44 + 45 + 46) + (54 + 55 + 56) + (64 + 65 + 66)
== 45 * 3 + 55 * 3 + 65 * 3
== 55 * 9
and so on.
In general, for n-digits numbers, there are 3n of them consist of 4,5,6 only, their average value is exactly 5...5(n digits). Using code, the sum of them is ('5' * n).to_i * 3 ** n (Ruby), or int('5' * n) * 3 ** n (Python).
You calculate up to 6-digits numbers, then subtract the sum of 666555 to 666666.
P.S: for small numbers like 666554, using pattern matching is fast enough. (example)

Implement a counter in base 3 (number of digit values), e.g. 0,1,2,10,11,12,20,21,22,100.... and then translate the base-3 number into a decimal with the digits 4,5,6 (0->4, 1->5, 2->6), and add to running total. Repeat until the limit.
def compute_sum(digits, max_val):
def _next_val(cur_val):
for pos in range(len(cur_val)):
cur_val[pos]+=1
if cur_val[pos]<len(digits):
return
cur_val[pos]=0
cur_val.append(0)
def _get_val(cur_val):
digit_val=1
num_val=0
for x in cur_val:
num_val+=digits[x]*digit_val
digit_val*=10
return num_val
cur_val=[]
sum=0
while(True):
_next_val(cur_val)
num_val=_get_val(cur_val)
if num_val>max_val:
break
sum+=num_val
return sum
def main():
digits=[4,5,6]
max_val=666554
print(digits, max_val)
print(compute_sum(digits, max_val))

Mathematics are good, but not all problems are trivially "compressible", so knowing how to deal with them without mathematics can be worthwhile.
In this problem, the summation is trivial, the difficulty is efficiently enumerating the numbers that need be added, at first glance.
The "filter" route is a possibility: generate all possible numbers, incrementally, and filter out those which do not match; however it is also quite inefficient (in general):
the condition might not be trivial to match: in this case, the easier way is a conversion to string (fairly heavy on divisions and tests) followed by string-matching
the ratio of filtering is not too bad to start with at 30% per digit, but it scales very poorly as gen-y-s remarked: for a 4 digits number it is at 1%, or generating and checking 100 numbers to only get 1 out of them.
I would therefore advise a "generational" approach: only generate numbers that match the condition (and all of them).
I would note that generating all numbers composed of 4, 5 and 6 is like counting (in ternary):
starts from 4
45 becomes 46 (beware of carry-overs)
66 becomes 444 (extreme carry-over)
Let's go, in Python, as a generator:
def generator():
def convert(array):
i = 0
for e in array:
i *= 10
i += e
return i
def increment(array):
result = []
carry = True
for e in array[::-1]:
if carry:
e += 1
carry = False
if e > 6:
e = 4
carry = True
result = [e,] + result
if carry:
result = [4,] + result
return result
array = [4]
while True:
num = convert(array)
if num > 666554: break
yield num
array = increment(array)
Its result can be printed with sum(generator()):
$ time python example.py
409632209
python example.py 0.03s user 0.00s system 82% cpu 0.043 total
And here is the same in C++.

"Start with a simpler problem." —Polya
Sum the n-digit numbers which consist of the digits 4,5,6 only
As Yu Hao explains above, there are 3**n numbers and their average by symmetry is eg. 555555, so the sum is 3**n * (10**n-1)*5/9. But if you didn't spot that, here's how you might solve the problem another way.
The problem has a recursive construction, so let's try a recursive solution. Let g(n) be the sum of all 456-numbers of exactly n digits. Then we have the recurrence relation:
g(n) = (4+5+6)*10**(n-1)*3**(n-1) + 3*g(n-1)
To see this, separate the first digit of each number in the sum (eg. for n=3, the hundreds column). That gives the first term. The second term is sum of the remaining digits, one count of g(n-1) for each prefix of 4,5,6.
If that's still unclear, write out the n=2 sum and separate tens from units:
g(2) = 44+45+46 + 54+55+56 + 64+65+66
= (40+50+60)*3 + 3*(4+5+6)
= (4+5+6)*10*3 + 3*g(n-1)
Cool. At this point, the keen reader might like to check Yu Hao's formula for g(n) satisfies our recurrence relation.
To solve OP's problem, the sum of all 456-numbers from 4 to 666666 is g(1) + g(2) + g(3) + g(4) + g(5) + g(6). In Python, with dynamic programming:
def sum456(n):
"""Find the sum of all numbers at most n digits which consist of 4,5,6 only"""
g = [0] * (n+1)
for i in range(1,n+1):
g[i] = 15*10**(i-1)*3**(i-1) + 3*g[i-1]
print(g) # show the array of partial solutions
return sum(g)
For n=6
>>> sum456(6)
[0, 15, 495, 14985, 449955, 13499865, 404999595]
418964910
Edit: I note that OP truncated his sum at 666554 so it doesn't fit the general pattern. It will be less the last few terms
>>> sum456(6) - (666555 + 666556 + 666564 + 666565 + 666566 + 666644 + 666645 + 666646 + 666654 + 666655 + 666656 + + 666664 + 666665 + 666666)
409632209

The sum of 4 through 666666 is:
total = sum([15*(3**i)*int('1'*(i+1)) for i in range(6)])
>>> 418964910
The sum of the few numbers between 666554 and 666666 is:
rest = 666555+666556+666564+666565+666566+
666644+666645+666646+
666654+666655+666656+
666664+666665+666666
>>> 9332701
total - rest
>>> 409632209

Java implementation of question:-
This uses the modulo(10^9 +7) for the answer.
public static long compute_sum(long[] digits, long max_val, long count[]) {
List<Long> cur_val = new ArrayList<>();
long sum = 0;
long mod = ((long)Math.pow(10,9))+7;
long num_val = 0;
while (true) {
_next_val(cur_val, digits);
num_val = _get_val(cur_val, digits, count);
sum =(sum%mod + (num_val)%mod)%mod;
if (num_val == max_val) {
break;
}
}
return sum;
}
public static void _next_val(List<Long> cur_val, long[] digits) {
for (int pos = 0; pos < cur_val.size(); pos++) {
cur_val.set(pos, cur_val.get(pos) + 1);
if (cur_val.get(pos) < digits.length)
return;
cur_val.set(pos, 0L);
}
cur_val.add(0L);
}
public static long _get_val(List<Long> cur_val, long[] digits, long count[]) {
long digit_val = 1;
long num_val = 0;
long[] digitAppearanceCount = new long[]{0,0,0};
for (Long x : cur_val) {
digitAppearanceCount[x.intValue()] = digitAppearanceCount[x.intValue()]+1;
if (digitAppearanceCount[x.intValue()]>count[x.intValue()]){
num_val=0;
break;
}
num_val = num_val+(digits[x.intValue()] * digit_val);
digit_val *= 10;
}
return num_val;
}
public static void main(String[] args) {
long [] digits=new long[]{4,5,6};
long count[] = new long[]{1,1,1};
long max_val= 654;
System.out.println(compute_sum(digits, max_val, count));
}
The Answer by #gen-y-s (https://stackoverflow.com/a/31286947/8398943) is wrong (It includes 55,66,44 for x=y=z=1 which is exceeding the available 4s, 5s, 6s). It gives output as 12189 but it should be 3675 for x=y=z=1.
The logic by #Yu Hao (https://stackoverflow.com/a/31285816/8398943) has the same mistake as mentioned above. It gives output as 12189 but it should be 3675 for x=y=z=1.

Related

Algorithm to sum up all digits of a number

Can you please explain to me how this loop works? What is going on after first loop and after second etc.
def sum(n):
s = 0
while n:
s += n % 10
n /= 10
return s
>>> print sum(123)
6
def sum(n):
s = 0
while n:
s += n % 10
n /= 10
return s
Better rewrite this way (easier to understand):
def sum(n):
s = 0 // start with s = 0
while n > 0: // while our number is bigger than 0
s += n % 10 // add the last digit to s, for example 54%10 = 4
n /= 10 // integer division = just removing last digit, for example 54/10 = 5
return s // return the result
n > 0 in Python can be simply written as n
but I think it is bad practice for beginners
so basically, what we are doing in this algorithm is that we are taking one digit at a time from least significant digit of the number and adding that in our s (which is sum variable), and once we have added the least significant digit, we are then removing it and doing the above thing again and again till the numbers remains to be zero, so how do we know the least significant digit, well just take the remainder of the n by dividing it with 10, now how do we remove the last digit(least significant digit) , we just divide it with 10, so here you go, let me know if it is not understandable.
int main()
{
int t;
cin>>t;
cout<<floor(log10(t)+1);
return 0;
}
Output
254
3

Excess (-1) in base 4 representation

I've been trying to wrap my head around this one problem for the last couple of days, and I can't figure out a way to solve it. So, here it goes:
Given the base 4(that is 0, 1, 2, 3 as digits for a number), find the excess (-1) in base 4 representation of any negative or positive integer number.
examples:
-6 = (-1)22
conversely, (-1)22 in excess (-1) of base 4 = 2 * 4^0 + 2 * 4^1 + (-1) * 4^2 = 2 + 8 - 16 = 10 - 16 = -6 in base 10
27 = 2(-1)(-1)
conversely, 2(-1)(-1) = (-1) * 4^0 + (-1) * 4^1 + 2 * 4^2 = -1 - 4 + 32 = 27
I did come up with a few algorithms for positive numbers, but none of them hold true for all negative numbers, so into the trash they went.
Can anyone give me some kind of clue here? Thanks!
----------------
Edit: I'm going to try to rephrase this question in such a way that it does not raise any confusions.
Consider the radix obtained by subtracting 1 from every digit, called the excess-(-1) of base 4. In this radix, any number can be represented using the digits -1, 0, 1, 2. So, the problem asks for an algorithm that gets as an input any integer number, and gives as output the representation of that given number.
Examples:
decimal -6 = -1 2 2 for the excess-(-1) of base 4.
To verify this, we take the representation -1 -1 2 and transform it to a decimal number, start from the right-most digit and use the generic base n to base 10 algorithm, like so:
number = 2 * 4^0 + 2 * 4^1 + (-1) * 4^2 = 2 + 4 - 16 = -6
I don't know if "quaterit" is the correct word for the radix in this representation, but I'm going to use it anyway.
Since you say you already have an algorithm for positive numbers, I'll try to take a negative number as an input and write something that uses what you already have. The code below doesn't quite work, but I'll explain why at the end.
int[] BaseFourExcessForNegativeNumbers(int x) {
int powerOfFour = 1;
while (-powerOfFour > x) {
powerOfFour *= 4;
}
int firstQuaterit = -1;
int remainder = x + powerOfFour;
int[] otherQuaterits;
if (remainder >= 0) {
otherQuaterits = BaseFourExcessForPositiveNumbers(remainder);
} else {
otherQuaterits = BaseFourExcessForNegativeNumbers(remainder);
}
int[] result = new int[otherQuaterits.Length + 1];
result[0] = firstQuaterit;
for (int index = 0; index < otherQuaterits.Length; ++index) {
result[index + 1] = otherQuaterits[index];
}
return result;
}
The idea here is that every negative number x will start with a (-1) in this representation. If that (-1) is in the 4^n position, we want to find out how to represent x - (-1)*4^n to see how to represent the rest of the number.
The reason the code I wrote won't work is that it doesn't take into consideration the possibility that the second quaterit is a 0. If that happens, the array my code will produce will be missing that 0. In fact, if BaseFourExcessForPositiveNumbers is written in the same way, the resulting array will be missing every 0, but will otherwise be correct. A workaround is to keep track of which place the first quaterit takes, and then make the array that size, and fill it from the back to the front.

Algorithm for separating integer into a sum of products of single digit numbers? [duplicate]

A couple of days ago I played around with Befunge which is an esoteric programming language. Befunge uses a LIFO stack to store data. When you write programs the digits from 0 to 9 are actually Befunge-instructions which push the corresponding values onto the stack. So for exmaple this would push a 7 to stack:
34+
In order to push a number greater than 9, calculations must be done with numbers less than or equal to 9. This would yield 123.
99*76*+
While solving Euler Problem 1 with Befunge I had to push the fairly large number 999 to the stack. Here I began to wonder how I could accomplish this task with as few instructions as possible. By writing a term down in infix notation and taking out common factors I came up with
9993+*3+*
One could also simply multiply two two-digit numbers which produce 999, e.g.
39*66*1+*
I thought about this for while and then decided to write a program which puts out the smallest expression according to these rules in reverse polish notation for any given integer. This is what I have so far (written in NodeJS with underscorejs):
var makeExpr = function (value) {
if (value < 10) return value + "";
var output = "", counter = 0;
(function fn (val) {
counter++;
if(val < 9) { output += val; return; };
var exp = Math.floor(Math.log(val) / Math.log(9));
var div = Math.floor(val / Math.pow(9, exp));
_( exp ).times(function () { output += "9"; });
_(exp-1).times(function () { output += "*"; });
if (div > 1) output += div + "*";
fn(val - Math.pow(9, exp) * div);
})(value);
_(counter-1).times(function () { output+= "+"; });
return output.replace(/0\+/, "");
};
makeExpr(999);
// yields 999**99*3*93*++
This piece of code constructs the expression naively and is obvously way to long. Now my questions:
Is there an algorithm to simplify expressions in reverse polish notation?
Would simplification be easier in infix notation?
Can an expression like 9993+*3+* be proofed to be the smallest one possible?
I hope you can give some insights. Thanks in advance.
When only considering multiplication and addition, it's pretty easy to construct optimal formula's, because that problem has the optimal substructure property. That is, the optimal way to build [num1][num2]op is from num1 and num2 that are both also optimal. If duplication is also considered, that's no longer true.
The num1 and num2 give rise to overlapping subproblems, so Dynamic Programming is applicable.
We can simply, for a number i:
For every 1 < j <= sqrt(i) that evenly divides i, try [j][i / j]*
For every 0 < j < i/2, try [j][i - j]+
Take the best found formula
That is of course very easy to do bottom-up, just start at i = 0 and work your way up to whatever number you want. Step 2 is a little slow, unfortunately, so after say 100000 it starts to get annoying to wait for it. There might be some trick that I'm not seeing.
Code in C# (not tested super well, but it seems to work):
string[] n = new string[10000];
for (int i = 0; i < 10; i++)
n[i] = "" + i;
for (int i = 10; i < n.Length; i++)
{
int bestlen = int.MaxValue;
string best = null;
// try factors
int sqrt = (int)Math.Sqrt(i);
for (int j = 2; j <= sqrt; j++)
{
if (i % j == 0)
{
int len = n[j].Length + n[i / j].Length + 1;
if (len < bestlen)
{
bestlen = len;
best = n[j] + n[i / j] + "*";
}
}
}
// try sums
for (int j = 1; j < i / 2; j++)
{
int len = n[j].Length + n[i - j].Length + 1;
if (len < bestlen)
{
bestlen = len;
best = n[j] + n[i - j] + "+";
}
}
n[i] = best;
}
Here's a trick to optimize searching for the sums. Suppose there is an array that contains, for every length, the highest number that can be made with that length. An other thing that is perhaps less obvious that this array also gives us, is a quick way to determine the shortest number that is bigger than some threshold (by simply scanning through the array and noting the first position that crosses the threshold). Together, that gives a quick way to discard huge portions of the search space.
For example, the biggest number of length 3 is 81 and the biggest number of length 5 is 728. Now if we want to know how to get 1009 (prime, so no factors found), first we try the sums where the first part has length 1 (so 1+1008 through 9+1000), finding 9+1000 which is 9 characters long (95558***+).
The next step, checking the sums where the first part has length 3 or less, can be skipped completely. 1009 - 81 = 929, and 929 (the lowest that the second part of the sum can be if the first part is to be 3 characters or less) is bigger than 728 so numbers of 929 and over must be at least 7 characters long. So if the first part of the sum is 3 characters, the second part must be at least 7 characters, and then there's also a + sign on the end, so the total is at least 11 characters. The best so far was 9, so this step can be skipped.
The next step, with 5 characters in the first part, can also be skipped, because 1009 - 728 = 280, and to make 280 or high we need at least 5 characters. 5 + 5 + 1 = 11, bigger than 9, so don't check.
Instead of checking about 500 sums, we only had to check 9 this way, and the check to make the skipping possible is very quick. This trick is good enough that generating all numbers up to a million only takes 3 seconds on my PC (before, it would take 3 seconds to get to 100000).
Here's the code:
string[] n = new string[100000];
int[] biggest_number_of_length = new int[n.Length];
for (int i = 0; i < 10; i++)
n[i] = "" + i;
biggest_number_of_length[1] = 9;
for (int i = 10; i < n.Length; i++)
{
int bestlen = int.MaxValue;
string best = null;
// try factors
int sqrt = (int)Math.Sqrt(i);
for (int j = 2; j <= sqrt; j++)
{
if (i % j == 0)
{
int len = n[j].Length + n[i / j].Length + 1;
if (len < bestlen)
{
bestlen = len;
best = n[j] + n[i / j] + "*";
}
}
}
// try sums
for (int x = 1; x < bestlen; x += 2)
{
int find = i - biggest_number_of_length[x];
int min = int.MaxValue;
// find the shortest number that is >= (i - biggest_number_of_length[x])
for (int k = 1; k < biggest_number_of_length.Length; k += 2)
{
if (biggest_number_of_length[k] >= find)
{
min = k;
break;
}
}
// if that number wasn't small enough, it's not worth looking in that range
if (min + x + 1 < bestlen)
{
// range [find .. i] isn't optimal
for (int j = find; j < i; j++)
{
int len = n[i - j].Length + n[j].Length + 1;
if (len < bestlen)
{
bestlen = len;
best = n[i - j] + n[j] + "+";
}
}
}
}
// found
n[i] = best;
biggest_number_of_length[bestlen] = i;
}
There's still room for improvement. This code will re-check sums that it has already checked. There are simple ways to make it at least not check the same sum twice (by remembering the last find), but that made no significant difference in my tests. It should be possible to find a better upper bound.
There's also 93*94*1+*, which is basically 27*37.
Were I to attack this problem, I'd start by first trying to evenly divide the number. So given 999 I would divide by 9 and get 111. Then I'd try to divide by 9, 8, 7, etc. until I discovered that 111 is 3*37.
37 is prime, so I go greedy and divide by 9, giving me 4 with a remainder of 1.
That seems to give me optimum results for the half dozen I've tried. It's a little expensive, of course, testing for even divisibility. But perhaps not more expensive than generating a too-long expression.
Using this, 100 becomes 55*4*. 102 works out to 29*5*6+.
101 brings up an interesting case. 101/9 = (9*11) + 2. Or, alternately, (9*9)+20. Let's see:
983+*2+ (9*11) + 2
99*45*+ (9*9) + 20
Whether it's easier to generate the postfix directly or generate infix and convert, I really don't know. I can see benefits and drawbacks to each.
Anyway, that's the approach I'd take: try to divide evenly at first, and then be greedy dividing by 9. Not sure exactly how I'd structure it.
I'd sure like to see your solution once you figure it out.
Edit
This is an interesting problem. I came up with a recursive function that does a credible job of generating postfix expressions, but it's not optimum. Here it is in C#.
string GetExpression(int val)
{
if (val < 10)
{
return val.ToString();
}
int quo, rem;
// first see if it's evenly divisible
for (int i = 9; i > 1; --i)
{
quo = Math.DivRem(val, i, out rem);
if (rem == 0)
{
// If val < 90, then only generate here if the quotient
// is a one-digit number. Otherwise it can be expressed
// as (9 * x) + y, where x and y are one-digit numbers.
if (val >= 90 || (val < 90 && quo <= 9))
{
// value is (i * quo)
return i + GetExpression(quo) + "*";
}
}
}
quo = Math.DivRem(val, 9, out rem);
// value is (9 * quo) + rem
// optimization reduces (9 * 1) to 9
var s1 = "9" + ((quo == 1) ? string.Empty : GetExpression(quo) + "*");
var s2 = GetExpression(rem) + "+";
return s1 + s2;
}
For 999 it generates 9394*1+**, which I believe is optimum.
This generates optimum expressions for values <= 90. Every number from 0 to 90 can be expressed as the product of two one-digit numbers, or by an expression of the form (9x + y), where x and y are one-digit numbers. However, I don't know that this guarantees an optimum expression for values greater than 90.
There is 44 solutions for 999 with lenght 9:
39149*+**
39166*+**
39257*+**
39548*+**
39756*+**
39947*+**
39499**+*
39669**+*
39949**+*
39966**+*
93149*+**
93166*+**
93257*+**
93548*+**
93756*+**
93947*+**
93269**+*
93349**+*
93366**+*
93439**+*
93629**+*
93636**+*
93926**+*
93934**+*
93939+*+*
93948+*+*
93957+*+*
96357**+*
96537**+*
96735**+*
96769+*+*
96778+*+*
97849+*+*
97858+*+*
97867+*+*
99689+*+*
956*99*+*
968*79*+*
39*149*+*
39*166*+*
39*257*+*
39*548*+*
39*756*+*
39*947*+*
Edit:
I have working on some search space pruning improvements so sorry I have not posted it immediately. There is script in Erlnag. Original one takes 14s for 999 but this one makes it in around 190ms.
Edit2:
There is 1074 solutions of length 13 for 9999. It takes 7 minutes and there is some of them below:
329+9677**+**
329+9767**+**
338+9677**+**
338+9767**+**
347+9677**+**
347+9767**+**
356+9677**+**
356+9767**+**
3147789+***+*
31489+77***+*
3174789+***+*
3177489+***+*
3177488*+**+*
There is version in C with more aggressive pruning of state space and returns only one solution. It is way faster.
$ time ./polish_numbers 999
Result for 999: 39149*+**, length 9
real 0m0.008s
user 0m0.004s
sys 0m0.000s
$ time ./polish_numbers 99999
Result for 99999: 9158*+1569**+**, length 15
real 0m34.289s
user 0m34.296s
sys 0m0.000s
harold was reporting his C# bruteforce version makes same number in 20s so I was curious if I can improve mine. I have tried better memory utilization by refactoring data structure. Searching algorithm mostly works with length of solution and it's existence so I separated this information to one structure (best_rec_header). I have also make solution as tree branches separated in another (best_rec_args). Those data are used only when new better solution for given number. There is code.
Result for 99999: 9158*+1569**+**, length 15
real 0m31.824s
user 0m31.812s
sys 0m0.012s
It was still too much slow. So I tried some other versions. First I added some statistics to demonstrate that mine code is not computing all smaller numbers.
Result for 99999: 9158*+1569**+**, length 15, (skipped 36777, computed 26350)
Then I have tried change code to compute + solutions for bigger numbers first.
Result for 99999: 1956**+9158*+**, length 15, (skipped 0, computed 34577)
real 0m17.055s
user 0m17.052s
sys 0m0.008s
It was almost as twice faster. But there was another idea that may be sometimes I give up find solution for some number as limited by current best_len limit. So I tried to make small numbers (up to half of n) unlimited (note 255 as best_len limit for first of operands finding).
Result for 99999: 9158*+1569**+**, length 15, (skipped 36777, computed 50000)
real 0m12.058s
user 0m12.048s
sys 0m0.008s
Nice improvement but what if I limit solutions for those numbers by best solution found so far. It needs some sort of computation global state. Code becomes more complicated but result even faster.
Result for 99999: 97484777**+**+*, length 15, (skipped 36997, computed 33911)
real 0m10.401s
user 0m10.400s
sys 0m0.000s
It was even able to compute ten times bigger number.
Result for 999999: 37967+2599**+****, length 17, (skipped 440855)
real 12m55.085s
user 12m55.168s
sys 0m0.028s
Then I decided to try also brute force method and this was even faster.
Result for 99999: 9158*+1569**+**, length 15
real 0m3.543s
user 0m3.540s
sys 0m0.000s
Result for 999999: 37949+2599**+****, length 17
real 5m51.624s
user 5m51.556s
sys 0m0.068s
Which shows, that constant matter. It is especially true for modern CPU when brute force approach gets advantage from better vectorization, better CPU cache utilization and less branching.
Anyway, I think there is some better approach using better understanding of number theory or space searching by algorithms as A* and so. And for really big numbers there may be good idea to use genetic algorithms.
Edit3:
harold came with new idea to eliminate trying to much sums. I have implemented it in this new version. It is order of magnitude faster.
$ time ./polish_numbers 99999
Result for 99999: 9158*+1569**+**, length 15
real 0m0.153s
user 0m0.152s
sys 0m0.000s
$ time ./polish_numbers 999999
Result for 999999: 37949+2599**+****, length 17
real 0m3.516s
user 0m3.512s
sys 0m0.004s
$ time ./polish_numbers 9999999
Result for 9999999: 9788995688***+***+*, length 19
real 1m39.903s
user 1m39.904s
sys 0m0.032s
Don't forget, you can also push ASCII values!!
Usually, this is longer, but for higher numbers it can get much shorter:
If you needed the number 123, it would be much better to do
"{" than 99*76*+

Efficient algorithm for getting from 1 to n with 3 specific operations

The question:
Given those 3 valid operations over numbers and an integer n:
add 1 to the number
multiply the number by 2
multiply the number by 3
describe an efficient algorithm for the minimal number of operations of getting from 1 to n with the 3 operations mentioned above.
For example for n = 28 the answer is 4 : 1 * 3 * 3 * 3 + 1 = 27 + 1 = 28.
I wrote the code below in java and it takes a lot of time to end for num>=1000. Can you help me understand what is wrong w.r.t. efficiency. I am trying to make a more efficient algorithm to solve the problem. Please help.
public static int ToN(int num){
return to(1,num);
}
public static int to(int x,int y){
if (x==y)
return 0;
if (x>y)
return y;
return Math.min(Math.min(to(x+1,y),to(x*2,y)),to(x*3,y))+1;
}
Dynamic Programming
Using previously computed results to compute new results. This makes the solution linear, as opposed to exponential in your case.
Python program
import sys
n = int(raw_input())
dp = [sys.maxint]*(n+1)
dp[1] = 1
for i in xrange(2, n+1):
dp[i] = dp[i-1]+1
if i%2==0: dp[i] = min(dp[i], dp[i/2]+1)
if i%3==0: dp[i] = min(dp[i], dp[i/3]+1)
print dp[n]
Your solution is slow, because you're not caching previously calculated results (best solutions for i < num), and you recalculating them each time. This make it working for exponential time, because each call of toN function spawns 3 other calls, and so on recursivelly
public static int ToN(int num){
int[] opsRequired = new int[num+1];
opsRequired[1] = 1;
for (int i=2; i <= num; i++) {
opsRequired[i] = opsRequired[i-1];
if (i % 2 == 0)
opsRequired[i] = Math.min(opsRequired[i], opsRequired[i/2]);
if (i % 3 == 0)
opsRequired[i] = Math.min(opsRequired[i], opsRequired[i/3]);
opsRequired[i]++;
}
}
Two things that you can introspect:
1) Why does your recursive method need two parameters? If the problem involves moving from 1 to n at all times, we can manage with just a single parameter. Note that while recursing, you pass y as is (and you don't operate on y's value inside your method either). This usually indicates that the parameter itself is redundant.
2) After having optimized the parameters, we can now memoize the solution. This would speed up your method greatly.
For positive integers, multiplying by 3 will always be faster than by 2, which will always be faster than adding 1. So the algorithm is trivial.
if x == y return 0
if x <= y/3 return 1 + to(x*3,y)
if x <= y/2 return 1 + to(x*2,y)
return y-x

Find number of binary numbers with certain constraints

This is more of a puzzle than a coding problem. I need to find how many binary numbers can be generated satisfying certain constraints. The inputs are
(integer) Len - Number of digits in the binary number
(integer) x
(integer) y
The binary number has to be such that taking any x adjacent digits from the binary number should contain at least y 1's.
For example -
Len = 6, x = 3, y = 2
0 1 1 0 1 1 - Length is 6, Take any 3 adjacent digits from this and
there will be 2 l's
I had this C# coding question posed to me in an interview and I cannot figure out any algorithm to solve this. Not looking for code (although it's welcome), any sort of help, pointers are appreciated
This problem can be solved using dynamic programming. The main idea is to group the binary numbers according to the last x-1 bits and the length of each binary number. If appending a bit sequence to one number yields a number satisfying the constraint, then appending the same bit sequence to any number in the same group results in a number satisfying the constraint also.
For example, x = 4, y = 2. both of 01011 and 10011 have the same last 3 bits (011). Appending a 0 to each of them, resulting 010110 and 100110, both satisfy the constraint.
Here is pseudo code:
mask = (1<<(x-1)) - 1
count[0][0] = 1
for(i = 0; i < Len-1; ++i) {
for(j = 0; j < 1<<i && j < 1<<(x-1); ++j) {
if(i<x-1 || count1Bit(j*2+1)>=y)
count[i+1][(j*2+1)&mask] += count[i][j];
if(i<x-1 || count1Bit(j*2)>=y)
count[i+1][(j*2)&mask] += count[i][j];
}
}
answer = 0
for(j = 0; j < 1<<i && j < 1<<(x-1); ++j)
answer += count[Len][j];
This algorithm assumes that Len >= x. The time complexity is O(Len*2^x).
EDIT
The count1Bit(j) function counts the number of 1 in the binary representation of j.
The only input to this algorithm are Len, x, and y. It starts from an empty binary string [length 0, group 0], and iteratively tries to append 0 and 1 until length equals to Len. It also does the grouping and counting the number of binary strings satisfying the 1-bits constraint in each group. The output of this algorithm is answer, which is the number of binary strings (numbers) satisfying the constraints.
For a binary string in group [length i, group j], appending 0 to it results in a binary string in group [length i+1, group (j*2)%(2^(x-1))]; appending 1 to it results in a binary string in group [length i+1, group (j*2+1)%(2^(x-1))].
Let count[i,j] be the number of binary strings in group [length i, group j] satisfying the 1-bits constraint. If there are at least y 1 in the binary representation of j*2, then appending 0 to each of these count[i,j] binary strings yields a binary string in group [length i+1, group (j*2)%(2^(x-1))] which also satisfies the 1-bit constraint. Therefore, we can add count[i,j] into count[i+1,(j*2)%(2^(x-1))]. The case of appending 1 is similar.
The condition i<x-1 in the above algorithm is to keep the binary strings growing when length is less than x-1.
Using the example of LEN = 6, X = 3 and Y = 2...
Build an exhaustive bit pattern generator for X bits. A simple binary counter can do this. For example, if X = 3
then a counter from 0 to 7 will generate all possible bit patterns of length 3.
The patterns are:
000
001
010
011
100
101
110
111
Verify the adjacency requirement as the patterns are built. Reject any patterns that do not qualify.
Basically this boils down to rejecting any pattern containing fewer than 2 '1' bits (Y = 2). The list prunes down to:
011
101
110
111
For each member of the pruned list, add a '1' bit and retest the first X bits. Keep the new pattern if it passes the
adjacency test. Do the same with a '0' bit. For example this step proceeds as:
1011 <== Keep
1101 <== Keep
1110 <== Keep
1111 <== Keep
0011 <== Reject
0101 <== Reject
0110 <== Keep
0111 <== Keep
Which leaves:
1011
1101
1110
1111
0110
0111
Now repeat this process until the pruned set is empty or the member lengths become LEN bits long. In the end
the only patterns left are:
111011
111101
111110
111111
110110
110111
101101
101110
101111
011011
011101
011110
011111
Count them up and you are done.
Note that you only need to test the first X bits on each iteration because all the subsequent patterns were verified in prior steps.
Considering that input values are variable and wanted to see the actual output, I used recursive algorithm to determine all combinations of 0 and 1 for a given length :
private static void BinaryNumberWithOnes(int n, int dump, int ones, string s = "")
{
if (n == 0)
{
if (BinaryWithoutDumpCountContainsnumberOfOnes(s, dump,ones))
Console.WriteLine(s);
return;
}
BinaryNumberWithOnes(n - 1, dump, ones, s + "0");
BinaryNumberWithOnes(n - 1, dump, ones, s + "1");
}
and BinaryWithoutDumpCountContainsnumberOfOnes to determine if the binary number meets the criteria
private static bool BinaryWithoutDumpCountContainsnumberOfOnes(string binaryNumber, int dump, int ones)
{
int current = 0;
int count = binaryNumber.Length;
while(current +dump < count)
{
var fail = binaryNumber.Remove(current, dump).Replace("0", "").Length < ones;
if (fail)
{
return false;
}
current++;
}
return true;
}
Calling BinaryNumberWithOnes(6, 3, 2) will output all binary numbers that match
010011
011011
011111
100011
100101
100111
101011
101101
101111
110011
110101
110110
110111
111011
111101
111110
111111
Sounds like a nested for loop would do the trick. Pseudocode (not tested).
value = '0101010111110101010111' // change this line to format you would need
for (i = 0; i < (Len-x); i++) { // loop over value from left to right
kount = 0
for (j = i; j < (i+x); j++) { // count '1' bits in the next 'x' bits
kount += value[j] // add 0 or 1
if kount >= y then return success
}
}
return fail
The naive approach would be a tree-recursive algorithm.
Our recursive method would slowly build the number up, e.g. it would start at xxxxxx, return the sum of a call with 1xxxxx and 0xxxxx, which themselves will return the sum of a call with 10, 11 and 00, 01, etc. except if the x/y conditions are NOT satisfied for the string it would build by calling itself it does NOT go down that path, and if you are at a terminal condition (built a number of the correct length) you return 1. (note that since we're building the string up from left to right, you don't have to check x/y for the entire string, just also considering the newly added digit!)
By returning a sum over all calls then all of the returned 1s will pool together and be returned by the initial call, equalling the number of constructed strings.
No idea what the big O notation for time complexity is for this one, it could be as bad as O(2^n)*O(checking x/y conditions) but it will prune lots of branches off the tree in most cases.
UPDATE: One insight I had is that all branches of the recursive tree can be 'merged' if they have identical last x digits so far, because then the same checks would be applied to all digits hereafter so you may as well double them up and save a lot of work. This now requires building the tree explicitly instead of implicitly via recursive calls, and maybe some kind of hashing scheme to detect when branches have identical x endings, but for large length it would provide a huge speedup.
My approach is to start by getting the all binary numbers with the minimum number of 1's, which is easy enough, you just get every unique permutation of a binary number of length x with y 1's, and cycle each unique permutation "Len" times. By flipping the 0 bits of these seeds in every combination possible, we are guaranteed to iterate over all of the binary numbers that fit the criteria.
from itertools import permutations, cycle, combinations
def uniq(x):
d = {}
for i in x:
d[i]=1
return d.keys()
def findn( l, x, y ):
window = []
for i in xrange(y):
window.append(1)
for i in xrange(x-y):
window.append(0)
perms = uniq(permutations(window))
seeds=[]
for p in perms:
pr = cycle(p)
seeds.append([ pr.next() for i in xrange(l) ]) ###a seed is a binary number fitting the criteria with minimum 1 bits
bin_numbers=[]
for seed in seeds:
if seed in bin_numbers: continue
indexes = [ i for i, x in enumerate(seed) if x == 0] ### get indexes of 0 "bits"
exit = False
for i in xrange(len(indexes)+1):
if( exit ): break
for combo in combinations(indexes, i): ### combinatorically flipping the zero bits in the seed
new_num = seed[:]
for index in combo: new_num[index]+=1
if new_num in bin_numbers:
### if our new binary number has been seen before
### we can break out since we are doing a depth first traversal
exit=True
break
else:
bin_numbers.append(new_num)
print len(bin_numbers)
findn(6,3,2)
Growth of this approach is definitely exponential, but I thought I'd share my approach in case it helps someone else get to a lower complexity solution...
Set some condition and introduce simple help variable.
L = 6, x = 3 , y = 2 introduce d = x - y = 1
Condition: if the list of the next number hypotetical value and the previous x - 1 elements values has a number of 0-digits > d next number concrete value must be 1, otherwise add two brances with both 1 and 0 as concrete value.
Start: check(Condition) => both 0,1 due to number of total zeros in the 0-count check.
Empty => add 0 and 1
Step 1:Check(Condition)
0 (number of next value if 0 and previous x - 1 zeros > d(=1)) -> add 1 to sequence
1 -> add both 0,1 in two different branches
Step 2: check(Condition)
01 -> add 1
10 -> add 1
11 -> add 0,1 in two different branches
Step 3:
011 -> add 0,1 in two branches
101 -> add 1 (the next value if 0 and prev x-1 seq would be 010, so we prune and set only 1)
110 -> add 1
111 -> add 0,1
Step 4:
0110 -> obviously 1
0111 -> both 0,1
1011 -> both 0,1
1101 -> 1
1110 -> 1
1111 -> 0,1
Step 5:
01101 -> 1
01110 -> 1
01111 -> 0,1
10110 -> 1
10111 -> 0,1
11011 -> 0,1
11101 -> 1
11110 -> 1
11111 -> 0,1
Step 6 (Finish):
011011
011101
011110
011111
101101
101110
101111
110110
110111
111011
111101
111110
111111
Now count. I've tested for L = 6, x = 4 and y = 2 too, but consider to check the algorithm for special cases and extended cases.
Note: I'm pretty sure some algorithm with Disposition Theory bases should be a really massive improvement of my algorithm.
So in a series of Len binary digits, you are looking for a x-long segment that contains y 1's ..
See the execution: http://ideone.com/xuaWaK
Here's my Algorithm in Java:
import java.util.*;
import java.lang.*;
class Main
{
public static ArrayList<String> solve (String input, int x, int y)
{
int s = 0;
ArrayList<String> matches = new ArrayList<String>();
String segment = null;
for (int i=0; i<(input.length()-x); i++)
{
s = 0;
segment = input.substring(i,(i+x));
System.out.print(" i: "+i+" ");
for (char c : segment.toCharArray())
{
System.out.print("*");
if (c == '1')
{
s = s + 1;
}
}
if (s == y)
{
matches.add(segment);
}
System.out.println();
}
return matches;
}
public static void main (String [] args)
{
String input = "011010101001101110110110101010111011010101000110010";
int x = 6;
int y = 4;
ArrayList<String> matches = null;
matches = solve (input, x, y);
for (String match : matches)
{
System.out.println(" > "+match);
}
System.out.println(" Number of matches is " + matches.size());
}
}
The number of patterns of length X that contain at least Y 1 bits is countable. For the case x == y we know there is exactly one pattern of the 2^x possible patterns that meets the criteria. For smaller y we need to sum up the number of patterns which have excess 1 bits and the number of patterns that have exactly y bits.
choose(n, k) = n! / k! (n - k)!
numPatterns(x, y) {
total = 0
for (int j = x; j >= y; j--)
total += choose(x, j)
return total
}
For example :
X = 4, Y = 4 : 1 pattern
X = 4, Y = 3 : 1 + 4 = 5 patterns
X = 4, Y = 2 : 1 + 4 + 6 = 11 patterns
X = 4, Y = 1 : 1 + 4 + 6 + 4 = 15 patterns
X = 4, Y = 0 : 1 + 4 + 6 + 4 + 1 = 16
(all possible patterns have at least 0 1 bits)
So let M be the number of X length patterns that meet the Y criteria. Now, that X length pattern is a subset of N bits. There are (N - x + 1) "window" positions for the sub pattern, and 2^N total patterns possible. If we start with any of our M patterns, we know that appending a 1 to the right and shifting to the next window will result in one of our known M patterns. The question is, how many of the M patterns can we add a 0 to, shift right, and still have a valid pattern in M?
Since we are adding a zero, we have to be either shifting away from a zero, or we have to already be in an M where we have an excess of 1 bits. To flip that around, we can ask how many of the M patterns have exactly Y bits and start with a 1. Which is the same as "how many patterns of length X-1 have Y-1 bits", which we know how to answer:
shiftablePatternCount = M - choose(X-1, Y-1)
So starting with M possibilities, we are going to increase by shiftablePatternCount when we slide to the right. All patterns in the new window are in the set of M, with some patterns now duplicated. We are going to shift a number of times to fill up N by (N - X), each time increasing the count by shiftablePatternCount, so the full answer should be :
totalCountOfMatchingPatterns = M + (N - X)*shiftablePatternCount
edit - realized a mistake. I need to count the duplicates of the shiftable patterns that are generated. I think that's doable. (draft still)
I am not sure about my answer but here is my view.just take a look at it,
Len=4,
x=3,
y=2.
i just took out two patterns,cause pattern must contain at least y's 1.
X 1 1 X
1 X 1 X
X - represent don't care
now count for 1st expression is 2 1 1 2 =4
and for 2nd expression 1 2 1 2 =4
but 2 pattern is common between both so minus 2..so there will be total 6 pair which satisfy the condition.
I happen to be using a algoritem similar to your problem, trying to find a way to improve it, I found your question. So I will share
static int GetCount(int length, int oneBits){
int result = 0;
double count = Math.Pow(2, length);
for (int i = 1; i <= count - 1; i++)
{
string str = Convert.ToString(i, 2).PadLeft(length, '0');
if (str.ToCharArray().Count(c => c == '1') == oneBits)
{
result++;
}
}
return result;
}
not very efficent I think, but elegent solution.

Resources