Any useful mathematical function / algorithm to break down big numbers? - algorithm

So what I want to do is breaking down numbers that are dozens of thousands big into smaller numbers, preferably 2~9.
The first thing came to my mind was prime factorization, for instance the number 49392 can be expressed as (2 x 2 x 2 x 2 x 3 x 3 x 7 x 7 x 7). But there are prime numbers and numbers such as 25378 = 2 × 12689 that cant be expressed with only multiplication.
So I want to break these numbers down using multiplication and addition, for example, the number 25378 could be expressed as 25346 + 32 = (2 × 19 × 23 × 29) + (2^5). Still, 23 and 29 are too big but I just picked random number just to show what I mean by using addtion and multiplication together to express big numbers, I'm sure there's a better combination of number that express 25378 than 25346 and 32.
Anyways, I thought programming this would involve ton of unnecessary if statement and would be incredibly slow in the big picture. So I was wondering, if there is a mathematical algorithm or function that does this thing? If not, I could just optimize the code myself, but I was just curious, I couldn't find anything on google myself though.

Assuming the problem is to write a number as the simplest expression containing the numbers 1-9, addition and multiplication (simplest = smallest number of operators), then this Python program does this in O(N^2) time.
A number N can be written as the sum or product of two smaller numbers, so if you've precalculated the simplest way of constructing the numbers 1..N-1, then you can find the simplest way of constructing N in O(N) time. Then it's just a matter of avoiding duplicate work -- for example without loss of generality in the expressions A+B and AB, A<=B, and nicely printing out the final expression.
def nice_exp(x, pri):
if isinstance(x, int):
return str(x)
else:
oppri = 1 if x[0] == '*' else 0
if oppri < pri:
bracks = '()'
else:
bracks = ['', '']
return '%s%s %s %s%s' % (bracks[0], nice_exp(x[1], oppri), x[0], nice_exp(x[2], oppri), bracks[1])
def solve(N):
infinity = 1e12
size = [infinity] * (N+1)
expr = [None] * (N+1)
for i in range(N+1):
if i < 10:
size[i] = 1
expr[i] = i
continue
for j in range(2, i):
if j * j > i: break
if i%j == 0 and size[j] + size[i//j] + 1 < size[i]:
size[i] = size[j] + size[i//j] + 1
expr[i] = ('*', expr[j], expr[i//j])
for j in range(1, i):
if j > i-j: break
if size[j] + size[i-j] + 1 < size[i]:
size[i] = size[j] + size[i-j] + 1
expr[i] = ('+', expr[j], expr[i-j])
return nice_exp(expr[N], 0)
print(solve(25378))
Output:
2 * (5 + 4 * 7 * (5 + 7 * 8 * 8))

Related

Sum of all numbers written with particular digits in a given range

My objective is to find the sum of all numbers from 4 to 666554 which consists of 4,5,6 only.
SUM = 4+5+6+44+45+46+54+55+56+64+65+66+.....................+666554.
Simple method is to run a loop and add the numbers made of 4,5 and 6 only.
long long sum = 0;
for(int i=4;i <=666554;i++){
/*check if number contains only 4,5 and 6.
if condition is true then add the number to the sum*/
}
But it seems to be inefficient. Checking that the number is made up of 4,5 and 6 will take time. Is there any way to increase the efficiency. I have tried a lot but no new approach i have found.Please help.
For 1-digit numbers, note that
4 + 5 + 6 == 5 * 3
For 2-digits numbers:
(44 + 45 + 46) + (54 + 55 + 56) + (64 + 65 + 66)
== 45 * 3 + 55 * 3 + 65 * 3
== 55 * 9
and so on.
In general, for n-digits numbers, there are 3n of them consist of 4,5,6 only, their average value is exactly 5...5(n digits). Using code, the sum of them is ('5' * n).to_i * 3 ** n (Ruby), or int('5' * n) * 3 ** n (Python).
You calculate up to 6-digits numbers, then subtract the sum of 666555 to 666666.
P.S: for small numbers like 666554, using pattern matching is fast enough. (example)
Implement a counter in base 3 (number of digit values), e.g. 0,1,2,10,11,12,20,21,22,100.... and then translate the base-3 number into a decimal with the digits 4,5,6 (0->4, 1->5, 2->6), and add to running total. Repeat until the limit.
def compute_sum(digits, max_val):
def _next_val(cur_val):
for pos in range(len(cur_val)):
cur_val[pos]+=1
if cur_val[pos]<len(digits):
return
cur_val[pos]=0
cur_val.append(0)
def _get_val(cur_val):
digit_val=1
num_val=0
for x in cur_val:
num_val+=digits[x]*digit_val
digit_val*=10
return num_val
cur_val=[]
sum=0
while(True):
_next_val(cur_val)
num_val=_get_val(cur_val)
if num_val>max_val:
break
sum+=num_val
return sum
def main():
digits=[4,5,6]
max_val=666554
print(digits, max_val)
print(compute_sum(digits, max_val))
Mathematics are good, but not all problems are trivially "compressible", so knowing how to deal with them without mathematics can be worthwhile.
In this problem, the summation is trivial, the difficulty is efficiently enumerating the numbers that need be added, at first glance.
The "filter" route is a possibility: generate all possible numbers, incrementally, and filter out those which do not match; however it is also quite inefficient (in general):
the condition might not be trivial to match: in this case, the easier way is a conversion to string (fairly heavy on divisions and tests) followed by string-matching
the ratio of filtering is not too bad to start with at 30% per digit, but it scales very poorly as gen-y-s remarked: for a 4 digits number it is at 1%, or generating and checking 100 numbers to only get 1 out of them.
I would therefore advise a "generational" approach: only generate numbers that match the condition (and all of them).
I would note that generating all numbers composed of 4, 5 and 6 is like counting (in ternary):
starts from 4
45 becomes 46 (beware of carry-overs)
66 becomes 444 (extreme carry-over)
Let's go, in Python, as a generator:
def generator():
def convert(array):
i = 0
for e in array:
i *= 10
i += e
return i
def increment(array):
result = []
carry = True
for e in array[::-1]:
if carry:
e += 1
carry = False
if e > 6:
e = 4
carry = True
result = [e,] + result
if carry:
result = [4,] + result
return result
array = [4]
while True:
num = convert(array)
if num > 666554: break
yield num
array = increment(array)
Its result can be printed with sum(generator()):
$ time python example.py
409632209
python example.py 0.03s user 0.00s system 82% cpu 0.043 total
And here is the same in C++.
"Start with a simpler problem." —Polya
Sum the n-digit numbers which consist of the digits 4,5,6 only
As Yu Hao explains above, there are 3**n numbers and their average by symmetry is eg. 555555, so the sum is 3**n * (10**n-1)*5/9. But if you didn't spot that, here's how you might solve the problem another way.
The problem has a recursive construction, so let's try a recursive solution. Let g(n) be the sum of all 456-numbers of exactly n digits. Then we have the recurrence relation:
g(n) = (4+5+6)*10**(n-1)*3**(n-1) + 3*g(n-1)
To see this, separate the first digit of each number in the sum (eg. for n=3, the hundreds column). That gives the first term. The second term is sum of the remaining digits, one count of g(n-1) for each prefix of 4,5,6.
If that's still unclear, write out the n=2 sum and separate tens from units:
g(2) = 44+45+46 + 54+55+56 + 64+65+66
= (40+50+60)*3 + 3*(4+5+6)
= (4+5+6)*10*3 + 3*g(n-1)
Cool. At this point, the keen reader might like to check Yu Hao's formula for g(n) satisfies our recurrence relation.
To solve OP's problem, the sum of all 456-numbers from 4 to 666666 is g(1) + g(2) + g(3) + g(4) + g(5) + g(6). In Python, with dynamic programming:
def sum456(n):
"""Find the sum of all numbers at most n digits which consist of 4,5,6 only"""
g = [0] * (n+1)
for i in range(1,n+1):
g[i] = 15*10**(i-1)*3**(i-1) + 3*g[i-1]
print(g) # show the array of partial solutions
return sum(g)
For n=6
>>> sum456(6)
[0, 15, 495, 14985, 449955, 13499865, 404999595]
418964910
Edit: I note that OP truncated his sum at 666554 so it doesn't fit the general pattern. It will be less the last few terms
>>> sum456(6) - (666555 + 666556 + 666564 + 666565 + 666566 + 666644 + 666645 + 666646 + 666654 + 666655 + 666656 + + 666664 + 666665 + 666666)
409632209
The sum of 4 through 666666 is:
total = sum([15*(3**i)*int('1'*(i+1)) for i in range(6)])
>>> 418964910
The sum of the few numbers between 666554 and 666666 is:
rest = 666555+666556+666564+666565+666566+
666644+666645+666646+
666654+666655+666656+
666664+666665+666666
>>> 9332701
total - rest
>>> 409632209
Java implementation of question:-
This uses the modulo(10^9 +7) for the answer.
public static long compute_sum(long[] digits, long max_val, long count[]) {
List<Long> cur_val = new ArrayList<>();
long sum = 0;
long mod = ((long)Math.pow(10,9))+7;
long num_val = 0;
while (true) {
_next_val(cur_val, digits);
num_val = _get_val(cur_val, digits, count);
sum =(sum%mod + (num_val)%mod)%mod;
if (num_val == max_val) {
break;
}
}
return sum;
}
public static void _next_val(List<Long> cur_val, long[] digits) {
for (int pos = 0; pos < cur_val.size(); pos++) {
cur_val.set(pos, cur_val.get(pos) + 1);
if (cur_val.get(pos) < digits.length)
return;
cur_val.set(pos, 0L);
}
cur_val.add(0L);
}
public static long _get_val(List<Long> cur_val, long[] digits, long count[]) {
long digit_val = 1;
long num_val = 0;
long[] digitAppearanceCount = new long[]{0,0,0};
for (Long x : cur_val) {
digitAppearanceCount[x.intValue()] = digitAppearanceCount[x.intValue()]+1;
if (digitAppearanceCount[x.intValue()]>count[x.intValue()]){
num_val=0;
break;
}
num_val = num_val+(digits[x.intValue()] * digit_val);
digit_val *= 10;
}
return num_val;
}
public static void main(String[] args) {
long [] digits=new long[]{4,5,6};
long count[] = new long[]{1,1,1};
long max_val= 654;
System.out.println(compute_sum(digits, max_val, count));
}
The Answer by #gen-y-s (https://stackoverflow.com/a/31286947/8398943) is wrong (It includes 55,66,44 for x=y=z=1 which is exceeding the available 4s, 5s, 6s). It gives output as 12189 but it should be 3675 for x=y=z=1.
The logic by #Yu Hao (https://stackoverflow.com/a/31285816/8398943) has the same mistake as mentioned above. It gives output as 12189 but it should be 3675 for x=y=z=1.

Algorithm for determining number of possible combinations

I need to write an algorithm for a given problem: You have infinite pennies, nickels, dimes, and quarters. Write a class method that will output all combinations of coins such that the total is 99 cents.
It seems like a permutation nPr problem. Any algoritham for it?
Regards,
Priyank
I think this problem is most easily answered using recursion w a table of denominations
{5000, 2000, ... 1} // $50's to one penny
You would start with:
WaysToMakeChange(10000, 0) // ie. $100...highest denomination index is 0 ($50)
WaysToMakeChange(amount, maxdenomindex) would calculate using 0 or more of the maxdenom
the recurance is something like
WaysToMakeChange(amount - usedbymaxdenom, maxdenomindex - 1)
I programmed this and it can be optimized in many ways:
1) multithreading
2) caching. This is very important. B/c of the way the algorithm works, WaysToMakeChange(m,n) will be called many times with the same initial values:
For example. Changing $100 can be done by:
1 $50 + 0 $20's + 0 $10's + ways to $50 dollars with highest currency $5 (ie. WaysToMakeChange(5000, index for $5)
0 $50 + 2 $20's + 1 $10's + ways to $50 dollars with highest currency $5 (ie. WaysToMakeChange(5000, index for $5)
Clearly WaysToMakeChange(5000, index for $5) can be cached so that the subsequent call does not need to be made
3) Shortcircuiting the lowest recursion.
Suppose static const int denom[] = {5000, 2000, 1000, 500, 200, 100, 50, 25, 10, 5, 1};
The first test for WaysToMakeChange(int total, int coinIndex) should be something like:
if( coins[_countof(coins)-1] == 1 && coinIndex == _countof(coins) - 2){
return total / coins[_countof(coins)-2] + 1;
}
What does this mean? Well if your lowest denom is 1 then you only have to go as far as the second lowest denom (say a nickel). Then there are 1+ total/second lowest denom left. For example:
49c -> 5 nickels + 4 pennies. 4 nickels + 9 pennies....49 pennies = 1+ total/second lowest denom left
The easiest way is probably to spend a few moments thinking about the problem. There is a relatively nice, recursive, algorithm that lends itself neatly to either memoization or reworking into a dynamic programming solution.
This problem is classic Dynamic Programming problem. You can read about it here
http://www.algorithmist.com/index.php/Coin_Change
the python code is:
def count( n, m ):
if n == 0:
return 1
if n < 0:
return 0
if m <= 0 and n >= 1:
return 0
return count( n, m - 1 ) + count( n - S[m], m )
Here S[m] gives the value of the denomination and S is a sorted array of denominations
This problem seems like it is a diophantine equation, i.e. for a*x + b*y + ... = n, find a solution, where all letters are integers. The simplest, but not the most elegant solution would be an iterative one (displayed in python, note that I skip variable l because it resembles the number 1):
dioph_combinations = list()
for i in range(0, 99, 25):
for j in range(0, 99-i, 10):
for k in range(0, 99-i-j, 5):
for m in range(0, 99-i-j-k, 1):
if i + j + k + m == 99:
dioph_combinations.append( (i/25, j/10, k/5, m) )
The resulting list dioph_combinations will contain the possible combinations.

Find the smallest regular number that is not less than N

Regular numbers are numbers that evenly divide powers of 60. As an example, 602 = 3600 = 48 × 75, so both 48 and 75 are divisors of a power of 60. Thus, they are also regular numbers.
This is an extension of rounding up to the next power of two.
I have an integer value N which may contain large prime factors and I want to round it up to a number composed of only small prime factors (2, 3 and 5)
Examples:
f(18) == 18 == 21 * 32
f(19) == 20 == 22 * 51
f(257) == 270 == 21 * 33 * 51
What would be an efficient way to find the smallest number satisfying this requirement?
The values involved may be large, so I would like to avoid enumerating all regular numbers starting from 1 or maintaining an array of all possible values.
One can produce arbitrarily thin a slice of the Hamming sequence around the n-th member in time ~ n^(2/3) by direct enumeration of triples (i,j,k) such that N = 2^i * 3^j * 5^k.
The algorithm works from log2(N) = i+j*log2(3)+k*log2(5); enumerates all possible ks and for each, all possible js, finds the top i and thus the triple (k,j,i) and keeps it in a "band" if inside the given "width" below the given high logarithmic top value (when width < 1 there can be at most one such i) then sorts them by their logarithms.
WP says that n ~ (log N)^3, i.e. run time ~ (log N)^2. Here we don't care for the exact position of the found triple in the sequence, so all the count calculations from the original code can be thrown away:
slice hi w = sortBy (compare `on` fst) b where -- hi>log2(N) is a top value
lb5=logBase 2 5 ; lb3=logBase 2 3 -- w<1 (NB!) is log2(width)
b = concat -- the slice
[ [ (r,(i,j,k)) | frac < w ] -- store it, if inside width
| k <- [ 0 .. floor ( hi /lb5) ], let p = fromIntegral k*lb5,
j <- [ 0 .. floor ((hi-p)/lb3) ], let q = fromIntegral j*lb3 + p,
let (i,frac)=properFraction(hi-q) ; r = hi - frac ] -- r = i + q
-- properFraction 12.7 == (12, 0.7)
-- update: in pseudocode:
def slice(hi, w):
lb5, lb3 = logBase(2, 5), logBase(2, 3) -- logs base 2 of 5 and 3
for k from 0 step 1 to floor(hi/lb5) inclusive:
p = k*lb5
for j from 0 step 1 to floor((hi-p)/lb3) inclusive:
q = j*lb3 + p
i = floor(hi-q)
frac = hi-q-i -- frac < 1 , always
r = hi - frac -- r == i + q
if frac < w:
place (r,(i,j,k)) into the output array
sort the output array's entries by their "r" component
in ascending order, and return thus sorted array
Having enumerated the triples in the slice, it is a simple matter of sorting and searching, taking practically O(1) time (for arbitrarily thin a slice) to find the first triple above N. Well, actually, for constant width (logarithmic), the amount of numbers in the slice (members of the "upper crust" in the (i,j,k)-space below the log(N) plane) is again m ~ n^2/3 ~ (log N)^2 and sorting takes m log m time (so that searching, even linear, takes ~ m run time then). But the width can be made smaller for bigger Ns, following some empirical observations; and constant factors for the enumeration of triples are much higher than for the subsequent sorting anyway.
Even with constant width (logarthmic) it runs very fast, calculating the 1,000,000-th value in the Hamming sequence instantly and the billionth in 0.05s.
The original idea of "top band of triples" is due to Louis Klauder, as cited in my post on a DDJ blogs discussion back in 2008.
update: as noted by GordonBGood in the comments, there's no need for the whole band but rather just about one or two values above and below the target. The algorithm is easily amended to that effect. The input should also be tested for being a Hamming number itself before proceeding with the algorithm, to avoid round-off issues with double precision. There are no round-off issues comparing the logarithms of the Hamming numbers known in advance to be different (though going up to a trillionth entry in the sequence uses about 14 significant digits in logarithm values, leaving only 1-2 digits to spare, so the situation may in fact be turning iffy there; but for 1-billionth we only need 11 significant digits).
update2: turns out the Double precision for logarithms limits this to numbers below about 20,000 to 40,000 decimal digits (i.e. 10 trillionth to 100 trillionth Hamming number). If there's a real need for this for such big numbers, the algorithm can be switched back to working with the Integer values themselves instead of their logarithms, which will be slower.
Okay, hopefully third time's a charm here. A recursive, branching algorithm for an initial input of p, where N is the number being 'built' within each thread. NB 3a-c here are launched as separate threads or otherwise done (quasi-)asynchronously.
Calculate the next-largest power of 2 after p, call this R. N = p.
Is N > R? Quit this thread. Is p composed of only small prime factors? You're done. Otherwise, go to step 3.
After any of 3a-c, go to step 4.
a) Round p up to the nearest multiple of 2. This number can be expressed as m * 2.
b) Round p up to the nearest multiple of 3. This number can be expressed as m * 3.
c) Round p up to the nearest multiple of 5. This number can be expressed as m * 5.
Go to step 2, with p = m.
I've omitted the bookkeeping to do regarding keeping track of N but that's fairly straightforward I take it.
Edit: Forgot 6, thanks ypercube.
Edit 2: Had this up to 30, (5, 6, 10, 15, 30) realized that was unnecessary, took that out.
Edit 3: (The last one I promise!) Added the power-of-30 check, which helps prevent this algorithm from eating up all your RAM.
Edit 4: Changed power-of-30 to power-of-2, per finnw's observation.
Here's a solution in Python, based on Will Ness answer but taking some shortcuts and using pure integer math to avoid running into log space numerical accuracy errors:
import math
def next_regular(target):
"""
Find the next regular number greater than or equal to target.
"""
# Check if it's already a power of 2 (or a non-integer)
try:
if not (target & (target-1)):
return target
except TypeError:
# Convert floats/decimals for further processing
target = int(math.ceil(target))
if target <= 6:
return target
match = float('inf') # Anything found will be smaller
p5 = 1
while p5 < target:
p35 = p5
while p35 < target:
# Ceiling integer division, avoiding conversion to float
# (quotient = ceil(target / p35))
# From https://stackoverflow.com/a/17511341/125507
quotient = -(-target // p35)
# Quickly find next power of 2 >= quotient
# See https://stackoverflow.com/a/19164783/125507
try:
p2 = 2**((quotient - 1).bit_length())
except AttributeError:
# Fallback for Python <2.7
p2 = 2**(len(bin(quotient - 1)) - 2)
N = p2 * p35
if N == target:
return N
elif N < match:
match = N
p35 *= 3
if p35 == target:
return p35
if p35 < match:
match = p35
p5 *= 5
if p5 == target:
return p5
if p5 < match:
match = p5
return match
In English: iterate through every combination of 5s and 3s, quickly finding the next power of 2 >= target for each pair and keeping the smallest result. (It's a waste of time to iterate through every possible multiple of 2 if only one of them can be correct). It also returns early if it ever finds that the target is already a regular number, though this is not strictly necessary.
I've tested it pretty thoroughly, testing every integer from 0 to 51200000 and comparing to the list on OEIS http://oeis.org/A051037, as well as many large numbers that are ±1 from regular numbers, etc. It's now available in SciPy as fftpack.helper.next_fast_len, to find optimal sizes for FFTs (source code).
I'm not sure if the log method is faster because I couldn't get it to work reliably enough to test it. I think it has a similar number of operations, though? I'm not sure, but this is reasonably fast. Takes <3 seconds (or 0.7 second with gmpy) to calculate that 2142 × 380 × 5444 is the next regular number above 22 × 3454 × 5249+1 (the 100,000,000th regular number, which has 392 digits)
You want to find the smallest number m that is m >= N and m = 2^i * 3^j * 5^k where all i,j,k >= 0.
Taking logarithms the equations can be rewritten as:
log m >= log N
log m = i*log2 + j*log3 + k*log5
You can calculate log2, log3, log5 and logN to (enough high, depending on the size of N) accuracy. Then this problem looks like a Integer Linear programming problem and you could try to solve it using one of the known algorithms for this NP-hard problem.
EDITED/CORRECTED: Corrected the codes to pass the scipy tests:
Here's an answer based on endolith's answer, but almost eliminating long multi-precision integer calculations by using float64 logarithm representations to do a base comparison to find triple values that pass the criteria, only resorting to full precision comparisons when there is a chance that the logarithm value may not be accurate enough, which only occurs when the target is very close to either the previous or the next regular number:
import math
def next_regulary(target):
"""
Find the next regular number greater than or equal to target.
"""
if target < 2: return ( 0, 0, 0 )
log2hi = 0
mant = 0
# Check if it's already a power of 2 (or a non-integer)
try:
mant = target & (target - 1)
target = int(target) # take care of case where not int/float/decimal
except TypeError:
# Convert floats/decimals for further processing
target = int(math.ceil(target))
mant = target & (target - 1)
# Quickly find next power of 2 >= target
# See https://stackoverflow.com/a/19164783/125507
try:
log2hi = target.bit_length()
except AttributeError:
# Fallback for Python <2.7
log2hi = len(bin(target)) - 2
# exit if this is a power of two already...
if not mant: return ( log2hi - 1, 0, 0 )
# take care of trivial cases...
if target < 9:
if target < 4: return ( 0, 1, 0 )
elif target < 6: return ( 0, 0, 1 )
elif target < 7: return ( 1, 1, 0 )
else: return ( 3, 0, 0 )
# find log of target, which may exceed the float64 limit...
if log2hi < 53: mant = target << (53 - log2hi)
else: mant = target >> (log2hi - 53)
log2target = log2hi + math.log2(float(mant) / (1 << 53))
# log2 constants
log2of2 = 1.0; log2of3 = math.log2(3); log2of5 = math.log2(5)
# calculate range of log2 values close to target;
# desired number has a logarithm of log2target <= x <= top...
fctr = 6 * log2of3 * log2of5
top = (log2target**3 + 2 * fctr)**(1/3) # for up to 2 numbers higher
btm = 2 * log2target - top # or up to 2 numbers lower
match = log2hi # Anything found will be smaller
result = ( log2hi, 0, 0 ) # placeholder for eventual matches
count = 0 # only used for debugging counting band
fives = 0; fiveslmt = int(math.ceil(top / log2of5))
while fives < fiveslmt:
log2p = top - fives * log2of5
threes = 0; threeslmt = int(math.ceil(log2p / log2of3))
while threes < threeslmt:
log2q = log2p - threes * log2of3
twos = int(math.floor(log2q)); log2this = top - log2q + twos
if log2this >= btm: count += 1 # only used for counting band
if log2this >= btm and log2this < match:
# logarithm precision may not be enough to differential between
# the next lower regular number and the target, so do
# a full resolution comparison to eliminate this case...
if (2**twos * 3**threes * 5**fives) >= target:
match = log2this; result = ( twos, threes, fives )
threes += 1
fives += 1
return result
print(next_regular(2**2 * 3**454 * 5**249 + 1)) # prints (142, 80, 444)
Since most long multi-precision calculations have been eliminated, gmpy isn't needed, and on IDEOne the above code takes 0.11 seconds instead of 0.48 seconds for endolith's solution to find the next regular number greater than the 100 millionth one as shown; it takes 0.49 seconds instead of 5.48 seconds to find the next regular number past the billionth (next one is (761,572,489) past (1334,335,404) + 1), and the difference will get even larger as the range goes up as the multi-precision calculations get increasingly longer for the endolith version compared to almost none here. Thus, this version could calculate the next regular number from the trillionth in the sequence in about 50 seconds on IDEOne, where it would likely take over an hour with the endolith version.
The English description of the algorithm is almost the same as for the endolith version, differing as follows:
1) calculates the float log estimation of the argument target value (we can't use the built-in log function directly as the range may be much too large for representation as a 64-bit float),
2) compares the log representation values in determining qualifying values inside an estimated range above and below the target value of only about two or three numbers (depending on round-off),
3) compare multi-precision values only if within the above defined narrow band,
4) outputs the triple indices rather than the full long multi-precision integer (would be about 840 decimal digits for the one past the billionth, ten times that for the trillionth), which can then easily be converted to the long multi-precision value if required.
This algorithm uses almost no memory other than for the potentially very large multi-precision integer target value, the intermediate evaluation comparison values of about the same size, and the output expansion of the triples if required. This algorithm is an improvement over the endolith version in that it successfully uses the logarithm values for most comparisons in spite of their lack of precision, and that it narrows the band of compared numbers to just a few.
This algorithm will work for argument ranges somewhat above ten trillion (a few minute's calculation time at IDEOne rates) when it will no longer be correct due to lack of precision in the log representation values as per #WillNess's discussion; in order to fix this, we can change the log representation to a "roll-your-own" logarithm representation consisting of a fixed-length integer (124 bits for about double the exponent range, good for targets of over a hundred thousand digits if one is willing to wait); this will be a little slower due to the smallish multi-precision integer operations being slower than float64 operations, but not that much slower since the size is limited (maybe a factor of three or so slower).
Now none of these Python implementations (without using C or Cython or PyPy or something) are particularly fast, as they are about a hundred times slower than as implemented in a compiled language. For reference sake, here is a Haskell version:
{-# OPTIONS_GHC -O3 #-}
import Data.Word
import Data.Bits
nextRegular :: Integer -> ( Word32, Word32, Word32 )
nextRegular target
| target < 2 = ( 0, 0, 0 )
| target .&. (target - 1) == 0 = ( fromIntegral lg2hi - 1, 0, 0 )
| target < 9 = case target of
3 -> ( 0, 1, 0 )
5 -> ( 0, 0, 1 )
6 -> ( 1, 1, 0 )
_ -> ( 3, 0, 0 )
| otherwise = match
where
lg3 = logBase 2 3 :: Double; lg5 = logBase 2 5 :: Double
lg2hi = let cntplcs v cnt =
let nv = v `shiftR` 31 in
if nv <= 0 then
let cntbts x c =
if x <= 0 then c else
case c + 1 of
nc -> nc `seq` cntbts (x `shiftR` 1) nc in
cntbts (fromIntegral v :: Word32) cnt
else case cnt + 31 of ncnt -> ncnt `seq` cntplcs nv ncnt
in cntplcs target 0
lg2tgt = let mant = if lg2hi <= 53 then target `shiftL` (53 - lg2hi)
else target `shiftR` (lg2hi - 53)
in fromIntegral lg2hi +
logBase 2 (fromIntegral mant / 2^53 :: Double)
lg2top = (lg2tgt^3 + 2 * 6 * lg3 * lg5)**(1/3) -- for 2 numbers or so higher
lg2btm = 2* lg2tgt - lg2top -- or two numbers or so lower
match =
let klmt = floor (lg2top / lg5)
loopk k mtchlgk mtchtplk =
if k > klmt then mtchtplk else
let p = lg2top - fromIntegral k * lg5
jlmt = fromIntegral $ floor (p / lg3)
loopj j mtchlgj mtchtplj =
if j > jlmt then loopk (k + 1) mtchlgj mtchtplj else
let q = p - fromIntegral j * lg3
( i, frac ) = properFraction q; r = lg2top - frac
( nmtchlg, nmtchtpl ) =
if r < lg2btm || r >= mtchlgj then
( mtchlgj, mtchtplj ) else
if 2^i * 3^j * 5^k >= target then
( r, ( i, j, k ) ) else ( mtchlgj, mtchtplj )
in nmtchlg `seq` nmtchtpl `seq` loopj (j + 1) nmtchlg nmtchtpl
in loopj 0 mtchlgk mtchtplk
in loopk 0 (fromIntegral lg2hi) ( fromIntegral lg2hi, 0, 0 )
trival :: ( Word32, Word32, Word32 ) -> Integer
trival (i,j,k) = 2^i * 3^j * 5^k
main = putStrLn $ show $ nextRegular $ (trival (1334,335,404)) + 1 -- (1126,16930,40)
This code calculates the next regular number following the billionth in too small a time to be measured and following the trillionth in 0.69 seconds on IDEOne (and potentially could run even faster except that IDEOne doesn't support LLVM). Even Julia will run at something like this Haskell speed after the "warm-up" for JIT compilation.
EDIT_ADD: The Julia code is as per the following:
function nextregular(target :: BigInt) :: Tuple{ UInt32, UInt32, UInt32 }
# trivial case of first value or anything less...
target < 2 && return ( 0, 0, 0 )
# Check if it's already a power of 2 (or a non-integer)
mant = target & (target - 1)
# Quickly find next power of 2 >= target
log2hi :: UInt32 = 0
test = target
while true
next = test & 0x7FFFFFFF
test >>>= 31; log2hi += 31
test <= 0 && (log2hi -= leading_zeros(UInt32(next)) - 1; break)
end
# exit if this is a power of two already...
mant == 0 && return ( log2hi - 1, 0, 0 )
# take care of trivial cases...
if target < 9
target < 4 && return ( 0, 1, 0 )
target < 6 && return ( 0, 0, 1 )
target < 7 && return ( 1, 1, 0 )
return ( 3, 0, 0 )
end
# find log of target, which may exceed the Float64 limit...
if log2hi < 53 mant = target << (53 - log2hi)
else mant = target >>> (log2hi - 53) end
log2target = log2hi + log(2, Float64(mant) / (1 << 53))
# log2 constants
log2of2 = 1.0; log2of3 = log(2, 3); log2of5 = log(2, 5)
# calculate range of log2 values close to target;
# desired number has a logarithm of log2target <= x <= top...
fctr = 6 * log2of3 * log2of5
top = (log2target^3 + 2 * fctr)^(1/3) # for 2 numbers or so higher
btm = 2 * log2target - top # or 2 numbers or so lower
# scan for values in the given narrow range that satisfy the criteria...
match = log2hi # Anything found will be smaller
result :: Tuple{UInt32,UInt32,UInt32} = ( log2hi, 0, 0 ) # placeholder for eventual matches
fives :: UInt32 = 0; fiveslmt = UInt32(ceil(top / log2of5))
while fives < fiveslmt
log2p = top - fives * log2of5
threes :: UInt32 = 0; threeslmt = UInt32(ceil(log2p / log2of3))
while threes < threeslmt
log2q = log2p - threes * log2of3
twos = UInt32(floor(log2q)); log2this = top - log2q + twos
if log2this >= btm && log2this < match
# logarithm precision may not be enough to differential between
# the next lower regular number and the target, so do
# a full resolution comparison to eliminate this case...
if (big(2)^twos * big(3)^threes * big(5)^fives) >= target
match = log2this; result = ( twos, threes, fives )
end
end
threes += 1
end
fives += 1
end
result
end
Here's another possibility I just thought of:
If N is X bits long, then the smallest regular number R ≥ N will be in the range
[2X-1, 2X]
e.g. if N = 257 (binary 100000001) then we know R is 1xxxxxxxx unless R is exactly equal to the next power of 2 (512)
To generate all the regular numbers in this range, we can generate the odd regular numbers (i.e. multiples of powers of 3 and 5) first, then take each value and multiply by 2 (by bit-shifting) as many times as necessary to bring it into this range.
In Python:
from itertools import ifilter, takewhile
from Queue import PriorityQueue
def nextPowerOf2(n):
p = max(1, n)
while p != (p & -p):
p += p & -p
return p
# Generate multiples of powers of 3, 5
def oddRegulars():
q = PriorityQueue()
q.put(1)
prev = None
while not q.empty():
n = q.get()
if n != prev:
prev = n
yield n
if n % 3 == 0:
q.put(n // 3 * 5)
q.put(n * 3)
# Generate regular numbers with the same number of bits as n
def regularsCloseTo(n):
p = nextPowerOf2(n)
numBits = len(bin(n))
for i in takewhile(lambda x: x <= p, oddRegulars()):
yield i << max(0, numBits - len(bin(i)))
def nextRegular(n):
bigEnough = ifilter(lambda x: x >= n, regularsCloseTo(n))
return min(bigEnough)
You know what? I'll put money on the proposition that actually, the 'dumb' algorithm is fastest. This is based on the observation that the next regular number does not, in general, seem to be much larger than the given input. So simply start counting up, and after each increment, refactor and see if you've found a regular number. But create one processing thread for each available core you have, and for N cores have each thread examine every Nth number. When each thread has found a number or crossed the power-of-2 threshold, compare the results (keep a running best number) and there you are.
I wrote a small c# program to solve this problem. It's not very optimised but it's a start.
This solution is pretty fast for numbers as big as 11 digits.
private long GetRegularNumber(long n)
{
long result = n - 1;
long quotient = result;
while (quotient > 1)
{
result++;
quotient = result;
quotient = RemoveFactor(quotient, 2);
quotient = RemoveFactor(quotient, 3);
quotient = RemoveFactor(quotient, 5);
}
return result;
}
private static long RemoveFactor(long dividend, long divisor)
{
long remainder = 0;
long quotient = dividend;
while (remainder == 0)
{
dividend = quotient;
quotient = Math.DivRem(dividend, divisor, out remainder);
}
return dividend;
}

Find the minimum number of operations required to compute a number using a specified range of numbers

Let me start with an example -
I have a range of numbers from 1 to 9. And let's say the target number that I want is 29.
In this case the minimum number of operations that are required would be (9*3)+2 = 2 operations. Similarly for 18 the minimum number of operations is 1 (9*2=18).
I can use any of the 4 arithmetic operators - +, -, / and *.
How can I programmatically find out the minimum number of operations required?
Thanks in advance for any help provided.
clarification: integers only, no decimals allowed mid-calculation. i.e. the following is not valid (from comments below): ((9/2) + 1) * 4 == 22
I must admit I didn't think about this thoroughly, but for my purpose it doesn't matter if decimal numbers appear mid-calculation. ((9/2) + 1) * 4 == 22 is valid. Sorry for the confusion.
For the special case where set Y = [1..9] and n > 0:
n <= 9 : 0 operations
n <=18 : 1 operation (+)
otherwise : Remove any divisor found in Y. If this is not enough, do a recursion on the remainder for all offsets -9 .. +9. Offset 0 can be skipped as it has already been tried.
Notice how division is not needed in this case. For other Y this does not hold.
This algorithm is exponential in log(n). The exact analysis is a job for somebody with more knowledge about algebra than I.
For more speed, add pruning to eliminate some of the search for larger numbers.
Sample code:
def findop(n, maxlen=9999):
# Return a short postfix list of numbers and operations
# Simple solution to small numbers
if n<=9: return [n]
if n<=18: return [9,n-9,'+']
# Find direct multiply
x = divlist(n)
if len(x) > 1:
mults = len(x)-1
x[-1:] = findop(x[-1], maxlen-2*mults)
x.extend(['*'] * mults)
return x
shortest = 0
for o in range(1,10) + range(-1,-10,-1):
x = divlist(n-o)
if len(x) == 1: continue
mults = len(x)-1
# We spent len(divlist) + mults + 2 fields for offset.
# The last number is expanded by the recursion, so it doesn't count.
recursion_maxlen = maxlen - len(x) - mults - 2 + 1
if recursion_maxlen < 1: continue
x[-1:] = findop(x[-1], recursion_maxlen)
x.extend(['*'] * mults)
if o > 0:
x.extend([o, '+'])
else:
x.extend([-o, '-'])
if shortest == 0 or len(x) < shortest:
shortest = len(x)
maxlen = shortest - 1
solution = x[:]
if shortest == 0:
# Fake solution, it will be discarded
return '#' * (maxlen+1)
return solution
def divlist(n):
l = []
for d in range(9,1,-1):
while n%d == 0:
l.append(d)
n = n/d
if n>1: l.append(n)
return l
The basic idea is to test all possibilities with k operations, for k starting from 0. Imagine you create a tree of height k that branches for every possible new operation with operand (4*9 branches per level). You need to traverse and evaluate the leaves of the tree for each k before moving to the next k.
I didn't test this pseudo-code:
for every k from 0 to infinity
for every n from 1 to 9
if compute(n,0,k):
return k
boolean compute(n,j,k):
if (j == k):
return (n == target)
else:
for each operator in {+,-,*,/}:
for every i from 1 to 9:
if compute((n operator i),j+1,k):
return true
return false
It doesn't take into account arithmetic operators precedence and braces, that would require some rework.
Really cool question :)
Notice that you can start from the end! From your example (9*3)+2 = 29 is equivalent to saying (29-2)/3=9. That way we can avoid the double loop in cyborg's answer. This suggests the following algorithm for set Y and result r:
nextleaves = {r}
nops = 0
while(true):
nops = nops+1
leaves = nextleaves
nextleaves = {}
for leaf in leaves:
for y in Y:
if (leaf+y) or (leaf-y) or (leaf*y) or (leaf/y) is in X:
return(nops)
else:
add (leaf+y) and (leaf-y) and (leaf*y) and (leaf/y) to nextleaves
This is the basic idea, performance can be certainly be improved, for instance by avoiding "backtracks", such as r+a-a or r*a*b/a.
I guess my idea is similar to the one of Peer Sommerlund:
For big numbers, you advance fast, by multiplication with big ciphers.
Is Y=29 prime? If not, divide it by the maximum divider of (2 to 9).
Else you could subtract a number, to reach a dividable number. 27 is fine, since it is dividable by 9, so
(29-2)/9=3 =>
3*9+2 = 29
So maybe - I didn't think about this to the end: Search the next divisible by 9 number below Y. If you don't reach a number which is a digit, repeat.
The formula is the steps reversed.
(I'll try it for some numbers. :) )
I tried with 2551, which is
echo $((((3*9+4)*9+4)*9+4))
But I didn't test every intermediate result whether it is prime.
But
echo $((8*8*8*5-9))
is 2 operations less. Maybe I can investigate this later.

Why is my implementation of Atkin sieve is slower than Eratosthenes? [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 12 months ago.
Improve this question
I'm doing problems from Project Euler in Ruby and implemented Atkin's sieve for finding prime numbers but it runs slower than sieve of Eratosthenes. What is the problem?
def atkin_sieve(n)
primes = [2,3,5]
sieve = Array.new(n+1, false)
y_upper = n-4 > 0 ? Math.sqrt(n-4).truncate : 1
for x in (1..Math.sqrt(n/4).truncate)
for y in (1..y_upper)
k = 4*x**2 + y**2
sieve[k] = !sieve[k] if k%12 == 1 or k%12 == 5
end
end
y_upper = n-3 > 0 ? Math.sqrt(n-3).truncate : 1
for x in (1..Math.sqrt(n/3).truncate)
for y in (1..y_upper)
k = 3*x**2 + y**2
sieve[k] = !sieve[k] if k%12 == 7
end
end
for x in (1..Math.sqrt(n).truncate)
for y in (1..x)
k = 3*x**2 - y**2
if k < n and k%12 == 11
sieve[k] = !sieve[k]
end
end
end
for j in (5...n)
if sieve[j]
prime = true
for i in (0...primes.length)
if j % (primes[i]**2) == 0
prime = false
break
end
end
primes << j if prime
end
end
primes
end
def erato_sieve(n)
primes = []
for i in (2..n)
if primes.all?{|x| i % x != 0}
primes << i
end
end
primes
end
As Wikipedia says, "The modern sieve of Atkin is more complicated, but faster when properly optimized" (my emphasis).
The first obvious place to save some time in the first set of loops would be to stop iterating over y when 4*x**2 + y**2 is greater than n. For example, if n is 1,000,000 and x is 450, then you should stop iterating when y is greater than 435 (instead of continuing to 999 as you do at the moment). So you could rewrite the first loop as:
for x in (1..Math.sqrt(n/4).truncate)
X = 4 * x ** 2
for y in (1..Math.sqrt(n - X).truncate)
k = X + y ** 2
sieve[k] = !sieve[k] if k%12 == 1 or k%12 == 5
end
end
(This also avoids re-computing 4*x**2 each time round the loop, though that is probably a very small improvement, if any.)
Similar remarks apply, of course, to the other loops over y.
A second place where you could speed things up is in the strategy for looping over y. You loop over all values of y in the range, and then check to see which ones lead to values of k with the correct remainders modulo 12. Instead, you could just loop over the right values of y only, and avoid testing the remainders altogether.
If 4*x**2 is 4 modulo 12, then y**2 must be 1 or 9 modulo 12, and so y must be 1, 3, 5, 7, or 11 modulo 12. If 4*x**2 is 8 modulo 12, then y**2 must be 5 or 9 modulo 12, so y must be 3 or 9 modulo 12. And finally, if 4*x**2 is 0 modulo 12, then y**2 must be 1 or 5 modulo 12, so y must be 1, 5, 7, 9, or 11 modulo 12.
I also note that your sieve of Eratosthenes is doing useless work by testing divisibility by all primes below i. You can halt the iteration once you've test for divisibility by all primes less than or equal to the square root of i.
It would help a lot if you actually implemented the Sieve of Eratosthenes properly in the first place.
The critical feature of that sieve is that you only do one operation per time a prime divides a number. By contrast you are doing work for every prime less than the number. The difference is subtle, but the performance implications are huge.
Here is the actual sieve that you failed to implement:
def eratosthenes_primes(n)
primes = []
could_be_prime = (0..n).map{|i| true}
could_be_prime[0] = false
could_be_prime[1] = false
i = 0
while i*i <= n
if could_be_prime[i]
j = i*i
while j <= n
could_be_prime[j] = false
j += i
end
end
i += 1
end
return (2..n).find_all{|i| could_be_prime[i]}
end
Compare this with your code for finding all of the primes up to 50,000. Also note that this can easily be sped up by a factor of 2 by special casing the logic for even numbers. With that tweak, this algorithm should be fast enough for every Project Euler problem that needs you to compute a lot of primes.
#Gareth mentions some redundant calculations regarding 4x^2+y^2. Both here and in other places where you have calculations within a loop, you can make use of calculations you've already performed and reduce this to simple addition.
Rather than X=4 * x ** 2, you could rely on the fact that X already has the value of 4 * (x-1) ** 2. Since 4x^2 = 4(x-1)^2 + 4(2x - 1), all you need to do is add 8 * x - 4 to X. You can use this same trick for k, and the other places where you have repeated calculations (like 3x^2 + y^2).

Resources