Fastest algorithm to find factors of 2 and 5 - algorithm

I've read this question: Which is the fastest algorithm to find prime numbers?, but I'd like to do this only for 2 and 5 primes.
For example, the number 42000 is factorized as:
24 • 31 • 53 • 71
I'm only interested in finding the powers of 2 and 5: 4 and 3 in this example.
My naive approach is to successively divide by 2 while the remainder is 0, then successively divide by 5 while the remainder is 0.
The number of successful divisions (with zero remainder) are the powers of 2 and 5.
This involves performing (x + y + 2) divisions, where x is the power of 2 and y is the power of 5.
Is there a faster algorithm to find the powers of 2 and 5?

Following the conversation, I do think your idea is the fastest way to go, with one exception:
Division (in most cases) is expensive. On the other hand, checking the last digit of the number is (usually?) faster, so I would check the last digit (0/5 and 0/2/4/6/8) before dividing.

I am basing this off this comment by the OP:
my library is written in PHP and the number is actually stored as a string in base 10. That's not the most efficient indeed, but this is what worked best within the technical limits of the language.
If you are committed to strings-in-php, then the following pseudo-code will speed things up compared to actual general-purpose repeated modulus and division:
while the string ends in 0, but is not 0
chop a zero off the end,
increment ctr2 and ctr5
switch repeatedly depending on the last digit:
if it is a 5,
divide it by 5
increment ctr5
if it is 2, 4, 6, 8,
divide it by 2
increment ctr2
otherwise
you have finished
This does not require any modulus operations, and you can implement divide-by-5 and divide-by-2 cheaper than a general-purpose long-number division.
On the other hand, if you want performance, using string representations for unlimited-size integers is suicide. Use gmp (which has a php library) for your math, and convert to strings only when necessary.
edit:
you can gain extra efficiency (and simplify your operations) using the following pseudocode:
if the string is zero, terminate early
while the last non-zero character of the string is a '5',
add the string to itself
decrement ctr2
count the '0's at the end of the string into a ctr0
chop off ctr0 zeros from the string
ctr2 += ctr0
ctr5 += ctr0
while the last digit is 2, 4, 6, 8
divide the string by 2
increment ctr2
Chopping many 0s at once is better than looping. And mul2 beats div5 in terms of speed (it can be implemented by adding the number once).

If you have a billion digit number, you do not want to do divisions on it unless it's really necessary. If you don't have reason to believe that it is in the 1/2^1000 numbers divisible by 2^1000, then it makes sense to use much faster tests that only look at the last few digits. You can tell whether a number is divisible by 2 by looking at the last digit, whether it is divisible by 4 by looking at the last 2 digits, and by 2^n by looking at the last n digits. Similarly, you can tell whether a number is divisible by 5 by looking at the last digit, whether it is divisible by 25 by looking at the last 2 digits, and by 5^n by looking at the last n digits.
I suggest that you first count and remove the trailing 0s, then decide from the last digit whether you are testing for powers of 2 (last digit 2,4,6, or 8) or powers of 5 (last digit 5).
If you are testing for powers of 2, then take the last 2, 4, 8, 16, ... 2^i digits, and multiply this by 25, 625, ... 5^2^i, counting the trailing 0s up to 2^i (but not beyond). If you get fewer than 2^i trailing 0s, then stop.
If you are testing for powers of 5, then take the last 2, 4, 8, 16, ... 2^i digits, and multiply this by 4, 16, ... 2^2^i, counting the trailing 0s up to 2^i (but not beyond). If you get fewer than 2^i trailing 0s, then stop.
For example, suppose the number you are analyzing is 283,795,456. Multiply 56 by 25, you get 1400 which has 2 trailing 0s, continue. Multiply 5,456 by 625, you get 3,410,000, which has 4 trailing 0s, continue. Multiply 83,795,456 by 5^8=390,625, you get 32,732,600,000,000, which has 8 trailing 0s, continue. Multiply 283,795,456 by 5^16 to get 43,303,750,000,000,000,000 which has only 13 trailing 0s. That's less than 16, so stop, the power of 2 in the prime factorization is 2^13.
I hope that for larger multiplications you are implementing an n log n algorithm for multiplying n digit numbers, but even if you aren't, this technique should outperform anything involving division on typical large numbers.
Let's look at the average-case time complexity of various algorithms, assuming that each n-digit number is equally likely.
Addition or subtraction of two n-digit numbers takes theta(n) steps.
Dividing an n-digit number by a small number like 5 takes theta(n) steps. Dividing by the base is O(1).
Dividing an n-digit number by another large number takes theta(n log n) steps using the FFT, or theta(n^2) by a naive algorithm. The same is true for multiplication.
The algorithm of repeatedly dividing a base 10 number by 2 has an average case time complexity of theta(n): It takes theta(n) time for the first division, and on average, you need to do only O(1) divisions.
Computing a large power of 2 with at least n digits takes theta(n log n) by repeated squaring, or theta(n^2) with simple multiplication. Performing Euclid's algorithm to compute the GCD takes an average of theta(n) steps. Although divisions take theta(n log n) time, most of the steps can be done as repeated subtractions and it takes only theta(n) time to do those. It takes O(n^2 log log n) to perform Euclid's algorithm this way. Other improvements might bring this down to theta(n^2).
Checking the last digit for divisibility by 2 or 5 before performing a more expensive calculation is good, but it only results in a constant factor improvement. Applying the original algorithm after this still takes theta(n) steps on average.
Checking the last d digits for divisibility by 2^d or 5^d takes O(d^2) time, O(d log d) with the FFT. It is very likely that we only need to do this when d is small. The fraction of n-digit numbers divisible by 2^d is 1/2^d. So, the average time spent on these checks is O(sum(d^2 / 2^d)) and that sum is bounded independent of n, so it takes theta(1) time on average. When you use the last digits to check for divisibility, you usually don't have to do any operations on close to n digits.

depends on whether you're starting with a native binary number or some bigint string -
chopping off very long chains of trailing edge zeros in bigint strings are a lot easier than trying to extract powers of 2 and 5 separately - e.g. 23456789 x 10^66
23456789000000000000000000000000000000000000000000000000000000000000000000
This particular integer, on the surface, is 244-bits in total, requiring a 177-bit-wide mantissa (178-bit precision minus 1-bit implicit) to handle it losslessly, so even newer data types such as uint128 types won't suffice :
11010100011010101100101010010000110000101000100001000110100101
01011011111101001110100110111100001001010000110111110101101101
01001000011001110110010011010100001001101000010000110100000000
0000000000000000000000000000000000000000000000000000000000
The sequential approach is to spend 132 loop cycles in a bigint package to get them out ::
129 63 2932098625
130 64 586419725
131 65 117283945
132 66 23456789
133 2^66 x
5^66 x
23456789
But once you can quickly realize there's a chain of 66 trailing zeros, the bigint package becomes fully optional, since the residual digits is less than 24.5-bits in total width:
2^66
5^66
23456789

I think your algorithm will be the fastest. But I have a couple of suggestions.
One alternative is based on the greatest common divisor. Take the gcd of your input number with the smallest power of 2 greater than your input number; that will give you all the factors of 2. Divide by the gcd, then repeat the same operation with 5; that will give you all the factors of 5. Divide again by the gcd, and the remainder tells you if there are any other factors.
Another alternative is based on binary search. Split the binary representation of your input number in half; if the right half is 0, move left, otherwise move right. Once you have the factors of 2, divide, then apply the same algorithm on the remainder using powers of 5.
I'll leave it to you to implement and time these algorithms. But my gut feeling is that repeated division will be hard to beat.
I just read your comment that your input number is stored in base 10. In that case, divide repeatedly by 10 as long as the remainder is 0; that gives factors of both 2 and 5. Then apply your algorithm on the reduced number.

Related

How to implement large numbers?

So I tried to make a program that is about Multiplication Consistency. However, that required me to use large numbers. I decided to implement my own structure for handling them, as I thought it would be better to use something customized for the program. But I managed to design two ways to handle big numbers, and I do not know which is better for this case.
The program does a lot of divisions and number generation, without too much printing.
Number generation is done by taking a number, that is the multiplication of powers of 2, 3, 5 and 7 (2^a * 3*b * 5^c * 7^d) and generates all numbers, where the multiplication of the digits equals that number, excluding numbers containing 1s, because such digits can be added infinitely. Generation here is handled recursively, so I do not generate new numbers from scratch.
Once a number is generated, it is tested by which powers of 2, 3, 5 and 7 it is divisible, which is where the program does the divisions. This is where most of the program's work is done.
So, I came up with two ways to handle numbers: a straightforward string of single digits and the structure n * INT + k (n times the max integer value plus a smaller number; INT does not have to be explicitly int32).
Which would be faster?
Digits array:
Each digit is a number, so working with the string is working with numbers
Printing is as simple as printing an array
Division can be done with integers up to 9, or with other strings
The more digits, the longer the division process
Numbers are easy to build
Practically no limit to the length of the number
Testing if the number is divisible by some numbers (3, 9, 2^n and 5^n) can be sped up by looking at the digits, but that can become complex to implement
Division with this solution is done by scanning the array.
n * INT + k
The structure is about values
Printing requires to turn the values into a digit string
Division can be done with integers up to the maximum value, and each division makes the number smaller for further divisions
The larger the values, the slower the process
Numbers require a special algorithm to build
Maximum value is INT * INT + INT, so going around that limit would require further development
No digits, so testing for divisibility is done by dividing
Division of this structure is recursive: The result of the division is (n / d) * INT + k / d with a remain of (n % d) * INT + k % d. The remain is a number on its own, so it is further divided and the new result is added to the first.
I must note that I have not implemented dividing big numbers by big numbers, so big numbers get divided only by integers.
While the array of digits requires a lot of scanning while dividing, the other option can be slow at first, but the number can be tested for division with much larger numbers, which can quickly lower the value, and such divisions are faster on their own.
How are the two solutions compared?

Stuck on complexity analysis of a tricky program

Really stuck the complexity analysis of this problem .
Given digits 0–9 , we need to find all the numbers of max length k whose digits will be in increasing order .
for example if k = 3 , numbers can be 0,00,000,01,02,03,04,.... 1,11,111,12,...
So the question basically that if repetitions allowed for digits,
How many such combinations are possible to find all the numbers less than size k (less than digit length k) such that digits from left to right will be non-decreasing order.
Numbers with at most k digits that are weakly increasing are in 1-1 correspondence with binary strings of length k+10, with exactly ten 1's. The number of consecutive 0s just before the ith one and one in the binary string is the number of i digits in the original number. For example, if k=7, then 001119 maps to 00100011111111010 (2 zeros, 3 ones, 0 twos, 0 threes, ..., 0 eights, 1 nine, 1 digit left over to make the number of digits up to 7).
These binary strings are easy to count: there's choose(k+10, 10)-1 of them (missing one because the empty number is disallowed). This can be computed in O(1) arithmetic operations (actually 10 additions, 18 multiplications and one division).
I don't have enough reputation neither, so I cannot answer Paul's or Globe's answer.
Globe's answer choose(k+9,9) is not perfect, because it only counts the solutions where the numbers have exactly k digits. But the original problems allows numbers with less digits too.
Paul's answer choose(k+10,10) counts these shorter numbers too, but it also allows numbers with zero digits. Let's say k=7 then the following binary string describes a number with no digits: 11111111110000000. We have to exclude this one.
So the solution is: choose(k+10,10)-1
I don't have enough reputation to comment on Paul's answer, so I'm adding another answer. The formula isn't choose(k+10, 10) as specified by Paul, it's choose(k+9, 9).
For instance if we have k=2, choose(2+10, 10) gives us 66, when there are only 55 numbers that satisfy the property.
We pick stars and separators, where the separators divide our digits into buckets from 0 to 9, and stars tell us how many digits to pick from a bucket. (E.g. **|**||*||||||* corresponding to 001139)
The reasoning behind it being k+9 and not k+10 is as follows:
we have to pick 9 separators between 10 digits, so while we have k choices for the stars, we only have 9 choices for the separators.

What makes this prime factorization so efficient?

I've been doing some Project Euler problems to learn/practice Lua, and my initial quick-and-dirty way of finding the largest prime factor of nwas pretty bad, so I looked up some code to see how others were doing it (in attempts to understand different factoring methodologies).
I ran across the following (originally in Python - this is my Lua):
function Main()
local n = 102
local i = 2
while i^2 < n do
while n%i==0 do n = n / i end
i = i+1
end
print(n)
end
This factored huge numbers in a very short time - almost immediately. The thing I noticed about the algorithm that I wouldn't have divined:
n = n / i
This seems to be in all of the decent algorithms. I've worked it out on paper with smaller numbers and I can see that it makes the numbers converge, but I don't understand why this operation converges on the largest prime factor.
Can anyone explain?
In this case, i is the prime factor candidate. Consider, n is composed of the following prime numbers:
n = p1^n1 * p2^n2 * p3^n3
When i reaches p1, the statement n = n / i = n / p1 removes one occurrence of p1:
n / p1 = p1^(n-1) * p2^n2 * p3^n3
The inner while iterates as long as there are p1s in n. Thus, after the iteration is complete (when i = i + 1 is executed), all occurrences of p1 have been removed and:
n' = p2^n2 * p3^n3
Let's skip some iterations until i reaches p3. The remaining n is then:
n'' = p3^n3
Here, we find a first mistake in the code. If n3 is 2, then the outer condition does not hold and we remain with p3^2. It should be while i^2 <= n.
As before, the inner while removes all occurences of p3, leaving us with n'''=1. This is the second mistake. It should be while n%i==0 and n>i (not sure about the LUA syntax), which keeps the very last occurence.
So the above code works for all numbers n where the largest prime factor occurrs only once by successivley removing all other factors. For all other numbers, the mentioned corrections should make it work, too.
This eliminates all the known smaller prime factors off n so that n becomes smaller, and sqrt(n) can be reached earlier. This gives the performance boost, as you no longer need to run numbers to square root of original N, say if n is a million, it consists of 2's and 5's, and naive querying against all known primes would need to check against all primes up to 1000, while dividing this by 2 yields 15625, then dividing by 5 yields 1 (by the way, your algorithm will return 1! To fix, if your loop exits with n=1, return i instead.) effectively factoring the big number in two steps. But this is only acceptable with "common" numbers, that have a single high prime denominator and a bunch of smaller ones, but factoring a number n=p*q whiere both p and q are primes and are close won't be able to benefit from this boost.
The n=n/i line works because if you are seeking another prime than i you are currently found as a divisor, the result is also divisible by that prime, by definition of prime numbers. Read here: https://en.wikipedia.org/wiki/Fundamental_theorem_of_arithmetic . Also this only works in your case because your i runs from 2 upward, so that you first divide by primes then their composites. Otherwise, if your number would have a 3 as largest prime, is also divisible by 2 and you'd check against 6 first, you'd spoil the principle of only dividing by primes (say with 72, if you first divide by 6, you'll end up with 2, while the answer is 3) by accidentally dividing by a composite of a largest prime.
This algorithm (when corrected) takes O(max(p2,sqrt(p1))) steps to find the prime factorization of n, where p1 is the largest prime factor and the p2 is the second largest prime factor. In case of a repeated largest prime factor, p1=p2.
Knuth and Trabb Pardo studied the behavior of this function "Analysis of a Simple Factorization Algorithm" Theoretical Computer Science 3 (1976) 321-348. They argued against the usual analysis such as computing the average number of steps taken when factoring integers up to n. Although a few numbers with large prime factors boost the average value, in a cryptographic context what may be more relevant is that some of the percentiles are quite low. For example, 44.7% of numbers satisfy max(sqrt(p1),p2)<n^(1/3), and 1.2% of numbers satisfy max(sqrt(p1),p2)<n^(1/5).
A simple improvement is to test the remainder for primality after you find a new prime factor. It is very fast to test whether a number is prime. This reduces the time to O(p2) by avoiding the trial divisions between p2 and sqrt(p1). The median size of the second largest prime is about n^0.21. This means it is feasible to factor many 45-digit numbers rapidly (in a few processor-seconds) using this improvement on trial division. By comparison, Pollard-rho factorization on a product of two primes takes O(sqrt(p2)) steps on average, according to one model.

Distinct digit count

Is is possible to count the distinct digits in a number in constant time O(1)?
Suppose n=1519 output should be 3 as there are 3 distinct digits(1,5,9).
I have done it in O(N) time but anyone knows how to find it in O(1) time?
I assume N is the number of digits of n. If the size of n is unlimited, it can't be done in general in O(1) time.
Consider the number n=11111...111, with 2 trillion digits. If I switch one of the digits from a 1 to a 2, there is no way to discover this without in some way looking at every single digit. Thus processing a number with 2 trillion digits must take (of the order of) 2 trillion operations at least, and in general, a number with N digits must take (of the order of) N operations at least.
However, for almost all numbers, the simple O(N) algorithm finishes very quickly because you can just stop as soon as you get to 10 distinct digits. Almost all numbers of sufficient length will have all 10 digits: e.g. the probability of not terminating with the answer '10' after looking at the first 100 digits is about 0.00027, and after the first 1000 digits it's about 1.7e-45. But unfortunately, there are some oddities which make the worst case O(N).
After seeing that someone really posted a serious answer to this question, I'd rather repeat my own cheat here, which is a special case of the answer described by #SimonNickerson:
O(1) is not possible, unless you are on radix 2, because that way, every number other than 0 has both 1 and 0, and thus my "solution" works not only for integers...
EDIT
How about 2^k - 1? Isn't that all 1s?
Drat! True... I should have known that when something seems so easy, it is flawed somehow... If I got the all 0 case covered, I should have covered the all 1 case too.
Luckily this case can be tested quite quickly (if addition and bitwise AND are considered an O(1) operation): if x is the number to be tested, compute y this way: y=(x+1) AND x. If y=0, then x=2^k - 1. because this is the only case when all the bits needed to be flipped by the addition. Of course, this is quite a bit flawed, as with bit lengths exceeding the bus width, the bitwise operators are not O(1) anymore, but rather O(N).
At the same time, I think it can be brought down to O(logN), by breaking the number into bus width size chunks, and AND-ing together the neighboring ones, repeating until only one is left: if there were no 0s in the number tested, the last one will be full 1s too...
EDIT2: I was wrong... This is still O(N).

Perfect powers of numbers which can fit in 64 bit size integer (using priority queues)

How can we print out all perfect powers that can be represented as 64-bit long integers: 4, 8, 9, 16, 25, 27, .... A perfect power is a number that can be written as ab for integers a and b ≥ 2.
It's not a homework problem, I found it in job interview questions section of an algorithm design book. Hint, the chapter was based on priority queues.
Most of the ideas I have are quadratic in nature, that keep finding powers until they stop fitting 64 bit but that's not what an interviewer will look for. Also, I'm not able to understand how would PQ's help here.
Using a small priority queue, with one entry per power, is a reasonable way to list the numbers. See following python code.
import Queue # in Python 3 say: queue
pmax, vmax = 10, 150
Q=Queue.PriorityQueue(pmax)
p = 2
for e in range(2,pmax):
p *= 2
Q.put((p,2,e))
print 1,1,2
while not Q.empty():
(v, b, e) = Q.get()
if v < vmax:
print v, b, e
b += 1
Q.put((b**e, b, e))
With pmax, vmax as in the code above, it produces the following output. For the proposed problem, replace pmax and vmax with 64 and 2**64.
1 1 2
4 2 2
8 2 3
9 3 2
16 2 4
16 4 2
25 5 2
27 3 3
32 2 5
36 6 2
49 7 2
64 2 6
64 4 3
64 8 2
81 3 4
81 9 2
100 10 2
121 11 2
125 5 3
128 2 7
144 12 2
The complexity of this method is O(vmax^0.5 * log(pmax)). This is because the number of perfect squares is dominant over the number of perfect cubes, fourth powers, etc., and for each square we do O(log(pmax)) work for get and put queue operations. For higher powers, we do O(log(pmax)) work when computing b**e.
When vmax,pmax =64, 2**64, there will be about 2*(2^32 + 2^21 + 2^16 + 2^12 + ...) queue operations, ie about 2^33 queue ops.
Added note: This note addresses cf16's comment, “one remark only, I don't think "the number of perfect squares is dominant over the number of perfect cubes, fourth powers, etc." they all are infinite. but yes, if we consider finite set”. It is true that in the overall mathematical scheme of things, the cardinalities are the same. That is, if P(j) is the set of all j'th powers of integers, then the cardinality of P(j) == P(k) for all integers j,k > 0. Elements of any two sets of powers can be put into 1-1 correspondence with each other.
Nevertheless, when computing perfect powers in ascending order, no matter how many are computed, finite or not, the work of delivering squares dominates that for any other power. For any given x, the density of perfect kth powers in the region of x declines exponentially as k increases. As x increases, the density of perfect kth powers in the region of x is proportional to (x1/k)/x, hence third powers, fourth powers, etc become vanishingly rare compared to squares as x increases.
As a concrete example, among perfect powers between 1e8 and 1e9 the number of (2; 3; 4; 5; 6)th powers is about (21622; 535; 77; 24; 10). There are more than 30 times as many squares between 1e8 and 1e9 than there are instances of any higher powers than squares. Here are ratios of the number of perfect squares between two numbers, vs the number of higher perfect powers: 10¹⁰–10¹⁵, r≈301; 10¹⁵–10²⁰, r≈2K; 10²⁰–10²⁵, r≈15K; 10²⁵–10³⁰, r≈100K. In short, as x increases, squares dominate more and more when perfect powers are delivered in ascending order.
A priority queue helps, for example, if you want to avoid duplicates in the output, or if you want to list the values particularly sorted.
Priority queues can often be replaced by sorting and vice versa. You could therefore generate all combinations of ab, then sort the results and remove adjacent duplicates. In this application, this approach appears to be slightly but perhaps not drammatically memory-inefficient as witnessed by one of the sister answers.
A priority queue can be superior to sorting, if you manage to remove duplicates as you go; or if you want to avoid storing and processing the whole result to be generated in memory. The other sister answer is an example of the latter but it could easily do both with a slight modification.
Here it makes the difference between an array taking up ~16 GB of RAM and a queue with less than 64 items taking up several kilobytes at worst. Such a huge difference in memory consumption also translates to RAM access time versus cache access time difference, so the memory lean algorithm may end up much faster even if the underlying data structure incurs some overhead by maintaining itself and needs more instructions compared to the naive algorithm that uses sorting.
Because the size of the input is fixed, it is not technically possible that the methods you thought of have been quadratic in nature. Having two nested loops does not make an algorithm quadratic, until you can say that the upper bound of each such loop is proportional to input size, and often not even then). What really matters is how many times the innermost logic actually executes.
In this case the competition is between feasible constants and non-feasible constants.
The only way I can see the priority queue making much sense is that you want to print numbers as they become available, in strictly increasing order, and of course without printing any number twice. So you start off with a prime generator (that uses the sieve of eratosthenes or some smarter technique to generate the sequence 2, 3, 5, 7, 11, ...). You start by putting a triple representing the fact that 2^2 = 4 onto the queue. Then you repeat a process of removing the smallest item (the triple with the smallest exponentiation result) from the queue, printing it, increasing the exponent by one, and putting it back onto the queue (with its priority determined by the result of the new exponentiation). You interleave this process with one that generates new primes as needed (sometime before p^2 is output).
Since the largest exponent base we can possibly have is 2^32 (2^32)^2 = 2^64, the number of elements on the queue shouldn't exceed the number of primes less than 2^32, which is evidently 203,280,221, which I guess is a tractable number.

Resources