Given Prime Number N, Compute the Next Prime? - algorithm

A coworker just told me that the C# Dictionary collection resizes by prime numbers for arcane reasons relating to hashing. And my immediate question was, "how does it know what the next prime is? do they story a giant table or compute on the fly? that's a scary non-deterministic runtime on an insert causing a resize"
So my question is, given N, which is a prime number, what is the most efficient way to compute the next prime number?

About a year ago I was working this area for libc++ while implementing the
unordered (hash) containers for C++11. I thought I would share
my experiences here. This experience supports marcog's accepted answer for a
reasonable definition of "brute force".
That means that even a simple brute force will be fast enough in most
circumstances, taking O(ln(p)*sqrt(p)) on average.
I developed several implementations of size_t next_prime(size_t n) where the
spec for this function is:
Returns: The smallest prime that is greater than or equal to n.
Each implementation of next_prime is accompanied by a helper function is_prime. is_prime should be considered a private implementation detail; not meant to be called directly by the client. Each of these implementations was of course tested for correctness, but also
tested with the following performance test:
int main()
{
typedef std::chrono::high_resolution_clock Clock;
typedef std::chrono::duration<double, std::milli> ms;
Clock::time_point t0 = Clock::now();
std::size_t n = 100000000;
std::size_t e = 100000;
for (std::size_t i = 0; i < e; ++i)
n = next_prime(n+1);
Clock::time_point t1 = Clock::now();
std::cout << e/ms(t1-t0).count() << " primes/millisecond\n";
return n;
}
I should stress that this is a performance test, and does not reflect typical
usage, which would look more like:
// Overflow checking not shown for clarity purposes
n = next_prime(2*n + 1);
All performance tests were compiled with:
clang++ -stdlib=libc++ -O3 main.cpp
Implementation 1
There are seven implementations. The purpose for displaying the first
implementation is to demonstrate that if you fail to stop testing the candidate
prime x for factors at sqrt(x) then you have failed to even achieve an
implementation that could be classified as brute force. This implementation is
brutally slow.
bool
is_prime(std::size_t x)
{
if (x < 2)
return false;
for (std::size_t i = 2; i < x; ++i)
{
if (x % i == 0)
return false;
}
return true;
}
std::size_t
next_prime(std::size_t x)
{
for (; !is_prime(x); ++x)
;
return x;
}
For this implementation only I had to set e to 100 instead of 100000, just to
get a reasonable running time:
0.0015282 primes/millisecond
Implementation 2
This implementation is the slowest of the brute force implementations and the
only difference from implementation 1 is that it stops testing for primeness
when the factor surpasses sqrt(x).
bool
is_prime(std::size_t x)
{
if (x < 2)
return false;
for (std::size_t i = 2; true; ++i)
{
std::size_t q = x / i;
if (q < i)
return true;
if (x % i == 0)
return false;
}
return true;
}
std::size_t
next_prime(std::size_t x)
{
for (; !is_prime(x); ++x)
;
return x;
}
Note that sqrt(x) isn't directly computed, but inferred by q < i. This
speeds things up by a factor of thousands:
5.98576 primes/millisecond
and validates marcog's prediction:
... this is well within the constraints of
most problems taking on the order of
a millisecond on most modern hardware.
Implementation 3
One can nearly double the speed (at least on the hardware I'm using) by
avoiding use of the % operator:
bool
is_prime(std::size_t x)
{
if (x < 2)
return false;
for (std::size_t i = 2; true; ++i)
{
std::size_t q = x / i;
if (q < i)
return true;
if (x == q * i)
return false;
}
return true;
}
std::size_t
next_prime(std::size_t x)
{
for (; !is_prime(x); ++x)
;
return x;
}
11.0512 primes/millisecond
Implementation 4
So far I haven't even used the common knowledge that 2 is the only even prime.
This implementation incorporates that knowledge, nearly doubling the speed
again:
bool
is_prime(std::size_t x)
{
for (std::size_t i = 3; true; i += 2)
{
std::size_t q = x / i;
if (q < i)
return true;
if (x == q * i)
return false;
}
return true;
}
std::size_t
next_prime(std::size_t x)
{
if (x <= 2)
return 2;
if (!(x & 1))
++x;
for (; !is_prime(x); x += 2)
;
return x;
}
21.9846 primes/millisecond
Implementation 4 is probably what most people have in mind when they think
"brute force".
Implementation 5
Using the following formula you can easily choose all numbers which are
divisible by neither 2 nor 3:
6 * k + {1, 5}
where k >= 1. The following implementation uses this formula, but implemented
with a cute xor trick:
bool
is_prime(std::size_t x)
{
std::size_t o = 4;
for (std::size_t i = 5; true; i += o)
{
std::size_t q = x / i;
if (q < i)
return true;
if (x == q * i)
return false;
o ^= 6;
}
return true;
}
std::size_t
next_prime(std::size_t x)
{
switch (x)
{
case 0:
case 1:
case 2:
return 2;
case 3:
return 3;
case 4:
case 5:
return 5;
}
std::size_t k = x / 6;
std::size_t i = x - 6 * k;
std::size_t o = i < 2 ? 1 : 5;
x = 6 * k + o;
for (i = (3 + o) / 2; !is_prime(x); x += i)
i ^= 6;
return x;
}
This effectively means that the algorithm has to check only 1/3 of the
integers for primeness instead of 1/2 of the numbers and the performance test
shows the expected speed up of nearly 50%:
32.6167 primes/millisecond
Implementation 6
This implementation is a logical extension of implementation 5: It uses the
following formula to compute all numbers which are not divisible by 2, 3 and 5:
30 * k + {1, 7, 11, 13, 17, 19, 23, 29}
It also unrolls the inner loop within is_prime, and creates a list of "small
primes" that is useful for dealing with numbers less than 30.
static const std::size_t small_primes[] =
{
2,
3,
5,
7,
11,
13,
17,
19,
23,
29
};
static const std::size_t indices[] =
{
1,
7,
11,
13,
17,
19,
23,
29
};
bool
is_prime(std::size_t x)
{
const size_t N = sizeof(small_primes) / sizeof(small_primes[0]);
for (std::size_t i = 3; i < N; ++i)
{
const std::size_t p = small_primes[i];
const std::size_t q = x / p;
if (q < p)
return true;
if (x == q * p)
return false;
}
for (std::size_t i = 31; true;)
{
std::size_t q = x / i;
if (q < i)
return true;
if (x == q * i)
return false;
i += 6;
q = x / i;
if (q < i)
return true;
if (x == q * i)
return false;
i += 4;
q = x / i;
if (q < i)
return true;
if (x == q * i)
return false;
i += 2;
q = x / i;
if (q < i)
return true;
if (x == q * i)
return false;
i += 4;
q = x / i;
if (q < i)
return true;
if (x == q * i)
return false;
i += 2;
q = x / i;
if (q < i)
return true;
if (x == q * i)
return false;
i += 4;
q = x / i;
if (q < i)
return true;
if (x == q * i)
return false;
i += 6;
q = x / i;
if (q < i)
return true;
if (x == q * i)
return false;
i += 2;
}
return true;
}
std::size_t
next_prime(std::size_t n)
{
const size_t L = 30;
const size_t N = sizeof(small_primes) / sizeof(small_primes[0]);
// If n is small enough, search in small_primes
if (n <= small_primes[N-1])
return *std::lower_bound(small_primes, small_primes + N, n);
// Else n > largest small_primes
// Start searching list of potential primes: L * k0 + indices[in]
const size_t M = sizeof(indices) / sizeof(indices[0]);
// Select first potential prime >= n
// Known a-priori n >= L
size_t k0 = n / L;
size_t in = std::lower_bound(indices, indices + M, n - k0 * L) - indices;
n = L * k0 + indices[in];
while (!is_prime(n))
{
if (++in == M)
{
++k0;
in = 0;
}
n = L * k0 + indices[in];
}
return n;
}
This is arguably getting beyond "brute force" and is good for boosting the
speed another 27.5% to:
41.6026 primes/millisecond
Implementation 7
It is practical to play the above game for one more iteration, developing a
formula for numbers not divisible by 2, 3, 5 and 7:
210 * k + {1, 11, ...},
The source code isn't shown here, but is very similar to implementation 6.
This is the implementation I chose to actually use for the unordered containers
of libc++, and that source code is open source (found at the link).
This final iteration is good for another 14.6% speed boost to:
47.685 primes/millisecond
Use of this algorithm assures that clients of libc++'s hash tables can choose
any prime they decide is most beneficial to their situation, and the performance
for this application is quite acceptable.

Just in case somebody is curious:
Using reflector I determined that .Net uses a static class that contains a hard coded list of ~72 primes ranging up to 7199369 which is scans for the smallest prime that is at least twice the current size, and for sizes larger than that it computes the next prime by trial division of all odd numbers up to the sqrt of the potential number. This class is immutable and thread safe (i.e. larger primes are not stored for future use).

The gaps between consecutive prime numbers is known to be quite small, with the first gap of over 100 occurring for prime number 370261. That means that even a simple brute force will be fast enough in most circumstances, taking O(ln(p)*sqrt(p)) on average.
For p=10,000 that's O(921) operations. Bearing in mind we'll be performing this once every O(ln(p)) insertion (roughly speaking), this is well within the constraints of most problems taking on the order of a millisecond on most modern hardware.

A nice trick is to use a partial sieve. For example, what is the next prime that follows the number N = 2534536543556?
Check the modulus of N with respect to a list of small primes. Thus...
mod(2534536543556,[3 5 7 11 13 17 19 23 29 31 37])
ans =
2 1 3 6 4 1 3 4 22 16 25
We know that the next prime following N must be an odd number, and we can immediately discard all odd multiples of this list of small primes. These moduli allow us to sieve out multiples of those small primes. Were we to use the small primes up to 200, we can use this scheme to immediately discard most potential prime numbers greater than N, except for a small list.
More explicitly, if we are looking for a prime number beyond 2534536543556, it cannot be divisible by 2, so we need consider only the odd numbers beyond that value. The moduli above show that 2534536543556 is congruent to 2 mod 3, therefore 2534536543556+1 is congruent to 0 mod 3, as must be 2534536543556+7, 2534536543556+13, etc. Effectively, we can sieve out many of the numbers without any need to test them for primality and without any trial divisions.
Similarly, the fact that
mod(2534536543556,7) = 3
tells us that 2534536543556+4 is congruent to 0 mod 7. Of course, that number is even, so we can ignore it. But 2534536543556+11 is an odd number that is divisible by 7, as is 2534536543556+25, etc. Again, we can exclude these numbers as clearly composite (because they are divisible by 7) and so not prime.
Using only the small list of primes up to 37, we can exclude most of the numbers that immediately follow our starting point of 2534536543556, only excepting a few:
{2534536543573 , 2534536543579 , 2534536543597}
Of those numbers, are they prime?
2534536543573 = 1430239 * 1772107
2534536543579 = 99833 * 25387763
I've made the effort of providing the prime factorizations of the first two numbers in the list. See that they are composite, but the prime factors are large. Of course, this makes sense, since we've already ensured that no number that remains can have small prime factors. The third one in our short list (2534536543597) is in fact the very first prime number beyond N. The sieving scheme I've described will tend to result in numbers that are either prime, or are composed of generally large prime factors. So we needed to actually apply an explicit test for primality to only a few numbers before finding the next prime.
A similar scheme quickly yields the next prime exceeding N = 1000000000000000000000000000, as 1000000000000000000000000103.

Just a few experiments with inter-primes distance.
This is a complement to visualize other answers.
I got the primes from the 100.000th (=1,299,709) to the 200.000th (=2,750,159)
Some data:
Maximum interprime distance = 148
Mean interprime distance = 15
Interprime distance frequency plot:
Interprime Distance vs Prime Number
Just to see it's "random". However ...

There's no function f(n) to calculate the next prime number. Instead a number must be tested for primality.
It is also very useful, when finding the nth prime number, to already know all prime numbers from the 1st up to (n-1)th, because those are the only numbers that need to be tested as factors.
As a result of these reasons, I would not be surprised if there is a precalculated set of large prime numbers. It doesn't really make sense to me if certain primes needed to be recalculated repeatedly.

As others have already noted, a means of finding the next prime number given the current prime has not been found yet. Therefore most algorithms focus more on using a fast means of checking primality since you have to check n/2 of the numbers between your known prime and the next one.
Depending upon the application, you can also get away with just hard-coding a look-up table, as noted by Paul Wheeler.

For sheer novelty, there’s always this approach:
#!/usr/bin/perl
for $p ( 2 .. 200 ) {
next if (1x$p) =~ /^(11+)\1+$/;
for ($n=1x(1+$p); $n =~ /^(11+)\1+$/; $n.=1) { }
printf "next prime after %d is %d\n", $p, length($n);
}
which produces
next prime after 2 is 3
next prime after 3 is 5
next prime after 5 is 7
next prime after 7 is 11
next prime after 11 is 13
next prime after 13 is 17
next prime after 17 is 19
next prime after 19 is 23
next prime after 23 is 29
next prime after 29 is 31
next prime after 31 is 37
next prime after 37 is 41
next prime after 41 is 43
next prime after 43 is 47
next prime after 47 is 53
next prime after 53 is 59
next prime after 59 is 61
next prime after 61 is 67
next prime after 67 is 71
next prime after 71 is 73
next prime after 73 is 79
next prime after 79 is 83
next prime after 83 is 89
next prime after 89 is 97
next prime after 97 is 101
next prime after 101 is 103
next prime after 103 is 107
next prime after 107 is 109
next prime after 109 is 113
next prime after 113 is 127
next prime after 127 is 131
next prime after 131 is 137
next prime after 137 is 139
next prime after 139 is 149
next prime after 149 is 151
next prime after 151 is 157
next prime after 157 is 163
next prime after 163 is 167
next prime after 167 is 173
next prime after 173 is 179
next prime after 179 is 181
next prime after 181 is 191
next prime after 191 is 193
next prime after 193 is 197
next prime after 197 is 199
next prime after 199 is 211
All fun and games aside, it is well known that the optimal hash table size is rigorously provably a prime number of the form 4N−1. So just finding the next prime is insufficient. You have to do the other check, too.

As far as I remember, it uses prime number next to the double of current size. It doesn't calculate that prime number - there table with preloaded numbers up to some big value (do not exactly, something about around 10,000,000). When that number is reached, it uses some naive algorithm to get next number (like curNum=curNum+1) and validates it using some if these methods: http://en.wikipedia.org/wiki/Prime_number#Verifying_primality

Related

Divide a number by 3 without using division, multiplication or modulus

Without using /, % and * operators, write a function to divide a number by 3. itoa() is available.
The above was asked from me in an interview and I couldn't really come up with an answer. I thought of converting the number to a string and adding all the digits, but that will just tell me whether number is divisible or not. Or, by repeated subtraction it can also tell me the remainder. But, how do I obtain the quotient on division?
The below code takes in 2 integers, and divides the first by the second. It supports negative numbers.
int divide (int a, int b) {
if (b == 0)
//throw division by zero error
//isPos is used to check whether the answer is positive or negative
int isPos = 1;
//if the signs are different, the answer will be negative
if ((a < 0 && b > 0) || (a > 0 && b < 0))
int isPos = 0;
a = Math.abs(a);
b = Math.abs(b);
int ans = 0;
while (a >= b) {
a = a-b;
ans++;
}
if (isPos)
return 0-ans;
return ans;
}
According to itoa the number is integer.
int divide(int a, int b)
{
int n=0;
while(1)
{
a-=b;
if(a<b)
{
n=n+1;
return n;
}
else
n=n+1;
}
}
Just count how many times b in a by subtracting it
Edit: Removed the limit
The "count how many times you subtract 3" algorithm takes theta(|input|) steps. You could argue that theta(|input|) is fine for 32-bit integers, in which case why do any programming? Just use a lookup table. However, there are much faster methods which can be used for larger inputs.
You can perform a binary search for the quotient, testing whether a candidate quotient q is too large or too small by comparing q+q+q with the input. Binary search takes theta(log |input|) time.
Binary search uses division by 2, which can be done by the shift operator instead of /, or you can implement this yourself on arrays of bits if the shift operator is too close to division.
It is tempting to use the fact that 1/3 is the sum of the geometric series 1/4 + 1/16 + 1/64 + 1/256 + ... by trying (n>>2) + (n>>4) + (n>>6) + ... however this produces the wrong answer for n=3,6,7,9, 11, 12, 13, 14, 15, 18, ... It is off by two for n=15,30,31, 39, .... In general, this is off by O(log n). For n nonnegative,
(n>>2) + (n>>4) + (n>>6) + ... = (n-wt4(n))/3
where wt4(n) is the sum of the base 4 digits of n, and the / on the right hand side is exact, not integer division. We can compute n/3 by adding wt4(n)/3 to (n>>2)+(n>>4)+(n>>6)+... We can compute the base 4 digits of n and therefore wt4(n) using only addition and the right shift.
int oneThirdOf(int n){
if (0<=n && n<3)
return 0;
if (n==3)
return 1;
return sum(n) + oneThirdOf(wt4(n));
}
// Compute (n>>2) + (n>>4) + (n>>6) + ... recursively.
int sum(int n){
if (n<4)
return 0;
return (n>>2) + sum(n>>2);
}
// Compute the sum of the digits of n base 4 recursively.
int wt4(int n){
if (n<4)
return n;
int fourth = n>>2;
int lastDigit = n-fourth-fourth-fourth-fourth;
return wt4(fourth) + lastDigit;
}
This also takes theta(log input) steps.

Most efficient method of generating a random number with a fixed number of bits set

I need to generate a random number, but it needs to be selected from the set of binary numbers with equal numbers of set bits. E.g. choose a random byte value with exactly 2 bits set...
00000000 - no
00000001 - no
00000010 - no
00000011 - YES
00000100 - no
00000101 - YES
00000110 - YES
...
=> Set of possible numbers 3, 5, 6...
Note that this is a simplified set of numbers. Think more along the lines of 'Choose a random 64-bit number with exactly 40 bits set'. Each number from the set must be equally likely to arise.
Do a random selection from the set of all bit positions, then set those bits.
Example in Python:
def random_bits(word_size, bit_count):
number = 0
for bit in random.sample(range(word_size), bit_count):
number |= 1 << bit
return number
Results of running the above 10 times:
0xb1f69da5cb867efbL
0xfceff3c3e16ea92dL
0xecaea89655befe77L
0xbf7d57a9b62f338bL
0x8cd1fee76f2c69f7L
0x8563bfc6d9df32dfL
0xdf0cdaebf0177e5fL
0xf7ab75fe3e2d11c7L
0x97f9f1cbb1f9e2f8L
0x7f7f075de5b73362L
I have found an elegant solution: random-dichotomy.
Idea is that on average:
and with a random number is dividing by 2 the number of set bits,
or is adding 50% of set bits.
C code to compile with gcc (to have __builtin_popcountll):
#include <assert.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
/// Return a random number, with nb_bits bits set out of the width LSB
uint64_t random_bits(uint8_t width, uint8_t nb_bits)
{
assert(nb_bits <= width);
assert(width <= 64);
uint64_t x_min = 0;
uint64_t x_max = width == 64 ? (uint64_t)-1 : (1UL<<width)-1;
int n = 0;
while (n != nb_bits)
{
// generate a random value of at least width bits
uint64_t x = random();
if (width > 31)
x ^= random() << 31;
if (width > 62)
x ^= random() << 33;
x = x_min | (x & x_max); // x_min is a subset of x, which is a subset of x_max
n = __builtin_popcountll(x);
printf("x_min = 0x%016lX, %d bits\n", x_min, __builtin_popcountll(x_min));
printf("x_max = 0x%016lX, %d bits\n", x_max, __builtin_popcountll(x_max));
printf("x = 0x%016lX, %d bits\n\n", x, n);
if (n > nb_bits)
x_max = x;
else
x_min = x;
}
return x_min;
}
In general less than 10 loops are needed to reach the requested number of bits (and with luck it can take 2 or 3 loops). Corner cases (nb_bits=0,1,width-1,width) are working even if a special case would be faster.
Example of result:
x_min = 0x0000000000000000, 0 bits
x_max = 0x1FFFFFFFFFFFFFFF, 61 bits
x = 0x1492717D79B2F570, 33 bits
x_min = 0x0000000000000000, 0 bits
x_max = 0x1492717D79B2F570, 33 bits
x = 0x1000202C70305120, 14 bits
x_min = 0x0000000000000000, 0 bits
x_max = 0x1000202C70305120, 14 bits
x = 0x0000200C10200120, 7 bits
x_min = 0x0000200C10200120, 7 bits
x_max = 0x1000202C70305120, 14 bits
x = 0x1000200C70200120, 10 bits
x_min = 0x1000200C70200120, 10 bits
x_max = 0x1000202C70305120, 14 bits
x = 0x1000200C70201120, 11 bits
x_min = 0x1000200C70201120, 11 bits
x_max = 0x1000202C70305120, 14 bits
x = 0x1000200C70301120, 12 bits
width = 61, nb_bits = 12, x = 0x1000200C70301120
Of course, you need a good prng. Otherwise you can face an infinite loop.
Say the number of bits to set is b and the word size is w. I would create a vector v of of length w with the first b values set to 1 and the rest set to 0. Then just shuffle v.
Here is another option which is very simple and reasonably fast in practice.
choose a bit at random
if it is already set
do nothing
else
set it
increment count
end if
Repeat until count equals the number of bits you want set.
This will only be slow when the number of bits you want set (call it k) is more than half the word length (call it N). In that case, use the algorithm to set N - k bits instead and then flip all the bits in the result.
I bet the expected running time here is pretty good, although I am too lazy/stupid to compute it precisely right now. But I can bound it as less than 2*k... The expected number of flips of a coin to get "heads" is two, and each iteration here has a better than 1/2 chance of succeeding.
If you don't have the convenience of Python's random.sample, you might do this in C using the classic sequential sampling algorithm:
unsigned long k_bit_helper(int n, int k, unsigned long bit, unsigned long accum) {
if !(n && k)
return accum;
if (k > rand() % n)
return k_bit_helper(n - 1, k - 1, bit + bit, accum + bit);
else
return k_bit_helper(n - 1, k, bit + bit, accum);
}
unsigned long random_k_bits(int k) {
return k_bit_helper(64, k, 1, 0);
}
The cost of the above will be dominated by the cost of generating the random numbers (true in the other solutions, also). You can optimize this a bit if you have a good prng by batching: for example, since you know that the random numbers will be in steadily decreasing ranges, you could get the random numbers for n through n-3 by getting a random number in the range 0..(n * (n - 1) * (n - 2) * (n - 3)) and then extracting the individual random numbers:
r = randint(0, n * (n - 1) * (n - 2) * (n - 3) - 1);
rn = r % n; r /= n
rn1 = r % (n - 1); r /= (n - 1);
rn2 = r % (n - 2); r /= (n - 2);
rn3 = r % (n - 3); r /= (n - 3);
The maximum value of n is presumably 64 or 26, so the maximum value of the product above is certainly less than 224. Indeed, if you used a 64-bit prng, you could extract as many as 10 random numbers out of it. However, don't do this unless you know the prng you use produces independently random bits.
I have another suggestion based on enumeration: choose a random number i between 1 and n choose k, and generate the i-th combination. For example, for n = 6, k = 3 the 20 combinations are:
000111
001011
010011
100011
001101
010101
100101
011001
101001
110001
001110
010110
100110
011010
101010
110010
011100
101100
110100
111000
Let's say we randomly choose combination number 7. We first check whether it has a 1 in the last position: it has, because the first 10 (5 choose 2) combinations have. We then recursively check the remaining positions. Here is some C++ code:
word ithCombination(int n, int k, word i) {
// i is zero-based
word x = 0;
word b = 1;
while (k) {
word c = binCoeff[n - 1][k - 1];
if (i < c) {
x |= b;
--k;
} else {
i -= c;
}
--n;
b <<= 1;
}
return x;
}
word randomKBits(int k) {
word i = randomRange(0, binCoeff[BITS_PER_WORD][k] - 1);
return ithCombination(BITS_PER_WORD, k, i);
}
To be fast, we use precalculated binomial coefficients in binCoeff. The function randomRange returns a random integer between the two bounds (inclusively).
I did some timings (source). With the C++11 default random number generator, most time is spent in generating random numbers. Then this solution is fastest, since it uses the absolute minimum number of random bits possible. If I use a fast random number generator, then the solution by mic006 is fastest. If k is known to be very small, it's best to just randomly set bits until k are set.
Not exactly an algorithm suggestion, but just found a really neat solution in JavaScript to get random bits directly from Math.random output bits using ArrayBuffer.
//Swap var out with const and let for maximum performance! I like to use var because of prototyping ease
var randomBitList = function(n){
var floats = Math.ceil(n/64)+1;
var buff = new ArrayBuffer(floats*8);
var floatView = new Float64Array(buff);
var int8View = new Uint8Array(buff);
var intView = new Int32Array(buff);
for(var i = 0; i < (floats-1)*2; i++){
floatView[floats-1] = Math.random();
int8View[(floats-1)*8] = int8View[(floats-1)*8+4];
intView[i] = intView[(floats-1)*2];
}
this.get = function(idx){
var i = idx>>5;//divide by 32
var j = idx%32;
return (intView[i]>>j)&1;
//return Math.random()>0.5?0:1;
};
this.getBitList = function(){
var arr = [];
for(var idx = 0; idx < n; idx++){
var i = idx>>5;//divide by 32
var j = idx%32;
arr[idx] = (intView[i]>>j)&1;
}
return arr;
}
};

Segmented Sieve of Eratosthenes?

It's easy enough to make a simple sieve:
for (int i=2; i<=N; i++){
if (sieve[i]==0){
cout << i << " is prime" << endl;
for (int j = i; j<=N; j+=i){
sieve[j]=1;
}
}
cout << i << " has " << sieve[i] << " distinct prime factors\n";
}
But what about when N is very large and I can't hold that kind of array in memory? I've looked up segmented sieve approaches and they seem to involve finding primes up until sqrt(N) but I don't understand how it works. What if N is very large (say 10^18)?
The basic idea of a segmented sieve is to choose the sieving primes less than the square root of n, choose a reasonably large segment size that nevertheless fits in memory, and then sieve each of the segments in turn, starting with the smallest. At the first segment, the smallest multiple of each sieving prime that is within the segment is calculated, then multiples of the sieving prime are marked as composite in the normal way; when all the sieving primes have been used, the remaining unmarked numbers in the segment are prime. Then, for the next segment, for each sieving prime you already know the first multiple in the current segment (it was the multiple that ended the sieving for that prime in the prior segment), so you sieve on each sieving prime, and so on until you are finished.
The size of n doesn't matter, except that a larger n will take longer to sieve than a smaller n; the size that matters is the size of the segment, which should be as large as convenient (say, the size of the primary memory cache on the machine).
You can see a simple implementation of a segmented sieve here. Note that a segmented sieve will be very much faster than O'Neill's priority-queue sieve mentioned in another answer; if you're interested, there's an implementation here.
EDIT: I wrote this for a different purpose, but I'll show it here because it might be useful:
Though the Sieve of Eratosthenes is very fast, it requires O(n) space. That can be reduced to O(sqrt(n)) for the sieving primes plus O(1) for the bitarray by performing the sieving in successive segments. At the first segment, the smallest multiple of each sieving prime that is within the segment is calculated, then multiples of the sieving prime are marked composite in the normal way; when all the sieving primes have been used, the remaining unmarked numbers in the segment are prime. Then, for the next segment, the smallest multiple of each sieving prime is the multiple that ended the sieving in the prior segment, and so the sieving continues until finished.
Consider the example of sieve from 100 to 200 in segments of 20. The five sieving primes are 3, 5, 7, 11 and 13. In the first segment from 100 to 120, the bitarray has ten slots, with slot 0 corresponding to 101, slot k corresponding to 100+2k+1, and slot 9 corresponding to 119. The smallest multiple of 3 in the segment is 105, corresponding to slot 2; slots 2+3=5 and 5+3=8 are also multiples of 3. The smallest multiple of 5 is 105 at slot 2, and slot 2+5=7 is also a multiple of 5. The smallest multiple of 7 is 105 at slot 2, and slot 2+7=9 is also a multiple of 7. And so on.
Function primesRange takes arguments lo, hi and delta; lo and hi must be even, with lo < hi, and lo must be greater than sqrt(hi). The segment size is twice delta. Ps is a linked list containing the sieving primes less than sqrt(hi), with 2 removed since even numbers are ignored. Qs is a linked list containing the offest into the sieve bitarray of the smallest multiple in the current segment of the corresponding sieving prime. After each segment, lo advances by twice delta, so the number corresponding to an index i of the sieve bitarray is lo + 2i + 1.
function primesRange(lo, hi, delta)
function qInit(p)
return (-1/2 * (lo + p + 1)) % p
function qReset(p, q)
return (q - delta) % p
sieve := makeArray(0..delta-1)
ps := tail(primes(sqrt(hi)))
qs := map(qInit, ps)
while lo < hi
for i from 0 to delta-1
sieve[i] := True
for p,q in ps,qs
for i from q to delta step p
sieve[i] := False
qs := map(qReset, ps, qs)
for i,t from 0,lo+1 to delta-1,hi step 1,2
if sieve[i]
output t
lo := lo + 2 * delta
When called as primesRange(100, 200, 10), the sieving primes ps are [3, 5, 7, 11, 13]; qs is initially [2, 2, 2, 10, 8] corresponding to smallest multiples 105, 105, 105, 121 and 117, and is reset for the second segment to [1, 2, 6, 0, 11] corresponding to smallest multiples 123, 125, 133, 121 and 143.
You can see this program in action at http://ideone.com/iHYr1f. And in addition to the links shown above, if you are interested in programming with prime numbers I modestly recommend this essay at my blog.
It's just that we are making segmented with the sieve we have.
The basic idea is let's say we have to find out prime numbers between 85 and 100.
We have to apply the traditional sieve,but in the fashion as described below:
So we take the first prime number 2 , divide the starting number by 2(85/2) and taking round off to smaller number we get p=42,now multiply again by 2 we get p=84, from here onwards start adding 2 till the last number.So what we have done is that we have removed all the factors of 2(86,88,90,92,94,96,98,100) in the range.
We take the next prime number 3 , divide the starting number by 3(85/3) and taking round off to smaller number we get p=28,now multiply again by 3 we get p=84, from here onwards start adding 3 till the last number.So what we have done is that we have removed all the factors of 3(87,90,93,96,99) in the range.
Take the next prime number=5 and so on..................
Keep on doing the above steps.You can get the prime numbers (2,3,5,7,...) by using the traditional sieve upto sqrt(n).And then use it for segmented sieve.
There's a version of the Sieve based on priority queues that yields as many primes as you request, rather than all of them up to an upper bound. It's discussed in the classic paper "The Genuine Sieve of Eratosthenes" and googling for "sieve of eratosthenes priority queue" turns up quite a few implementations in various programming languages.
If someone would like to see C++ implementation, here is mine:
void sito_delta( int delta, std::vector<int> &res)
{
std::unique_ptr<int[]> results(new int[delta+1]);
for(int i = 0; i <= delta; ++i)
results[i] = 1;
int pierw = sqrt(delta);
for (int j = 2; j <= pierw; ++j)
{
if(results[j])
{
for (int k = 2*j; k <= delta; k+=j)
{
results[k]=0;
}
}
}
for (int m = 2; m <= delta; ++m)
if (results[m])
{
res.push_back(m);
std::cout<<","<<m;
}
};
void sito_segment(int n,std::vector<int> &fiPri)
{
int delta = sqrt(n);
if (delta>10)
{
sito_segment(delta,fiPri);
// COmpute using fiPri as primes
// n=n,prime = fiPri;
std::vector<int> prime=fiPri;
int offset = delta;
int low = offset;
int high = offset * 2;
while (low < n)
{
if (high >=n ) high = n;
int mark[offset+1];
for (int s=0;s<=offset;++s)
mark[s]=1;
for(int j = 0; j< prime.size(); ++j)
{
int lowMinimum = (low/prime[j]) * prime[j];
if(lowMinimum < low)
lowMinimum += prime[j];
for(int k = lowMinimum; k<=high;k+=prime[j])
mark[k-low]=0;
}
for(int i = low; i <= high; i++)
if(mark[i-low])
{
fiPri.push_back(i);
std::cout<<","<<i;
}
low=low+offset;
high=high+offset;
}
}
else
{
std::vector<int> prime;
sito_delta(delta, prime);
//
fiPri = prime;
//
int offset = delta;
int low = offset;
int high = offset * 2;
// Process segments one by one
while (low < n)
{
if (high >= n) high = n;
int mark[offset+1];
for (int s = 0; s <= offset; ++s)
mark[s] = 1;
for (int j = 0; j < prime.size(); ++j)
{
// find the minimum number in [low..high] that is
// multiple of prime[i] (divisible by prime[j])
int lowMinimum = (low/prime[j]) * prime[j];
if(lowMinimum < low)
lowMinimum += prime[j];
//Mark multiples of prime[i] in [low..high]
for (int k = lowMinimum; k <= high; k+=prime[j])
mark[k-low] = 0;
}
for (int i = low; i <= high; i++)
if(mark[i-low])
{
fiPri.push_back(i);
std::cout<<","<<i;
}
low = low + offset;
high = high + offset;
}
}
};
int main()
{
std::vector<int> fiPri;
sito_segment(1013,fiPri);
}
Based on Swapnil Kumar answer I did the following algorithm in C. It was built with mingw32-make.exe.
#include<math.h>
#include<stdio.h>
#include<stdlib.h>
int main()
{
const int MAX_PRIME_NUMBERS = 5000000;//The number of prime numbers we are looking for
long long *prime_numbers = malloc(sizeof(long long) * MAX_PRIME_NUMBERS);
prime_numbers[0] = 2;
prime_numbers[1] = 3;
prime_numbers[2] = 5;
prime_numbers[3] = 7;
prime_numbers[4] = 11;
prime_numbers[5] = 13;
prime_numbers[6] = 17;
prime_numbers[7] = 19;
prime_numbers[8] = 23;
prime_numbers[9] = 29;
const int BUFFER_POSSIBLE_PRIMES = 29 * 29;//Because the greatest prime number we have is 29 in the 10th position so I started with a block of 841 numbers
int qt_calculated_primes = 10;//10 because we initialized the array with the ten first primes
int possible_primes[BUFFER_POSSIBLE_PRIMES];//Will store the booleans to check valid primes
long long iteration = 0;//Used as multiplier to the range of the buffer possible_primes
int i;//Simple counter for loops
while(qt_calculated_primes < MAX_PRIME_NUMBERS)
{
for (i = 0; i < BUFFER_POSSIBLE_PRIMES; i++)
possible_primes[i] = 1;//set the number as prime
int biggest_possible_prime = sqrt((iteration + 1) * BUFFER_POSSIBLE_PRIMES);
int k = 0;
long long prime = prime_numbers[k];//First prime to be used in the check
while (prime <= biggest_possible_prime)//We don't need to check primes bigger than the square root
{
for (i = 0; i < BUFFER_POSSIBLE_PRIMES; i++)
if ((iteration * BUFFER_POSSIBLE_PRIMES + i) % prime == 0)
possible_primes[i] = 0;
if (++k == qt_calculated_primes)
break;
prime = prime_numbers[k];
}
for (i = 0; i < BUFFER_POSSIBLE_PRIMES; i++)
if (possible_primes[i])
{
if ((qt_calculated_primes < MAX_PRIME_NUMBERS) && ((iteration * BUFFER_POSSIBLE_PRIMES + i) != 1))
{
prime_numbers[qt_calculated_primes] = iteration * BUFFER_POSSIBLE_PRIMES + i;
printf("%d\n", prime_numbers[qt_calculated_primes]);
qt_calculated_primes++;
} else if (!(qt_calculated_primes < MAX_PRIME_NUMBERS))
break;
}
iteration++;
}
return 0;
}
It set a maximum of prime numbers to be found, then an array is initialized with known prime numbers like 2, 3, 5...29. So we make a buffer that will store the segments of possible primes, this buffer can't be greater than the power of the greatest initial prime that in this case is 29.
I'm sure there are a plenty of optimizations that can be done to improve the performance like parallelize the segments analysis process and skip numbers that are multiple of 2, 3 and 5 but it serves as an example of low memory consumption.
A number is prime if none of the smaller prime numbers divides it. Since we iterate over the prime numbers in order, we already marked all numbers, who are divisible by at least one of the prime numbers, as divisible. Hence if we reach a cell and it is not marked, then it isn't divisible by any smaller prime number and therefore has to be prime.
Remember these points:-
// Generating all prime number up to R
// creating an array of size (R-L-1) set all elements to be true: prime && false: composite
#include<bits/stdc++.h>
using namespace std;
#define MAX 100001
vector<int>* sieve(){
bool isPrime[MAX];
for(int i=0;i<MAX;i++){
isPrime[i]=true;
}
for(int i=2;i*i<MAX;i++){
if(isPrime[i]){
for(int j=i*i;j<MAX;j+=i){
isPrime[j]=false;
}
}
}
vector<int>* primes = new vector<int>();
primes->push_back(2);
for(int i=3;i<MAX;i+=2){
if(isPrime[i]){
primes->push_back(i);
}
}
return primes;
}
void printPrimes(long long l, long long r, vector<int>*&primes){
bool isprimes[r-l+1];
for(int i=0;i<=r-l;i++){
isprimes[i]=true;
}
for(int i=0;primes->at(i)*(long long)primes->at(i)<=r;i++){
int currPrimes=primes->at(i);
//just smaller or equal value to l
long long base =(l/(currPrimes))*(currPrimes);
if(base<l){
base=base+currPrimes;
}
//mark all multiplies within L to R as false
for(long long j=base;j<=r;j+=currPrimes){
isprimes[j-l]=false;
}
//there may be a case where base is itself a prime number
if(base==currPrimes){
isprimes[base-l]= true;
}
}
for(int i=0;i<=r-l;i++){
if(isprimes[i]==true){
cout<<i+l<<endl;
}
}
}
int main(){
vector<int>* primes=sieve();
int t;
cin>>t;
while(t--){
long long l,r;
cin>>l>>r;
printPrimes(l,r,primes);
}
return 0;
}

How to check if an integer is a power of 3?

I saw this question, and pop up this idea.
There exists a constant time (pretty fast) method for integers of limited size (e.g. 32-bit integers).
Note that for an integer N that is a power of 3 the following is true:
For any M <= N that is a power of 3, M divides N.
For any M <= N that is not a power 3, M does not divide N.
The biggest power of 3 that fits into 32 bits is 3486784401 (3^20). This gives the following code:
bool isPower3(std::uint32_t value) {
return value != 0 && 3486784401u % value == 0;
}
Similarly for signed 32 bits it is 1162261467 (3^19):
bool isPower3(std::int32_t value) {
return value > 0 && 1162261467 % value == 0;
}
In general the magic number is:
== pow(3, floor(log(MAX) / log(3)))
Careful with floating point rounding errors, use a math calculator like Wolfram Alpha to calculate the constant. For example for 2^63-1 (signed int64) both C++ and Java give 4052555153018976256, but the correct value is 4052555153018976267.
while (n % 3 == 0) {
n /= 3;
}
return n == 1;
Note that 1 is the zeroth power of three.
Edit: You also need to check for zero before the loop, as the loop will not terminate for n = 0 (thanks to Bruno Rothgiesser).
I find myself slightly thinking that if by 'integer' you mean 'signed 32-bit integer', then (pseudocode)
return (n == 1)
or (n == 3)
or (n == 9)
...
or (n == 1162261467)
has a certain beautiful simplicity to it (the last number is 3^19, so there aren't an absurd number of cases). Even for an unsigned 64-bit integer there still be only 41 cases (thanks #Alexandru for pointing out my brain-slip). And of course would be impossible for arbitrary-precision arithmetic...
I'm surprised at this. Everyone seems to have missed the fastest algorithm of all.
The following algorithm is faster on average - and dramatically faster in some cases - than a simple while(n%3==0) n/=3; loop:
bool IsPowerOfThree(uint n)
{
// Optimizing lines to handle the most common cases extremely quickly
if(n%3 != 0) return n==1;
if(n%9 != 0) return n==3;
// General algorithm - works for any uint
uint r;
n = Math.DivRem(n, 59049, out r); if(n!=0 && r!=0) return false;
n = Math.DivRem(n+r, 243, out r); if(n!=0 && r!=0) return false;
n = Math.DivRem(n+r, 27, out r); if(n!=0 && r!=0) return false;
n += r;
return n==1 || n==3 || n==9;
}
The numeric constants in the code are 3^10, 3^5, and 3^3.
Performance calculations
In modern CPUs, DivRem is a often single instruction that takes a one cycle. On others it expands to a div followed by a mul and an add, which would takes more like three cycles altogether. Each step of the general algorithm looks long but it actually consists only of: DivRem, cmp, cmove, cmp, cand, cjmp, add. There is a lot of parallelism available, so on a typical two-way superscalar processor each step will likely execute in about 4 clock cycles, giving a guaranteed worst-case execution time of about 25 clock cycles.
If input values are evenly distributed over the range of UInt32, here are the probabilities associated with this algorithm:
Return in or before the first optimizing line: 66% of the time
Return in or before the second optimizing line: 89% of the time
Return in or before the first general algorithm step: 99.998% of the time
Return in or before the second general algorithm step: 99.99998% of the time
Return in or before the third general algorithm step: 99.999997% of the time
This algorithm outperforms the simple while(n%3==0) n/=3 loop, which has the following probabilities:
Return in the first iteration: 66% of the time
Return in the first two iterations: 89% of the time
Return in the first three iterations: 97% of the time
Return in the first four iterations: 98.8% of the time
Return in the first five iterations: 99.6% of the time ... and so on to ...
Return in the first twelve iterations: 99.9998% of the time ... and beyond ...
What is perhaps even more important, this algorithm handles midsize and large powers of three (and multiples thereof) much more efficiently: In the worst case the simple algorithm will consume over 100 CPU cycles because it will loop 20 times (41 times for 64 bits). The algorithm I present here will never take more than about 25 cycles.
Extending to 64 bits
Extending the above algorithm to 64 bits is trivial - just add one more step. Here is a 64 bit version of the above algorithm optimized for processors without efficient 64 bit division:
bool IsPowerOfThree(ulong nL)
{
// General algorithm only
ulong rL;
nL = Math.DivRem(nL, 3486784401, out rL); if(nL!=0 && rL!=0) return false;
nL = Math.DivRem(nL+rL, 59049, out rL); if(nL!=0 && rL!=0) return false;
uint n = (uint)nL + (uint)rL;
n = Math.DivRem(n, 243, out r); if(n!=0 && r!=0) return false;
n = Math.DivRem(n+r, 27, out r); if(n!=0 && r!=0) return false;
n += r;
return n==1 || n==3 || n==9;
}
The new constant is 3^20. The optimization lines are omitted from the top of the method because under our assumption that 64 bit division is slow, they would actually slow things down.
Why this technique works
Say I want to know if "100000000000000000" is a power of 10. I might follow these steps:
I divide by 10^10 and get a quotient of 10000000 and a remainder of 0. These add to 10000000.
I divide by 10^5 and get a quotient of 100 and a remainder of 0. These add to 100.
I divide by 10^3 and get a quotient of 0 and a remainderof 100. These add to 100.
I divide by 10^2 and get a quotient of 1 and a remainder of 0. These add to 1.
Because I started with a power of 10, every time I divided by a power of 10 I ended up with either a zero quotient or a zero remainder. Had I started out with anything except a power of 10 I would have sooner or later ended up with a nonzero quotient or remainder.
In this example I selected exponents of 10, 5, and 3 to match the code provided previously, and added 2 just for the heck of it. Other exponents would also work: There is a simple algorithm for selecting the ideal exponents given your maximum input value and the maximum power of 10 allowed in the output, but this margin does not have enough room to contain it.
NOTE: You may have been thinking in base ten throughout this explanation, but the entire explanation above can be read and understood identically if you're thinking in in base three, except the exponents would have been expressed differently (instead of "10", "5", "3" and "2" I would have to say "101", "12", "10" and "2").
This is a summary of all good answers below this questions, and the performance figures can be found from the LeetCode article.
1. Loop Iteration
Time complexity O(log(n)), space complexity O(1)
public boolean isPowerOfThree(int n) {
if (n < 1) {
return false;
}
while (n % 3 == 0) {
n /= 3;
}
return n == 1;
}
2. Base Conversion
Convert the integer to a base 3 number, and check if it is written as a leading 1 followed by all 0. It is inspired by the solution to check if a number is power of 2 by doing n & (n - 1) == 0
Time complexity: O(log(n)) depending on language and compiler, space complexity: O(log(n))
public boolean isPowerOfThree(int n) {
return Integer.toString(n, 3).matches("^10*$");
}
3. Mathematics
If n = 3^i, then i = log(n) / log(3), and thus comes to the solution
Time complexity: depending on language and compiler, space complexity: O(1)
public boolean isPowerOfThree(int n) {
return (Math.log(n) / Math.log(3) + epsilon) % 1 <= 2 * epsilon;
}
4. Integer Limitations
Because 3^19 = 1162261467 is the largest power of 3 number fits in a 32 bit integer, thus we can do
Time complexity: O(1), space complexity: O(1)
public boolean isPowerOfThree(int n) {
return n > 0 && 1162261467 % n == 0;
}
5. Integer Limitations with Set
The idea is similar to #4 but use a set to store all possible power of 3 numbers (from 3^0 to 3^19). It makes code more readable.
6. Recursive (C++11)
This solution is specific to C++11, using template meta programming so that complier will replace the call isPowerOf3<Your Input>::cValue with calculated result.
Time complexity: O(1), space complexity: O(1)
template<int N>
struct isPowerOf3 {
static const bool cValue = (N % 3 == 0) && isPowerOf3<N / 3>::cValue;
};
template<>
struct isPowerOf3<0> {
static const bool cValue = false;
};
template<>
struct isPowerOf3<1> {
static const bool cValue = true;
};
int main() {
cout<<isPowerOf3<1162261467>::cValue;
return 0;
}
if (log n) / (log 3) is integral then n is a power of 3.
Recursively divide by 3, check that the remainder is zero and re-apply to the quotient.
Note that 1 is a valid answer as 3 to the zero power is 1 is an edge case to beware.
Very interesting question, I like the answer from starblue,
and this is a variation of his algorithm which will converge little bit faster to the solution:
private bool IsPow3(int n)
{
if (n == 0) return false;
while (n % 9 == 0)
{
n /= 9;
}
return (n == 1 || n == 3);
}
Between powers of two there is at most one power of three.
So the following is a fast test:
Find the binary logarithm of n by finding the position of the leading 1 bit in the number. This is very fast, as modern processors have a special instruction for that. (Otherwise you can do it by bit twiddling, see Bit Twiddling Hacks).
Look up the potential power of three in a table indexed by this position and compare to n (if there is no power of three you can store any number with a different binary logarithm).
If they are equal return yes, otherwise no.
The runtime depends mostly on the time needed for accessing the table entry. If we are using machine integers the table is small, and probably in cache (we are using it many millions of times, otherwise this level of optimization wouldn't make sense).
Here is a nice and fast implementation of Ray Burns' method in C:
bool is_power_of_3(unsigned x) {
if (x > 0x0000ffff)
x *= 0xb0cd1d99; // multiplicative inverse of 59049
if (x > 0x000000ff)
x *= 0xd2b3183b; // multiplicative inverse of 243
return x <= 243 && ((x * 0x71c5) & 0x5145) == 0x5145;
}
It uses the multiplicative inverse trick for to first divide by 3^10 and then by 3^5. Finally, it needs to check whether the result is 1, 3, 9, 27, 81, or 243, which is done by some simple hashing that I found by trial-and-error.
On my CPU (Intel Sandy Bridge), it is quite fast, but not as fast as the method of starblue that uses the binary logarithm (which is implemented in hardware on that CPU). But on a CPU without such an instruction, or when lookup tables are undesirable, it might be an alternative.
How large is your input? With O(log(N)) memory you can do faster, O(log(log(N)). Precompute the powers of 3 and then do a binary search on the precomputed values.
Simple and constant-time solution:
return n == power(3, round(log(n) / log(3)))
For really large numbers n, you can use the following math trick to speed up the operation of
n % 3 == 0
which is really slow and most likely the choke point of any algorithm that relies on repeated checking of remainders. You have to understand modular arithmetic to follow what I am doing, which is part of elementary number theory.
Let x = Σ k a k 2 k be the number of interest. We can let the upper bound of the sum be ∞ with the understanding that a k = 0 for some k > M. Then
0 ≡ x ≡ Σ k a k 2 k ≡ Σ k a 2k 2 2k + a 2k+1 2 2k+1 ≡ Σ k 2 2k ( a 2k + a 2k+1 2) ≡ Σ k a 2k + a 2k+1 2 (mod 3)
since 22k ≡ 4 k ≡ 1k ≡ 1 (mod 3).
Given a binary representation of a number x with 2n+1 bits as
x0 x1 x2 ... x2n+1
where xk ∈{0,1} you can group odd even pairs
(x0 x1) (x2 x3) ... (x2n x2n+1).
Let q denote the number of pairings of the form (1 0) and let r denote the number of pairings of the form (0 1). Then it follows from the equation above that 3 | x if and only if 3 | (q + 2r). Furthermore, you can show that 3|(q + 2r) if and only if q and r have the same remainder when divided by 3.
So an algorithm for determining whether a number is divisible by 3 could be done as follows
q = 0, r = 0
for i in {0,1, .., n}
pair <- (x_{2i} x_{2i+1})
if pair == (1 0)
switch(q)
case 0:
q = 1;
break;
case 1:
q = 2;
break;
case 2:
q = 0;
break;
else if pair == (0 1)
switch(r)
case 0:
r = 1;
break;
case 1:
r = 2;
break;
case 2:
r = 0;
return q == r
This algorithm is more efficient than the use of %.
--- Edit many years later ----
I took a few minutes to implement a rudimentary version of this in python that checks its true for all numbers up to 10^4. I include it below for reference. Obviously, to make use of this one would implement this as close to hardware as possible. This scanning technique can be extended to any number that one wants to by altering the derivation. I also conjecture the 'scanning' portion of the algorithm can be reformulated in a recursive O(log n) type formulation similar to a FFT, but I'd have to think on it.
#!/usr/bin/python
def bits2num(bits):
num = 0
for i,b in enumerate(bits):
num += int(b) << i
return num
def num2bits(num):
base = 0
bits = list()
while True:
op = 1 << base
if op > num:
break
bits.append(op&num !=0)
base += 1
return "".join(map(str,map(int,bits)))[::-1]
def div3(bits):
n = len(bits)
if n % 2 != 0:
bits = bits + '0'
n = len(bits)
assert n % 2 == 0
q = 0
r = 0
for i in range(n/2):
pair = bits[2*i:2*i+2]
if pair == '10':
if q == 0:
q = 1
elif q == 1:
q = 2
elif q == 2:
q = 0
elif pair == '01':
if r == 0:
r = 1
elif r == 1:
r = 2
elif r == 2:
r = 0
else:
pass
return q == r
for i in range(10000):
truth = (i % 3) == 0
bits = num2bits(i)
check = div3(bits)
assert truth == check
You can do better than repeated division, which takes O(lg(X) * |division|) time. Essentially you do a binary search on powers of 3. Really we will be doing a binary search on N, where 3^N = input value). Setting the Pth binary digit of N corresponds to multiplying by 3^(2^P), and values of the form 3^(2^P) can be computed by repeated squaring.
Algorithm
Let the input value be X.
Generate a list L of repeated squared values which ends once you pass X.
Let your candidate value be T, initialized to 1.
For each E in reversed L, if T*E <= X then let T *= E.
Return T == X.
Complexity:
O(lg(lg(X)) * |multiplication|)
- Generating and iterating over L takes lg(lg(X)) iterations, and multiplication is the most expensive operation in an iteration.
The fastest solution is either testing if n > 0 && 3**19 % n == 0 as given in another answer or perfect hashing (below). First I'm giving two multiplication-based solutions.
Multiplication
I wonder why everybody missed that multiplication is much faster than division:
for (int i=0, pow=1; i<=19, pow*=3; ++i) {
if (pow >= n) {
return pow == n;
}
}
return false;
Just try all powers, stop when it grew too big. Avoid overflow as 3**19 = 0x4546B3DB is the biggest power fitting in signed 32-bit int.
Multiplication with binary search
Binary search could look like
int pow = 1;
int next = pow * 6561; // 3**8
if (n >= next) pow = next;
next = pow * 81; // 3**4
if (n >= next) pow = next;
next = pow * 81; // 3**4; REPEATED
if (n >= next) pow = next;
next = pow * 9; // 3**2
if (n >= next) pow = next;
next = pow * 3; // 3**1
if (n >= next) pow = next;
return pow == next;
One step is repeated, so that the maximum exponent 19 = 8+4+4+2+1 can exactly be reached.
Perfect hashing
There are 20 powers of three fitting into a signed 32-bit int, so we take a table of 32 elements. With some experimentation, I found the perfect hash function
def hash(x):
return (x ^ (x>>1) ^ (x>>2)) & 31;
mapping each power to a distinct index between 0 and 31. The remaining stuff is trivial:
// Create a table and fill it with some power of three.
table = [1 for i in range(32)]
// Fill the buckets.
for n in range(20): table[hash(3**n)] = 3**n;
Now we have
table = [
1162261467, 1, 3, 729, 14348907, 1, 1, 1,
1, 1, 19683, 1, 2187, 81, 1594323, 9,
27, 43046721, 129140163, 1, 1, 531441, 243, 59049,
177147, 6561, 1, 4782969, 1, 1, 1, 387420489]
and can test very fast via
def isPowerOfThree(x):
return table[hash(x)] == x
Your question is fairly easy to answer by defining a simple function to run the check for you. The example implementation shown below is written in Python but should not be difficult to rewrite in other languages if needed. Unlike the last version of this answer, the code shown below is far more reliable.
Python 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 08:06:12) [MSC v.1900 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import math
>>> def power_of(number, base):
return number == base ** round(math.log(number, base))
>>> base = 3
>>> for power in range(21):
number = base ** power
print(f'{number} is '
f'{"" if power_of(number, base) else "not "}'
f'a power of {base}.')
number += 1
print(f'{number} is '
f'{"" if power_of(number, base) else "not "}'
f'a power of {base}.')
print()
1 is a power of 3.
2 is not a power of 3.
3 is a power of 3.
4 is not a power of 3.
9 is a power of 3.
10 is not a power of 3.
27 is a power of 3.
28 is not a power of 3.
81 is a power of 3.
82 is not a power of 3.
243 is a power of 3.
244 is not a power of 3.
729 is a power of 3.
730 is not a power of 3.
2187 is a power of 3.
2188 is not a power of 3.
6561 is a power of 3.
6562 is not a power of 3.
19683 is a power of 3.
19684 is not a power of 3.
59049 is a power of 3.
59050 is not a power of 3.
177147 is a power of 3.
177148 is not a power of 3.
531441 is a power of 3.
531442 is not a power of 3.
1594323 is a power of 3.
1594324 is not a power of 3.
4782969 is a power of 3.
4782970 is not a power of 3.
14348907 is a power of 3.
14348908 is not a power of 3.
43046721 is a power of 3.
43046722 is not a power of 3.
129140163 is a power of 3.
129140164 is not a power of 3.
387420489 is a power of 3.
387420490 is not a power of 3.
1162261467 is a power of 3.
1162261468 is not a power of 3.
3486784401 is a power of 3.
3486784402 is not a power of 3.
>>>
NOTE: The last revision has caused this answer to become nearly the same as TMS' answer.
Set based solution...
DECLARE #LastExponent smallint, #SearchCase decimal(38,0)
SELECT
#LastExponent = 79, -- 38 for bigint
#SearchCase = 729
;WITH CTE AS
(
SELECT
POWER(CAST(3 AS decimal(38,0)), ROW_NUMBER() OVER (ORDER BY c1.object_id)) AS Result,
ROW_NUMBER() OVER (ORDER BY c1.object_id) AS Exponent
FROM
sys.columns c1, sys.columns c2
)
SELECT
Result, Exponent
FROM
CTE
WHERE
Exponent <= #LastExponent
AND
Result = #SearchCase
With SET STATISTICS TIME ON it record the lowest possible, 1 millisecond.
Another approach is to generate a table on compile time. The good thing is, that you can extend this to powers of 4, 5, 6, 7, whatever
template<std::size_t... Is>
struct seq
{ };
template<std::size_t N, std::size_t... Is>
struct gen_seq : gen_seq<N-1, N-1, Is...>
{ };
template<std::size_t... Is>
struct gen_seq<0, Is...> : seq<Is...>
{ };
template<std::size_t N>
struct PowersOfThreeTable
{
std::size_t indexes[N];
std::size_t values[N];
static constexpr std::size_t size = N;
};
template<typename LambdaType, std::size_t... Is>
constexpr PowersOfThreeTable<sizeof...(Is)>
generatePowersOfThreeTable(seq<Is...>, LambdaType evalFunc)
{
return { {Is...}, {evalFunc(Is)...} };
}
template<std::size_t N, typename LambdaType>
constexpr PowersOfThreeTable<N> generatePowersOfThreeTable(LambdaType evalFunc)
{
return generatePowersOfThreeTable(gen_seq<N>(), evalFunc);
}
template<std::size_t Base, std::size_t Exp>
struct Pow
{
static constexpr std::size_t val = Base * Pow<Base, Exp-1ULL>::val;
};
template<std::size_t Base>
struct Pow<Base, 0ULL>
{
static constexpr std::size_t val = 1ULL;
};
template<std::size_t Base>
struct Pow<Base, 1ULL>
{
static constexpr std::size_t val = Base;
};
constexpr std::size_t tableFiller(std::size_t val)
{
return Pow<3ULL, val>::val;
}
bool isPowerOfThree(std::size_t N)
{
static constexpr unsigned tableSize = 41; //choosen by fair dice roll
static constexpr PowersOfThreeTable<tableSize> table =
generatePowersOfThreeTable<tableSize>(tableFiller);
for(auto a : table.values)
if(a == N)
return true;
return false;
}
I measured times (C#, Platform target x64) for some solutions.
using System;
class Program
{
static void Main()
{
var sw = System.Diagnostics.Stopwatch.StartNew();
for (uint n = ~0u; n > 0; n--) ;
Console.WriteLine(sw.Elapsed); // nada 1.1 s
sw.Restart();
for (uint n = ~0u; n > 0; n--) isPow3a(n);
Console.WriteLine(sw.Elapsed); // 3^20 17.3 s
sw.Restart();
for (uint n = ~0u; n > 0; n--) isPow3b(n);
Console.WriteLine(sw.Elapsed); // % / 10.6 s
Console.Read();
}
static bool isPow3a(uint n) // Elric
{
return n > 0 && 3486784401 % n == 0;
}
static bool isPow3b(uint n) // starblue
{
if (n > 0) while (n % 3 == 0) n /= 3;
return n == 1;
}
}
Another way (of splitting hairs).
using System;
class Program
{
static void Main()
{
Random rand = new Random(0); uint[] r = new uint[512];
for (int i = 0; i < 512; i++)
r[i] = (uint)(rand.Next(1 << 30)) << 2 | (uint)(rand.Next(4));
var sw = System.Diagnostics.Stopwatch.StartNew();
for (int i = 1 << 23; i > 0; i--)
for (int j = 0; j < 512; j++) ;
Console.WriteLine(sw.Elapsed); // 0.3 s
sw.Restart();
for (int i = 1 << 23; i > 0; i--)
for (int j = 0; j < 512; j++) isPow3c(r[j]);
Console.WriteLine(sw.Elapsed); // 10.6 s
sw.Restart();
for (int i = 1 << 23; i > 0; i--)
for (int j = 0; j < 512; j++) isPow3b(r[j]);
Console.WriteLine(sw.Elapsed); // 9.0 s
Console.Read();
}
static bool isPow3c(uint n)
{ return (n & 1) > 0 && 3486784401 % n == 0; }
static bool isPow3b(uint n)
{ if (n > 0) while (n % 3 == 0) n /= 3; return n == 1; }
}
Python program to check whether the number is a POWER of 3 or not.
def power(Num1):
while Num1 % 3 == 0:
Num1 /= 3
return Num1 == 1
Num1 = int(input("Enter a Number: "))
print(power(Num1))
Python solution
from math import floor
from math import log
def IsPowerOf3(number):
p = int(floor(log(number) / log(3)))
power_floor = pow(3, p)
power_ceil = power_floor * 3
if power_floor == number or power_ceil == number:
return True
return False
This is much faster than the simple divide by 3 solution.
Proof: 3 ^ p = number
p log(3) = log(number) (taking log both side)
p = log(number) / log(3)
Here's a general algorithm for finding out if a number is a power of another number:
bool IsPowerOf(int n,int b)
{
if (n > 1)
{
while (n % b == 0)
{
n /= b;
}
}
return n == 1;
}
#include<iostream>
#include<string>
#include<cmath>
using namespace std;
int main()
{
int n, power=0;
cout<<"enter a number"<<endl;
cin>>n;
if (n>0){
for(int i=0; i<=n; i++)
{
int r=n%3;
n=n/3;
if (r==0){
power++;
}
else{
cout<<"not exactly power of 3";
return 0;
}
}
}
cout<<"the power is "<<power<<endl;
}
This is a constant time method! Yes. O(1). For numbers of fixed length, say 32-bits.
Given that we need to check if an integer n is a power of 3, let us start thinking about this problem in terms of what information is already at hand.
1162261467 is the largest power of 3 that can fit into an Java int.
1162261467 = 3^19 + 0
The given n can be expressed as [(a power of 3) + (some x)]. I think it is fairly elementary to be able to prove that if x is 0(which happens iff n is a power of 3), 1162261467 % n = 0.
The general idea is that if X is some power of 3, X can be expressed as Y/3a, where a is some integer and X < Y. It follows the exact same principle for Y < X. The Y = X case is elementary.
So, to check if a given integer n is a power of three, check if n > 0 && 1162261467 % n == 0.
Python:
return n > 0 and 1162261467 % n == 0
OR Calculate log:
lg = round(log(n,3))
return 3**lg == n
1st approach is faster than the second one.

Algorithm to find Largest prime factor of a number

What is the best approach to calculating the largest prime factor of a number?
I'm thinking the most efficient would be the following:
Find lowest prime number that divides cleanly
Check if result of division is prime
If not, find next lowest
Go to 2.
I'm basing this assumption on it being easier to calculate the small prime factors. Is this about right? What other approaches should I look into?
Edit: I've now realised that my approach is futile if there are more than 2 prime factors in play, since step 2 fails when the result is a product of two other primes, therefore a recursive algorithm is needed.
Edit again: And now I've realised that this does still work, because the last found prime number has to be the highest one, therefore any further testing of the non-prime result from step 2 would result in a smaller prime.
Here's the best algorithm I know of (in Python)
def prime_factors(n):
"""Returns all the prime factors of a positive integer"""
factors = []
d = 2
while n > 1:
while n % d == 0:
factors.append(d)
n /= d
d = d + 1
return factors
pfs = prime_factors(1000)
largest_prime_factor = max(pfs) # The largest element in the prime factor list
The above method runs in O(n) in the worst case (when the input is a prime number).
EDIT:
Below is the O(sqrt(n)) version, as suggested in the comment. Here is the code, once more.
def prime_factors(n):
"""Returns all the prime factors of a positive integer"""
factors = []
d = 2
while n > 1:
while n % d == 0:
factors.append(d)
n /= d
d = d + 1
if d*d > n:
if n > 1: factors.append(n)
break
return factors
pfs = prime_factors(1000)
largest_prime_factor = max(pfs) # The largest element in the prime factor list
Actually there are several more efficient ways to find factors of big numbers (for smaller ones trial division works reasonably well).
One method which is very fast if the input number has two factors very close to its square root is known as Fermat factorisation. It makes use of the identity N = (a + b)(a - b) = a^2 - b^2 and is easy to understand and implement. Unfortunately it's not very fast in general.
The best known method for factoring numbers up to 100 digits long is the Quadratic sieve. As a bonus, part of the algorithm is easily done with parallel processing.
Yet another algorithm I've heard of is Pollard's Rho algorithm. It's not as efficient as the Quadratic Sieve in general but seems to be easier to implement.
Once you've decided on how to split a number into two factors, here is the fastest algorithm I can think of to find the largest prime factor of a number:
Create a priority queue which initially stores the number itself. Each iteration, you remove the highest number from the queue, and attempt to split it into two factors (not allowing 1 to be one of those factors, of course). If this step fails, the number is prime and you have your answer! Otherwise you add the two factors into the queue and repeat.
My answer is based on Triptych's, but improves a lot on it. It is based on the fact that beyond 2 and 3, all the prime numbers are of the form 6n-1 or 6n+1.
var largestPrimeFactor;
if(n mod 2 == 0)
{
largestPrimeFactor = 2;
n = n / 2 while(n mod 2 == 0);
}
if(n mod 3 == 0)
{
largestPrimeFactor = 3;
n = n / 3 while(n mod 3 == 0);
}
multOfSix = 6;
while(multOfSix - 1 <= n)
{
if(n mod (multOfSix - 1) == 0)
{
largestPrimeFactor = multOfSix - 1;
n = n / largestPrimeFactor while(n mod largestPrimeFactor == 0);
}
if(n mod (multOfSix + 1) == 0)
{
largestPrimeFactor = multOfSix + 1;
n = n / largestPrimeFactor while(n mod largestPrimeFactor == 0);
}
multOfSix += 6;
}
I recently wrote a blog article explaining how this algorithm works.
I would venture that a method in which there is no need for a test for primality (and no sieve construction) would run faster than one which does use those. If that is the case, this is probably the fastest algorithm here.
JavaScript code:
'option strict';
function largestPrimeFactor(val, divisor = 2) {
let square = (val) => Math.pow(val, 2);
while ((val % divisor) != 0 && square(divisor) <= val) {
divisor++;
}
return square(divisor) <= val
? largestPrimeFactor(val / divisor, divisor)
: val;
}
Usage Example:
let result = largestPrimeFactor(600851475143);
Here is an example of the code:
Similar to #Triptych answer but also different. In this example list or dictionary is not used. Code is written in Ruby
def largest_prime_factor(number)
i = 2
while number > 1
if number % i == 0
number /= i;
else
i += 1
end
end
return i
end
largest_prime_factor(600851475143)
# => 6857
All numbers can be expressed as the product of primes, eg:
102 = 2 x 3 x 17
712 = 2 x 2 x 2 x 89
You can find these by simply starting at 2 and simply continuing to divide until the result isn't a multiple of your number:
712 / 2 = 356 .. 356 / 2 = 178 .. 178 / 2 = 89 .. 89 / 89 = 1
using this method you don't have to actually calculate any primes: they'll all be primes, based on the fact that you've already factorised the number as much as possible with all preceding numbers.
number = 712;
currNum = number; // the value we'll actually be working with
for (currFactor in 2 .. number) {
while (currNum % currFactor == 0) {
// keep on dividing by this number until we can divide no more!
currNum = currNum / currFactor // reduce the currNum
}
if (currNum == 1) return currFactor; // once it hits 1, we're done.
}
//this method skips unnecessary trial divisions and makes
//trial division more feasible for finding large primes
public static void main(String[] args)
{
long n= 1000000000039L; //this is a large prime number
long i = 2L;
int test = 0;
while (n > 1)
{
while (n % i == 0)
{
n /= i;
}
i++;
if(i*i > n && n > 1)
{
System.out.println(n); //prints n if it's prime
test = 1;
break;
}
}
if (test == 0)
System.out.println(i-1); //prints n if it's the largest prime factor
}
The simplest solution is a pair of mutually recursive functions.
The first function generates all the prime numbers:
Start with a list of all natural numbers greater than 1.
Remove all numbers that are not prime. That is, numbers that have no prime factors (other than themselves). See below.
The second function returns the prime factors of a given number n in increasing order.
Take a list of all the primes (see above).
Remove all the numbers that are not factors of n.
The largest prime factor of n is the last number given by the second function.
This algorithm requires a lazy list or a language (or data structure) with call-by-need semantics.
For clarification, here is one (inefficient) implementation of the above in Haskell:
import Control.Monad
-- All the primes
primes = 2 : filter (ap (<=) (head . primeFactors)) [3,5..]
-- Gives the prime factors of its argument
primeFactors = factor primes
where factor [] n = []
factor xs#(p:ps) n =
if p*p > n then [n]
else let (d,r) = divMod n p in
if r == 0 then p : factor xs d
else factor ps n
-- Gives the largest prime factor of its argument
largestFactor = last . primeFactors
Making this faster is just a matter of being more clever about detecting which numbers are prime and/or factors of n, but the algorithm stays the same.
n = abs(number);
result = 1;
if (n mod 2 == 0) {
result = 2;
while (n mod 2 = 0) n /= 2;
}
for(i=3; i<sqrt(n); i+=2) {
if (n mod i == 0) {
result = i;
while (n mod i = 0) n /= i;
}
}
return max(n,result)
There are some modulo tests that are superflous, as n can never be divided by 6 if all factors 2 and 3 have been removed. You could only allow primes for i, which is shown in several other answers here.
You could actually intertwine the sieve of Eratosthenes here:
First create the list of integers up
to sqrt(n).
In the for loop mark all multiples
of i up to the new sqrt(n) as not
prime, and use a while loop instead.
set i to the next prime number in
the list.
Also see this question.
I'm aware this is not a fast solution. Posting as hopefully easier to understand slow solution.
public static long largestPrimeFactor(long n) {
// largest composite factor must be smaller than sqrt
long sqrt = (long)Math.ceil(Math.sqrt((double)n));
long largest = -1;
for(long i = 2; i <= sqrt; i++) {
if(n % i == 0) {
long test = largestPrimeFactor(n/i);
if(test > largest) {
largest = test;
}
}
}
if(largest != -1) {
return largest;
}
// number is prime
return n;
}
Python Iterative approach by removing all prime factors from the number
def primef(n):
if n <= 3:
return n
if n % 2 == 0:
return primef(n/2)
elif n % 3 ==0:
return primef(n/3)
else:
for i in range(5, int((n)**0.5) + 1, 6):
#print i
if n % i == 0:
return primef(n/i)
if n % (i + 2) == 0:
return primef(n/(i+2))
return n
I am using algorithm which continues dividing the number by it's current Prime Factor.
My Solution in python 3 :
def PrimeFactor(n):
m = n
while n%2==0:
n = n//2
if n == 1: # check if only 2 is largest Prime Factor
return 2
i = 3
sqrt = int(m**(0.5)) # loop till square root of number
last = 0 # to store last prime Factor i.e. Largest Prime Factor
while i <= sqrt :
while n%i == 0:
n = n//i # reduce the number by dividing it by it's Prime Factor
last = i
i+=2
if n> last: # the remaining number(n) is also Factor of number
return n
else:
return last
print(PrimeFactor(int(input())))
Input : 10
Output : 5
Input : 600851475143
Output : 6857
Inspired by your question I decided to implement my own version of factorization (and finding largest prime factor) in Python.
Probably the simplest to implement, yet quite efficient, factoring algorithm that I know is Pollard's Rho algorithm. It has a running time of O(N^(1/4)) at most which is much more faster than time of O(N^(1/2)) for trial division algorithm. Both algos have these running times only in case of composite (non-prime) number, that's why primality test should be used to filter out prime (non-factorable) numbers.
I used following algorithms in my code: Fermat Primality Test ..., Pollard's Rho Algorithm ..., Trial Division Algorithm. Fermat primality test is used before running Pollard's Rho in order to filter out prime numbers. Trial Division is used as a fallback because Pollard's Rho in very rare cases may fail to find a factor, especially for some small numbers.
Obviously after fully factorizing a number into sorted list of prime factors the largest prime factor will be the last element in this list. In general case (for any random number) I don't know of any other ways to find out largest prime factor besides fully factorizing a number.
As an example in my code I'm factoring first 190 fractional digits of Pi, code factorizes this number within 1 second, and shows largest prime factor which is 165 digits (545 bits) in size!
Try it online!
def is_fermat_probable_prime(n, *, trials = 32):
# https://en.wikipedia.org/wiki/Fermat_primality_test
import random
if n <= 16:
return n in (2, 3, 5, 7, 11, 13)
for i in range(trials):
if pow(random.randint(2, n - 2), n - 1, n) != 1:
return False
return True
def pollard_rho_factor(N, *, trials = 16):
# https://en.wikipedia.org/wiki/Pollard%27s_rho_algorithm
import random, math
for j in range(trials):
i, stage, y, x = 0, 2, 1, random.randint(1, N - 2)
while True:
r = math.gcd(N, x - y)
if r != 1:
break
if i == stage:
y = x
stage <<= 1
x = (x * x + 1) % N
i += 1
if r != N:
return [r, N // r]
return [N] # Pollard-Rho failed
def trial_division_factor(n, *, limit = None):
# https://en.wikipedia.org/wiki/Trial_division
fs = []
while n & 1 == 0:
fs.append(2)
n >>= 1
d = 3
while d * d <= n and limit is None or d <= limit:
q, r = divmod(n, d)
if r == 0:
fs.append(d)
n = q
else:
d += 2
if n > 1:
fs.append(n)
return fs
def factor(n):
if n <= 1:
return []
if is_fermat_probable_prime(n):
return [n]
fs = trial_division_factor(n, limit = 1 << 12)
if len(fs) >= 2:
return sorted(fs[:-1] + factor(fs[-1]))
fs = pollard_rho_factor(n)
if len(fs) >= 2:
return sorted([e1 for e0 in fs for e1 in factor(e0)])
return trial_division_factor(n)
def demo():
import time, math
# http://www.math.com/tables/constants/pi.htm
# pi = 3.
# 1415926535 8979323846 2643383279 5028841971 6939937510 5820974944 5923078164 0628620899 8628034825 3421170679
# 8214808651 3282306647 0938446095 5058223172 5359408128 4811174502 8410270193 8521105559 6446229489 5493038196
# n = first 190 fractional digits of Pi
n = 1415926535_8979323846_2643383279_5028841971_6939937510_5820974944_5923078164_0628620899_8628034825_3421170679_8214808651_3282306647_0938446095_5058223172_5359408128_4811174502_8410270193_8521105559_6446229489
print('Number:', n)
tb = time.time()
fs = factor(n)
print('All Prime Factors:', fs)
print('Largest Prime Factor:', f'({math.log2(fs[-1]):.02f} bits, {len(str(fs[-1]))} digits)', fs[-1])
print('Time Elapsed:', round(time.time() - tb, 3), 'sec')
if __name__ == '__main__':
demo()
Output:
Number: 1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679821480865132823066470938446095505822317253594081284811174502841027019385211055596446229489
All Prime Factors: [3, 71, 1063541, 153422959, 332958319, 122356390229851897378935483485536580757336676443481705501726535578690975860555141829117483263572548187951860901335596150415443615382488933330968669408906073630300473]
Largest Prime Factor: (545.09 bits, 165 digits) 122356390229851897378935483485536580757336676443481705501726535578690975860555141829117483263572548187951860901335596150415443615382488933330968669408906073630300473
Time Elapsed: 0.593 sec
Here is my attempt in c#. The last print out is the largest prime factor of the number. I checked and it works.
namespace Problem_Prime
{
class Program
{
static void Main(string[] args)
{
/*
The prime factors of 13195 are 5, 7, 13 and 29.
What is the largest prime factor of the number 600851475143 ?
*/
long x = 600851475143;
long y = 2;
while (y < x)
{
if (x % y == 0)
{
// y is a factor of x, but is it prime
if (IsPrime(y))
{
Console.WriteLine(y);
}
x /= y;
}
y++;
}
Console.WriteLine(y);
Console.ReadLine();
}
static bool IsPrime(long number)
{
//check for evenness
if (number % 2 == 0)
{
if (number == 2)
{
return true;
}
return false;
}
//don't need to check past the square root
long max = (long)Math.Sqrt(number);
for (int i = 3; i <= max; i += 2)
{
if ((number % i) == 0)
{
return false;
}
}
return true;
}
}
}
#python implementation
import math
n = 600851475143
i = 2
factors=set([])
while i<math.sqrt(n):
while n%i==0:
n=n/i
factors.add(i)
i+=1
factors.add(n)
largest=max(factors)
print factors
print largest
Calculates the largest prime factor of a number using recursion in C++. The working of the code is explained below:
int getLargestPrime(int number) {
int factor = number; // assumes that the largest prime factor is the number itself
for (int i = 2; (i*i) <= number; i++) { // iterates to the square root of the number till it finds the first(smallest) factor
if (number % i == 0) { // checks if the current number(i) is a factor
factor = max(i, number / i); // stores the larger number among the factors
break; // breaks the loop on when a factor is found
}
}
if (factor == number) // base case of recursion
return number;
return getLargestPrime(factor); // recursively calls itself
}
Here is my approach to quickly calculate the largest prime factor.
It is based on fact that modified x does not contain non-prime factors. To achieve that, we divide x as soon as a factor is found. Then, the only thing left is to return the largest factor. It would be already prime.
The code (Haskell):
f max' x i | i > x = max'
| x `rem` i == 0 = f i (x `div` i) i -- Divide x by its factor
| otherwise = f max' x (i + 1) -- Check for the next possible factor
g x = f 2 x 2
The following C++ algorithm is not the best one, but it works for numbers under a billion and its pretty fast
#include <iostream>
using namespace std;
// ------ is_prime ------
// Determines if the integer accepted is prime or not
bool is_prime(int n){
int i,count=0;
if(n==1 || n==2)
return true;
if(n%2==0)
return false;
for(i=1;i<=n;i++){
if(n%i==0)
count++;
}
if(count==2)
return true;
else
return false;
}
// ------ nextPrime -------
// Finds and returns the next prime number
int nextPrime(int prime){
bool a = false;
while (a == false){
prime++;
if (is_prime(prime))
a = true;
}
return prime;
}
// ----- M A I N ------
int main(){
int value = 13195;
int prime = 2;
bool done = false;
while (done == false){
if (value%prime == 0){
value = value/prime;
if (is_prime(value)){
done = true;
}
} else {
prime = nextPrime(prime);
}
}
cout << "Largest prime factor: " << value << endl;
}
Found this solution on the web by "James Wang"
public static int getLargestPrime( int number) {
if (number <= 1) return -1;
for (int i = number - 1; i > 1; i--) {
if (number % i == 0) {
number = i;
}
}
return number;
}
Prime factor using sieve :
#include <bits/stdc++.h>
using namespace std;
#define N 10001
typedef long long ll;
bool visit[N];
vector<int> prime;
void sieve()
{
memset( visit , 0 , sizeof(visit));
for( int i=2;i<N;i++ )
{
if( visit[i] == 0)
{
prime.push_back(i);
for( int j=i*2; j<N; j=j+i )
{
visit[j] = 1;
}
}
}
}
void sol(long long n, vector<int>&prime)
{
ll ans = n;
for(int i=0; i<prime.size() || prime[i]>n; i++)
{
while(n%prime[i]==0)
{
n=n/prime[i];
ans = prime[i];
}
}
ans = max(ans, n);
cout<<ans<<endl;
}
int main()
{
ll tc, n;
sieve();
cin>>n;
sol(n, prime);
return 0;
}
Guess, there is no immediate way but performing a factorization, as examples above have done, i.e.
in a iteration you identify a "small" factor f of a number N, then continue with the reduced problem "find largest prime factor of N':=N/f with factor candidates >=f ".
From certain size of f the expected search time is less, if you do a primality test on reduced N', which in case confirms, that your N' is already the largest prime factor of initial N.
Here is my attempt in Clojure. Only walking the odds for prime? and the primes for prime factors ie. sieve. Using lazy sequences help producing the values just before they are needed.
(defn prime?
([n]
(let [oddNums (iterate #(+ % 2) 3)]
(prime? n (cons 2 oddNums))))
([n [i & is]]
(let [q (quot n i)
r (mod n i)]
(cond (< n 2) false
(zero? r) false
(> (* i i) n) true
:else (recur n is)))))
(def primes
(let [oddNums (iterate #(+ % 2) 3)]
(lazy-seq (cons 2 (filter prime? oddNums)))))
;; Sieve of Eratosthenes
(defn sieve
([n]
(sieve primes n))
([[i & is :as ps] n]
(let [q (quot n i)
r (mod n i)]
(cond (< n 2) nil
(zero? r) (lazy-seq (cons i (sieve ps q)))
(> (* i i) n) (when (> n 1) (lazy-seq [n]))
:else (recur is n)))))
(defn max-prime-factor [n]
(last (sieve n)))
Recursion in C
Algorithm could be
Check if n is a factor or t
Check if n is prime. If so, remember n
Increment n
Repeat until n > sqrt(t)
Here's an example of a (tail)recursive solution to the problem in C:
#include <stdio.h>
#include <stdbool.h>
bool is_factor(long int t, long int n){
return ( t%n == 0);
}
bool is_prime(long int n0, long int n1, bool acc){
if ( n1 * n1 > n0 || acc < 1 )
return acc;
else
return is_prime(n0, n1+2, acc && (n0%n1 != 0));
}
int gpf(long int t, long int n, long int acc){
if (n * n > t)
return acc;
if (is_factor(t, n)){
if (is_prime(n, 3, true))
return gpf(t, n+2, n);
else
return gpf(t, n+2, acc);
}
else
return gpf(t, n+2, acc);
}
int main(int argc, char ** argv){
printf("%d\n", gpf(600851475143, 3, 0));
return 0;
}
The solution is composed of three functions. One to test if the candidate is a factor, another to test if that factor is prime, and finally one to compose those two together.
Some key ideas here are:
1- Stopping the recursion at sqrt(600851475143)
2- Only test odd numbers for factorness
3- Only testing candidate factors for primeness with odd numbers
It seems to me that step #2 of the algorithm given isn't going to be all that efficient an approach. You have no reasonable expectation that it is prime.
Also, the previous answer suggesting the Sieve of Eratosthenes is utterly wrong. I just wrote two programs to factor 123456789. One was based on the Sieve, one was based on the following:
1) Test = 2
2) Current = Number to test
3) If Current Mod Test = 0 then
3a) Current = Current Div Test
3b) Largest = Test
3c) Goto 3.
4) Inc(Test)
5) If Current < Test goto 4
6) Return Largest
This version was 90x faster than the Sieve.
The thing is, on modern processors the type of operation matters far less than the number of operations, not to mention that the algorithm above can run in cache, the Sieve can't. The Sieve uses a lot of operations striking out all the composite numbers.
Note, also, that my dividing out factors as they are identified reduces the space that must be tested.
Compute a list storing prime numbers first, e.g. 2 3 5 7 11 13 ...
Every time you prime factorize a number, use implementation by Triptych but iterating this list of prime numbers rather than natural integers.
With Java:
For int values:
public static int[] primeFactors(int value) {
int[] a = new int[31];
int i = 0, j;
int num = value;
while (num % 2 == 0) {
a[i++] = 2;
num /= 2;
}
j = 3;
while (j <= Math.sqrt(num) + 1) {
if (num % j == 0) {
a[i++] = j;
num /= j;
} else {
j += 2;
}
}
if (num > 1) {
a[i++] = num;
}
int[] b = Arrays.copyOf(a, i);
return b;
}
For long values:
static long[] getFactors(long value) {
long[] a = new long[63];
int i = 0;
long num = value;
while (num % 2 == 0) {
a[i++] = 2;
num /= 2;
}
long j = 3;
while (j <= Math.sqrt(num) + 1) {
if (num % j == 0) {
a[i++] = j;
num /= j;
} else {
j += 2;
}
}
if (num > 1) {
a[i++] = num;
}
long[] b = Arrays.copyOf(a, i);
return b;
}
This is probably not always faster but more optimistic about that you find a big prime divisor:
N is your number
If it is prime then return(N)
Calculate primes up until Sqrt(N)
Go through the primes in descending order (largest first)
If N is divisible by Prime then Return(Prime)
Edit: In step 3 you can use the Sieve of Eratosthenes or Sieve of Atkins or whatever you like, but by itself the sieve won't find you the biggest prime factor. (Thats why I wouldn't choose SQLMenace's post as an official answer...)
Here is the same function#Triptych provided as a generator, which has also been simplified slightly.
def primes(n):
d = 2
while (n > 1):
while (n%d==0):
yield d
n /= d
d += 1
the max prime can then be found using:
n= 373764623
max(primes(n))
and a list of factors found using:
list(primes(n))
I think it would be good to store somewhere all possible primes smaller then n and just iterate through them to find the biggest divisior. You can get primes from prime-numbers.org.
Of course I assume that your number isn't too big :)
#include<stdio.h>
#include<conio.h>
#include<math.h>
#include <time.h>
factor(long int n)
{
long int i,j;
while(n>=4)
{
if(n%2==0) { n=n/2; i=2; }
else
{ i=3;
j=0;
while(j==0)
{
if(n%i==0)
{j=1;
n=n/i;
}
i=i+2;
}
i-=2;
}
}
return i;
}
void main()
{
clock_t start = clock();
long int n,sp;
clrscr();
printf("enter value of n");
scanf("%ld",&n);
sp=factor(n);
printf("largest prime factor is %ld",sp);
printf("Time elapsed: %f\n", ((double)clock() - start) / CLOCKS_PER_SEC);
getch();
}

Resources