Explanation of the following algorithm to find nCr modulo P - algorithm

I was trying to solve a problem involving large factorials modulo a prime, and found the following algorithm in another's solution:
long long factMod (long long n, long long p)
{
long long ans = 1;
while (n > 1)
{
long long cur = 1;
for (long long i = 1; i < p; i++)
{
cur = (cur * i) % p;
}
ans = (ans * modPow(cur, n/p, p)) % p;
for (long long i = 1; i <= n % p; i++)
{
ans = (ans * i) % p;
}
n /= p;
}
return (ans % p);
}
long long nChooseK(long long n, long long k, long long p)
{
int num_degree = get_degree(n, p) - get_degree(n - k, p);
int den_degree = get_degree(k, p);
if (num_degree > den_degree) { return 0; }
long long nFact = factMod(n, p);
long long kFact = factMod(k, p);
long long nMinusKFact = factMod(n-k, p);
long long ans = (((nFact * modPow(kFact, p - 2, p)) % p) * modPow(nMinusKFact, p - 2, p))%p;
return ans;
}
I know the basics of number theory but can't seem to figure out how this works.
The nChooseK function appears to use the definition of combination [n!/(n-k)!k!] with the modular inverse computed using Fermat's little theorem to replace the division. However, according to one of the answers, the factMod function does not actually compute the factorial. If this is the case, how does the nChooseK function work?

Yes, n! ≡ 0 mod p if and only if n ≥ p, but factMod isn't computing n! mod p – it's computing n!/pk mod p where k is the exponent of p in the prime factorization of n!, perhaps for the purpose of computing a binomial coefficient. Iteration i (counting from 0) of the loop counts the contribution of those factors 1…n whose prime factorization includes pi. The statement n /= p; yields the subproblem on the multiples of p.
The function get_degree(n, p) presumably returns the exponent of p in the prime factorization of n!. If get_degree(n, p) == get_degree(k, p) + get_degree(n - k, p), then the factors of p in numerator and denominator exactly cancel, and we can use factMod to account for the other factors. Otherwise, the number of combinations is divisible by p, so we return 0.
Since (p-1)! ≡ -1 mod p by Wilson's theorem, the first inner loop is redundant.

Related

Running time of GCD Algorithm?

unsigned int gcd(unsigned int n, unsigned int m)
{
if (n == 0)
return m;
if (m == 0)
return n;
while (m! = n)
{
if (n > m)
n = n − m;
else
m = m − n;
}
return n;
}
Some psuedocode for an iterative GCD algorithm using a while loop. I there are no places where there is anything being divided by 2, so I do not think that it is logarithmic. Since the while loop runs for a time directly proportional to N does it make it linear like O(N)?
I would say it is O(max(n,m)) because it big-O should be symmetric for n vs m since the algorithm is.
#PaulHankin improves this to be O(max(n,m)/gcd(n,m)).

Another Pollard Rho Implementation

In an attempt to solve the 3rd problem on project Euler (https://projecteuler.net/problem=3), I decided to implement Pollard's Rho algorithm (at least part of it, I'm planning on including the cycling later). The odd thing is that it works for numbers such as: 82123(factor = 41) and 16843009(factor 257). However when I try the project Euler number: 600851475143, I end up getting 71 when the largest prime factor is 6857. Here's my implementation(sorry for wall of code and lack of type casting):
#include <iostream>
#include <math.h>
#include <vector>
using namespace std;
long long int gcd(long long int a,long long int b);
long long int f(long long int x);
int main()
{
long long int i, x, y, N, factor, iterations = 0, counter = 0;
vector<long long int>factors;
factor = 1;
x = 631;
N = 600851475143;
factors.push_back(x);
while (factor == 1)
{
y = f(x);
y = y % N;
factors.push_back(y);
cout << "\niteration" << iterations << ":\t";
i = 0;
while (factor == 1 && (i < factors.size() - 1))
{
factor = gcd(abs(factors.back() - factors[i]), N);
cout << factor << " ";
i++;
}
x = y;
//factor = 2;
iterations++;
}
system("PAUSE");
return 0;
}
long long int gcd(long long int a, long long int b)
{
long long int remainder;
do
{
remainder = a % b;
a = b;
b = remainder;
} while (remainder != 0);
return a;
}
long long int f(long long int x)
{
//x = x*x * 1024 + 32767;
x = x*x + 1;
return x;
}
Pollard's rho algorithm guarantees nothing. It doesn't guarantee to find the largest factor. It doesn't guarantee that any factor it finds is prime. It doesn't even guarantee to find a factor at all. The rho algorithm is probabilistic; it will probably find a factor, but not necessarily. Since your function returns a factor, it works.
That said, your implementation isn't very good. It's not necessary to store all previous values of the function, and compute the gcd to each every time through the loop. Here is pseudocode for a better version of the function:
function rho(n)
for c from 1 to infinity
h, t := 1, 1
repeat
h := (h*h+c) % n # the hare runs ...
h := (h*h+c) % n # ... twice as fast
t := (t*t+c) % n # as the tortoise
g := gcd(t-h, n)
while g == 1
if g < n then return g
This function returns a single factor of n, which may be either prime or composite. It stores only two values of the random sequence, and stops when it finds a cycle (when g == n), restarting with a different random sequence (by incrementing c). Otherwise it keeps going until it finds a factor, which shouldn't take too long as long as you limit the input to 64-bit integers. Find more factors by applying rho to the remaining cofactor, or if the factor that is found is composite, stopping when all the prime factors have been found.
By the way, you don't need Pollard's rho algorithm to solve Project Euler #3; simple trial division is sufficient. This algorithm finds all the prime factors of a number, from which you can extract the largest:
function factors(n)
f := 2
while f * f <= n
while n % f == 0
print f
n := n / f
f := f + 1
if n > 1 then print n

Fastest way to generate binomial coefficients

I need to calculate combinations for a number.
What is the fastest way to calculate nCp where n>>p?
I need a fast way to generate binomial coefficients for an polynomial equation and I need to get the coefficient of all the terms and store it in an array.
(a+b)^n = a^n + nC1 a^(n-1) * b + nC2 a^(n-2) * ............
+nC(n-1) a * b^(n-1) + b^n
What is the most efficient way to calculate nCp ??
You cau use dynamic programming in order to generate binomial coefficients
You can create an array and than use O(N^2) loop to fill it
C[n, k] = C[n-1, k-1] + C[n-1, k];
where
C[1, 1] = C[n, n] = 1
After that in your program you can get the C(n, k) value just looking at your 2D array at [n, k] indices
UPDATE smth like that
for (int k = 1; k <= K; k++) C[0][k] = 0;
for (int n = 0; n <= N; n++) C[n][0] = 1;
for (int n = 1; n <= N; n++)
for (int k = 1; k <= K; k++)
C[n][k] = C[n-1][k-1] + C[n-1][k];
where the N, K - maximum values of your n, k
If you need to compute them for all n, Ribtoks's answer is probably the best.
For a single n, you're better off doing like this:
C[0] = 1
for (int k = 0; k < n; ++ k)
C[k+1] = (C[k] * (n-k)) / (k+1)
The division is exact, if done after the multiplication.
And beware of overflowing with C[k] * (n-k) : use large enough integers.
If you want complete expansions for large values of n, FFT convolution might be the fastest way. In the case of a binomial expansion with equal coefficients (e.g. a series of fair coin tosses) and an even order (e.g. number of tosses) you can exploit symmetries thus:
Theory
Represent the results of two coin tosses (e.g. half the difference between the total number of heads and tails) with the expression A + A*cos(Pi*n/N). N is the number of samples in your buffer - a binomial expansion of even order O will have O+1 coefficients and require a buffer of N >= O/2 + 1 samples - n is the sample number being generated, and A is a scale factor that will usually be either 2 (for generating binomial coefficients) or 0.5 (for generating a binomial probability distribution).
Notice that, in frequency, this expression resembles the binomial distribution of those two coin tosses - there are three symmetrical spikes at positions corresponding to the number (heads-tails)/2. Since modelling the overall probability distribution of independent events requires convolving their distributions, we want to convolve our expression in the frequency domain, which is equivalent to multiplication in the time domain.
In other words, by raising our cosine expression for the result of two tosses to a power (e.g. to simulate 500 tosses, raise it to the power of 250 since it already represents a pair), we can arrange for the binomial distribution for a large number to appear in the frequency domain. Since this is all real and even, we can substitute the DCT-I for the DFT to improve efficiency.
Algorithm
decide on a buffer size, N, that is at least O/2 + 1 and can be conveniently DCTed
initialise it with the expression pow(A + A*cos(Pi*n/N),O/2)
apply the forward DCT-I
read out the coefficients from the buffer - the first number is the central peak where heads=tails, and subsequent entries correspond to symmetrical pairs successively further from the centre
Accuracy
There's a limit to how high O can be before accumulated floating-point rounding errors rob you of accurate integer values for the coefficients, but I'd guess the number is pretty high. Double-precision floating-point can represent 53-bit integers with complete accuracy, and I'm going to ignore the rounding loss involved in the use of pow() because the generating expression will take place in FP registers, giving us an extra 11 bits of mantissa to absorb the rounding error on Intel platforms. So assuming we use a 1024-point DCT-I implemented via the FFT, that means losing 10 bits' accuracy to rounding error during the transform and not much else, leaving us with ~43 bits of clean representation. I don't know what order of binomial expansion generates coefficients of that size, but I dare say it's big enough for your needs.
Asymmetrical expansions
If you want the asymmetrical expansions for unequal coefficients of a and b, you'll need to use a two-sided (complex) DFT and a complex pow() function. Generate the expression A*A*e^(-Pi*i*n/N) + A*B + B*B*e^(+Pi*i*n/N) [using the complex pow() function to raise it to the power of half the expansion order] and DFT it. What you have in the buffer is, again, the central point (but not the maximum if A and B are very different) at offset zero, and it is followed by the upper half of the distribution. The upper half of the buffer will contain the lower half of the distribution, corresponding to heads-minus-tails values that are negative.
Notice that the source data is Hermitian symmetrical (the second half of the input buffer is the complex conjugate of the first), so this algorithm is not optimal and can be performed using a complex-to-complex FFT of half the required size for optimum efficiency.
Needless to say, all the complex exponentiation will chew more CPU time and hurt accuracy compared to the purely real algorithm for symmetrical distributions above.
This is my version:
def binomial(n, k):
if k == 0:
return 1
elif 2*k > n:
return binomial(n,n-k)
else:
e = n-k+1
for i in range(2,k+1):
e *= (n-k+i)
e /= i
return e
I recently wrote a piece of code that needed to call for a binary coefficient about 10 million times. So I did a combination lookup-table/calculation approach that's still not too wasteful of memory. You might find it useful (and my code is in the public domain). The code is at
http://www.etceterology.com/fast-binomial-coefficients
It's been suggested that I inline the code here. A big honking lookup table seems like a waste, so here's the final function, and a Python script that generates the table:
extern long long bctable[]; /* See below */
long long binomial(int n, int k) {
int i;
long long b;
assert(n >= 0 && k >= 0);
if (0 == k || n == k) return 1LL;
if (k > n) return 0LL;
if (k > (n - k)) k = n - k;
if (1 == k) return (long long)n;
if (n <= 54 && k <= 54) {
return bctable[(((n - 3) * (n - 3)) >> 2) + (k - 2)];
}
/* Last resort: actually calculate */
b = 1LL;
for (i = 1; i <= k; ++i) {
b *= (n - (k - i));
if (b < 0) return -1LL; /* Overflow */
b /= i;
}
return b;
}
#!/usr/bin/env python3
import sys
class App(object):
def __init__(self, max):
self.table = [[0 for k in range(max + 1)] for n in range(max + 1)]
self.max = max
def build(self):
for n in range(self.max + 1):
for k in range(self.max + 1):
if k == 0: b = 1
elif k > n: b = 0
elif k == n: b = 1
elif k == 1: b = n
elif k > n-k: b = self.table[n][n-k]
else:
b = self.table[n-1][k] + self.table[n-1][k-1]
self.table[n][k] = b
def output(self, val):
if val > 2**63: val = -1
text = " {0}LL,".format(val)
if self.column + len(text) > 76:
print("\n ", end = "")
self.column = 3
print(text, end = "")
self.column += len(text)
def dump(self):
count = 0
print("long long bctable[] = {", end="");
self.column = 999
for n in range(self.max + 1):
for k in range(self.max + 1):
if n < 4 or k < 2 or k > n-k:
continue
self.output(self.table[n][k])
count += 1
print("\n}}; /* {0} Entries */".format(count));
def run(self):
self.build()
self.dump()
return 0
def main(args):
return App(54).run()
if __name__ == "__main__":
sys.exit(main(sys.argv))
If you really only need the case where n is much larger than p, one way to go would be to use the Stirling's formula for the factorials. (if n>>1 and p is order one, Stirling approximate n! and (n-p)!, keep p! as it is etc.)
The fastest reasonable approximation in my own benchmarking is the approximation used by the Apache Commons Maths library: http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math3/special/Gamma.html#logGamma(double)
My colleagues and I tried to see if we could beat it, while using exact calculations rather than approximates. All approaches failed miserably (many orders slower) except one, which was 2-3 times slower. The best performing approach uses https://math.stackexchange.com/a/202559/123948, here is the code (in Scala):
var i: Int = 0
var binCoeff: Double = 1
while (i < k) {
binCoeff *= (n - i) / (k - i).toDouble
i += 1
}
binCoeff
The really bad approaches where various attempts at implementing Pascal's Triangle using tail recursion.
nCp = n! / ( p! (n-p)! ) =
( n * (n-1) * (n-2) * ... * (n - p) * (n - p - 1) * ... * 1 ) /
( p * (p-1) * ... * 1 * (n - p) * (n - p - 1) * ... * 1 )
If we prune the same terms of the numerator and the denominator, we are left with minimal multiplication required. We can write a function in C to perform 2p multiplications and 1 division to get nCp:
int binom ( int p, int n ) {
if ( p == 0 ) return 1;
int num = n;
int den = p;
while ( p > 1 ) {
p--;
num *= n - p;
den *= p;
}
return num / den;
}
I was looking for the same thing and couldn't find it, so wrote one myself that seems optimal for any Binomial Coeffcient for which the endresult fits into a Long.
// Calculate Binomial Coefficient
// Jeroen B.P. Vuurens
public static long binomialCoefficient(int n, int k) {
// take the lowest possible k to reduce computing using: n over k = n over (n-k)
k = java.lang.Math.min( k, n - k );
// holds the high number: fi. (1000 over 990) holds 991..1000
long highnumber[] = new long[k];
for (int i = 0; i < k; i++)
highnumber[i] = n - i; // the high number first order is important
// holds the dividers: fi. (1000 over 990) holds 2..10
int dividers[] = new int[k - 1];
for (int i = 0; i < k - 1; i++)
dividers[i] = k - i;
// for every dividers there is always exists a highnumber that can be divided by
// this, the number of highnumbers being a sequence that equals the number of
// dividers. Thus, the only trick needed is to divide in reverse order, so
// divide the highest divider first trying it on the highest highnumber first.
// That way you do not need to do any tricks with primes.
for (int divider: dividers) {
boolean eliminated = false;
for (int i = 0; i < k; i++) {
if (highnumber[i] % divider == 0) {
highnumber[i] /= divider;
eliminated = true;
break;
}
}
if(!eliminated) throw new Error(n+","+k+" divider="+divider);
}
// multiply remainder of highnumbers
long result = 1;
for (long high : highnumber)
result *= high;
return result;
}
If I understand the notation in the question, you don't just want nCp, you actually want all of nC1, nC2, ... nC(n-1). If this is correct, we can leverage the following relationship to make this fairly trivial:
for all k>0: nCk = prod_{from i=1..k}( (n-i+1)/i )
i.e. for all k>0: nCk = nC(k-1) * (n-k+1) / k
Here's a python snippet implementing this approach:
def binomial_coef_seq(n, k):
"""Returns a list of all binomial terms from choose(n,0) up to choose(n,k)"""
b = [1]
for i in range(1,k+1):
b.append(b[-1] * (n-i+1)/i)
return b
If you need all coefficients up to some k > ceiling(n/2), you can use symmetry to reduce the number of operations you need to perform by stopping at the coefficient for ceiling(n/2) and then just backfilling as far as you need.
import numpy as np
def binomial_coef_seq2(n, k):
"""Returns a list of all binomial terms from choose(n,0) up to choose(n,k)"""
k2 = int(np.ceiling(n/2))
use_symmetry = k > k2
if use_symmetry:
k = k2
b = [1]
for i in range(1, k+1):
b.append(b[-1] * (n-i+1)/i)
if use_symmetry:
v = k2 - (n-k)
b2 = b[-v:]
b.extend(b2)
return b
Time Complexity : O(denominator)
Space Complexity : O(1)
public class binomialCoeff {
static double binomialcoeff(int numerator, int denominator)
{
double res = 1;
//invalid numbers
if (denominator>numerator || denominator<0 || numerator<0) {
res = -1;
return res;}
//default values
if(denominator==numerator || denominator==0 || numerator==0)
return res;
// Since C(n, k) = C(n, n-k)
if ( denominator > (numerator - denominator) )
denominator = numerator - denominator;
// Calculate value of [n * (n-1) *---* (n-k+1)] / [k * (k-1) *----* 1]
while (denominator>=1)
{
res *= numerator;
res = res / denominator;
denominator--;
numerator--;
}
return res;
}
/* Driver program to test above function*/
public static void main(String[] args)
{
int numerator = 120;
int denominator = 20;
System.out.println("Value of C("+ numerator + ", " + denominator+ ") "
+ "is" + " "+ binomialcoeff(numerator, denominator));
}
}

Is there any quick way to determine first k digits on n^n

I am writing a program where I need to know only the first k (k can be anywhere between 1-5) numbers of another big number which can be represented as n^n where n is a very large number.
Currently I am actually calculating n^n and then parsing it as a string. I wonder if there is a better more fast method exists.
There are two possibilities.
If you want the first k leading digits (as in: the leading digit of 12345 is 1), then you can use the fact that
n^n = 10^(n*Log10(n))
so you compute the fractional part f of n*Log10(n), and then the first k digits of 10^f will be your result. This works for numbers up to about 10^10 before round-off errors start kicking in if you use double precision. For example, for n = 2^20, f = 0.57466709..., 10^f = 3.755494... so your first 5 digits are 37554. For n = 4, f = 0.4082..., 10^f = 2.56 so your first digit is 2.
If you want the first k trailing digits (as in: the trailing digit of 12345 is 5), then you can use modular arithmetic. I would use the squaring trick:
factor = n mod 10^k
result = 1
while (n != 0)
if (n is odd) then result = (result * factor) mod 10^k
factor = (factor * factor) mod 10^k
n >>= 1
Taking n=2^20 as an example again, we find that result = 88576. For n=4, we have factor = 1, 4, 6 and result = 1, 1, 6 so the answer is 6.
if you mean the least significant or rightmost digits, this can be done with modular multiplication. It's O(N) complexity and doesn't require any special bignum data types.
#include <cmath>
#include <cstdio>
//returns ((base ^ exponent) % mod)
int modularExponentiation(int base, int exponent, int mod){
int result = 1;
for(int i = 0; i < exponent; i++){
result = (result * base) % mod;
}
return result;
}
int firstKDigitsOfNToThePowerOfN(int k, int n){
return modularExponentiation(n, n, pow(10, k));
}
int main(){
int n = 11;
int result = firstKDigitsOfNToThePowerOfN(3, n);
printf("%d", result);
}
This will print 611, the first three digits of 11^11 = 285311670611.
This implementation is suitable for values of N less than sqrt(INT_MAX), which will vary but on my machine and language it's over 46,000.
Furthermore, if it so happens that your INT_MAX is less than (10^k)^2, you can change modularExponentiation to handle any N that can fit in an int:
int modularExponentiation(int base, int exponent, int mod){
int result = 1;
for(int i = 0; i < exponent; i++){
result = (result * (base % mod)) % mod; //doesn't overflow as long as mod * mod < INT_MAX
}
return result;
}
if O(n) time is insufficient for you, we can take advantage of the property of exponentiation that A^(2*C) = (A^C)^2, and get logarithmic efficiency.
//returns ((base ^ exponent) % mod)
int modularExponentiation(int base, int exponent, int mod){
if (exponent == 0){return 1;}
if (exponent == 1){return base % mod;}
if (exponent % 2 == 1){
return ((base % mod) * modularExponentiation(base, exponent-1, mod)) % mod;
}
else{
int newBase = modularExponentiation(base, exponent / 2, mod);
return (newBase * newBase) % mod;
}
}

Algorithm to find Largest prime factor of a number

What is the best approach to calculating the largest prime factor of a number?
I'm thinking the most efficient would be the following:
Find lowest prime number that divides cleanly
Check if result of division is prime
If not, find next lowest
Go to 2.
I'm basing this assumption on it being easier to calculate the small prime factors. Is this about right? What other approaches should I look into?
Edit: I've now realised that my approach is futile if there are more than 2 prime factors in play, since step 2 fails when the result is a product of two other primes, therefore a recursive algorithm is needed.
Edit again: And now I've realised that this does still work, because the last found prime number has to be the highest one, therefore any further testing of the non-prime result from step 2 would result in a smaller prime.
Here's the best algorithm I know of (in Python)
def prime_factors(n):
"""Returns all the prime factors of a positive integer"""
factors = []
d = 2
while n > 1:
while n % d == 0:
factors.append(d)
n /= d
d = d + 1
return factors
pfs = prime_factors(1000)
largest_prime_factor = max(pfs) # The largest element in the prime factor list
The above method runs in O(n) in the worst case (when the input is a prime number).
EDIT:
Below is the O(sqrt(n)) version, as suggested in the comment. Here is the code, once more.
def prime_factors(n):
"""Returns all the prime factors of a positive integer"""
factors = []
d = 2
while n > 1:
while n % d == 0:
factors.append(d)
n /= d
d = d + 1
if d*d > n:
if n > 1: factors.append(n)
break
return factors
pfs = prime_factors(1000)
largest_prime_factor = max(pfs) # The largest element in the prime factor list
Actually there are several more efficient ways to find factors of big numbers (for smaller ones trial division works reasonably well).
One method which is very fast if the input number has two factors very close to its square root is known as Fermat factorisation. It makes use of the identity N = (a + b)(a - b) = a^2 - b^2 and is easy to understand and implement. Unfortunately it's not very fast in general.
The best known method for factoring numbers up to 100 digits long is the Quadratic sieve. As a bonus, part of the algorithm is easily done with parallel processing.
Yet another algorithm I've heard of is Pollard's Rho algorithm. It's not as efficient as the Quadratic Sieve in general but seems to be easier to implement.
Once you've decided on how to split a number into two factors, here is the fastest algorithm I can think of to find the largest prime factor of a number:
Create a priority queue which initially stores the number itself. Each iteration, you remove the highest number from the queue, and attempt to split it into two factors (not allowing 1 to be one of those factors, of course). If this step fails, the number is prime and you have your answer! Otherwise you add the two factors into the queue and repeat.
My answer is based on Triptych's, but improves a lot on it. It is based on the fact that beyond 2 and 3, all the prime numbers are of the form 6n-1 or 6n+1.
var largestPrimeFactor;
if(n mod 2 == 0)
{
largestPrimeFactor = 2;
n = n / 2 while(n mod 2 == 0);
}
if(n mod 3 == 0)
{
largestPrimeFactor = 3;
n = n / 3 while(n mod 3 == 0);
}
multOfSix = 6;
while(multOfSix - 1 <= n)
{
if(n mod (multOfSix - 1) == 0)
{
largestPrimeFactor = multOfSix - 1;
n = n / largestPrimeFactor while(n mod largestPrimeFactor == 0);
}
if(n mod (multOfSix + 1) == 0)
{
largestPrimeFactor = multOfSix + 1;
n = n / largestPrimeFactor while(n mod largestPrimeFactor == 0);
}
multOfSix += 6;
}
I recently wrote a blog article explaining how this algorithm works.
I would venture that a method in which there is no need for a test for primality (and no sieve construction) would run faster than one which does use those. If that is the case, this is probably the fastest algorithm here.
JavaScript code:
'option strict';
function largestPrimeFactor(val, divisor = 2) {
let square = (val) => Math.pow(val, 2);
while ((val % divisor) != 0 && square(divisor) <= val) {
divisor++;
}
return square(divisor) <= val
? largestPrimeFactor(val / divisor, divisor)
: val;
}
Usage Example:
let result = largestPrimeFactor(600851475143);
Here is an example of the code:
Similar to #Triptych answer but also different. In this example list or dictionary is not used. Code is written in Ruby
def largest_prime_factor(number)
i = 2
while number > 1
if number % i == 0
number /= i;
else
i += 1
end
end
return i
end
largest_prime_factor(600851475143)
# => 6857
All numbers can be expressed as the product of primes, eg:
102 = 2 x 3 x 17
712 = 2 x 2 x 2 x 89
You can find these by simply starting at 2 and simply continuing to divide until the result isn't a multiple of your number:
712 / 2 = 356 .. 356 / 2 = 178 .. 178 / 2 = 89 .. 89 / 89 = 1
using this method you don't have to actually calculate any primes: they'll all be primes, based on the fact that you've already factorised the number as much as possible with all preceding numbers.
number = 712;
currNum = number; // the value we'll actually be working with
for (currFactor in 2 .. number) {
while (currNum % currFactor == 0) {
// keep on dividing by this number until we can divide no more!
currNum = currNum / currFactor // reduce the currNum
}
if (currNum == 1) return currFactor; // once it hits 1, we're done.
}
//this method skips unnecessary trial divisions and makes
//trial division more feasible for finding large primes
public static void main(String[] args)
{
long n= 1000000000039L; //this is a large prime number
long i = 2L;
int test = 0;
while (n > 1)
{
while (n % i == 0)
{
n /= i;
}
i++;
if(i*i > n && n > 1)
{
System.out.println(n); //prints n if it's prime
test = 1;
break;
}
}
if (test == 0)
System.out.println(i-1); //prints n if it's the largest prime factor
}
The simplest solution is a pair of mutually recursive functions.
The first function generates all the prime numbers:
Start with a list of all natural numbers greater than 1.
Remove all numbers that are not prime. That is, numbers that have no prime factors (other than themselves). See below.
The second function returns the prime factors of a given number n in increasing order.
Take a list of all the primes (see above).
Remove all the numbers that are not factors of n.
The largest prime factor of n is the last number given by the second function.
This algorithm requires a lazy list or a language (or data structure) with call-by-need semantics.
For clarification, here is one (inefficient) implementation of the above in Haskell:
import Control.Monad
-- All the primes
primes = 2 : filter (ap (<=) (head . primeFactors)) [3,5..]
-- Gives the prime factors of its argument
primeFactors = factor primes
where factor [] n = []
factor xs#(p:ps) n =
if p*p > n then [n]
else let (d,r) = divMod n p in
if r == 0 then p : factor xs d
else factor ps n
-- Gives the largest prime factor of its argument
largestFactor = last . primeFactors
Making this faster is just a matter of being more clever about detecting which numbers are prime and/or factors of n, but the algorithm stays the same.
n = abs(number);
result = 1;
if (n mod 2 == 0) {
result = 2;
while (n mod 2 = 0) n /= 2;
}
for(i=3; i<sqrt(n); i+=2) {
if (n mod i == 0) {
result = i;
while (n mod i = 0) n /= i;
}
}
return max(n,result)
There are some modulo tests that are superflous, as n can never be divided by 6 if all factors 2 and 3 have been removed. You could only allow primes for i, which is shown in several other answers here.
You could actually intertwine the sieve of Eratosthenes here:
First create the list of integers up
to sqrt(n).
In the for loop mark all multiples
of i up to the new sqrt(n) as not
prime, and use a while loop instead.
set i to the next prime number in
the list.
Also see this question.
I'm aware this is not a fast solution. Posting as hopefully easier to understand slow solution.
public static long largestPrimeFactor(long n) {
// largest composite factor must be smaller than sqrt
long sqrt = (long)Math.ceil(Math.sqrt((double)n));
long largest = -1;
for(long i = 2; i <= sqrt; i++) {
if(n % i == 0) {
long test = largestPrimeFactor(n/i);
if(test > largest) {
largest = test;
}
}
}
if(largest != -1) {
return largest;
}
// number is prime
return n;
}
Python Iterative approach by removing all prime factors from the number
def primef(n):
if n <= 3:
return n
if n % 2 == 0:
return primef(n/2)
elif n % 3 ==0:
return primef(n/3)
else:
for i in range(5, int((n)**0.5) + 1, 6):
#print i
if n % i == 0:
return primef(n/i)
if n % (i + 2) == 0:
return primef(n/(i+2))
return n
I am using algorithm which continues dividing the number by it's current Prime Factor.
My Solution in python 3 :
def PrimeFactor(n):
m = n
while n%2==0:
n = n//2
if n == 1: # check if only 2 is largest Prime Factor
return 2
i = 3
sqrt = int(m**(0.5)) # loop till square root of number
last = 0 # to store last prime Factor i.e. Largest Prime Factor
while i <= sqrt :
while n%i == 0:
n = n//i # reduce the number by dividing it by it's Prime Factor
last = i
i+=2
if n> last: # the remaining number(n) is also Factor of number
return n
else:
return last
print(PrimeFactor(int(input())))
Input : 10
Output : 5
Input : 600851475143
Output : 6857
Inspired by your question I decided to implement my own version of factorization (and finding largest prime factor) in Python.
Probably the simplest to implement, yet quite efficient, factoring algorithm that I know is Pollard's Rho algorithm. It has a running time of O(N^(1/4)) at most which is much more faster than time of O(N^(1/2)) for trial division algorithm. Both algos have these running times only in case of composite (non-prime) number, that's why primality test should be used to filter out prime (non-factorable) numbers.
I used following algorithms in my code: Fermat Primality Test ..., Pollard's Rho Algorithm ..., Trial Division Algorithm. Fermat primality test is used before running Pollard's Rho in order to filter out prime numbers. Trial Division is used as a fallback because Pollard's Rho in very rare cases may fail to find a factor, especially for some small numbers.
Obviously after fully factorizing a number into sorted list of prime factors the largest prime factor will be the last element in this list. In general case (for any random number) I don't know of any other ways to find out largest prime factor besides fully factorizing a number.
As an example in my code I'm factoring first 190 fractional digits of Pi, code factorizes this number within 1 second, and shows largest prime factor which is 165 digits (545 bits) in size!
Try it online!
def is_fermat_probable_prime(n, *, trials = 32):
# https://en.wikipedia.org/wiki/Fermat_primality_test
import random
if n <= 16:
return n in (2, 3, 5, 7, 11, 13)
for i in range(trials):
if pow(random.randint(2, n - 2), n - 1, n) != 1:
return False
return True
def pollard_rho_factor(N, *, trials = 16):
# https://en.wikipedia.org/wiki/Pollard%27s_rho_algorithm
import random, math
for j in range(trials):
i, stage, y, x = 0, 2, 1, random.randint(1, N - 2)
while True:
r = math.gcd(N, x - y)
if r != 1:
break
if i == stage:
y = x
stage <<= 1
x = (x * x + 1) % N
i += 1
if r != N:
return [r, N // r]
return [N] # Pollard-Rho failed
def trial_division_factor(n, *, limit = None):
# https://en.wikipedia.org/wiki/Trial_division
fs = []
while n & 1 == 0:
fs.append(2)
n >>= 1
d = 3
while d * d <= n and limit is None or d <= limit:
q, r = divmod(n, d)
if r == 0:
fs.append(d)
n = q
else:
d += 2
if n > 1:
fs.append(n)
return fs
def factor(n):
if n <= 1:
return []
if is_fermat_probable_prime(n):
return [n]
fs = trial_division_factor(n, limit = 1 << 12)
if len(fs) >= 2:
return sorted(fs[:-1] + factor(fs[-1]))
fs = pollard_rho_factor(n)
if len(fs) >= 2:
return sorted([e1 for e0 in fs for e1 in factor(e0)])
return trial_division_factor(n)
def demo():
import time, math
# http://www.math.com/tables/constants/pi.htm
# pi = 3.
# 1415926535 8979323846 2643383279 5028841971 6939937510 5820974944 5923078164 0628620899 8628034825 3421170679
# 8214808651 3282306647 0938446095 5058223172 5359408128 4811174502 8410270193 8521105559 6446229489 5493038196
# n = first 190 fractional digits of Pi
n = 1415926535_8979323846_2643383279_5028841971_6939937510_5820974944_5923078164_0628620899_8628034825_3421170679_8214808651_3282306647_0938446095_5058223172_5359408128_4811174502_8410270193_8521105559_6446229489
print('Number:', n)
tb = time.time()
fs = factor(n)
print('All Prime Factors:', fs)
print('Largest Prime Factor:', f'({math.log2(fs[-1]):.02f} bits, {len(str(fs[-1]))} digits)', fs[-1])
print('Time Elapsed:', round(time.time() - tb, 3), 'sec')
if __name__ == '__main__':
demo()
Output:
Number: 1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679821480865132823066470938446095505822317253594081284811174502841027019385211055596446229489
All Prime Factors: [3, 71, 1063541, 153422959, 332958319, 122356390229851897378935483485536580757336676443481705501726535578690975860555141829117483263572548187951860901335596150415443615382488933330968669408906073630300473]
Largest Prime Factor: (545.09 bits, 165 digits) 122356390229851897378935483485536580757336676443481705501726535578690975860555141829117483263572548187951860901335596150415443615382488933330968669408906073630300473
Time Elapsed: 0.593 sec
Here is my attempt in c#. The last print out is the largest prime factor of the number. I checked and it works.
namespace Problem_Prime
{
class Program
{
static void Main(string[] args)
{
/*
The prime factors of 13195 are 5, 7, 13 and 29.
What is the largest prime factor of the number 600851475143 ?
*/
long x = 600851475143;
long y = 2;
while (y < x)
{
if (x % y == 0)
{
// y is a factor of x, but is it prime
if (IsPrime(y))
{
Console.WriteLine(y);
}
x /= y;
}
y++;
}
Console.WriteLine(y);
Console.ReadLine();
}
static bool IsPrime(long number)
{
//check for evenness
if (number % 2 == 0)
{
if (number == 2)
{
return true;
}
return false;
}
//don't need to check past the square root
long max = (long)Math.Sqrt(number);
for (int i = 3; i <= max; i += 2)
{
if ((number % i) == 0)
{
return false;
}
}
return true;
}
}
}
#python implementation
import math
n = 600851475143
i = 2
factors=set([])
while i<math.sqrt(n):
while n%i==0:
n=n/i
factors.add(i)
i+=1
factors.add(n)
largest=max(factors)
print factors
print largest
Calculates the largest prime factor of a number using recursion in C++. The working of the code is explained below:
int getLargestPrime(int number) {
int factor = number; // assumes that the largest prime factor is the number itself
for (int i = 2; (i*i) <= number; i++) { // iterates to the square root of the number till it finds the first(smallest) factor
if (number % i == 0) { // checks if the current number(i) is a factor
factor = max(i, number / i); // stores the larger number among the factors
break; // breaks the loop on when a factor is found
}
}
if (factor == number) // base case of recursion
return number;
return getLargestPrime(factor); // recursively calls itself
}
Here is my approach to quickly calculate the largest prime factor.
It is based on fact that modified x does not contain non-prime factors. To achieve that, we divide x as soon as a factor is found. Then, the only thing left is to return the largest factor. It would be already prime.
The code (Haskell):
f max' x i | i > x = max'
| x `rem` i == 0 = f i (x `div` i) i -- Divide x by its factor
| otherwise = f max' x (i + 1) -- Check for the next possible factor
g x = f 2 x 2
The following C++ algorithm is not the best one, but it works for numbers under a billion and its pretty fast
#include <iostream>
using namespace std;
// ------ is_prime ------
// Determines if the integer accepted is prime or not
bool is_prime(int n){
int i,count=0;
if(n==1 || n==2)
return true;
if(n%2==0)
return false;
for(i=1;i<=n;i++){
if(n%i==0)
count++;
}
if(count==2)
return true;
else
return false;
}
// ------ nextPrime -------
// Finds and returns the next prime number
int nextPrime(int prime){
bool a = false;
while (a == false){
prime++;
if (is_prime(prime))
a = true;
}
return prime;
}
// ----- M A I N ------
int main(){
int value = 13195;
int prime = 2;
bool done = false;
while (done == false){
if (value%prime == 0){
value = value/prime;
if (is_prime(value)){
done = true;
}
} else {
prime = nextPrime(prime);
}
}
cout << "Largest prime factor: " << value << endl;
}
Found this solution on the web by "James Wang"
public static int getLargestPrime( int number) {
if (number <= 1) return -1;
for (int i = number - 1; i > 1; i--) {
if (number % i == 0) {
number = i;
}
}
return number;
}
Prime factor using sieve :
#include <bits/stdc++.h>
using namespace std;
#define N 10001
typedef long long ll;
bool visit[N];
vector<int> prime;
void sieve()
{
memset( visit , 0 , sizeof(visit));
for( int i=2;i<N;i++ )
{
if( visit[i] == 0)
{
prime.push_back(i);
for( int j=i*2; j<N; j=j+i )
{
visit[j] = 1;
}
}
}
}
void sol(long long n, vector<int>&prime)
{
ll ans = n;
for(int i=0; i<prime.size() || prime[i]>n; i++)
{
while(n%prime[i]==0)
{
n=n/prime[i];
ans = prime[i];
}
}
ans = max(ans, n);
cout<<ans<<endl;
}
int main()
{
ll tc, n;
sieve();
cin>>n;
sol(n, prime);
return 0;
}
Guess, there is no immediate way but performing a factorization, as examples above have done, i.e.
in a iteration you identify a "small" factor f of a number N, then continue with the reduced problem "find largest prime factor of N':=N/f with factor candidates >=f ".
From certain size of f the expected search time is less, if you do a primality test on reduced N', which in case confirms, that your N' is already the largest prime factor of initial N.
Here is my attempt in Clojure. Only walking the odds for prime? and the primes for prime factors ie. sieve. Using lazy sequences help producing the values just before they are needed.
(defn prime?
([n]
(let [oddNums (iterate #(+ % 2) 3)]
(prime? n (cons 2 oddNums))))
([n [i & is]]
(let [q (quot n i)
r (mod n i)]
(cond (< n 2) false
(zero? r) false
(> (* i i) n) true
:else (recur n is)))))
(def primes
(let [oddNums (iterate #(+ % 2) 3)]
(lazy-seq (cons 2 (filter prime? oddNums)))))
;; Sieve of Eratosthenes
(defn sieve
([n]
(sieve primes n))
([[i & is :as ps] n]
(let [q (quot n i)
r (mod n i)]
(cond (< n 2) nil
(zero? r) (lazy-seq (cons i (sieve ps q)))
(> (* i i) n) (when (> n 1) (lazy-seq [n]))
:else (recur is n)))))
(defn max-prime-factor [n]
(last (sieve n)))
Recursion in C
Algorithm could be
Check if n is a factor or t
Check if n is prime. If so, remember n
Increment n
Repeat until n > sqrt(t)
Here's an example of a (tail)recursive solution to the problem in C:
#include <stdio.h>
#include <stdbool.h>
bool is_factor(long int t, long int n){
return ( t%n == 0);
}
bool is_prime(long int n0, long int n1, bool acc){
if ( n1 * n1 > n0 || acc < 1 )
return acc;
else
return is_prime(n0, n1+2, acc && (n0%n1 != 0));
}
int gpf(long int t, long int n, long int acc){
if (n * n > t)
return acc;
if (is_factor(t, n)){
if (is_prime(n, 3, true))
return gpf(t, n+2, n);
else
return gpf(t, n+2, acc);
}
else
return gpf(t, n+2, acc);
}
int main(int argc, char ** argv){
printf("%d\n", gpf(600851475143, 3, 0));
return 0;
}
The solution is composed of three functions. One to test if the candidate is a factor, another to test if that factor is prime, and finally one to compose those two together.
Some key ideas here are:
1- Stopping the recursion at sqrt(600851475143)
2- Only test odd numbers for factorness
3- Only testing candidate factors for primeness with odd numbers
It seems to me that step #2 of the algorithm given isn't going to be all that efficient an approach. You have no reasonable expectation that it is prime.
Also, the previous answer suggesting the Sieve of Eratosthenes is utterly wrong. I just wrote two programs to factor 123456789. One was based on the Sieve, one was based on the following:
1) Test = 2
2) Current = Number to test
3) If Current Mod Test = 0 then
3a) Current = Current Div Test
3b) Largest = Test
3c) Goto 3.
4) Inc(Test)
5) If Current < Test goto 4
6) Return Largest
This version was 90x faster than the Sieve.
The thing is, on modern processors the type of operation matters far less than the number of operations, not to mention that the algorithm above can run in cache, the Sieve can't. The Sieve uses a lot of operations striking out all the composite numbers.
Note, also, that my dividing out factors as they are identified reduces the space that must be tested.
Compute a list storing prime numbers first, e.g. 2 3 5 7 11 13 ...
Every time you prime factorize a number, use implementation by Triptych but iterating this list of prime numbers rather than natural integers.
With Java:
For int values:
public static int[] primeFactors(int value) {
int[] a = new int[31];
int i = 0, j;
int num = value;
while (num % 2 == 0) {
a[i++] = 2;
num /= 2;
}
j = 3;
while (j <= Math.sqrt(num) + 1) {
if (num % j == 0) {
a[i++] = j;
num /= j;
} else {
j += 2;
}
}
if (num > 1) {
a[i++] = num;
}
int[] b = Arrays.copyOf(a, i);
return b;
}
For long values:
static long[] getFactors(long value) {
long[] a = new long[63];
int i = 0;
long num = value;
while (num % 2 == 0) {
a[i++] = 2;
num /= 2;
}
long j = 3;
while (j <= Math.sqrt(num) + 1) {
if (num % j == 0) {
a[i++] = j;
num /= j;
} else {
j += 2;
}
}
if (num > 1) {
a[i++] = num;
}
long[] b = Arrays.copyOf(a, i);
return b;
}
This is probably not always faster but more optimistic about that you find a big prime divisor:
N is your number
If it is prime then return(N)
Calculate primes up until Sqrt(N)
Go through the primes in descending order (largest first)
If N is divisible by Prime then Return(Prime)
Edit: In step 3 you can use the Sieve of Eratosthenes or Sieve of Atkins or whatever you like, but by itself the sieve won't find you the biggest prime factor. (Thats why I wouldn't choose SQLMenace's post as an official answer...)
Here is the same function#Triptych provided as a generator, which has also been simplified slightly.
def primes(n):
d = 2
while (n > 1):
while (n%d==0):
yield d
n /= d
d += 1
the max prime can then be found using:
n= 373764623
max(primes(n))
and a list of factors found using:
list(primes(n))
I think it would be good to store somewhere all possible primes smaller then n and just iterate through them to find the biggest divisior. You can get primes from prime-numbers.org.
Of course I assume that your number isn't too big :)
#include<stdio.h>
#include<conio.h>
#include<math.h>
#include <time.h>
factor(long int n)
{
long int i,j;
while(n>=4)
{
if(n%2==0) { n=n/2; i=2; }
else
{ i=3;
j=0;
while(j==0)
{
if(n%i==0)
{j=1;
n=n/i;
}
i=i+2;
}
i-=2;
}
}
return i;
}
void main()
{
clock_t start = clock();
long int n,sp;
clrscr();
printf("enter value of n");
scanf("%ld",&n);
sp=factor(n);
printf("largest prime factor is %ld",sp);
printf("Time elapsed: %f\n", ((double)clock() - start) / CLOCKS_PER_SEC);
getch();
}

Resources