Feasible implementation of a prime-counting function [closed] - performance

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
The community reviewed whether to reopen this question 5 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
What would be a computationally feasible pseudocode of any prime-counting function implementation?
I initially attempted coding the Hardy-Wright algorithm, but its factorials began generating miserable overflows, and many others appear bound to yield similar problems. I've scoured Google for practical solutions, but, at best, have found very esoteric mathematics which I haven't ever seen implemented in conventional programs.

The prime-counting function pi(x) computes the number of primes not exceeding x, and has fascinated mathematicians for centuries. At the beginning of the eighteenth century, Adrien-Marie Legendre gave a formula using an auxiliary function phi(x,a) that counts the numbers not greater than x that are not stricken by sieving with the first a primes; for instance, phi(50,3) = 14 for the numbers 1, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47 and 49. The phi function can be computed as phi(x,a) = phi(x,a-1) - phi(x/P(a),a-1), where phi(x,1) is the number of odd integers not exceeding x and P(a) is the a-th prime number (counting from P(1)=2).
function phi(x, a)
if (phi(x, a) is in cache)
return phi(x, a) from cache
if (a == 1)
return (x + 1) // 2
t := phi(x, a-1) - phi(x // p[a], a-1)
insert phi(x, a) = t in cache
return t
An array p stores the a-th prime for small a, calculated by sieving. The cache is important; without it, run time would be exponential. Given phi, Legendre's prime-counting formula is pi(x) = phi(x,a) + a - 1, where a = pi(floor(sqrt(x))). Legendre used his formula to calculate pi(10^6), but he reported 78526 instead of the correct answer of 78498, which, even though wrong, was astonishingly close for an intricate manual calculation.
In the 1950s, Derrick H. Lehmer gave an improved algorithm for counting primes:
function pi(x)
if (x < limit) return count(primes(x))
a := pi(root(x, 4)) # fourth root of x
b := pi(root(x, 2)) # square root of x
c := pi(root(x, 3)) # cube root of x
sum := phi(x,a) + (b+a-2) * (b-a+1) / 2
for i from a+1 to b
w := x / p[i]
lim := pi(sqrt(w))
sum := sum - pi(w)
if (i <= c)
for j from i to lim
sum := sum - pi(w / p[j]) + j - 1
return sum
For example, pi(10^12) = 37607912018. Even with these algorithms, and their modern variants, and very fast computers, it remains appallingly tedious to calculate large values of pi; at this writing, the largest known value is pi(10^24) = 18435599767349200867866.
To use this algorithm to calculate the n-th prime, a corollary to the Prime Number Theorem bounds the n-th prime P(n) between n log n and n(log n + log log n) for n > 5, so compute pi at the bounds and use bisection to determine the n-th prime, switching to sieving when the bounds are close.
I discuss prime numbers in several entries at my blog.

Wikipedia might help too. The article on prime counting contains a few pointers. For starters I'd recommend the algorithm by Meissel in the section "Algorithms for evaluating π(x)", which is one of simplest algorithm that does not generate all primes.
I also find the book by Pomerance and Crandall "Prime numbers a computational perspective" helpful. This book has a detailed and quite accessible description of prime counting methods. But keep in mind that the topic by its nature is a bit too advanced for most of the reader here.

Related

Number of Fibonacci numbers smaller than number k. Sub O(n)

Interview question: How many Fibonacci numbers exists less than a given number k? Can you find a function in terms of k, to get the number of fibonacci number less than k?
Example : n = 6
Answer: 6 as (0, 1, 1, 2, 3, 5)
Easy enough, write a loop or use the recursive definition of Fibonacci. However, that sounds too easy... is there a way to do this using the closed-form definition? (https://en.wikipedia.org/wiki/Fibonacci_number#Closed-form_expression)
Here is a close-form Python solution which is O(1). It uses Binet's formula (from the Wikipedia article that you linked to):
>>> from math import sqrt,log
>>> def numFibs(n): return int(log(sqrt(5)*n)/log((1+sqrt(5))/2))
>>> numFibs(10)
6
Which tracks with 1,1,2,3,5,8
The point is that the second term in Binet's formula is negligible and it is easy enough to invert the result of neglecting it.
The above formula counts the number of Fibonacci numbers which are less than or equal to n. It jumps by 1 with each new Fibonacci number. So, for example, numFibs(12) = 6 and numFibs(13) = 7. 13 is the 7th Fibonacci number, so if you want the number of Fibobacci numbers which are strictly smaller than n you have to introduce a lag. Something like:
def smallerFibs(n):
if n <= 1:
return 0
else:
return min(numFibs(n-1),numFibs(n))
Now smallerFibs(13) is still 6 but then smallerFibs(14) = 7. This is of course still O(1).
I think it's fairly easy to see the growth of this number, at least. By the Binet / De-Moivre formula,
fn = (φn - ψn) / 5
Since |ψ| < 1 < φ, then
fn ∼ φn / 5.
From this it follows that the number of Fibonacci numbers smaller than x grows like logφ(5x).

Find n, where its factorial is a product of factorials

Stackoverflow promoted this mathematics question at the right pane from stackexchange mathematics site.
I got curious to see the answers. Turned out that the question was about a specfic case (3!⋅5!⋅7!=n!) and the answers were for this specfic case too.
From a programmer/programming point of view I wonder what would be the most suffecient algorithm to solve the problem.
We have two situation though. One that the problem always have an answer and the other doesn't.
Input is 3, 5, 7 and output is 10 as in the linked question.
The algorithm would be reasonably straightforward:
Order inputs a, b, and c so that a <= b <= c
c! can be divided out from n!
This makes a! * b! equal to the product of numbers between c+1 and n, inclusive
Find the next prime p > c. This number cannot be produced by multiplying a! * b!, because both a and b are strictly less than p, hence they do not contain p among their factors
Try all candidate n between c+1 and p-1, inclusive
If you do not find an answer, there is no solution
In case of a=3, b=5, and c=7 you find the next prime above 7, which is 11, and try all numbers between 7+1 and 11-1, inclusive (i.e. 8, 9, and 10) as candidates for n.
Factor a!, b!, c!, .... For a given prime p of a, it appears in a! this many times:
floor(a / p) + floor(a / p^2) + ...
For example, 5 appears in 26!:
26 / 5 + 26 / 25 = 5 + 1 = 6 times
Now you can binary search your n and check if each of the prime factors in a!*b!*c!*... occurs exactly as many times in m!, where m is the middle point of one of your binary search iterations.

Why is only one of the given statements about complexity classes correct? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Apparently the correct answer to the following question is (C), but why are the other options not correct until we know the value of n?
If n=1, all of these seem correct except (B)! Where am I going wrong?
Which of the following is not O(n2)?
(A) 1510 * n + 12099
(B) n1.98
(C) n3 / √n
(D) 220 * n
Wikipedia says :-
Big O
notation describes the limiting behavior of a function when the
argument tends towards a particular value or infinity, usually in
terms of simpler functions. A description of a function in terms of big O notation usually only
provides an upper bound on the growth rate of the function.
An upper bound means that f(n) = n can be expressed as O(n), O(n2), O(n3), and others but not in other functions like O(1), O(log n).
So, going in the same direction, now you can easily eliminate options as shown below :-
1510 * n + 12099 < c * n2, for some n > √12100 i.e. n > 110 --- hence is O(n2).
n1.98 < n2, for all n > 1 --- and is O(n2).
n3 / √n = n5/2 > n2 for all n > 1 --- hence, isn't O(n2).
220 * n = 1024*1024*n < c* n2 for all n > 1 ,where c = 1024*1024 --- hence, is O(n2).
Hence, the only option which doesn't satisfy O(n2) is option C,i.e., f(n) = (n^3 / (sqrt(n))). Here, so, (n3 / (sqrt(n))) isn't equal to O(n2).
While Shekhar Suman’s answer explains why the official answer is right in each case, it does not answer this part of the question: “Apparently the correct answer to the following question is (C), but why are the other options not correct until we know the value of n? If n = 1, all of these seem correct except (B)! Where am I going wrong?” (my highlighting).
I would suggest that the first two parts I have put in bold indicate what the questioner – at the time of asking – had not grasped, namely that O-notation is used to express information about functions, not about particular values of expressions.
This means it makes no sense to say “If n = 1, f(n) is O(n2)”, as that is a statement about one value, f(1), and not about the function f.
Similarly, “until we know the value of n” makes no sense when talking about functions, because n is a bound variable, i.e. is only used to relate n to an expression involving n in order to define a function.
None of the options are of order n^2.
(15^10) * n + 12099 is of order n
n^1.98 is of order n^1.98
n^3 / (sqrt(n)) is of order n^2.5
(2^20) * n is of order n
You can check whether two functions are of the same order, by dividing one over the other. It should tend to go to a constant when n goes to infinity.

Need help with a math trick question [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Easy interview question got harder: given numbers 1..100, find the missing number(s)
Hi guys, I wasn't sure where to ask this but since this is an algorithmic question here it goes. I've come face to face with a math problem and can't seem to get over it for the last couple of days. It goes like this:
You are given an adding machine that
sums a set of N+1 digits consisting of
positive integers 1 to N as it's given
the numbers (e.g. the machine is given
3 as the first number and outputs 3.
It's then given 6 as the second number
and outputs 9. It's given 11 as the
third number and outputs 20. Etcetera
until it has processed N+1 numbers).
One (and only one) of the digits is
repeated. How do you determine which
number is repeated?
It seems like a trick question and I'd be really annoyed if it is just that a question to which the answer is 'not possible' - any ideas here?
Subtract (1+2+..+N) = N*(N+1)/2 from the sum.
EDIT: in case N is not known, find the largest triangular number smaller than the given sum and subtract.
If you know N and the sum S, the answer is d = S - N*(N+1)/2.
This is because the sum of all numbers from 1 to N is a triangular number, and each number from 1 to N occurs once (except for one that is repeated).
If you do not know N, you can take N = floor((sqrt(8*S+1)-1)/2. That can be deduced from a quadratic equation (n^2 + n)/2 = a.
Ok, you have:
X = 1 + 2 + ... + N + p, where 1<=p<=N
Or
X = N(N+1)/2 + p, 1<=p<=N
Declare:
S(N) = N(N+1)/2
And you know that
S(N) < X < S(N+1), because 1<=p<=N
You can find N, by finding the S(N) such that S(N)X.
If you have found S(N), subtract it from X and you find the duplicate number.

Sum of digits of a factorial

Link to the original problem
It's not a homework question. I just thought that someone might know a real solution to this problem.
I was on a programming contest back in 2004, and there was this problem:
Given n, find sum of digits of n!. n can be from 0 to 10000. Time limit: 1 second. I think there was up to 100 numbers for each test set.
My solution was pretty fast but not fast enough, so I just let it run for some time. It built an array of pre-calculated values which I could use in my code. It was a hack, but it worked.
But there was a guy, who solved this problem with about 10 lines of code and it would give an answer in no time. I believe it was some sort of dynamic programming, or something from number theory. We were 16 at that time so it should not be a "rocket science".
Does anyone know what kind of an algorithm he could use?
EDIT: I'm sorry if I didn't made the question clear. As mquander said, there should be a clever solution, without bugnum, with just plain Pascal code, couple of loops, O(n2) or something like that. 1 second is not a constraint anymore.
I found here that if n > 5, then 9 divides sum of digits of a factorial. We also can find how many zeros are there at the end of the number. Can we use that?
Ok, another problem from programming contest from Russia. Given 1 <= N <= 2 000 000 000, output N! mod (N+1). Is that somehow related?
I'm not sure who is still paying attention to this thread, but here goes anyway.
First, in the official-looking linked version, it only has to be 1000 factorial, not 10000 factorial. Also, when this problem was reused in another programming contest, the time limit was 3 seconds, not 1 second. This makes a huge difference in how hard you have to work to get a fast enough solution.
Second, for the actual parameters of the contest, Peter's solution is sound, but with one extra twist you can speed it up by a factor of 5 with 32-bit architecture. (Or even a factor of 6 if only 1000! is desired.) Namely, instead of working with individual digits, implement multiplication in base 100000. Then at the end, total the digits within each super-digit. I don't know how good a computer you were allowed in the contest, but I have a desktop at home that is roughly as old as the contest. The following sample code takes 16 milliseconds for 1000! and 2.15 seconds for 10000! The code also ignores trailing 0s as they show up, but that only saves about 7% of the work.
#include <stdio.h>
int main() {
unsigned int dig[10000], first=0, last=0, carry, n, x, sum=0;
dig[0] = 1;
for(n=2; n <= 9999; n++) {
carry = 0;
for(x=first; x <= last; x++) {
carry = dig[x]*n + carry;
dig[x] = carry%100000;
if(x == first && !(carry%100000)) first++;
carry /= 100000; }
if(carry) dig[++last] = carry; }
for(x=first; x <= last; x++)
sum += dig[x]%10 + (dig[x]/10)%10 + (dig[x]/100)%10 + (dig[x]/1000)%10
+ (dig[x]/10000)%10;
printf("Sum: %d\n",sum); }
Third, there is an amazing and fairly simple way to speed up the computation by another sizable factor. With modern methods for multiplying large numbers, it does not take quadratic time to compute n!. Instead, you can do it in O-tilde(n) time, where the tilde means that you can throw in logarithmic factors. There is a simple acceleration due to Karatsuba that does not bring the time complexity down to that, but still improves it and could save another factor of 4 or so. In order to use it, you also need to divide the factorial itself into equal sized ranges. You make a recursive algorithm prod(k,n) that multiplies the numbers from k to n by the pseudocode formula
prod(k,n) = prod(k,floor((k+n)/2))*prod(floor((k+n)/2)+1,n)
Then you use Karatsuba to do the big multiplication that results.
Even better than Karatsuba is the Fourier-transform-based Schonhage-Strassen multiplication algorithm. As it happens, both algorithms are part of modern big number libraries. Computing huge factorials quickly could be important for certain pure mathematics applications. I think that Schonhage-Strassen is overkill for a programming contest. Karatsuba is really simple and you could imagine it in an A+ solution to the problem.
Part of the question posed is some speculation that there is a simple number theory trick that changes the contest problem entirely. For instance, if the question were to determine n! mod n+1, then Wilson's theorem says that the answer is -1 when n+1 is prime, and it's a really easy exercise to see that it's 2 when n=3 and otherwise 0 when n+1 is composite. There are variations of this too; for instance n! is also highly predictable mod 2n+1. There are also some connections between congruences and sums of digits. The sum of the digits of x mod 9 is also x mod 9, which is why the sum is 0 mod 9 when x = n! for n >= 6. The alternating sum of the digits of x mod 11 equals x mod 11.
The problem is that if you want the sum of the digits of a large number, not modulo anything, the tricks from number theory run out pretty quickly. Adding up the digits of a number doesn't mesh well with addition and multiplication with carries. It's often difficult to promise that the math does not exist for a fast algorithm, but in this case I don't think that there is any known formula. For instance, I bet that no one knows the sum of the digits of a googol factorial, even though it is just some number with roughly 100 digits.
This is A004152 in the Online Encyclopedia of Integer Sequences. Unfortunately, it doesn't have any useful tips about how to calculate it efficiently - its maple and mathematica recipes take the naive approach.
I'd attack the second problem, to compute N! mod (N+1), using Wilson's theorem. That reduces the problem to testing whether N is prime.
Small, fast python script found at http://www.penjuinlabs.com/blog/?p=44. It's elegant but still brute force.
import sys
for arg in sys.argv[1:]:
print reduce( lambda x,y: int(x)+int(y),
str( reduce( lambda x, y: x*y, range(1,int(arg)))))
$ time python sumoffactorialdigits.py 432 951 5436 606 14 9520
3798
9639
74484
5742
27
141651
real 0m1.252s
user 0m1.108s
sys 0m0.062s
Assume you have big numbers (this is the least of your problems, assuming that N is really big, and not 10000), and let's continue from there.
The trick below is to factor N! by factoring all n<=N, and then compute the powers of the factors.
Have a vector of counters; one counter for each prime number up to N; set them to 0. For each n<= N, factor n and increase the counters of prime factors accordingly (factor smartly: start with the small prime numbers, construct the prime numbers while factoring, and remember that division by 2 is shift). Subtract the counter of 5 from the counter of 2, and make the counter of 5 zero (nobody cares about factors of 10 here).
compute all the prime number up to N, run the following loop
for (j = 0; j< last_prime; ++j) {
count[j] = 0;
for (i = N/ primes[j]; i; i /= primes[j])
count[j] += i;
}
Note that in the previous block we only used (very) small numbers.
For each prime factor P you have to compute P to the power of the appropriate counter, that takes log(counter) time using iterative squaring; now you have to multiply all these powers of prime numbers.
All in all you have about N log(N) operations on small numbers (log N prime factors), and Log N Log(Log N) operations on big numbers.
and after the improvement in the edit, only N operations on small numbers.
HTH
1 second? Why can't you just compute n! and add up the digits? That's 10000 multiplications and no more than a few ten thousand additions, which should take approximately one zillionth of a second.
You have to compute the fatcorial.
1 * 2 * 3 * 4 * 5 = 120.
If you only want to calculate the sum of digits, you can ignore the ending zeroes.
For 6! you can do 12 x 6 = 72 instead of 120 * 6
For 7! you can use (72 * 7) MOD 10
EDIT.
I wrote a response too quickly...
10 is the result of two prime numbers 2 and 5.
Each time you have these 2 factors, you can ignore them.
1 * 2 * 3 * 4 * 5 * 6 * 7 * 8 * 9 * 10 * 11 * 12 * 13 * 14 * 15...
1 2 3 2 5 2 7 2 3 2 11 2 13 2 3
2 3 2 3 5 2 7 5
2 3
The factor 5 appears at 5, 10, 15...
Then a ending zero will appear after multiplying by 5, 10, 15...
We have a lot of 2s and 3s... We'll overflow soon :-(
Then, you still need a library for big numbers.
I deserve to be downvoted!
Let's see. We know that the calculation of n! for any reasonably-large number will eventually lead to a number with lots of trailing zeroes, which don't contribute to the sum. How about lopping off the zeroes along the way? That'd keep the sizer of the number a bit smaller?
Hmm. Nope. I just checked, and integer overflow is still a big problem even then...
Even without arbitrary-precision integers, this should be brute-forceable. In the problem statement you linked to, the biggest factorial that would need to be computed would be 1000!. This is a number with about 2500 digits. So just do this:
Allocate an array of 3000 bytes, with each byte representing one digit in the factorial. Start with a value of 1.
Run grade-school multiplication on the array repeatedly, in order to calculate the factorial.
Sum the digits.
Doing the repeated multiplications is the only potentially slow step, but I feel certain that 1000 of the multiplications could be done in a second, which is the worst case. If not, you could compute a few "milestone" values in advance and just paste them into your program.
One potential optimization: Eliminate trailing zeros from the array when they appear. They will not affect the answer.
OBVIOUS NOTE: I am taking a programming-competition approach here. You would probably never do this in professional work.
another solution using BigInteger
static long q20(){
long sum = 0;
String factorial = factorial(new BigInteger("100")).toString();
for(int i=0;i<factorial.length();i++){
sum += Long.parseLong(factorial.charAt(i)+"");
}
return sum;
}
static BigInteger factorial(BigInteger n){
BigInteger one = new BigInteger("1");
if(n.equals(one)) return one;
return n.multiply(factorial(n.subtract(one)));
}

Resources