Optimize haskell function with huge numbers of power function call

Optimize haskell function with huge numbers of power function call - performance

The following function is to find a number n which 1^3 + 2^3 + ... + (n-1) ^3 + n^3 = m. Is there any chance this function can be optimized for speed?
findNb :: Integer -> Integer
findNb m = findNb' 1 0
where findNb' n m' =
if m' == m then n - 1
else if m' < m then findNb' (n + 1) (m' + n^3)
else -1
I know there is a faster solution by using a math formula.
The reason I'm asking is that the similar implementation in JavaScript / C# seems far more faster than in Haskell. I'm just curious if it can be optimized. Thanks.
EDIT1: Add more evidences on the rum time
Haskell Version:
With main = print (findNb2 152000000000000000000000):
Compile with -O2 and profiling: ghc -o testo2.exe -O2 -prof -fprof-auto -rtsopts pileofcube.hs. Here is total time from profiling report:
total time = 0.19 secs (190 milliseconds) (190 ticks # 1000 us, 1 processor)
Compile with -O2 but no profiling: ghc -o testo22.exe -O2 pileofcube.hs. Run it with Measure-Command {./testo22.exe} in powershell. The result is:
Milliseconds : 157
JavaScript Version:
Code:
function findNb(m) {
let n = 0;
let sum = 0;
while (sum < m) {
n++;
sum += Math.pow(n, 3);
}
return sum === m ? n : -1;
}
var d1 = new Date();
findNb(152000000000000000000000);
console.log(new Date() - d1);
Result: 45 milliseconds running in Chrome on the same machine
EDIT2: Add C# Version
As #Berji and #Bakuriu commented, comparing to the JavaScript version above is not fair as it uses double-precision floating point numbers underlying and could not give the correct answer even. So I implemented it in C#, here is the code and result:
static void Main(string[] args)
{
BigInteger m = BigInteger.Parse("152000000000000000000000");
var s = new Stopwatch();
s.Start();
long n = 0;
BigInteger sum = 0;
while (sum < m)
{
n++;
sum += BigInteger.Pow(n, 3);
}
Console.WriteLine(sum == m ? n : -1);
s.Stop();
Console.WriteLine($"Escaped Time: {s.ElapsedMilliseconds} milliseconds.");
}
Result: Escaped Time: 457 milliseconds.
Conclusion
Haskell version is faster than C# one...
I was wrong at start because I didn't realized JavaScript use double-precision floating point numbers under the hood due to my poor JavaScript knowledge.
At this point seems the question does not make sense anymore...

Haskell too can use Double to get the wrong answer in less time:
% time ./so
./so 0.03s user 0.00s system 95% cpu 0.038 total
And Javascript too can get the correct result via npm-installing big-integer and using bigInt everywhere instead of Double:
% node so.js
^C
node so.js 35.62s user 0.30s system 93% cpu 38.259 total
... or maybe it isn't as trivial as that.

EDIT : I realized afterward that's not what the author of the question wanted. I'll keep it there as a in case someone wants to know the formula in question, but otherwise please disregard.
There is indeed a formula that lets you compute this in constant time (rather than n iterations). Since I couldn't remember the exact formula from school, I did a bit of searching, and here is is: https://proofwiki.org/wiki/Sum_of_Sequence_of_Cubes.
In haskell code, that would translate to
findNb n = n ^ 2 * (n + 1) ^ 2 / 4
which I believe should be much faster.

Not sure if this wording of that algorithm is faster, but try this?
findNb :: Integer -> Integer
findNb m = length $ takeWhile (<=m) $ scanl1 (+) [n^3 | n <- [1..]]
(This has different semantics in the undefined case, though.)

Related

Algorithm for separating integer into a sum of products of single digit numbers? [duplicate]

A couple of days ago I played around with Befunge which is an esoteric programming language. Befunge uses a LIFO stack to store data. When you write programs the digits from 0 to 9 are actually Befunge-instructions which push the corresponding values onto the stack. So for exmaple this would push a 7 to stack:
34+
In order to push a number greater than 9, calculations must be done with numbers less than or equal to 9. This would yield 123.
99*76*+
While solving Euler Problem 1 with Befunge I had to push the fairly large number 999 to the stack. Here I began to wonder how I could accomplish this task with as few instructions as possible. By writing a term down in infix notation and taking out common factors I came up with
9993+*3+*
One could also simply multiply two two-digit numbers which produce 999, e.g.
39*66*1+*
I thought about this for while and then decided to write a program which puts out the smallest expression according to these rules in reverse polish notation for any given integer. This is what I have so far (written in NodeJS with underscorejs):
var makeExpr = function (value) {
if (value < 10) return value + "";
var output = "", counter = 0;
(function fn (val) {
counter++;
if(val < 9) { output += val; return; };
var exp = Math.floor(Math.log(val) / Math.log(9));
var div = Math.floor(val / Math.pow(9, exp));
_( exp ).times(function () { output += "9"; });
_(exp-1).times(function () { output += "*"; });
if (div > 1) output += div + "*";
fn(val - Math.pow(9, exp) * div);
})(value);
_(counter-1).times(function () { output+= "+"; });
return output.replace(/0\+/, "");
};
makeExpr(999);
// yields 999**99*3*93*++
This piece of code constructs the expression naively and is obvously way to long. Now my questions:
Is there an algorithm to simplify expressions in reverse polish notation?
Would simplification be easier in infix notation?
Can an expression like 9993+*3+* be proofed to be the smallest one possible?
I hope you can give some insights. Thanks in advance.

When only considering multiplication and addition, it's pretty easy to construct optimal formula's, because that problem has the optimal substructure property. That is, the optimal way to build [num1][num2]op is from num1 and num2 that are both also optimal. If duplication is also considered, that's no longer true.
The num1 and num2 give rise to overlapping subproblems, so Dynamic Programming is applicable.
We can simply, for a number i:
For every 1 < j <= sqrt(i) that evenly divides i, try [j][i / j]*
For every 0 < j < i/2, try [j][i - j]+
Take the best found formula
That is of course very easy to do bottom-up, just start at i = 0 and work your way up to whatever number you want. Step 2 is a little slow, unfortunately, so after say 100000 it starts to get annoying to wait for it. There might be some trick that I'm not seeing.
Code in C# (not tested super well, but it seems to work):
string[] n = new string[10000];
for (int i = 0; i < 10; i++)
n[i] = "" + i;
for (int i = 10; i < n.Length; i++)
{
int bestlen = int.MaxValue;
string best = null;
// try factors
int sqrt = (int)Math.Sqrt(i);
for (int j = 2; j <= sqrt; j++)
{
if (i % j == 0)
{
int len = n[j].Length + n[i / j].Length + 1;
if (len < bestlen)
{
bestlen = len;
best = n[j] + n[i / j] + "*";
}
}
}
// try sums
for (int j = 1; j < i / 2; j++)
{
int len = n[j].Length + n[i - j].Length + 1;
if (len < bestlen)
{
bestlen = len;
best = n[j] + n[i - j] + "+";
}
}
n[i] = best;
}
Here's a trick to optimize searching for the sums. Suppose there is an array that contains, for every length, the highest number that can be made with that length. An other thing that is perhaps less obvious that this array also gives us, is a quick way to determine the shortest number that is bigger than some threshold (by simply scanning through the array and noting the first position that crosses the threshold). Together, that gives a quick way to discard huge portions of the search space.
For example, the biggest number of length 3 is 81 and the biggest number of length 5 is 728. Now if we want to know how to get 1009 (prime, so no factors found), first we try the sums where the first part has length 1 (so 1+1008 through 9+1000), finding 9+1000 which is 9 characters long (95558***+).
The next step, checking the sums where the first part has length 3 or less, can be skipped completely. 1009 - 81 = 929, and 929 (the lowest that the second part of the sum can be if the first part is to be 3 characters or less) is bigger than 728 so numbers of 929 and over must be at least 7 characters long. So if the first part of the sum is 3 characters, the second part must be at least 7 characters, and then there's also a + sign on the end, so the total is at least 11 characters. The best so far was 9, so this step can be skipped.
The next step, with 5 characters in the first part, can also be skipped, because 1009 - 728 = 280, and to make 280 or high we need at least 5 characters. 5 + 5 + 1 = 11, bigger than 9, so don't check.
Instead of checking about 500 sums, we only had to check 9 this way, and the check to make the skipping possible is very quick. This trick is good enough that generating all numbers up to a million only takes 3 seconds on my PC (before, it would take 3 seconds to get to 100000).
Here's the code:
string[] n = new string[100000];
int[] biggest_number_of_length = new int[n.Length];
for (int i = 0; i < 10; i++)
n[i] = "" + i;
biggest_number_of_length[1] = 9;
for (int i = 10; i < n.Length; i++)
{
int bestlen = int.MaxValue;
string best = null;
// try factors
int sqrt = (int)Math.Sqrt(i);
for (int j = 2; j <= sqrt; j++)
{
if (i % j == 0)
{
int len = n[j].Length + n[i / j].Length + 1;
if (len < bestlen)
{
bestlen = len;
best = n[j] + n[i / j] + "*";
}
}
}
// try sums
for (int x = 1; x < bestlen; x += 2)
{
int find = i - biggest_number_of_length[x];
int min = int.MaxValue;
// find the shortest number that is >= (i - biggest_number_of_length[x])
for (int k = 1; k < biggest_number_of_length.Length; k += 2)
{
if (biggest_number_of_length[k] >= find)
{
min = k;
break;
}
}
// if that number wasn't small enough, it's not worth looking in that range
if (min + x + 1 < bestlen)
{
// range [find .. i] isn't optimal
for (int j = find; j < i; j++)
{
int len = n[i - j].Length + n[j].Length + 1;
if (len < bestlen)
{
bestlen = len;
best = n[i - j] + n[j] + "+";
}
}
}
}
// found
n[i] = best;
biggest_number_of_length[bestlen] = i;
}
There's still room for improvement. This code will re-check sums that it has already checked. There are simple ways to make it at least not check the same sum twice (by remembering the last find), but that made no significant difference in my tests. It should be possible to find a better upper bound.

There's also 93*94*1+*, which is basically 27*37.
Were I to attack this problem, I'd start by first trying to evenly divide the number. So given 999 I would divide by 9 and get 111. Then I'd try to divide by 9, 8, 7, etc. until I discovered that 111 is 3*37.
37 is prime, so I go greedy and divide by 9, giving me 4 with a remainder of 1.
That seems to give me optimum results for the half dozen I've tried. It's a little expensive, of course, testing for even divisibility. But perhaps not more expensive than generating a too-long expression.
Using this, 100 becomes 55*4*. 102 works out to 29*5*6+.
101 brings up an interesting case. 101/9 = (9*11) + 2. Or, alternately, (9*9)+20. Let's see:
983+*2+ (9*11) + 2
99*45*+ (9*9) + 20
Whether it's easier to generate the postfix directly or generate infix and convert, I really don't know. I can see benefits and drawbacks to each.
Anyway, that's the approach I'd take: try to divide evenly at first, and then be greedy dividing by 9. Not sure exactly how I'd structure it.
I'd sure like to see your solution once you figure it out.
Edit
This is an interesting problem. I came up with a recursive function that does a credible job of generating postfix expressions, but it's not optimum. Here it is in C#.
string GetExpression(int val)
{
if (val < 10)
{
return val.ToString();
}
int quo, rem;
// first see if it's evenly divisible
for (int i = 9; i > 1; --i)
{
quo = Math.DivRem(val, i, out rem);
if (rem == 0)
{
// If val < 90, then only generate here if the quotient
// is a one-digit number. Otherwise it can be expressed
// as (9 * x) + y, where x and y are one-digit numbers.
if (val >= 90 || (val < 90 && quo <= 9))
{
// value is (i * quo)
return i + GetExpression(quo) + "*";
}
}
}
quo = Math.DivRem(val, 9, out rem);
// value is (9 * quo) + rem
// optimization reduces (9 * 1) to 9
var s1 = "9" + ((quo == 1) ? string.Empty : GetExpression(quo) + "*");
var s2 = GetExpression(rem) + "+";
return s1 + s2;
}
For 999 it generates 9394*1+**, which I believe is optimum.
This generates optimum expressions for values <= 90. Every number from 0 to 90 can be expressed as the product of two one-digit numbers, or by an expression of the form (9x + y), where x and y are one-digit numbers. However, I don't know that this guarantees an optimum expression for values greater than 90.

There is 44 solutions for 999 with lenght 9:
39149*+**
39166*+**
39257*+**
39548*+**
39756*+**
39947*+**
39499**+*
39669**+*
39949**+*
39966**+*
93149*+**
93166*+**
93257*+**
93548*+**
93756*+**
93947*+**
93269**+*
93349**+*
93366**+*
93439**+*
93629**+*
93636**+*
93926**+*
93934**+*
93939+*+*
93948+*+*
93957+*+*
96357**+*
96537**+*
96735**+*
96769+*+*
96778+*+*
97849+*+*
97858+*+*
97867+*+*
99689+*+*
956*99*+*
968*79*+*
39*149*+*
39*166*+*
39*257*+*
39*548*+*
39*756*+*
39*947*+*
Edit:
I have working on some search space pruning improvements so sorry I have not posted it immediately. There is script in Erlnag. Original one takes 14s for 999 but this one makes it in around 190ms.
Edit2:
There is 1074 solutions of length 13 for 9999. It takes 7 minutes and there is some of them below:
329+9677**+**
329+9767**+**
338+9677**+**
338+9767**+**
347+9677**+**
347+9767**+**
356+9677**+**
356+9767**+**
3147789+***+*
31489+77***+*
3174789+***+*
3177489+***+*
3177488*+**+*
There is version in C with more aggressive pruning of state space and returns only one solution. It is way faster.
$ time ./polish_numbers 999
Result for 999: 39149*+**, length 9
real 0m0.008s
user 0m0.004s
sys 0m0.000s
$ time ./polish_numbers 99999
Result for 99999: 9158*+1569**+**, length 15
real 0m34.289s
user 0m34.296s
sys 0m0.000s
harold was reporting his C# bruteforce version makes same number in 20s so I was curious if I can improve mine. I have tried better memory utilization by refactoring data structure. Searching algorithm mostly works with length of solution and it's existence so I separated this information to one structure (best_rec_header). I have also make solution as tree branches separated in another (best_rec_args). Those data are used only when new better solution for given number. There is code.
Result for 99999: 9158*+1569**+**, length 15
real 0m31.824s
user 0m31.812s
sys 0m0.012s
It was still too much slow. So I tried some other versions. First I added some statistics to demonstrate that mine code is not computing all smaller numbers.
Result for 99999: 9158*+1569**+**, length 15, (skipped 36777, computed 26350)
Then I have tried change code to compute + solutions for bigger numbers first.
Result for 99999: 1956**+9158*+**, length 15, (skipped 0, computed 34577)
real 0m17.055s
user 0m17.052s
sys 0m0.008s
It was almost as twice faster. But there was another idea that may be sometimes I give up find solution for some number as limited by current best_len limit. So I tried to make small numbers (up to half of n) unlimited (note 255 as best_len limit for first of operands finding).
Result for 99999: 9158*+1569**+**, length 15, (skipped 36777, computed 50000)
real 0m12.058s
user 0m12.048s
sys 0m0.008s
Nice improvement but what if I limit solutions for those numbers by best solution found so far. It needs some sort of computation global state. Code becomes more complicated but result even faster.
Result for 99999: 97484777**+**+*, length 15, (skipped 36997, computed 33911)
real 0m10.401s
user 0m10.400s
sys 0m0.000s
It was even able to compute ten times bigger number.
Result for 999999: 37967+2599**+****, length 17, (skipped 440855)
real 12m55.085s
user 12m55.168s
sys 0m0.028s
Then I decided to try also brute force method and this was even faster.
Result for 99999: 9158*+1569**+**, length 15
real 0m3.543s
user 0m3.540s
sys 0m0.000s
Result for 999999: 37949+2599**+****, length 17
real 5m51.624s
user 5m51.556s
sys 0m0.068s
Which shows, that constant matter. It is especially true for modern CPU when brute force approach gets advantage from better vectorization, better CPU cache utilization and less branching.
Anyway, I think there is some better approach using better understanding of number theory or space searching by algorithms as A* and so. And for really big numbers there may be good idea to use genetic algorithms.
Edit3:
harold came with new idea to eliminate trying to much sums. I have implemented it in this new version. It is order of magnitude faster.
$ time ./polish_numbers 99999
Result for 99999: 9158*+1569**+**, length 15
real 0m0.153s
user 0m0.152s
sys 0m0.000s
$ time ./polish_numbers 999999
Result for 999999: 37949+2599**+****, length 17
real 0m3.516s
user 0m3.512s
sys 0m0.004s
$ time ./polish_numbers 9999999
Result for 9999999: 9788995688***+***+*, length 19
real 1m39.903s
user 1m39.904s
sys 0m0.032s

Don't forget, you can also push ASCII values!!
Usually, this is longer, but for higher numbers it can get much shorter:
If you needed the number 123, it would be much better to do
"{" than 99*76*+

Efficiency in Haskell when counting primes

I have the following set of functions to count the number of primes less than or equal to a number n in Haskell.
The algorithm takes a number, checks if it is divisible by two and then checks if it divisible by odd numbers up to the square root of the number being checked.
-- is a numner, n, prime?
isPrime :: Int -> Bool
isPrime n = n > 1 &&
foldr (\d r -> d * d > n || (n `rem` d /= 0 && r))
True divisors
-- list of divisors for which to test primality
divisors :: [Int]
divisors = 2:[3,5..]
-- pi(n) - the prime counting function, the number of prime numbers <= n
primesNo :: Int -> Int
primesNo 2 = 1
primesNo n
| isPrime n = 1 + primesNo (n-1)
| otherwise = 0 + primesNo (n-1)
main = print $ primesNo (2^22)
Using GHC with the -O2 optimisation flag, counting the number of primes for n = 2^22 takes ~3.8sec on my system. The following C code take ~ 0.8 sec:
#include <stdio.h>
#include <math.h>
/*
compile with: gcc -std=c11 -lm -O2 c_primes.c -o c_orig
*/
int isPrime(int n) {
if (n < 2)
return 0;
else if (n == 2)
return 1;
else if (n % 2 == 0)
return 0;
int uL = sqrt(n);
int i = 3;
while (i <= uL) {
if (n % i == 0)
return 0;
i+=2;
}
return 1;
}
int main() {
int noPrimes = 0, limit = 4194304;
for (int n = 0; n <= limit; n++) {
if (isPrime(n))
noPrimes++;
}
printf("Number of primes in the interval [0,%d]: %d\n", limit, noPrimes);
return 0;
}
This algorithm take about 0.9 sec in Java and 1.8 sec in JavaScript (on Node) so it just feels that the Haskell version is slower than I would expect be. Is there anyway I can more efficiently code this in Haskell without changing the algorithm?
EDIT
The following version of isPrime offered by #dfeuer shaves one second off the running time taking it down to 2.8sec (down from 3.8). Though this is still slower than JavaScript (Node) which takes approx 1.8 sec as shown here, Yet Another Language Speed Test.
isPrime :: Int -> Bool
isPrime n
| n <= 2 = n == 2
| otherwise = odd n && go 3
where
go factor
| factor * factor > n = True
| otherwise = n `rem` factor /= 0 && go (factor+2)
EDIT
In the above isPrime function, the function go calls factor * factor for each divisor for a single n. I would imagine that it would be more efficient to compare factor to the square root of n as this would only have to be calculated once per n. However, using the following code, computation time is increased by approximately 10%, is the square root of n being re-calculated every time the inequality is evaluated (for each factor)?
isPrime :: Int -> Bool
isPrime n
| n <= 2 = n == 2
| otherwise = odd n && go 3
where
go factor
| factor > upperLim = True
| otherwise = n `rem` factor /= 0 && go (factor+2)
where
upperLim = (floor.sqrt.fromIntegral) n

I urge you to use a different algorithm, such as the Sieve of Eratosthenes discussed in the paper by Melissa O'Neill, or the version used in Math.NumberTheory.Primes from the arithmoi package, which also offers an optimized prime counting function. However, this might get you better constant factors:
-- is a number, n, prime?
isPrime :: Int -> Bool
isPrime n
| n <= 2 = n == 2
| otherwise = odd n && -- Put the 2 here instead
foldr (\d r -> d * d > n || (n `rem` d /= 0 && r))
True divisors
-- list of divisors for which to test primality
divisors :: [Int]
{-# INLINE divisors #-} -- No guarantee, but it might possibly inline and stay inlined,
-- so the numbers will be generated on each call instead of
-- being pulled in (expensively) from RAM.
divisors = [3,5..] -- No more 2:
The reason to get rid of the 2: is that an optimization called "foldr/build fusion", "short cut deforestation", or just "list fusion" can, potentially, make your divisors list go away, but, at least with GHC < 7.10.1, that 2: will block the optimization.
Edit: it seems that's not working for you, so here's something else to try:
isPrime n
| n <= 2 = n == 2
| otherwise = odd n && go 3
where
go factor
| factor * factor > n = True
| otherwise = n `rem` factor /= 0 && go (factor+2)

In general I've found that looping in Haskell is about 3-4 times slower than what can be accomplished with C.
To help understand the performance difference I slightly modified the
programs so that a fixed number of divisor tests are made per iteration
and added a parameter e to control how many iterations are made -
the number of (outer) iterations performed is 2^e. For each outer iteration
approx. 2^21 divisor tests are made.
The source code for each program and scripts to run and analyze the
results made be found here: https://github.com/erantapaa/loopbench
Pull-requests to improve the benchmarking are welcome.
Here are the results I get on a 2.4 GHz Intel Core 2 Duo using ghc 7.8.3 (under OSX). The gcc used was "Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)".
e ctime htime allocated gc-bytes alloc/iter h/c dns
10 0.0101 0.0200 87424 3408 1.980 4.61
11 0.0151 0.0345 112000 3408 2.285 4.51
12 0.0263 0.0700 161152 3408 2.661 5.09
13 0.0472 0.1345 259456 3408 2.850 5.08
14 0.0819 0.2709 456200 3408 3.308 5.50
15 0.1575 0.5382 849416 9616 3.417 5.54
16 0.3112 1.0900 1635848 15960 3.503 5.66
17 0.6105 2.1682 3208848 15984 3.552 5.66
18 1.2167 4.3536 6354576 16032 24.24 3.578 5.70
19 2.4092 8.7336 12646032 16128 24.12 3.625 5.75
20 4.8332 17.4109 25229080 16320 24.06 3.602 5.72
e = exponent parameter
ctime = running time of the C program
htime = running time of the Haskell program
allocated = bytes allocated in the heap (Haskell program)
gc-bytes = bytes copied during GC (Haskell program)
alloc/iter = bytes allocated in the heap / 2^e
h / c = htime divided by ctime
dns = (htime - ctime) divided by the number of divisor tests made
in nanoseconds
# divisor tests made = 2^e * 2^11
Some observations:
The Haskell program performs heap allocation at a rate of about 24 bytes per (outer) loop iteration. The C program clearly does not perform any alloction and runs completely in L1 cache.
The gc-bytes count remains constant for e between 10 and 14 because no garbage collections were performed for those runs.
The time ratio h/c gets progressively worse as more allocations are made.
dps is a measure of the extra time the Haskell program takes per divisor test; it increases with the total amount of allocation made. Also there are some plateaus which suggest this is due to cache effects.
It is well known that GHC does not produce the same tight loop code that
a C compiler produces. The penalty you pay is approx. 4.6 ns per iteration.
Moreover, it looks like Haskell is also affected by cache effects due to
heap allocation.
24 bytes per allocation and 5 ns per loop iteration is not a lot for
most programs, but when you have 2^20 allocations and 2^40 loop iterations
it becomes a factor.

The C code uses 32-bit integers, while the Haskell code uses 64-bit integers.
The original C code runs in 0.63 secs on my computer. However, if I replace the int-s with long-s, it runs in 2.07 seconds with gcc and 2.17 secs with clang.
In comparison, the updated isPrime function (see it in the thread question) runs in 2.09 seconds (with -O2 and -fllvm). Note that is slightly better than the clang-compiled C code, even though they use the same LLVM code generator.
The original Haskell code runs in 3.2 secs, which I think is an acceptable overhead for the convenience of using lists for iteration.

Inline everything, loose the superfluous tests, add strictness annotations just to be sure:
{-# LANGUAGE BangPatterns #-}
-- pi(n) - the prime counting function, the number of prime numbers <= n
primesNo :: Int -> Int
primesNo n
| n < 2 = 0
| otherwise = g 3 1
where
g k !cnt | k > n = cnt
| go 3 = g (k+2) (cnt+1)
| otherwise = g (k+2) cnt
where go f
| f*f > k = True
| otherwise = k `rem` f /= 0 && go (f+2)
main = print $ primesNo (2^22)
The go testing function is as in dfeuer's answer. Compile with -O2 as usual, and always test by running a standalone executable (with something like > test +RTS -s).
Calls to g can be made direct (that's really micro-optimizing it):
primesNo n
| n < 2 = 0
| otherwise = g 3 1
where
g k !cnt | k > n = cnt
| otherwise = go 3
where go f
| f*f > k = g (k+2) (cnt+1)
| k `rem` f == 0 = g (k+2) cnt
| otherwise = go (f+2)
More substantial change (still keeping the algorithm arguably the same) which might or mightn't speed it up is to turn it inside out, to spare the squares computations: test by [3] all odds from 9 to 23, by [3,5] all odds from 25 to 47, etc., along the lines of this segmented code:
import Data.List (inits)
primesNo n = length (takeWhile (<= n) $ 2 : oddprimes)
where
oddprimes = sieve 3 9 [3,5..] (inits [3,5..])
sieve x q ~(_:t) (fs:ft) =
filter ((`all` fs) . ((/=0).) . rem) [x,x+2..q-2]
++ sieve (q+2) (head t^2) t ft
Sometimes tweaking your code into using and instead of all changes the speed too. Further speedup might be attempted by inlining and simplifying everything (replace length with counting etc.).

to calculate one million prime numbers

I have got one question to print one million prime numbers . I have written a java program for that .. It's currently taking 1.5 mins approx to calculate it .. I think my solution is not that efficient. I have used the below algo:
Adding 1 2 3 to the prime list initially
Calculating the last digit of the number to be checked
Checking if the digit is 0 , 2 or 4 or 6 or 8 then skipping the number
else calculating the square root of the number ..
Trying to Divide the number starting from 2 till the square root of the number
if number is divisible then skipping the number else adding it to the prime list
I have read several other solutions as well , but I didn't find a good answer. Please suggest ideally what should be approx minimum time to calculate this and what changes are required to make the algorithm more efficient.

If you added 1 to your list, your answer is wrong already :)
Anyway, Sieve of Erathosthenes is where you should begin, it's incredibly simple and quite efficient.
Once you're familiar with the idea of sieves and how they work, you can move on to Sieve of Atkin, which is a bit more complicated but obviously more efficient.

Key things:
Skip all even numbers. Start with 5, and just add two at a time.
1 isn't a prime number...
Test a number by finding the mod of all prime numbers till the square root of the number. No need to test anything but primes.

A simple sieve of Eratosthenes runs like the clappers. This calculates the 1,000,000th prime in less than a second on my box:
class PrimeSieve
{
public List<int> Primes;
private BitArray Sieve;
public PrimeSieve(int max)
{
Primes = new List<int> { 2, 3 }; // Must include at least 2, 3.
Sieve = new BitArray(max + 1);
foreach (var p in Primes)
for (var i = p * p; i < Sieve.Length; i += p) Sieve[i] = true;
}
public int Extend()
{
var p = Primes.Last() + 2; // Skip the even numbers.
while (Sieve[p]) p += 2;
for (var i = p * p; i < Sieve.Length; i += p) Sieve[i] = true;
Primes.Add(p);
return p;
}
}
EDIT: sieving optimally starts from p^2, not 2p, as Will Ness correctly points out (all compound numbers below p^2 will have been marked in earlier iterations).

You might want to implement Sieve of Eratosthenes algorithm to find prime numbers from 1 to n and iteratively increase the range while you are doing it if needed to. (i.e. did not find 1,000,000 primes yet)

First, 1 is not a prime number.
Second, the millionth prime is 15,485,863, so you need to be prepared for some large data-handling.
Third, you probably want to use the Sieve of Eratosthenes; here's a simple version:
function sieve(n)
bits := makeArray(0..n, True)
for p from 2 to n step 1
if bits[p]
output p
for i from p*p to n step p
bits[i] := False
That may not work for the size of array that you will need to calculate the first million primes. In that case, you will want to implement a Segmented Sieve of Eratosthenes.
I've done a lot of work with prime numbers at my blog, including an essay that provides an optimized Sieve of Eratosthenes, with implementations in five programming languages.
No matter what you do, with any programming language, you should be able to compute the first million primes in no more than a few seconds.

Here's an Ocaml program that implements the Trial division sieve (which is sort of the inverse of Eratosthenes as correctly pointed out by Will):
(* Creates a function for streaming integers from x onward *)
let stream x =
let counter = ref (x) in
fun () ->
let _ = counter := !counter + 1 in
!counter;;
(* Filter the given stream of any multiples of x *)
let filter s x = fun () ->
let rec filter' () = match s () with
n when n mod x = 0 ->
filter' ()|
n ->
n in
filter' ();;
(* Get next prime, apply a new filter by that prime to the remainder of the stream *)
let primes count =
let rec primes' count' s = match count' with
0 ->
[]|
_ ->
let n = s () in
n :: primes' (count' - 1) (filter s n) in
primes' count (stream 1);;
It works on a stream of integers. Each time a new prime number is discovered, a filter is added to the stream so that the remainder of the stream gets filtered of any multiples of that prime number. This program can be altered to generate prime numbers on-demand as well.
It should be fairly easy to take the same approach in Java.
Hope this helps!

Here's a javascript solution that uses recursion and iteration to reach the millionth prime. It's not as fast as the Sieve of Erathosthenes, but does not require one to know the value of the millionth prime (i.e., size of the required sieve) in advance:
function findPrimes(n, current, primes) {
if (!n || current < 2) return []
var isPrime = true
for (var i = 0; i < primes.length; i++) {
if (current % primes[i] == 0) {
isPrime = false
break
}
}
if (isPrime) primes.push(current)
if (primes.length < n) return findPrimes(n, current + 1, primes)
else return primes
}
var primes = [2,3]
for (var i = 1; i <= 1000; i++) {
primes = findPrimes(i*1000, primes[primes.length - 1]+1, primes)
console.log(i*1000 + 'th prime: ' + primes[primes.length-1])
}
process.exit()
Output:
...
996000th prime: 15419293
997000th prime: 15435941
998000th prime: 15452873
999000th prime: 15469313
1000000th prime: 15485863
Process finished with exit code 0

As a fresher level I will try this one, so any improvement to make this more efficient and faster is appreciated
public static void main(String ar[]) {
ArrayList primeNumbers = new ArrayList();
for(int i = 2; primeNumbers.size() < 1000000; i++) {//first 1 million prime number
// for(int i = 2; i < 1000000; i++) {//prime numbers from 1 to 1 million
boolean divisible = false;
for(int j=2;j<i/2;j++){
if((i % j) == 0) {
divisible = true;
break;
}
}
if(divisible == false) {
primeNumbers.add(i);
// System.out.println(i + " ");
}
}
System.out.println(primeNumbers);
}

Adding 1 2 3 to the prime list initially
Actually, just 2 is sufficient. Hard-coding 3 might save, at most, a millisecond. There's no need to harp on 1. I am convinced that including it was an honest mistake. You already knew, and working on this program would have helped confirm this.
Calculating the last digit of the number to be checked
The last digit? In what base? Base 10? I think this might be your problem.
Checking if the digit is 0, 2 or 4 or 6 or 8 then skipping the number
else calculating the square root of the number
I think this is where the problem lies. Your program should simply skip even numbers, because, aside from −2 and 2, they're all composite. On the other hand, this won't halve running time because odd numbers like 91 and and 2209 might require more effort to be ruled out as not prime.
Trying to Divide the number starting from 2 till the square root of the number
if number is divisible then skipping the number else adding it to the prime list
Does "2 till the square root of the number" include numbers like 4, 6 and 9? The only potential factors that need to be checked are numbers that have already been proven prime. If n is not divisible by 7, it won't be divisible by 49 either. If you're building up a list, you might as well use it to check potential primes.
Benchmarking Java's a little difficult because you're at the mercy of the runtime system. Still, a minute and a half, while it would have been considered miraculous by Mersenne, is too slow today. Five, ten seconds, that I'd find acceptable.
Maybe this is one of those cases where you should avoid the use of objects in favor of an array of primitives. My first draft took even longer than yours. Eventually I came up with this:
static int[] fillWithPrimes(int quantity) {
int[] primes = new int[quantity];
primes[0] = 2;
int currPi = 1;
int currIndex = 0;
int currNum = 3;
int currPrime;
boolean coPrimeFlag;
double squareRoot;
while (currPi < quantity) {
squareRoot = Math.sqrt(currNum);
do {
currPrime = primes[currIndex];
coPrimeFlag = (currNum % currPrime != 0);
currIndex++;
} while (coPrimeFlag && currPrime <= squareRoot);
if (coPrimeFlag) {
primes[currPi] = currNum;
currPi++;
}
currNum += 2;
currIndex = 0;
}
return primes;
}
Then I wrote a main() that notes the time before calling fillWithPrimes() with a quantity parameter of 1,000,000, and reports on the results:
run:
Operation took 2378 milliseconds
10th prime is 29
100th prime is 541
1000th prime is 7919
10000th prime is 104729
100000th prime is 1299709
1000000th prime is 15485863
BUILD SUCCESSFUL (total time: 2 seconds)
I'm sure it can be optimized further. Me, personally, I'm satisfied with two and a half seconds.

Isn't everything after 5 ending in a five divisible by 5 as well, so you can skip things who's right(1,numb)<>"5" for example 987,985. I made one in Excel that will test a million numbers for primes and spit them in a column in about 15 seconds but it gets crazy around 15 million

Code Golf: Leibniz formula for Pi

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I recently posted one of my favourite interview whiteboard coding questions in "What's your more controversial programming opinion", which is to write a function that computes Pi using the Leibniz formula.
It can be approached in a number of different ways, and the exit condition takes a bit of thought, so I thought it might make an interesting code golf question. Shortest code wins!
Given that Pi can be estimated using the function 4 * (1 - 1/3 + 1/5 - 1/7 + ...) with more terms giving greater accuracy, write a function that calculates Pi to within 0.00001.
Edit: 3 Jan 2008
As suggested in the comments I changed the exit condition to be within 0.00001 as that's what I really meant (an accuracy 5 decimal places is much harder due to rounding and so I wouldn't want to ask that in an interview, whereas within 0.00001 is an easier to understand and implement exit condition).
Also, to answer the comments, I guess my intention was that the solution should compute the number of iterations, or check when it had done enough, but there's nothing to prevent you from pre-computing the number of iterations and using that number. I really asked the question out of interest to see what people would come up with.

J, 14 chars
4*-/%>:+:i.1e6
Explanation
1e6 is number 1 followed by 6 zeroes (1000000).
i.y generates the first y non negative numbers.
+: is a function that doubles each element in the list argument.
>: is a function that increments by one each element in the list argument.
So, the expression >:+:i.1e6 generates the first one million odd numbers:
1 3 5 7 ...
% is the reciprocal operator (numerator "1" can be omitted).
-/ does an alternate sum of each element in the list argument.
So, the expression -/%>:+:i.1e6 generates the alternate sum of the reciprocals of the first one million odd numbers:
1 - 1/3 + 1/5 - 1/7 + ...
4* is multiplication by four. If you multiply by four the previous sum, you have π.
That's it! J is a powerful language for mathematics.
Edit: since generating 9! (362880) terms for the alternate sum is sufficient to have 5 decimal digit accuracy, and since the Leibniz formula can be written also this way:
4 - 4/3 + 4/5 - 4/7 + ...
...you can write a shorter, 12 chars version of the program:
-/4%>:+:i.9!

Language: Brainfuck, Char count: 51/59
Does this count? =]
Because there are no floating-point numbers in Brainfuck, it was pretty difficult to get the divisions working properly. Grr.
Without newline (51):
+++++++[>+++++++<-]>++.-----.+++.+++.---.++++.++++.
With newline (59):
+++++++[>+++++++>+<<-]>++.-----.+++.+++.---.++++.++++.>+++.

Perl
26 chars
26 just the function, 27 to compute, 31 to print. From the comments to this answer.
sub _{$-++<1e6&&4/$-++-&_} # just the sub
sub _{$-++<1e6&&4/$-++-&_}_ # compute
sub _{$-++<1e6&&4/$-++-&_}say _ # print
28 chars
28 just computing, 34 to print. From the comments. Note that this version cannot use 'say'.
$.=.5;$\=2/$.++-$\for 1..1e6 # no print
$.=.5;$\=2/$.++-$\for$...1e6;print # do print, with bonus obfuscation
36 chars
36 just computing, 42 to print. Hudson's take at dreeves's rearrangement, from the comments.
$/++;$\+=8/$//($/+2),$/+=4for$/..1e6
$/++;$\+=8/$//($/+2),$/+=4for$/..1e6;print
About the iteration count: as far as my math memories go, 400000 is provably enough to be accurate to 0.00001. But a million (or as low as 8e5) actually makes the decimal expansion actually match 5 fractional places, and it's the same character count so I kept that.

Ruby, 33 characters
(0..1e6).inject{|a,b|2/(0.5-b)-a}

Another C# version:
(60 characters)
4*Enumerable.Range(0, 500000).Sum(x => Math.Pow(-1, x)/(2*x + 1)); // = 3,14159

52 chars in Python:
print 4*sum(((-1.)**i/(2*i+1)for i in xrange(5**8)))
(51 dropping the 'x' from xrange.)
36 chars in Octave (or Matlab):
l=0:5^8;disp((-1).^l*(4./(2.*l+1))')
(execute "format long;" to show all the significant digits.) Omitting 'disp' we reach 30 chars:
octave:5> l=0:5^8;(-1).^l*(4./(2.*l+1))'
ans = 3.14159009359631

Oracle SQL 73 chars
select -4*sum(power(-1,level)/(level*2-1)) from dual connect by level<1e6

Language: C, Char count: 71
float p;main(i){for(i=1;1E6/i>5;i+=2)p-=(i%4-2)*4./i;printf("%g\n",p);}
Language: C99, Char count: 97 (including required newline)
#include <stdio.h>
float p;int main(){for(int i=1;1E6/i>5;i+=2)p-=(i%4-2)*4./i;printf("%g\n",p);}
I should note that the above versions (which are the same) keep track of whether an extra iteration would affect the result at all. Thus, it performs a minimum number of operations. To add more digits, replace 1E6 with 1E(num_digits+1) or 4E5 with 4E(num_digits) (depending on the version). For the full programs, %g may need to be replaced. float may need to be changed to double as well.
Language: C, Char count: 67 (see notes)
double p,i=1;main(){for(;i<1E6;i+=4)p+=8/i/(i+2);printf("%g\n",p);}
This version uses a modified version of posted algorithm, as used by some other answers. Also, it is not as clean/efficient as the first two solutions, as it forces 100 000 iterations instead of detecting when iterations become meaningless.
Language: C, Char count: 24 (cheating)
main(){puts("3.14159");}
Doesn't work with digit counts > 6, though.

Haskell
I got it down to 34 characters:
foldl subtract 4$map(4/)[3,5..9^6]
This expression yields 3.141596416935556 when evaluated.
Edit: here's a somewhat shorter version (at 33 characters) that uses foldl1 instead of foldl:
foldl1 subtract$map(4/)[1,3..9^6]
Edit 2: 9^6 instead of 10^6. One has to be economical ;)
Edit 3: Replaced with foldl' and foldl1' with foldl and foldl1 respectively—as a result of Edit 2, it no longer overflows. Thanks to ShreevatsaR for noticing this.

23 chars in MATLAB:
a=1e6;sum(4./(1-a:4:a))

F#:
Attempt #1:
let pi = 3.14159
Cheating? No, its winning with style!
Attempt #2:
let pi =
seq { 0 .. 100 }
|> Seq.map (fun x -> float x)
|> Seq.fold (fun x y -> x + (Math.Pow(-1.0, y)/(2.0 * y + 1.0))) 0.0
|> (fun x -> x * 4.0)
Its not as compact as it could possibly get, but pretty idiomatic F#.

common lisp, 55 chars.
(loop for i from 1 upto 4e5 by 4 sum (/ 8d0 i (+ i 2)))

Mathematica, 27 chars (arguably as low as 26, or as high as 33)
NSum[8/i/(i+2),{i,1,9^9,4}]
If you remove the initial "N" then it returns the answer as a (huge) fraction.
If it's cheating that Mathematica doesn't need a print statement to output its result then prepend "Print#" for a total of 33 chars.
NB:
If it's cheating to hardcode the number of terms, then I don't think any answer has yet gotten this right. Checking when the current term is below some threshold is no better than hardcoding the number of terms. Just because the current term is only changing the 6th or 7th digit doesn't mean that the sum of enough subsequent terms won't change the 5th digit.

Using the formula for the error term in an alternating series (and thus the necessary number of iterations to achieve the desired accuracy is not hard coded into the program):
public static void Main(string[] args) {
double tolerance = 0.000001;
double piApproximation = LeibnizPi(tolerance);
Console.WriteLine(piApproximation);
}
private static double LeibnizPi(double tolerance) {
double quarterPiApproximation = 0;
int index = 1;
double term;
int sign = 1;
do {
term = 1.0 / (2 * index - 1);
quarterPiApproximation += ((double)sign) * term;
index++;
sign = -sign;
} while (term > tolerance);
return 4 * quarterPiApproximation;
}

C#:
public static double Pi()
{
double pi = 0;
double sign = 1;
for (int i = 1; i < 500002; i += 2)
{
pi += sign / i;
sign = -sign;
}
return 4 * pi;
}

Perl :
$i+=($_&1?4:-4)/($_*2-1)for 1..1e6;print$i
for a total of 42 chars.

Ruby, 41 chars (using irb):
s=0;(3..3e6).step(4){|i|s+=8.0/i/(i-2)};s
Or this slightly longer, non-irb version:
s=0;(3..3e6).step(4){|i|s+=8.0/i/(i-2)};p s
This is a modified Leibniz:
Combine pairs of terms. This gives you 2/3 + 2/35 + 2/99 + ...
Pi becomes 8 * (1/(1 * 3) + 1/(5 * 7) + 1/(9 * 11) + ...)

F# (Interactive Mode) (59 Chars)
{0.0..1E6}|>Seq.fold(fun a x->a+ -1.**x/(2.*x+1.))0.|>(*)4.
(Yields a warning but omits the casts)

Here's a solution in MUMPS.
pi(N)
N X,I
S X=1 F I=3:4:N-2 S X=X-(1/I)+(1/(I+2))
Q 4*X
Parameter N indicates how many repeated fractions to use. That is, if you pass in 5 it will evaluate 4 * (1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11)
Some empirical testing showed that N=272241 is the lowest value that gives a correct value of 3.14159 when truncated to 5 decimal points. You have to go to N=852365 to get a value that rounds to 3.14159.

C# using iterator block:
static IEnumerable<double> Pi()
{
double i = 4, j = 1, k = 4;
for (;;)
{
yield return k;
k += (i *= -1) / (j += 2);
}
}

For the record, this Scheme implementation has 95 characters ignoring unnecessary whitespace.
(define (f)
(define (p a b)
(if (> a b)
0
(+ (/ 1.0 (* a (+ a 2))) (p (+ a 4) b))))
(* 8 (p 1 1e6)))

Javascript:
a=0,b=-1,d=-4,c=1e6;while(c--)a+=(d=-d)/(b+=2)
In javascript. 51 characters. Obviously not going to win but eh. :P
Edit -- updated to be 46 characters now, thanks to Strager. :)
UPDATE (March 30 2010)
A faster (precise only to 5 decimal places) 43 character version by David Murdoch
for(a=0,b=1,d=4,c=~4e5;++c;d=-d)a-=d/(b-=2)

Here's a recursive answer using C#. It will only work using the x64 JIT in Release mode because that's the only JIT that applies tail-call optimisation, and as the series converges so slowly it will result in a StackOverflowException without it.
It would be nice to have the IteratePi function as an anonymous lambda, but as it's self-recursive we'd have to start doing all manner of horrible things with Y-combinators so I've left it as a separate function.
public static double CalculatePi()
{
return IteratePi(0.0, 1.0, true);
}
private static double IteratePi(double result, double denom, bool add)
{
var term = 4.0 / denom;
if (term < 0.00001) return result;
var next = add ? result + term : result - term;
return IteratePi(next, denom + 2.0, !add);
}

Most of the current answers assume that they'll get 5 digits accuracy within some number of iterations and this number is hardcoded into the program. My understanding of the question was that the program itself is supposed to figure out when it's got an answer accurate to 5 digits and stop there. On that assumption here's my C# solution. I haven't bothered to minimise the number of characters since there's no way it can compete with some of the answers already out there, so I thought I'd make it readable instead. :)
private static double GetPi()
{
double acc = 1, sign = -1, lastCheck = 0;
for (double div = 3; ; div += 2, sign *= -1)
{
acc += sign / div;
double currPi = acc * 4;
double currCheck = Math.Round(currPi, 5);
if (currCheck == lastCheck)
return currPi;
lastCheck = currCheck;
}
}

Language: C99 (implicit return 0), Char count: 99 (95 + 4 required spaces)
exit condition depends on current value, not on a fixed count
#include <stdio.h>
float p, s=4, d=1;
int main(void) {
for (; 4/d > 1E-5; d += 2)
p -= (s = -s) / d;
printf("%g\n", p);
}
compacted version
#include<stdio.h>
float
p,s=4,d=1;int
main(void){for(;4/d>1E-5;d+=2)p-=(s=-s)/d;printf("%g\n",p);}

Language: dc, Char count: 35
dc -e '9k0 1[d4r/r2+sar-lad274899>b]dsbxrp'

Ruby:
irb(main):031:0> 4*(1..10000).inject {|s,x| s+(-1)**(x+1)*1.0/(2*x-1)}
=> 3.14149265359003

64 chars in AWK:
~# awk 'BEGIN {p=1;for(i=3;i<10^6;i+=4){p=p-1/i+1/(i+2)}print p*4}'
3.14159

C# cheating - 50 chars:
static single Pi(){
return Math.Round(Math.PI, 5));
}
It only says "taking into account the formula write a function..." it doesn't say reproduce the formula programmatically :) Think outside the box...
C# LINQ - 78 chars:
static double pi = 4 * Enumerable.Range(0, 1000000)
.Sum(n => Math.Pow(-1, n) / (2 * n + 1));
C# Alternate LINQ - 94 chars:
static double pi = return 4 * (from n in Enumerable.Range(0, 1000000)
select Math.Pow(-1, n) / (2 * n + 1)).Sum();
And finally - this takes the previously mentioned algorithm and condenses it mathematically so you don't have to worry about keep changing signs.
C# longhand - 89 chars (not counting unrequired spaces):
static double pi()
{
var t = 0D;
for (int n = 0; n < 1e6; t += Math.Pow(-1, n) / (2 * n + 1), n++) ;
return 4 * t;
}

#!/usr/bin/env python
from math import *
denom = 1.0
imm = 0.0
sgn = 1
it = 0
for i in xrange(0, int(1e6)):
imm += (sgn*1/denom)
denom += 2
sgn *= -1
print str(4*imm)

Algorithm to calculate the number of divisors of a given number

What would be the most optimal algorithm (performance-wise) to calculate the number of divisors of a given number?
It'll be great if you could provide pseudocode or a link to some example.
EDIT: All the answers have been very helpful, thank you. I'm implementing the Sieve of Atkin and then I'm going to use something similar to what Jonathan Leffler indicated. The link posted by Justin Bozonier has further information on what I wanted.

Dmitriy is right that you'll want the Sieve of Atkin to generate the prime list but I don't believe that takes care of the whole issue. Now that you have a list of primes you'll need to see how many of those primes act as a divisor (and how often).
Here's some python for the algo Look here and search for "Subject: math - need divisors algorithm". Just count the number of items in the list instead of returning them however.
Here's a Dr. Math that explains what exactly it is you need to do mathematically.
Essentially it boils down to if your number n is:
n = a^x * b^y * c^z
(where a, b, and c are n's prime divisors and x, y, and z are the number of times that divisor is repeated)
then the total count for all of the divisors is:
(x + 1) * (y + 1) * (z + 1).
Edit: BTW, to find a,b,c,etc you'll want to do what amounts to a greedy algo if I'm understanding this correctly. Start with your largest prime divisor and multiply it by itself until a further multiplication would exceed the number n. Then move to the next lowest factor and times the previous prime ^ number of times it was multiplied by the current prime and keep multiplying by the prime until the next will exceed n... etc. Keep track of the number of times you multiply the divisors together and apply those numbers into the formula above.
Not 100% sure about my algo description but if that isn't it it's something similar .

There are a lot more techniques to factoring than the sieve of Atkin. For example suppose we want to factor 5893. Well its sqrt is 76.76... Now we'll try to write 5893 as a product of squares. Well (77*77 - 5893) = 36 which is 6 squared, so 5893 = 77*77 - 6*6 = (77 + 6)(77-6) = 83*71. If that hadn't worked we'd have looked at whether 78*78 - 5893 was a perfect square. And so on. With this technique you can quickly test for factors near the square root of n much faster than by testing individual primes. If you combine this technique for ruling out large primes with a sieve, you will have a much better factoring method than with the sieve alone.
And this is just one of a large number of techniques that have been developed. This is a fairly simple one. It would take you a long time to learn, say, enough number theory to understand the factoring techniques based on elliptic curves. (I know they exist. I don't understand them.)
Therefore unless you are dealing with small integers, I wouldn't try to solve that problem myself. Instead I'd try to find a way to use something like the PARI library that already has a highly efficient solution implemented. With that I can factor a random 40 digit number like 124321342332143213122323434312213424231341 in about .05 seconds. (Its factorization, in case you wondered, is 29*439*1321*157907*284749*33843676813*4857795469949. I am quite confident that it didn't figure this out using the sieve of Atkin...)

#Yasky
Your divisors function has a bug in that it does not work correctly for perfect squares.
Try:
int divisors(int x) {
int limit = x;
int numberOfDivisors = 0;
if (x == 1) return 1;
for (int i = 1; i < limit; ++i) {
if (x % i == 0) {
limit = x / i;
if (limit != i) {
numberOfDivisors++;
}
numberOfDivisors++;
}
}
return numberOfDivisors;
}

I disagree that the sieve of Atkin is the way to go, because it could easily take longer to check every number in [1,n] for primality than it would to reduce the number by divisions.
Here's some code that, although slightly hackier, is generally much faster:
import operator
# A slightly efficient superset of primes.
def PrimesPlus():
yield 2
yield 3
i = 5
while True:
yield i
if i % 6 == 1:
i += 2
i += 2
# Returns a dict d with n = product p ^ d[p]
def GetPrimeDecomp(n):
d = {}
primes = PrimesPlus()
for p in primes:
while n % p == 0:
n /= p
d[p] = d.setdefault(p, 0) + 1
if n == 1:
return d
def NumberOfDivisors(n):
d = GetPrimeDecomp(n)
powers_plus = map(lambda x: x+1, d.values())
return reduce(operator.mul, powers_plus, 1)
ps That's working python code to solve this problem.

Here is a straight forward O(sqrt(n)) algorithm. I used this to solve project euler
def divisors(n):
count = 2 # accounts for 'n' and '1'
i = 2
while i ** 2 < n:
if n % i == 0:
count += 2
i += 1
if i ** 2 == n:
count += 1
return count

This interesting question is much harder than it looks, and it has not been answered. The question can be factored into 2 very different questions.
1 given N, find the list L of N's prime factors
2 given L, calculate number of unique combinations
All answers I see so far refer to #1 and fail to mention it is not tractable for enormous numbers. For moderately sized N, even 64-bit numbers, it is easy; for enormous N, the factoring problem can take "forever". Public key encryption depends on this.
Question #2 needs more discussion. If L contains only unique numbers, it is a simple calculation using the combination formula for choosing k objects from n items. Actually, you need to sum the results from applying the formula while varying k from 1 to sizeof(L). However, L will usually contain multiple occurrences of multiple primes. For example, L = {2,2,2,3,3,5} is the factorization of N = 360. Now this problem is quite difficult!
Restating #2, given collection C containing k items, such that item a has a' duplicates, and item b has b' duplicates, etc. how many unique combinations of 1 to k-1 items are there? For example, {2}, {2,2}, {2,2,2}, {2,3}, {2,2,3,3} must each occur once and only once if L = {2,2,2,3,3,5}. Each such unique sub-collection is a unique divisor of N by multiplying the items in the sub-collection.

An answer to your question depends greatly on the size of the integer. Methods for small numbers, e.g. less then 100 bit, and for numbers ~1000 bit (such as used in cryptography) are completely different.
general overview: http://en.wikipedia.org/wiki/Divisor_function
values for small n and some useful references: A000005: d(n) (also called tau(n) or sigma_0(n)), the number of divisors of n.
real-world example: factorization of integers

JUST one line
I have thought very carefuly about your question and I have tried to write a highly efficient and performant piece of code
To print all divisors of a given number on screen we need just one line of code!
(use option -std=c99 while compiling via gcc)
for(int i=1,n=9;((!(n%i)) && printf("%d is a divisor of %d\n",i,n)) || i<=(n/2);i++);//n is your number
for finding numbers of divisors you can use the following very very fast function(work correctly for all integer number except 1 and 2)
int number_of_divisors(int n)
{
int counter,i;
for(counter=0,i=1;(!(n%i) && (counter++)) || i<=(n/2);i++);
return counter;
}
or if you treat given number as a divisor(work correctly for all integer number except 1 and 2)
int number_of_divisors(int n)
{
int counter,i;
for(counter=0,i=1;(!(n%i) && (counter++)) || i<=(n/2);i++);
return ++counter;
}
NOTE:two above functions works correctly for all positive integer number except number 1 and 2
so it is functional for all numbers that are greater than 2
but if you Need to cover 1 and 2 , you can use one of the following functions( a little slower)
int number_of_divisors(int n)
{
int counter,i;
for(counter=0,i=1;(!(n%i) && (counter++)) || i<=(n/2);i++);
if (n==2 || n==1)
{
return counter;
}
return ++counter;
}
OR
int number_of_divisors(int n)
{
int counter,i;
for(counter=0,i=1;(!(i==n) && !(n%i) && (counter++)) || i<=(n/2);i++);
return ++counter;
}
small is beautiful :)

The sieve of Atkin is an optimized version of the sieve of Eratosthenes which gives all prime numbers up to a given integer. You should be able to google this for more detail.
Once you have that list, it's a simple matter to divide your number by each prime to see if it's an exact divisor (i.e., remainder is zero).
The basic steps calculating the divisors for a number (n) are [this is pseudocode converted from real code so I hope I haven't introduced errors]:
for z in 1..n:
prime[z] = false
prime[2] = true;
prime[3] = true;
for x in 1..sqrt(n):
xx = x * x
for y in 1..sqrt(n):
yy = y * y
z = 4*xx+yy
if (z <= n) and ((z mod 12 == 1) or (z mod 12 == 5)):
prime[z] = not prime[z]
z = z-xx
if (z <= n) and (z mod 12 == 7):
prime[z] = not prime[z]
z = z-yy-yy
if (z <= n) and (x > y) and (z mod 12 == 11):
prime[z] = not prime[z]
for z in 5..sqrt(n):
if prime[z]:
zz = z*z
x = zz
while x <= limit:
prime[x] = false
x = x + zz
for z in 2,3,5..n:
if prime[z]:
if n modulo z == 0 then print z

You might try this one. It's a bit hackish, but it's reasonably fast.
def factors(n):
for x in xrange(2,n):
if n%x == 0:
return (x,) + factors(n/x)
return (n,1)

Once you have the prime factorization, there is a way to find the number of divisors. Add one to each of the exponents on each individual factor and then multiply the exponents together.
For example:
36
Prime Factorization: 2^2*3^2
Divisors: 1, 2, 3, 4, 6, 9, 12, 18, 36
Number of Divisors: 9
Add one to each exponent 2^3*3^3
Multiply exponents: 3*3 = 9

Before you commit to a solution consider that the Sieve approach might not be a good answer in the typical case.
A while back there was a prime question and I did a time test--for 32-bit integers at least determining if it was prime was slower than brute force. There are two factors going on:
1) While a human takes a while to do a division they are very quick on the computer--similar to the cost of looking up the answer.
2) If you do not have a prime table you can make a loop that runs entirely in the L1 cache. This makes it faster.

This is an efficient solution:
#include <iostream>
int main() {
int num = 20;
int numberOfDivisors = 1;
for (int i = 2; i <= num; i++)
{
int exponent = 0;
while (num % i == 0) {
exponent++;
num /= i;
}
numberOfDivisors *= (exponent+1);
}
std::cout << numberOfDivisors << std::endl;
return 0;
}

Divisors do something spectacular: they divide completely. If you want to check the number of divisors for a number, n, it clearly is redundant to span the whole spectrum, 1...n. I have not done any in-depth research for this but I solved Project Euler's problem 12 on Triangular Numbers. My solution for the greater then 500 divisors test ran for 309504 microseconds (~0.3s). I wrote this divisor function for the solution.
int divisors (int x) {
int limit = x;
int numberOfDivisors = 1;
for (int i(0); i < limit; ++i) {
if (x % i == 0) {
limit = x / i;
numberOfDivisors++;
}
}
return numberOfDivisors * 2;
}
To every algorithm, there is a weak point. I thought this was weak against prime numbers. But since triangular numbers are not print, it served its purpose flawlessly. From my profiling, I think it did pretty well.
Happy Holidays.

You want the Sieve of Atkin, described here: http://en.wikipedia.org/wiki/Sieve_of_Atkin

Number theory textbooks call the divisor-counting function tau. The first interesting fact is that it's multiplicative, ie. τ(ab) = τ(a)τ(b) , when a and b have no common factor. (Proof: each pair of divisors of a and b gives a distinct divisor of ab).
Now note that for p a prime, τ(p**k) = k+1 (the powers of p). Thus you can easily compute τ(n) from its factorisation.
However factorising large numbers can be slow (the security of RSA crytopraphy depends on the product of two large primes being hard to factorise). That suggests this optimised algorithm
Test if the number is prime (fast)
If so, return 2
Otherwise, factorise the number (slow if multiple large prime factors)
Compute τ(n) from the factorisation

This is the most basic way of computing the number divissors:
class PrintDivisors
{
public static void main(String args[])
{
System.out.println("Enter the number");
// Create Scanner object for taking input
Scanner s=new Scanner(System.in);
// Read an int
int n=s.nextInt();
// Loop from 1 to 'n'
for(int i=1;i<=n;i++)
{
// If remainder is 0 when 'n' is divided by 'i',
if(n%i==0)
{
System.out.print(i+", ");
}
}
// Print [not necessary]
System.out.print("are divisors of "+n);
}
}

the prime number method is very clear here .
P[] is a list of prime number less than or equal the sq = sqrt(n) ;
for (int i = 0 ; i < size && P[i]<=sq ; i++){
nd = 1;
while(n%P[i]==0){
n/=P[i];
nd++;
}
count*=nd;
if (n==1)break;
}
if (n!=1)count*=2;//the confusing line :D :P .
i will lift the understanding for the reader .
i now look forward to a method more optimized .

The following is a C program to find the number of divisors of a given number.
The complexity of the above algorithm is O(sqrt(n)).
This algorithm will work correctly for the number which are perfect square as well as the numbers which are not perfect square.
Note that the upperlimit of the loop is set to the square-root of number to have the algorithm most efficient.
Note that storing the upperlimit in a separate variable also saves the time, you should not call the sqrt function in the condition section of the for loop, this also saves your computational time.
#include<stdio.h>
#include<math.h>
int main()
{
int i,n,limit,numberOfDivisors=1;
printf("Enter the number : ");
scanf("%d",&n);
limit=(int)sqrt((double)n);
for(i=2;i<=limit;i++)
if(n%i==0)
{
if(i!=n/i)
numberOfDivisors+=2;
else
numberOfDivisors++;
}
printf("%d\n",numberOfDivisors);
return 0;
}
Instead of the above for loop you can also use the following loop which is even more efficient as this removes the need to find the square-root of the number.
for(i=2;i*i<=n;i++)
{
...
}

Here is a function that I wrote. it's worst time complexity is O(sqrt(n)),best time on the other hand is O(log(n)). It gives you all the prime divisors along with the number of its occurence.
public static List<Integer> divisors(n) {
ArrayList<Integer> aList = new ArrayList();
int top_count = (int) Math.round(Math.sqrt(n));
int new_n = n;
for (int i = 2; i <= top_count; i++) {
if (new_n == (new_n / i) * i) {
aList.add(i);
new_n = new_n / i;
top_count = (int) Math.round(Math.sqrt(new_n));
i = 1;
}
}
aList.add(new_n);
return aList;
}

#Kendall
I tested your code and made some improvements, now it is even faster.
I also tested with #هومن جاویدپور code, this is also faster than his code.
long long int FindDivisors(long long int n) {
long long int count = 0;
long long int i, m = (long long int)sqrt(n);
for(i = 1;i <= m;i++) {
if(n % i == 0)
count += 2;
}
if(n / m == m && n % m == 0)
count--;
return count;
}

Isn't this just a question of factoring the number - determining all the factors of the number? You can then decide whether you need all combinations of one or more factors.
So, one possible algorithm would be:
factor(N)
divisor = first_prime
list_of_factors = { 1 }
while (N > 1)
while (N % divisor == 0)
add divisor to list_of_factors
N /= divisor
divisor = next_prime
return list_of_factors
It is then up to you to combine the factors to determine the rest of the answer.

I think this is what you are looking for.I does exactly what you asked for.
Copy and Paste it in Notepad.Save as *.bat.Run.Enter Number.Multiply the process by 2 and thats the number of divisors.I made that on purpose so the it determine the divisors faster:
Pls note that a CMD varriable cant support values over 999999999
#echo off
modecon:cols=100 lines=100
:start
title Enter the Number to Determine
cls
echo Determine a number as a product of 2 numbers
echo.
echo Ex1 : C = A * B
echo Ex2 : 8 = 4 * 2
echo.
echo Max Number length is 9
echo.
echo If there is only 1 proces done it
echo means the number is a prime number
echo.
echo Prime numbers take time to determine
echo Number not prime are determined fast
echo.
set /p number=Enter Number :
if %number% GTR 999999999 goto start
echo.
set proces=0
set mindet=0
set procent=0
set B=%Number%
:Determining
set /a mindet=%mindet%+1
if %mindet% GTR %B% goto Results
set /a solution=%number% %%% %mindet%
if %solution% NEQ 0 goto Determining
if %solution% EQU 0 set /a proces=%proces%+1
set /a B=%number% / %mindet%
set /a procent=%mindet%*100/%B%
if %procent% EQU 100 set procent=%procent:~0,3%
if %procent% LSS 100 set procent=%procent:~0,2%
if %procent% LSS 10 set procent=%procent:~0,1%
title Progress : %procent% %%%
if %solution% EQU 0 echo %proces%. %mindet% * %B% = %number%
goto Determining
:Results
title %proces% Results Found
echo.
#pause
goto start

i guess this one will be handy as well as precise
script.pyton
>>>factors=[ x for x in range (1,n+1) if n%x==0]
print len(factors)

Try something along these lines:
int divisors(int myNum) {
int limit = myNum;
int divisorCount = 0;
if (x == 1)
return 1;
for (int i = 1; i < limit; ++i) {
if (myNum % i == 0) {
limit = myNum / i;
if (limit != i)
divisorCount++;
divisorCount++;
}
}
return divisorCount;
}

I don't know the MOST efficient method, but I'd do the following:
Create a table of primes to find all primes less than or equal to the square root of the number (Personally, I'd use the Sieve of Atkin)
Count all primes less than or equal to the square root of the number and multiply that by two. If the square root of the number is an integer, then subtract one from the count variable.
Should work \o/
If you need, I can code something up tomorrow in C to demonstrate.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio