Fast algorithm for finding prime numbers? [duplicate] - algorithm

This question already has answers here:
Which is the fastest algorithm to find prime numbers?
(20 answers)
Closed 9 years ago.
First of all - I checked a lot in this forum and I haven't found something fast enough.
I try to make a function that returns me the prime numbers in a specified range.
For example I did this function (in C#) using the sieve of Eratosthenes. I tried also Atkin's sieve but the Eratosthenes one runs faster (in my implementation):
public static void SetPrimesSieve(int Range)
{
Primes = new List<uint>();
Primes.Add(2);
int Half = (Range - 1) >> 1;
BitArray Nums = new BitArray(Half, false);
int Sqrt = (int)Math.Sqrt(Range);
for (int i = 3, j; i <= Sqrt; )
{
for (j = ((i * i) >> 1) - 1; j < Half; j += i)
Nums[j] = true;
do
i += 2;
while (i <= Sqrt && Nums[(i >> 1) - 1]);
}
for (int i = 0; i < Half; ++i)
if (!Nums[i])
Primes.Add((uint)(i << 1) + 3);
}
It runs about twice faster than codes & algorithms I found...
There's should be a faster way to find prime numbers, could you help me?

When searching around for algorithms on this topic (for project Euler) I don't remember finding anything faster. If speed is really the concern, have you thought about just storing the primes so you simply need to look it up?
EDIT: quick google search found this, confirming that the fastest method would be just to page the results and look them up as needed.
One more edit - you may find more information here, essentially a duplicate of this topic. Top post there states that atkin's sieve was faster than eras' as far as generating on the fly.

The fastest algorithm in my experience so far is the Sieve of Erathostenes with wheel factorization for 2, 3 and 5, where the primes among the remaining numbers are represented as bits in a byte array. In Java on one core of my 3 year old Laptop it takes 23 seconds to compute the primes up to 1 billion.
With wheel factorization the Sieve of Atkin was about a factor of two slower, while with an ordinary BitSet it was about 30% faster.
See also this answer.

I did an algorithm that can find prime numbers from range 2-90 000 000 for 0.65 sec on I 350M-notebook, written in C .... you have to use bitwise operations and have "code" for recalculating index of your array to index of concrete bit you want. for example If you want folds of number 2, concrete bits will be for example ....10101000 ... so if you read from left ... you get index 4,6,8 ... thats it

Several comments.
For speed, precompute then load from disk. It's super fast. I did it in Java long ago.
Don't store as an array, store as a bitsequence for odd numbers. Way more efficient on memory
If your speed question is that you want this particular computation to run fast (you need to justify why you can't precompute and load it from disk) you need to code a better Atkin's sieve. It is faster. But only slightly.
You haven't indicated the end use for these primes. We may be missing something completely because you've not told us the application. Tell us a sketch of the application and the answers will be targetted better for your context.
Why on earth do you think something faster exists? You haven't justified your hunch. This is a very hard problem. (that is to find something faster)

You can do better than that using the Sieve of Atkin, but it is quite tricky to implement it fast and correctly. A simple translation of the Wikipedia pseudo-code is probably not good enough.

Related

Factorial of a big number [duplicate]

This question already has answers here:
Calculating factorial of large numbers in C
(16 answers)
Closed 2 years ago.
Consider problem of calculating factorial of a number.
When result is bigger than 2^32 then we will get overflow error.
How can we design a program to calculate factorial of big numbers?
EDIT: assume we are using C++ language.
EDIT2: it is a duplicate question of this one
As a question with just algorithm tagged. Your 2^32 is not an issue because an algorithm can never have an Overflow error. Implementations of an algorithm can and do have overflow errors. So what language are you using?
Most languages have a BigNumber or BigInteger that can be used.
Here's a C++ BigInteger library: https://mattmccutchen.net/bigint/
I suggest that you google for: c++ biginteger
If you can live with approximate values, consider using the Stirling approximation and compute it in double precision.
If you want exact values, you'll need arbitrary-precision arithmetic and a lot of computation time...
Doing this requires you to take one of a few approaches, but basically boils down to:
splitting your number across multiple variables (stored in an array) and
managing your operations across the array.
That way each int/element in the array has a positional magnitude and can be strung together in the end to make your whole number.
A good example here in C: http://www.go4expert.com/forums/c-program-calculate-factorial-t25352/
Test this script:
import gmpy as gm
print gm.fac(3000)
For very big number is difficult to stock or print result.
For some purposes, such as working out the number of combinations, it is sufficient to compute the logarithm of the factorial, because you will be dividing factorials by factorials and the final result is of a more reasonable size - you just subtract logarithms before taking the exponential of the result.
You can compute the logarithm of the factorial by adding logarithms, or by using the http://en.wikipedia.org/wiki/Gamma_function, which is often available in mathematical libraries (there are good ways to approximate this).
First invent a way to store and use big numbers. Common way is to interpret array of integers as digits of a big number. Then add basic operations to your system, such as multiplication. Then multiply.
Or use already made solutions. Google for: c++ big integer library
You can use BigInteger for finding factorial of a Big numbers probably greater than 65 as the range of data type long ends at 65! and it starts returning 0 after that. Please refer to below Java code. Hope it would help:
import java.math.BigInteger;
public class factorial {
public factorial() {
// TODO Auto-generated constructor stub
}
public static void main(String args[])
{
factorial f = new factorial();
System.out.println(f.fact(100));
}
public BigInteger fact(int num)
{
BigInteger sum = BigInteger.valueOf(1);
for(int i = num ; i>= 2; i --)
{
sum = sum.multiply(BigInteger.valueOf(i));
}
return sum;
}
}
If you want to improve the range of your measurement, you can use logarithms. Logarithms will convert your multiplication to additions making it much smaller to store.
factorial(n) => n * factorial(n-1)
log(factorial(n)) => log(n) * log(factorial(n-1))
5! = 5*4*3*2*1 = 120
log(5!) = log(5) + log(4) + log(3) + log(2) + log(1) = 2.0791812460476247
In this example, I used base 10 logarithms, but any base works.
10^2.0791812460476247
Or 10^0.0791812460476247*10^2 or 1.2*10^2
Implementation example in javascript

Sorting algorithm for list of integers

I have a list of about 200 integers whose values are between 1 and 5.
I want to get into learning about sorting algorithms and knowing where to apply each because at the moment I use bubble-sort for everything which I've been told is a terrible way to do things.
What would be the fastest sorting algorithm for this integer sorting?
EDIT: It turns out that because I know the numbers are 1 to 5 then I can use a bucket sort (?) algorithm which if I'm not mistaken - and I definitely could be - means that for each integer of value 1, I put it in the 1 group, value 2 I put it in the 2 group etc, then concatenate the groups at the end. This seems like a simple and efficient way to do it.
However since this is (currently) a learning excercise for me I am going to remove the 1 - 5 limitation and try to implement bubble-sort and merge-sort then compare the two to see which is faster.
Thanks for your help!
... which I've been told is a terrible way to do things.
First off, don't accept as gospel anything you hear from random bods on the internet (even me).
Bubble sort is fine under certain conditions, such as when the data is already mostly sorted, or the item count is relatively small (such as 200) (a), or you have no sort functionality built into the language and you're on a tight deadline where lack of performance will annoy the customer but lack of functionality will get you fired :-)
This bias against bubble sort is similar to the "only one exit point from a function" and "no goto" rules. You should understand the reasoning behind them so that you know when the rules can be ignored safely.
Anyway, on to the question proper. An efficient way for your specific case is to just count the items then output them, something like:
dim count[1..5] = {0, 0, 0, 0, 0};
for each item in list:
count[item] = count[item] + 1
for val in 1..5:
for quant in 1..count[val]:
output val
That's an O(n) time and O(1) space solution and you won't find a more efficient big-O for a generalised sort routine - it's only possible in this case because of the extra information you have about the data (limited to the values 1 through 5).
If you wanted to examine all the different sort algorithms, the Wikipedia Sorting Algorithm page is a useful starting point, including the major algorithms and their properties.
(a) As an aside, the following code (using worst case data for bubble sort), when run under CygWin on a not-very-powerful IBM T60 (2GHz dual core) laptop, completes in, on average, 0.157 seconds (5 samples: 0.150, 0.125, 0.192, 0.199, 0.115).
I wouldn't use it for sorting a million items (everyone knows bubble sort scales poorly) but 200 should be fine in most cases:
#include <stdio.h>
#define COUNT 200
int main (void) {
int i, swapped, tmp, item[COUNT];
// Set up worst case (reverse order) data.
for (i = 0; i < COUNT; i++)
item[i] = 200 - i;
// Slightly optimised bubble sort.
swapped = 1;
while (swapped) {
swapped = 0;
for (i = 1; i < COUNT; i++) {
if (item[i-1] > item[i]) {
tmp = item[i-1];
item[i-1] = item[i];
item[i] = tmp;
swapped = 1;
}
}
}
// for (i = 0; i < COUNT; i++)
// printf ("%d ", item[i]);
// putchar ('\n');
return 0;
}
You may not need sorting here, since you only have 5 possible values.
You could use 5 containers (or buckets) and as you scan your list of integers you place the values in the right bucket.
At the end, join the buckets together, in order.
Merge sort is an O(n log n) I think its way better than QuickSort
You can find some C# code here.

Algorithm to check if a number if a perfect number

I am looking for an algorithm to find if a given number is a perfect number.
The most simple that comes to my mind is :
Find all the factors of the number
Get the prime factors [except the number itself, if it is prime] and add them up to check if it is a perfect number.
Is there a better way to do this ?.
On searching, some Euclids work came up, but didnt find any good algorithm. Also this golfscript wasnt helpful: https://stackoverflow.com/questions/3472534/checking-whether-a-number-is-mathematically-a-perfect-number .
The numbers etc can be cached etc in real world usage [which I dont know where perfect nos are used :)]
However, since this is being asked in interviews, I am assuming there should be a "derivable" way of optimizing it.
Thanks !
If the input is even, see if it is of the form 2^(p-1)*(2^p-1), with p and 2^p-1 prime.
If the input is odd, return "false". :-)
See the Wikipedia page for details.
(Actually, since there are only 47 perfect numbers with fewer than 25 million digits, you might start with a simple table of those. Ask the interviewer if you can assume you are using 64-bit numbers, for instance...)
Edit: Dang, I failed the interview! :-(
In my over zealous attempt at finding tricks or heuristics to improve upon the "factorize + enumerate divisors + sum them" approach, I failed to note that being 1 modulo 9 was merely a necessary, and certainly not a sufficient condition for at number (other than 6) to be perfect...
Duh... with on average 1 in 9 even number satisfying this condition, my algorithm would sure find a few too many perfect numbers ;-).
To redeem myself, persist and maintain the suggestion of using the digital root, but only as a filter, to avoid the more expensive computation of the factor, in most cases.
[Original attempt: hall of shame]
If the number is even,<br>
compute its [digital root][1].
if the digital root is 1, the number is perfect, otherwise it isn't.
If the number is odd...
there are no shortcuts, other than...
"Not perfect" if the number is smaller than 10^300
For bigger values, one would then need to run the algorithm described in
the question, possibly with a few twists typically driven by heuristics
that prove that the sum of divisors will be lacking when the number
doesn't have some of the low prime factors.
My reason for suggesting the digital root trick for even numbers is that this can be computed without the help of an arbitrary length arithmetic library (like GMP). It is also much less computationally expensive than the decomposition in prime factors and/or the factorization (2^(p-1) * ((2^p)-1)). Therefore if the interviewer were to be satisfied with a "No perfect" response for odd numbers, the solution would be both very efficient and codable in most computer languages.
[Second and third attempt...]
If the number is even,<br>
if it is 6
The number is PERFECT
otherwise compute its [digital root][1].
if the digital root is _not_ 1
The number is NOT PERFECT
else ...,
Compute the prime factors
Enumerate the divisors, sum them
if the sum of these divisor equals the 2 * the number
it is PERFECT
else
it is NOT PERFECT
If the number is odd...
same as previously
On this relatively odd interview question...
I second andrewdski's comment to another response in this post, that this particular question is rather odd in the context of an interview for a general purpose developer. As with many interview questions, it can be that the interviewer isn't seeking a particular solution, but rather is providing an opportunity for the candidate to demonstrate his/her ability to articulate the general pros and cons of various approaches. Also, if the candidate is offered an opportunity to look-up generic resources such as MathWorld or Wikipedia prior to responding, this may also be a good test of his/her ability to quickly make sense of the info offered there.
Here's a quick algorithm just for fun, in PHP - using just a simple for loop. You can easliy port that to other languages:
function isPerfectNumber($num) {
$out = false;
if($num%2 == 0) {
$divisors = array(1);
for($i=2; $i<$num; $i++) {
if($num%$i == 0)
$divisors[] = $i;
}
if(array_sum($divisors) == $num)
$out = true;
}
return $out ? 'It\'s perfect!' : 'Not a perfect number.';
}
Hope this helps, not sure if this is what you're looking for.
#include<stdio.h>
#include<stdlib.h>
int sumOfFactors(int );
int main(){
int x, start, end;
printf("Enter start of the range:\n");
scanf("%d", &start);
printf("Enter end of the range:\n");
scanf("%d", &end);
for(x = start;x <= end;x++){
if(x == sumOfFactors(x)){
printf("The numbers %d is a perfect number\n", x);
}
}
return 0;
}
int sumOfFactors(int x){
int sum = 1, i, j;
for(j=2;j <= x/2;j++){
if(x % j == 0)
sum += j;
}
return sum;
}

square numbers given in an sequence

Given a positive integer sequence of numbers in an array with common difference 2
for e.g 2 4 6 8
Now replace each number by its square. Perform the computations efficiently.
I was asked this question in an interview and i gave him o(n) solution using bitwise operator since it is operation in the multiples of 2.If there is any better method please suggest.
I dunno if its better but it's recursive!!! :-)
(n+2)(n+2) = n**2 + 4*n + 4 // and you got n**2
class Square
{
public static int[] sequence(int[] array)
{
int[] result=new int[array.length];
for(int i=0;i<array.length;i++)
{
result[i]=array[i]*array[i];
}
return result;
}
}
// test cases:
// Square.sequence(new int[]{2,4,6,8})
//out put->{ 4, 16, 36, 64 }
It really depends on the interviewer, and what they think is "the right thing". If it were me, I'd think the (n << 2) + 4 were neat, but on the other hand, I'd hate to see it in my code. It takes more thinking to maintain, and there's a fair chance a good optimizer might do just as good a job.
I think the phrase "perform the operation efficiently" is probably our clue that the interviewer was looking for a fast computation. It's still O(n), but let's not forget that when you are comparing two O(n) algorithms, the coefficients start to matter again.

random permutation

I would like to genrate a random permutation as fast as possible.
The problem: The knuth shuffle which is O(n) involves generating n random numbers.
Since generating random numbers is quite expensive.
I would like to find an O(n) function involving a fixed O(1) amount of random numbers.
I realize that this question has been asked before, but I did not see any relevant answers.
Just to stress a point: I am not looking for anything less than O(n), just an algorithm involving less generation of random numbers.
Thanks
Create a 1-1 mapping of each permutation to a number from 1 to n! (n factorial). Generate a random number in 1 to n!, use the mapping, get the permutation.
For the mapping, perhaps this will be useful: http://en.wikipedia.org/wiki/Permutation#Numbering_permutations
Of course, this would get out of hand quickly, as n! can become really large soon.
Generating a random number takes long time you say? The implementation of Javas Random.nextInt is roughly
oldseed = seed;
nextseed = (oldseed * multiplier + addend) & mask;
return (int)(nextseed >>> (48 - bits));
Is that too much work to do for each element?
See https://doi.org/10.1145/3009909 for a careful analysis of the number of random bits required to generate a random permutation. (It's open-access, but it's not easy reading! Bottom line: if carefully implemented, all of the usual methods for generating random permutations are efficient in their use of random bits.)
And... if your goal is to generate a random permutation rapidly for large N, I'd suggest you try the MergeShuffle algorithm. An article published in 2015 claimed a factor-of-two speedup over Fisher-Yates in both parallel and sequential implementations, and a significant speedup in sequential computations over the other standard algorithm they tested (Rao-Sandelius).
An implementation of MergeShuffle (and of the usual Fisher-Yates and Rao-Sandelius algorithms) is available at https://github.com/axel-bacher/mergeshuffle. But caveat emptor! The authors are theoreticians, not software engineers. They have published their experimental code to github but aren't maintaining it. Someday, I imagine someone (perhaps you!) will add MergeShuffle to GSL. At present gsl_ran_shuffle() is an implementation of Fisher-Yates, see https://www.gnu.org/software/gsl/doc/html/randist.html?highlight=gsl_ran_shuffle.
Not what you asked exactly, but if provided random number generator doesn't satisfy you, may be you should try something different. Generally, pseudorandom number generation can be very simple.
Probably, best-known algorithm
http://en.wikipedia.org/wiki/Linear_congruential_generator
More
http://en.wikipedia.org/wiki/List_of_pseudorandom_number_generators
As other answers suggest, you can make a random integer in the range 0 to N! and use it to produce a shuffle. Although theoretically correct, this won't be faster in general since N! grows fast and you'll spend all your time doing bigint arithmetic.
If you want speed and you don't mind trading off some randomness, you will be much better off using a less good random number generator. A linear congruential generator (see http://en.wikipedia.org/wiki/Linear_congruential_generator) will give you a random number in a few cycles.
Usually there is no need in full-range of next random value, so to use exactly the same amount of randomness you can use next approach (which is almost like random(0,N!), I guess):
// ...
m = 1; // range of random buffer (single variant)
r = 0; // random buffer (number zero)
// ...
for(/* ... */) {
while (m < n) { // range of our buffer is too narrow for "n"
r = r*RAND_MAX + random(); // add another random to our random-buffer
m *= RAND_MAX; // update range of random-buffer
}
x = r % n; // pull-out next random with range "n"
r /= n; // remove it from random-buffer
m /= n; // fix range of random-buffer
// ...
}
P.S. of course there will be some errors related with division by value different from 2^n, but they will be distributed among resulted samples.
Generate N numbers (N < of the number of random number you need) before to do the computation, or store them in an array as data, with your slow but good random generator; then pick up a number simply incrementing an index into the array inside your computing loop; if you need different seeds, create multiple tables.
Are you sure that your mathematical and algorithmical approach to the problem is correct?
I hit exactly same problem where Fisher–Yates shuffle will be bottleneck in corner cases. But for me the real problem is brute force algorithm that doesn't scale well to all problems. Following story explains the problem and optimizations that I have come up with so far.
Dealing cards for 4 players
Number of possible deals is 96 bit number. That puts quite a stress for random number generator to avoid statical anomalies when selecting play plan from generated sample set of deals. I choose to use 2xmt19937_64 seeded from /dev/random because of the long period and heavy advertisement in web that it is good for scientific simulations.
Simple approach is to use Fisher–Yates shuffle to generate deals and filter out deals that don't match already collected information. Knuth shuffle takes ~1400 CPU cycles per deal mostly because I have to generate 51 random numbers and swap 51 times entries in the table.
That doesn't matter for normal cases where I would only need to generate 10000-100000 deals in 7 minutes. But there is extreme cases when filters may select only very small subset of hands requiring huge number of deals to be generated.
Using single number for multiple cards
When profiling with callgrind (valgrind) I noticed that main slow down was C++ random number generator (after switching away from std::uniform_int_distribution that was first bottleneck).
Then I came up with idea that I can use single random number for multiple cards. The idea is to use least significant information from the number first and then erase that information.
int number = uniform_rng(0, 52*51*50*49);
int card1 = number % 52;
number /= 52;
int cards2 = number % 51;
number /= 51;
......
Of course that is only minor optimization because generation is still O(N).
Generation using bit permutations
Next idea was exactly solution asked in here but I ended up still with O(N) but with larger cost than original shuffle. But lets look into solution and why it fails so miserably.
I decided to use idea Dealing All the Deals by John Christman
void Deal::generate()
{
// 52:26 split, 52!/(26!)**2 = 495,918,532,948,1041
max = 495918532948104LU;
partner = uniform_rng(eng1, max);
// 2x 26:13 splits, (26!)**2/(13!)**2 = 10,400,600**2
max = 10400600LU*10400600LU;
hands = uniform_rng(eng2, max);
// Create 104 bit presentation of deal (2 bits per card)
select_deal(id, partner, hands);
}
So far good and pretty good looking but select_deal implementation is PITA.
void select_deal(Id &new_id, uint64_t partner, uint64_t hands)
{
unsigned idx;
unsigned e, n, ns = 26;
e = n = 13;
// Figure out partnership who owns which card
for (idx = CARDS_IN_SUIT*NUM_SUITS; idx > 0; ) {
uint64_t cut = ncr(idx - 1, ns);
if (partner >= cut) {
partner -= cut;
// Figure out if N or S holds the card
ns--;
cut = ncr(ns, n) * 10400600LU;
if (hands > cut) {
hands -= cut;
n--;
} else
new_id[idx%NUM_SUITS] |= 1 << (idx/NUM_SUITS);
} else
new_id[idx%NUM_SUITS + NUM_SUITS] |= 1 << (idx/NUM_SUITS);
idx--;
}
unsigned ew = 26;
// Figure out if E or W holds a card
for (idx = CARDS_IN_SUIT*NUM_SUITS; idx-- > 0; ) {
if (new_id[idx%NUM_SUITS + NUM_SUITS] & (1 << (idx/NUM_SUITS))) {
uint64_t cut = ncr(--ew, e);
if (hands >= cut) {
hands -= cut;
e--;
} else
new_id[idx%NUM_SUITS] |= 1 << (idx/NUM_SUITS);
}
}
}
Now that I had the O(N) permutation solution done to prove algorithm could work I started searching for O(1) mapping from random number to bit permutation. Too bad it looks like only solution would be using huge lookup tables that would kill CPU caches. That doesn't sound good idea for AI that will be using very large amount of caches for double dummy analyzer.
Mathematical solution
After all hard work to figure out how to generate random bit permutations I decided go back to maths. It is entirely possible to apply filters before dealing cards. That requires splitting deals to manageable number of layered sets and selecting between sets based on their relative probabilities after filtering out impossible sets.
I don't yet have code ready for that to tests how much cycles I'm wasting in common case where filter is selecting major part of deal. But I believe this approach gives the most stable generation performance keeping the cost less than 0.1%.
Generate a 32 bit integer. For each index i (maybe only up to half the number of elements in the array), if bit i % 32 is 1, swap i with n - i - 1.
Of course, this might not be random enough for your purposes. You could probably improve this by not swapping with n - i - 1, but rather by another function applied to n and i that gives better distribution. You could even use two functions: one for when the bit is 0 and another for when it's 1.

Resources