Change the range of IRAND() in Fortran 77 [duplicate] - random

This is a follow on from a previously posted question:
How to generate a random number in C?
I wish to be able to generate a random number from within a particular range, such as 1 to 6 to mimic the sides of a die.
How would I go about doing this?

All the answers so far are mathematically wrong. Returning rand() % N does not uniformly give a number in the range [0, N) unless N divides the length of the interval into which rand() returns (i.e. is a power of 2). Furthermore, one has no idea whether the moduli of rand() are independent: it's possible that they go 0, 1, 2, ..., which is uniform but not very random. The only assumption it seems reasonable to make is that rand() puts out a Poisson distribution: any two nonoverlapping subintervals of the same size are equally likely and independent. For a finite set of values, this implies a uniform distribution and also ensures that the values of rand() are nicely scattered.
This means that the only correct way of changing the range of rand() is to divide it into boxes; for example, if RAND_MAX == 11 and you want a range of 1..6, you should assign {0,1} to 1, {2,3} to 2, and so on. These are disjoint, equally-sized intervals and thus are uniformly and independently distributed.
The suggestion to use floating-point division is mathematically plausible but suffers from rounding issues in principle. Perhaps double is high-enough precision to make it work; perhaps not. I don't know and I don't want to have to figure it out; in any case, the answer is system-dependent.
The correct way is to use integer arithmetic. That is, you want something like the following:
#include <stdlib.h> // For random(), RAND_MAX
// Assumes 0 <= max <= RAND_MAX
// Returns in the closed interval [0, max]
long random_at_most(long max) {
unsigned long
// max <= RAND_MAX < ULONG_MAX, so this is okay.
num_bins = (unsigned long) max + 1,
num_rand = (unsigned long) RAND_MAX + 1,
bin_size = num_rand / num_bins,
defect = num_rand % num_bins;
long x;
do {
x = random();
}
// This is carefully written not to overflow
while (num_rand - defect <= (unsigned long)x);
// Truncated division is intentional
return x/bin_size;
}
The loop is necessary to get a perfectly uniform distribution. For example, if you are given random numbers from 0 to 2 and you want only ones from 0 to 1, you just keep pulling until you don't get a 2; it's not hard to check that this gives 0 or 1 with equal probability. This method is also described in the link that nos gave in their answer, though coded differently. I'm using random() rather than rand() as it has a better distribution (as noted by the man page for rand()).
If you want to get random values outside the default range [0, RAND_MAX], then you have to do something tricky. Perhaps the most expedient is to define a function random_extended() that pulls n bits (using random_at_most()) and returns in [0, 2**n), and then apply random_at_most() with random_extended() in place of random() (and 2**n - 1 in place of RAND_MAX) to pull a random value less than 2**n, assuming you have a numerical type that can hold such a value. Finally, of course, you can get values in [min, max] using min + random_at_most(max - min), including negative values.

Following on from #Ryan Reich's answer, I thought I'd offer my cleaned up version. The first bounds check isn't required given the second bounds check, and I've made it iterative rather than recursive. It returns values in the range [min, max], where max >= min and 1+max-min < RAND_MAX.
unsigned int rand_interval(unsigned int min, unsigned int max)
{
int r;
const unsigned int range = 1 + max - min;
const unsigned int buckets = RAND_MAX / range;
const unsigned int limit = buckets * range;
/* Create equal size buckets all in a row, then fire randomly towards
* the buckets until you land in one of them. All buckets are equally
* likely. If you land off the end of the line of buckets, try again. */
do
{
r = rand();
} while (r >= limit);
return min + (r / buckets);
}

Here is a formula if you know the max and min values of a range, and you want to generate numbers inclusive in between the range:
r = (rand() % (max + 1 - min)) + min

unsigned int
randr(unsigned int min, unsigned int max)
{
double scaled = (double)rand()/RAND_MAX;
return (max - min +1)*scaled + min;
}
See here for other options.

Wouldn't you just do:
srand(time(NULL));
int r = ( rand() % 6 ) + 1;
% is the modulus operator. Essentially it will just divide by 6 and return the remainder... from 0 - 5

For those who understand the bias problem but can't stand the unpredictable run-time of rejection-based methods, this series produces a progressively less biased random integer in the [0, n-1] interval:
r = n / 2;
r = (rand() * n + r) / (RAND_MAX + 1);
r = (rand() * n + r) / (RAND_MAX + 1);
r = (rand() * n + r) / (RAND_MAX + 1);
...
It does so by synthesising a high-precision fixed-point random number of i * log_2(RAND_MAX + 1) bits (where i is the number of iterations) and performing a long multiplication by n.
When the number of bits is sufficiently large compared to n, the bias becomes immeasurably small.
It does not matter if RAND_MAX + 1 is less than n (as in this question), or if it is not a power of two, but care must be taken to avoid integer overflow if RAND_MAX * n is large.

Here is a slight simpler algorithm than Ryan Reich's solution:
/// Begin and end are *inclusive*; => [begin, end]
uint32_t getRandInterval(uint32_t begin, uint32_t end) {
uint32_t range = (end - begin) + 1;
uint32_t limit = ((uint64_t)RAND_MAX + 1) - (((uint64_t)RAND_MAX + 1) % range);
/* Imagine range-sized buckets all in a row, then fire randomly towards
* the buckets until you land in one of them. All buckets are equally
* likely. If you land off the end of the line of buckets, try again. */
uint32_t randVal = rand();
while (randVal >= limit) randVal = rand();
/// Return the position you hit in the bucket + begin as random number
return (randVal % range) + begin;
}
Example (RAND_MAX := 16, begin := 2, end := 7)
=> range := 6 (1 + end - begin)
=> limit := 12 (RAND_MAX + 1) - ((RAND_MAX + 1) % range)
The limit is always a multiple of the range,
so we can split it into range-sized buckets:
Possible-rand-output: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Buckets: [0, 1, 2, 3, 4, 5][0, 1, 2, 3, 4, 5][X, X, X, X, X]
Buckets + begin: [2, 3, 4, 5, 6, 7][2, 3, 4, 5, 6, 7][X, X, X, X, X]
1st call to rand() => 13
→ 13 is not in the bucket-range anymore (>= limit), while-condition is true
→ retry...
2nd call to rand() => 7
→ 7 is in the bucket-range (< limit), while-condition is false
→ Get the corresponding bucket-value 1 (randVal % range) and add begin
=> 3

In order to avoid the modulo bias (suggested in other answers) you can always use:
arc4random_uniform(MAX-MIN)+MIN
Where "MAX" is the upper bound and "MIN" is lower bound. For example, for numbers between 10 and 20:
arc4random_uniform(20-10)+10
arc4random_uniform(10)+10
Simple solution and better than using "rand() % N".

While Ryan is correct, the solution can be much simpler based on what is known about the source of the randomness. To re-state the problem:
There is a source of randomness, outputting integer numbers in range [0, MAX) with uniform distribution.
The goal is to produce uniformly distributed random integer numbers in range [rmin, rmax] where 0 <= rmin < rmax < MAX.
In my experience, if the number of bins (or "boxes") is significantly smaller than the range of the original numbers, and the original source is cryptographically strong - there is no need to go through all that rigamarole, and simple modulo division would suffice (like output = rnd.next() % (rmax+1), if rmin == 0), and produce random numbers that are distributed uniformly "enough", and without any loss of speed. The key factor is the randomness source (i.e., kids, don't try this at home with rand()).
Here's an example/proof of how it works in practice. I wanted to generate random numbers from 1 to 22, having a cryptographically strong source that produced random bytes (based on Intel RDRAND). The results are:
Rnd distribution test (22 boxes, numbers of entries in each box):
1: 409443 4.55%
2: 408736 4.54%
3: 408557 4.54%
4: 409125 4.55%
5: 408812 4.54%
6: 409418 4.55%
7: 408365 4.54%
8: 407992 4.53%
9: 409262 4.55%
10: 408112 4.53%
11: 409995 4.56%
12: 409810 4.55%
13: 409638 4.55%
14: 408905 4.54%
15: 408484 4.54%
16: 408211 4.54%
17: 409773 4.55%
18: 409597 4.55%
19: 409727 4.55%
20: 409062 4.55%
21: 409634 4.55%
22: 409342 4.55%
total: 100.00%
This is as close to uniform as I need for my purpose (fair dice throw, generating cryptographically strong codebooks for WWII cipher machines such as http://users.telenet.be/d.rijmenants/en/kl-7sim.htm, etc). The output does not show any appreciable bias.
Here's the source of cryptographically strong (true) random number generator:
Intel Digital Random Number Generator
and a sample code that produces 64-bit (unsigned) random numbers.
int rdrand64_step(unsigned long long int *therand)
{
unsigned long long int foo;
int cf_error_status;
asm("rdrand %%rax; \
mov $1,%%edx; \
cmovae %%rax,%%rdx; \
mov %%edx,%1; \
mov %%rax, %0;":"=r"(foo),"=r"(cf_error_status)::"%rax","%rdx");
*therand = foo;
return cf_error_status;
}
I compiled it on Mac OS X with clang-6.0.1 (straight), and with gcc-4.8.3 using "-Wa,q" flag (because GAS does not support these new instructions).

As said before modulo isn't sufficient because it skews the distribution. Heres my code which masks off bits and uses them to ensure the distribution isn't skewed.
static uint32_t randomInRange(uint32_t a,uint32_t b) {
uint32_t v;
uint32_t range;
uint32_t upper;
uint32_t lower;
uint32_t mask;
if(a == b) {
return a;
}
if(a > b) {
upper = a;
lower = b;
} else {
upper = b;
lower = a;
}
range = upper - lower;
mask = 0;
//XXX calculate range with log and mask? nah, too lazy :).
while(1) {
if(mask >= range) {
break;
}
mask = (mask << 1) | 1;
}
while(1) {
v = rand() & mask;
if(v <= range) {
return lower + v;
}
}
}
The following simple code lets you look at the distribution:
int main() {
unsigned long long int i;
unsigned int n = 10;
unsigned int numbers[n];
for (i = 0; i < n; i++) {
numbers[i] = 0;
}
for (i = 0 ; i < 10000000 ; i++){
uint32_t rand = random_in_range(0,n - 1);
if(rand >= n){
printf("bug: rand out of range %u\n",(unsigned int)rand);
return 1;
}
numbers[rand] += 1;
}
for(i = 0; i < n; i++) {
printf("%u: %u\n",i,numbers[i]);
}
}

Will return a floating point number in the range [0,1]:
#define rand01() (((double)random())/((double)(RAND_MAX)))

Related

generate a random int given a random bit function

Suppose you're given a int randBit() function which returns, uniformly distributed, 0 or 1.
Write a randNumber(int max) function.
This is my implementation, but I can't prove/disprove that it's right.
// max number of bits
int i = (int)Math.floor(Math.log(max) / Math.log(2)) + 1;
int ret = randBit();
while (i-- > 0) {
ret = ret << 1 | randBit();
}
return ret;
The basic idea I had is that
find the number of bits present in the number
then generate the number by continuously concatenating the LSB until the bitlength is met
The approach to fill an int with random bits is the right way in my opinion. However, since your algorithm only works when max is power of 2 and is off by one in the loop, I'd suggest this modification:
// max number of bits
int i = (int)Math.floor(Math.log(max) / Math.log(2)) + 1;
int rnd = 0;
int mask = 1;
while (i-- > 0) {
rnd = rnd << 1 | randBit();
mask <<= 1; // or: mask *= 2
}
double q = (double)rnd / mask; // range is [0, 1)
return (int)((max + 1) * q);
Let's take a look at this:
i will always be equal to the number of bits that max occupies. When the loop is finished, rnd will contain that number of bits filled randomly with 0 or 1, and mask-1 will contain that number of bits filled with 1s. So it's safe to assume that the quotient of rnd and mask-1 is uniformly distributed between 0 and 1. This multiplied with max would wield results in the range between 0 and max, also uniformly distributed, in terms of floating/real values.
Now this result has to be mapped to integers, and of course you'd want them also to be uniformly distributed. The only catch here is the 1. If the quotient of rnd and mask-1 is exactly 1, there'd be an edge case that would cause trouble when scaling to the desired result range: There would be 0 .. max-1 values uniformly distributed, but max would be a rare exception.
To take care of this condition the quotient has to be built such that it ranges from 0 to 1, but with 1 exclusive. This is achieved by rnd / mask. This range can be easily mapped to uniformly-spreaded integers 0 .. max by multiplying with max+1 and casting to int.

What would be the most efficient algorithm to find kth digit from right of a^b i.e a raised to power b

I recently came across this question. I know the naive approach i.e to find a^b and then extract least significant digits of this number 'k' times.
I am looking for a better approach.
'a' and 'b' are integers.
The naive approach breaks when a^p < 10^k, but a^(p+1) overflows. A solution which only requires 2*10^k-2 to fit into the variables is to write the (a*a) mod 10^k using Russian peasant multiplication. It calculates the product of a*b by multiplying a and dividing b with steps of two and hence prevents the overflow as you can take the modulus between each step.
Here is a c++ implementation of function calculating (a*b)%m without an overflow:
unsigned long long abModm(unsigned long long a, unsigned long long b,unsigned long long m){
unsigned long long res=0;
a=a%m;
b=b%m;
while (b>0){
if (b&1==1){//is b odd
res=(res+a)%m;//collect the result
}
a=(a<<1)%m;//multiply a
b>>=1;//divide b
}
return res;
}
Then you can use this to solve the problem as already suggested by others:
int kthDigit(unsigned long long a, unsigned long long b, int k){
unsigned long long m=1;
for (int i=0;i<k;++i) m*=10;
unsigned long long res=1;
for (int i=0;i<b;++i){
res=abModm(res,a,m);
}
m/=10;
return res/m;
}
The exponent calculation is O(b) you can do it in O(log(b)) with
unsigned long long res=1;
while (b){
if (b&1) res=abModm(res,a,m);
b>>=1;
a=abModm(a,a,m);
}
Check for the special case that a is divisible by 10. If k < b the result is 0, if k ≥ b then it's the (k - b'th) digit of (a/10)^b.
Do the calculation modulo 10^(k + 1). Replace a with a modulo 10^(k + 1). With 64 bit arithmetic, the calculation is easy if k ≤ 18 and a < 2^32.
Do the power by multiplying in steps, and in each step, discard the highest digits that will not influence the digit you're looking for. This will allow you to go beyond the integer size limitations of your implementation. In Javascript, which is limited to 253-1, you can calculate e.g. the 9th digit of 999999999999.
function powerDigit(a, b, k) {
var c = 1, max = Math.pow(10, k);
a %= max;
while (b--) {
c *= a;
// if (c >= Math.pow(2, 53)) return NaN; // Javascript limitation
c %= max;
}
return Math.floor(c * 10 / max);
}
document.write(powerDigit(9, 9, 9) + "<BR>"); // 3 ; 387420489
document.write(powerDigit(99, 9, 9) + "<BR>"); // 4 ; 913517247483640899
document.write(powerDigit(99, 99, 9) + "<BR>"); // 2 ; 3.697296376497267726e+197
document.write(powerDigit(999, 999, 9) + "<BR>"); // 4 ; 3.680634882592232678e+2996
document.write(powerDigit(999999, 999999, 9)); // 9 ; millions of digits
First you need to find a^b then you divide this by 10^(k-1) and from result you find modulo 10 and you get your kth number from right.
Here i give example of c code:
double r=pow(a,b)/pow(10,k-1);
int result=(int)r%10;

A problem from a programming competition... Digit Sums

I need help solving problem N from this earlier competition:
Problem N: Digit Sums
Given 3 positive integers A, B and C,
find how many positive integers less
than or equal to A, when expressed in
base B, have digits which sum to C.
Input will consist of a series of
lines, each containing three integers,
A, B and C, 2 ≤ B ≤ 100, 1 ≤ A, C ≤
1,000,000,000. The numbers A, B and C
are given in base 10 and are separated
by one or more blanks. The input is
terminated by a line containing three
zeros.
Output will be the number of numbers,
for each input line (it must be given
in base 10).
Sample input
100 10 9
100 10 1
750000 2 2
1000000000 10 40
100000000 100 200
0 0 0
Sample output
10
3
189
45433800
666303
The relevant rules:
Read all input from the keyboard, i.e. use stdin, System.in, cin or equivalent. Input will be redirected from a file to form the input to your submission.
Write all output to the screen, i.e. use stdout, System.out, cout or equivalent. Do not write to stderr. Do NOT use, or even include, any module that allows direct manipulation of the screen, such as conio, Crt or anything similar. Output from your program is redirected to a file for later checking. Use of direct I/O means that such output is not redirected and hence cannot be checked. This could mean that a correct program is rejected!
Unless otherwise stated, all integers in the input will fit into a standard 32-bit computer word. Adjacent integers on a line will be separated by one or more spaces.
Of course, it's fair to say that I should learn more before trying to solve this, but i'd really appreciate it if someone here told me how it's done.
Thanks in advance, John.
Other people pointed out trivial solution: iterate over all numbers from 1 to A. But this problem, actually, can be solved in nearly constant time: O(length of A), which is O(log(A)).
Code provided is for base 10. Adapting it for arbitrary base is trivial.
To reach above estimate for time, you need to add memorization to recursion. Let me know if you have questions about that part.
Now, recursive function itself. Written in Java, but everything should work in C#/C++ without any changes. It's big, but mostly because of comments where I try to clarify algorithm.
// returns amount of numbers strictly less than 'num' with sum of digits 'sum'
// pay attention to word 'strictly'
int count(int num, int sum) {
// no numbers with negative sum of digits
if (sum < 0) {
return 0;
}
int result = 0;
// imagine, 'num' == 1234
// let's check numbers 1233, 1232, 1231, 1230 manually
while (num % 10 > 0) {
--num;
// check if current number is good
if (sumOfDigits(num) == sum) {
// one more result
++result;
}
}
if (num == 0) {
// zero reached, no more numbers to check
return result;
}
num /= 10;
// Using example above (1234), now we're left with numbers
// strictly less than 1230 to check (1..1229)
// It means, any number less than 123 with arbitrary digit appended to the right
// E.g., if this digit in the right (last digit) is 3,
// then sum of the other digits must be "sum - 3"
// and we need to add to result 'count(123, sum - 3)'
// let's iterate over all possible values of last digit
for (int digit = 0; digit < 10; ++digit) {
result += count(num, sum - digit);
}
return result;
}
Helper function
// returns sum of digits, plain and simple
int sumOfDigits(int x) {
int result = 0;
while (x > 0) {
result += x % 10;
x /= 10;
}
return result;
}
Now, let's write a little tester
int A = 12345;
int C = 13;
// recursive solution
System.out.println(count(A + 1, C));
// brute-force solution
int total = 0;
for (int i = 1; i <= A; ++i) {
if (sumOfDigits(i) == C) {
++total;
}
}
System.out.println(total);
You can write more comprehensive tester checking all values of A, but overall solution seems to be correct. (I tried several random A's and C's.)
Don't forget, you can't test solution for A == 1000000000 without memorization: it'll run too long. But with memorization, you can test it even for A == 10^1000.
edit
Just to prove a concept, poor man's memorization. (in Java, in other languages hashtables are declared differently) But if you want to learn something, it might be better to try to do it yourself.
// hold values here
private Map<String, Integer> mem;
int count(int num, int sum) {
// no numbers with negative sum of digits
if (sum < 0) {
return 0;
}
String key = num + " " + sum;
if (mem.containsKey(key)) {
return mem.get(key);
}
// ...
// continue as above...
// ...
mem.put(key, result);
return result;
}
Here's the same memoized recursive solution that Rybak posted, but with a simpler implementation, in my humble opinion:
HashMap<String, Integer> cache = new HashMap<String, Integer>();
int count(int bound, int base, int sum) {
// No negative digit sums.
if (sum < 0)
return 0;
// Handle one digit case.
if (bound < base)
return (sum <= bound) ? 1 : 0;
String key = bound + " " + sum;
if (cache.containsKey(key))
return cache.get(key);
int count = 0;
for (int digit = 0; digit < base; digit++)
count += count((bound - digit) / base, base, sum - digit);
cache.put(key, count);
return count;
}
This is not the complete solution (no input parsing). To get the number in base B, repeatedly take the modulo B, and then divide by B until the result is 0. This effectively computes the base-B digit from the right, and then shifts the number right.
int A,B,C; // from input
for (int x=1; x<A; x++)
{
int sumDigits = 0;
int v = x;
while (v!=0) {
sumDigits += (v % B);
v /= B;
}
if (sumDigits==C)
cout << x;
}
This is a brute force approach. It may be possible to compute this quicker by determining which sets of base B digits add up to C, arranging these in all permutations that are less than A, and then working backwards from that to create the original number.
Yum.
Try this:
int number, digitSum, resultCounter = 0;
for(int i=1; i<=A, i++)
{
number = i; //to avoid screwing up our counter
digitSum = 0;
while(number > 1)
{
//this is the next "digit" of the number as it would be in base B;
//works with any base including 10.
digitSum += (number % B);
//remove this digit from the number, square the base, rinse, repeat
number /= B;
}
digitSum += number;
//Does the sum match?
if(digitSum == C)
resultCounter++;
}
That's your basic algorithm for one line. Now you wrap this in another For loop for each input line you received, preceded by the input collection phase itself. This process can be simplified, but I don't feel like coding your entire answer to see if my algorithm works, and this looks right whereas the simpler tricks are harder to pass by inspection.
The way this works is by modulo dividing by powers of the base. Simple example, 1234 in base 10:
1234 % 10 = 4
1234 / 10 = 123 //integer division truncates any fraction
123 % 10 = 3 //sum is 7
123 / 10 = 12
12 % 10 = 2 //sum is 9
12 / 10 = 1 //end condition, add this and the sum is 10
A harder example to figure out by inspection would be the same number in base 12:
1234 % 12 = 10 //you can call it "A" like in hex, but we need a sum anyway
1234 / 12 = 102
102 % 12 = 6 // sum 16
102/12 = 8
8 % 12 = 8 //sum 24
8 / 12 = 0 //end condition, sum still 24.
So 1234 in base 12 would be written 86A. Check the math:
8*12^2 + 6*12 + 10 = 1152 + 72 + 10 = 1234
Have fun wrapping the rest of the code around this.

How to find a binary logarithm very fast? (O(1) at best)

Is there any very fast method to find a binary logarithm of an integer number? For example, given a number
x=52656145834278593348959013841835216159447547700274555627155488768 such algorithm must find y=log(x,2) which is 215. x is always a power of 2.
The problem seems to be really simple. All what is required is to find the position of the most significant 1 bit. There is a well-known method FloorLog, but it is not very fast especially for the very long multi-words integers.
What is the fastest method?
A quick hack: Most floating-point number representations automatically normalise values, meaning that they effectively perform the loop Christoffer Hammarström mentioned in hardware. So simply converting from an integer to FP and extracting the exponent should do the trick, provided the numbers are within the FP representation's exponent range! (In your case, your integer input requires multiple machine words, so multiple "shifts" will need to be performed in the conversion.)
If the integers are stored in a uint32_t a[], then my obvious solution would be as follows:
Run a linear search over a[] to find the highest-valued non-zero uint32_t value a[i] in a[] (test using uint64_t for that search if your machine has native uint64_t support)
Apply the bit twiddling hacks to find the binary log b of the uint32_t value a[i] you found in step 1.
Evaluate 32*i+b.
The answer is implementation or language dependent. Any implementation can store the number of significant bits along with the data, as it is often useful. If it must be calculated, then find the most significant word/limb and the most significant bit in that word.
If you're using fixed-width integers then the other answers already have you pretty-well covered.
If you're using arbitrarily large integers, like int in Python or BigInteger in Java, then you can take advantage of the fact that their variable-size representation uses an underlying array, so the base-2 logarithm can be computed easily and quickly in O(1) time using the length of the underlying array. The base-2 logarithm of a power of 2 is simply one less than the number of bits required to represent the number.
So when n is an integer power of 2:
In Python, you can write n.bit_length() - 1 (docs).
In Java, you can write n.bitLength() - 1 (docs).
You can create an array of logarithms beforehand. This will find logarithmic values up to log(N):
#define N 100000
int naj[N];
naj[2] = 1;
for ( int i = 3; i <= N; i++ )
{
naj[i] = naj[i-1];
if ( (1 << (naj[i]+1)) <= i )
naj[i]++;
}
The array naj is your logarithmic values. Where naj[k] = log(k).
Log is based on two.
This uses binary search for finding the closest power of 2.
public static int binLog(int x,boolean shouldRoundResult){
// assuming 32-bit integer
int lo=0;
int hi=31;
int rangeDelta=hi-lo;
int expGuess=0;
int guess;
while(rangeDelta>1){
expGuess=(lo+hi)/2; // or (loGuess+hiGuess)>>1
guess=1<<expGuess;
if(guess<x){
lo=expGuess;
} else if(guess>x){
hi=expGuess;
} else {
lo=hi=expGuess;
}
rangeDelta=hi-lo;
}
if(shouldRoundResult && hi>lo){
int loGuess=1<<lo;
int hiGuess=1<<hi;
int loDelta=Math.abs(x-loGuess);
int hiDelta=Math.abs(hiGuess-x);
if(loDelta<hiDelta)
expGuess=lo;
else
expGuess=hi;
} else {
expGuess=lo;
}
int result=expGuess;
return result;
}
The best option on top of my head would be a O(log(logn)) approach, by using binary search. Here is an example for a 64-bit ( <= 2^63 - 1 ) number (in C++):
int log2(int64_t num) {
int res = 0, pw = 0;
for(int i = 32; i > 0; i --) {
res += i;
if(((1LL << res) - 1) & num)
res -= i;
}
return res;
}
This algorithm will basically profide me with the highest number res such as (2^res - 1 & num) == 0. Of course, for any number, you can work it out in a similar matter:
int log2_better(int64_t num) {
var res = 0;
for(i = 32; i > 0; i >>= 1) {
if( (1LL << (res + i)) <= num )
res += i;
}
return res;
}
Note that this method relies on the fact that the "bitshift" operation is more or less O(1). If this is not the case, you would have to precompute either all the powers of 2, or the numbers of form 2^2^i (2^1, 2^2, 2^4, 2^8, etc.) and do some multiplications(which in this case aren't O(1)) anymore.
The example in the OP is an integer string of 65 characters, which is not representable by a INT64 or even INT128. It is still very easy to get the Log(2,x) from this string by converting it to a double-precision number. This at least gives you easy access to integers upto 2^1023.
Below you find some form of pseudocode
# 1. read the string
string="52656145834278593348959013841835216159447547700274555627155488768"
# 2. extract the length of the string
l=length(string) # l = 65
# 3. read the first min(l,17) digits in a float
float=to_float(string(1: min(17,l) ))
# 4. multiply with the correct power of 10
float = float * 10^(l-min(17,l) ) # float = 5.2656145834278593E64
# 5. Take the log2 of this number and round to the nearest integer
log2 = Round( Log(float,2) ) # 215
Note:
some computer languages can convert arbitrary strings into a double precision number. So steps 2,3 and 4 could be replaced by x=to_float(string)
Step 5 could be done quicker by just reading the double-precision exponent (bits 53 up to and including 63) and subtracting 1023 from it.
Quick example code: If you have awk you can quickly test this algorithm.
The following code creates the first 300 powers of two:
awk 'BEGIN{for(n=0;n<300; n++) print 2^n}'
The following reads the input and does the above algorithm:
awk '{ l=length($0); m = (l > 17 ? 17 : l)
x = substr($0,1,m) * 10^(l-m)
print log(x)/log(2)
}'
So the following bash-command is a convoluted way to create a consecutive list of numbers from 0 to 299:
$ awk 'BEGIN{for(n=0;n<300; n++) print 2^n}' | awk '{ l=length($0); m = (l > 17 ? 17 : l); x = substr($0,1,m) * 10^(l-m); print log(x)/log(2) }'
0
1
2
...
299

Is there a simple algorithm that can determine if X is prime?

I have been trying to work my way through Project Euler, and have noticed a handful of problems ask for you to determine a prime number as part of it.
I know I can just divide x by 2, 3, 4, 5, ..., square root of X and if I get to the square root, I can (safely) assume that the number is prime. Unfortunately this solution seems quite klunky.
I've looked into better algorithms on how to determine if a number is prime, but get confused fast.
Is there a simple algorithm that can determine if X is prime, and not confuse a mere mortal programmer?
Thanks much!
The first algorithm is quite good and used a lot on Project Euler. If you know the maximum number that you want you can also research Eratosthenes's sieve.
If you maintain the list of primes you can also refine the first algo to divide only with primes until the square root of the number.
With these two algoritms (dividing and the sieve) you should be able to solve the problems.
Edit: fixed name as noted in comments
To generate all prime numbers less than a limit Sieve of Eratosthenes (the page contains variants in 20 programming languages) is the oldest and the simplest solution.
In Python:
def iprimes_upto(limit):
is_prime = [True] * limit
for n in range(2, limit):
if is_prime[n]:
yield n
for i in range(n*n, limit, n): # start at ``n`` squared
is_prime[i] = False
Example:
>>> list(iprimes_upto(15))
[2, 3, 5, 7, 11, 13]
I see that Fermat's primality test has already been suggested, but I've been working through Structure and Interpretation of Computer Programs, and they also give the Miller-Rabin test (see Section 1.2.6, problem 1.28) as another alternative. I've been using it with success for the Euler problems.
Here's a simple optimization of your method that isn't quite the Sieve of Eratosthenes but is very easy to implement: first try dividing X by 2 and 3, then loop over j=1..sqrt(X)/6, trying to divide by 6*j-1 and 6*j+1. This automatically skips over all numbers divisible by 2 or 3, gaining you a pretty nice constant factor acceleration.
Keeping in mind the following facts (from MathsChallenge.net):
All primes except 2 are odd.
All primes greater than 3 can be written in the form 6k - 1 or 6k + 1.
You don't need to check past the square root of n
Here's the C++ function I use for relatively small n:
bool isPrime(unsigned long n)
{
if (n == 1) return false; // 1 is not prime
if (n < 4) return true; // 2 and 3 are both prime
if ((n % 2) == 0) return false; // exclude even numbers
if (n < 9) return true; //we have already excluded 4, 6, and 8.
if ((n % 3) == 0) return false; // exclude remaining multiples of 3
unsigned long r = floor( sqrt(n) );
unsigned long f = 5;
while (f <= r)
{
if ((n % f) == 0) return false;
if ((n % (f + 2)) == 0) return false;
f = f + 6;
}
return true; // (in all other cases)
}
You could probably think of more optimizations of your own.
I'd recommend Fermat's primality test. It is a probabilistic test, but it is correct surprisingly often. And it is incredibly fast when compared with the sieve.
For reasonably small numbers, x%n for up to sqrt(x) is awfully fast and easy to code.
Simple improvements:
test 2 and odd numbers only.
test 2, 3, and multiples of 6 + or - 1 (all primes other than 2 or 3 are multiples of 6 +/- 1, so you're essentially just skipping all even numbers and all multiples of 3
test only prime numbers (requires calculating or storing all primes up to sqrt(x))
You can use the sieve method to quickly generate a list of all primes up to some arbitrary limit, but it tends to be memory intensive. You can use the multiples of 6 trick to reduce memory usage down to 1/3 of a bit per number.
I wrote a simple prime class (C#) that uses two bitfields for multiples of 6+1 and multiples of 6-1, then does a simple lookup... and if the number i'm testing is outside the bounds of the sieve, then it falls back on testing by 2, 3, and multiples of 6 +/- 1. I found that generating a large sieve actually takes more time than calculating primes on the fly for most of the euler problems i've solved so far. KISS principle strikes again!
I wrote a prime class that uses a sieve to pre-calculate smaller primes, then relies on testing by 2, 3, and multiples of six +/- 1 for ones outside the range of the sieve.
For Project Euler, having a list of primes is really essential. I would suggest maintaining a list that you use for each problem.
I think what you're looking for is the Sieve of Eratosthenes.
Your right the simples is the slowest. You can optimize it somewhat.
Look into using modulus instead of square roots.
Keep track of your primes. you only need to divide 7 by 2, 3, and 5 since 6 is a multiple of 2 and 3, and 4 is a multiple of 2.
Rslite mentioned the eranthenos sieve. It is fairly straight forward. I have it in several languages it home. Add a comment if you want me to post that code later.
Here is my C++ one. It has plenty of room to improve, but it is fast compared to the dynamic languages versions.
// Author: James J. Carman
// Project: Sieve of Eratosthenes
// Description: I take an array of 2 ... max values. Instead of removeing the non prime numbers,
// I mark them as 0, and ignoring them.
// More info: http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
#include <iostream>
int main(void) {
// using unsigned short.
// maximum value is around 65000
const unsigned short max = 50000;
unsigned short x[max];
for(unsigned short i = 0; i < max; i++)
x[i] = i + 2;
for(unsigned short outer = 0; outer < max; outer++) {
if( x[outer] == 0)
continue;
unsigned short item = x[outer];
for(unsigned short multiplier = 2; (multiplier * item) < x[max - 1]; multiplier++) {
unsigned int searchvalue = item * multiplier;
unsigned int maxValue = max + 1;
for( unsigned short maxIndex = max - 1; maxIndex > 0; maxIndex--) {
if(x[maxIndex] != 0) {
maxValue = x[maxIndex];
break;
}
}
for(unsigned short searchindex = multiplier; searchindex < max; searchindex++) {
if( searchvalue > maxValue )
break;
if( x[searchindex] == searchvalue ) {
x[searchindex] = 0;
break;
}
}
}
}
for(unsigned short printindex = 0; printindex < max; printindex++) {
if(x[printindex] != 0)
std::cout << x[printindex] << "\t";
}
return 0;
}
I will throw up the Perl and python code I have as well as soon as I find it. They are similar in style, just less lines.
Here is a simple primality test in D (Digital Mars):
/**
* to compile:
* $ dmd -run prime_trial.d
* to optimize:
* $ dmd -O -inline -release prime_trial.d
*/
module prime_trial;
import std.conv : to;
import std.stdio : w = writeln;
/// Adapted from: http://www.devx.com/vb2themax/Tip/19051
bool
isprime(Integer)(in Integer number)
{
/* manually test 1, 2, 3 and multiples of 2 and 3 */
if (number == 2 || number == 3)
return true;
else if (number < 2 || number % 2 == 0 || number % 3 == 0)
return false;
/* we can now avoid to consider multiples
* of 2 and 3. This can be done really simply
* by starting at 5 and incrementing by 2 and 4
* alternatively, that is:
* 5, 7, 11, 13, 17, 19, 23, 25, 29, 31, 35, 37, ...
* we don't need to go higher than the square root of the number */
for (Integer divisor = 5, increment = 2; divisor*divisor <= number;
divisor += increment, increment = 6 - increment)
if (number % divisor == 0)
return false;
return true; // if we get here, the number is prime
}
/// print all prime numbers less then a given limit
void main(char[][] args)
{
const limit = (args.length == 2) ? to!(uint)(args[1]) : 100;
for (uint i = 0; i < limit; ++i)
if (isprime(i))
w(i);
}
I am working thru the Project Euler problems as well and in fact just finished #3 (by id) which is the search for the highest prime factor of a composite number (the number in the ? is 600851475143).
I looked at all of the info on primes (the sieve techniques already mentioned here), and on integer factorization on wikipedia and came up with a brute force trial division algorithm that I decided would do.
So as I am doing the euler problems to learn ruby I was looking into coding my algorithm and stumbled across the mathn library which has a Prime class and an Integer class with a prime_division method. how cool is that. i was able to get the correct answer to the problem with this ruby snippet:
require "mathn.rb"
puts 600851475143.prime_division.last.first
this snippet outputs the correct answer to the console. of course i ended up doing a ton of reading and learning before i stumbled upon this little beauty, i just thought i would share it with everyone...
I like this python code.
def primes(limit) :
limit += 1
x = range(limit)
for i in xrange(2,limit) :
if x[i] == i:
x[i] = 1
for j in xrange(i*i, limit, i) :
x[j] = i
return [j for j in xrange(2, limit) if x[j] == 1]
A variant of this can be used to generate the factors of a number.
def factors(limit) :
limit += 1
x = range(limit)
for i in xrange(2,limit) :
if x[i] == i:
x[i] = 1
for j in xrange(i*i, limit, i) :
x[j] = i
result = []
y = limit-1
while x[y] != 1 :
divisor = x[y]
result.append(divisor)
y /= divisor
result.append(y)
return result
Of course, if I were factoring a batch of numbers, I would not recalculate the cache; I'd do it once and do lookups in it.
Is not optimized but it's a very simple function.
function isprime(number){
if (number == 1)
return false;
var times = 0;
for (var i = 1; i <= number; i++){
if(number % i == 0){
times ++;
}
}
if (times > 2){
return false;
}
return true;
}
Maybe this implementation in Java can be helpful:
public class SieveOfEratosthenes {
/**
* Calling this method with argument 7 will return: true true false false true false true false
* which must be interpreted as : 0 is NOT prime, 1 is NOT prime, 2 IS prime, 3 IS prime, 4 is NOT prime
* 5 is prime, 6 is NOT prime, 7 is prime.
* Caller may either revert the array for easier reading, count the number of primes or extract the prime values
* by looping.
* #param upTo Find prime numbers up to this value. Must be a positive integer.
* #return a boolean array where index represents the integer value and value at index returns
* if the number is NOT prime or not.
*/
public static boolean[] isIndexNotPrime(int upTo) {
if (upTo < 2) {
return new boolean[0];
}
// 0-index array, upper limit must be upTo + 1
final boolean[] isIndexNotPrime = new boolean[upTo + 1];
isIndexNotPrime[0] = true; // 0 is not a prime number.
isIndexNotPrime[1] = true; // 1 is not a prime number.
// Find all non primes starting from 2 by finding 2 * 2, 2 * 3, 2 * 4 until 2 * multiplier > isIndexNotPrime.len
// Find next by 3 * 3 (since 2 * 3 was found before), 3 * 4, 3 * 5 until 3 * multiplier > isIndexNotPrime.len
// Move to 4, since isIndexNotPrime[4] is already True (not prime) no need to loop..
// Move to 5, 5 * 5, (2 * 5 and 3 * 5 was already set to True..) until 5 * multiplier > isIndexNotPrime.len
// Repeat process until i * i > isIndexNotPrime.len.
// Assume we are looking up to 100. Break once you reach 11 since 11 * 11 == 121 and we are not interested in
// primes above 121..
for (int i = 2; i < isIndexNotPrime.length; i++) {
if (i * i >= isIndexNotPrime.length) {
break;
}
if (isIndexNotPrime[i]) {
continue;
}
int multiplier = i;
while (i * multiplier < isIndexNotPrime.length) {
isIndexNotPrime[i * multiplier] = true;
multiplier++;
}
}
return isIndexNotPrime;
}
public static void main(String[] args) {
final boolean[] indexNotPrime = SieveOfEratosthenes.isIndexNotPrime(7);
assert !indexNotPrime[2]; // Not (not prime)
assert !indexNotPrime[3]; // Not (not prime)
assert indexNotPrime[4]; // (not prime)
assert !indexNotPrime[5]; // Not (not prime)
assert indexNotPrime[6]; // (not prime)
assert !indexNotPrime[7]; // Not (not prime)
}
}
The AKS prime testing algorithm:
Input: Integer n > 1
if (n is has the form ab with b > 1) then output COMPOSITE
r := 2
while (r < n) {
if (gcd(n,r) is not 1) then output COMPOSITE
if (r is prime greater than 2) then {
let q be the largest factor of r-1
if (q > 4sqrt(r)log n) and (n(r-1)/q is not 1 (mod r)) then break
}
r := r+1
}
for a = 1 to 2sqrt(r)log n {
if ( (x-a)n is not (xn-a) (mod xr-1,n) ) then output COMPOSITE
}
output PRIME;
another way in python is:
import math
def main():
count = 1
while True:
isprime = True
for x in range(2, int(math.sqrt(count) + 1)):
if count % x == 0:
isprime = False
break
if isprime:
print count
count += 2
if __name__ == '__main__':
main()

Resources