Fastest way to generate random numbers - random

libc has random which
uses a nonlinear additive feedback random number generator employing a default table of size 31 long integers to return successive pseudo-random numbers
I'm looking for a random function written in Rust with the same speed. It doesn't need to be crypto-safe, pseudo-random is good enough.
Looking through the rand crate it seems that XorShiftRng is the closest fit to this need:
The Xorshift algorithm is not suitable for cryptographic purposes but is very fast
When I use it like this:
extern crate time;
use rand::Rng;
let mut rng = rand::XorShiftRng::new_unseeded();
rng.next_u64();
it is about 33% slower than libc's random. (Sample code generating 8'000'000 random numbers).
In the end, I'll need i64 random numbers, so when I run rng.gen() this is already 100% slower than libc's random. When casting with rng.next_u64() as i64, then is is 60% slower.
Is there any way to achieve the same speed in rust without using any unsafe code?

Be sure to compile the code you are measuring in release mode, otherwise your benchmark is not representative of Rust's performance.
In order to obtain meaningful numbers, you must also modify the benchmark to do something with the generated numbers, e.g. collect them into a vector1. Failing to do so can make the compiler optimize the whole loop away because it has no side effects. This is what happened in your second attempt that led you to conclude that XorShiftRng is 760,000 thousand times faster than libc::random.
With the changed benchmark run in release mode, XorShiftRng ends up approximately 2x faster than libc::random:
PT0.101378490S seconds for libc::random
PT0.050827393S seconds for XorShiftRng
1
The compiler could also be smart enough to realize that the vector is also unused and optimize it away as well, but current rustc does not do so, and storing elements into a vector is enough. A simple and future-proof way to ensure that the generation is not optimized away is to sum up the numbers and write out the result.

Related

Generate random number in interval in PostScript

I am struggling to find a way to generate a random number within a given interval in PostScript.
Basically PostScript has three functions to help you generate (pseudo-)random numbers. Those are rand, srand and rrand.
The later two are for passing a seed to the number generator to be able to reproduce specific results. At least that´s what I understood they are for. Anyway they don´t seem suitable for my case.
So rand seems to be the only function I can use to generate a random number, but...
rand returns a random integer in the range 0 to 231 − 1 (From the PostScript Language Reference, page 637 (651 in the PDF))
This is far beyond the the interval I´m looking for. I am more interested in values up to small thousands, maybe 10.000 or something like that and small float values, up to 100, all with the lower limit of 0.
I thought I could just narrow my numbers down by simple divisions and extracting the root but that tends to give me unusable small values in quite a lot cases. I am wondering if there are robust ways to either shrink a large number down to what I need or, I´d prefer that, only generate numbers in the desired interval.
Besides: while-loops are not possible in PostScript, otherwise I´d have written a function to generate numbers until they fit in my interval.
Any hints on what to look for breaking numbers down into my interval?
mod is often good enough and it's fast. But you may get a more uniform distribution by using floating-point ops.
rand 16#7fffffff div 100 mul cvi
This is because mod discards the upper bits of the input. And the PRNG is usually trying to randomize over all the bits. By scaling down then up, they all contribute something in the way of rounding effects.
Just use the modulo operator to get it down to the size you want:
GS>rand 100 mod stack
7

C++ random_shuffle() how does it work?

I have a Deck vector with 52 Card, and I want to shuffle it.
vector<Card^> cards;
So I used this:
random_shuffle(cards.begin(), cards.end());
The problem was that it gave me the same result every time, so I used srand to randomize it:
srand(unsigned(time(NULL)));
random_shuffle(cards.begin(),cards.end());
This was still not truly random. When I started dealing cards, it was the same as in the last run. For example: "1. deal: A,6,3,2,K; 2. deal: Q,8,4,J,2", and when I restarted the program I got exactly the same order of deals.
Then I used srand() and random_shuffle with its 3rd parameter:
int myrandom (int i) {
return std::rand()%i;
}
srand(unsigned(time(NULL)));
random_shuffle(cards.begin(),cards.end(), myrandom);
Now it's working and always gives me different results on re-runs, but I don't know why it works this way. How do these functions work, what did I do here?
This answer required some investigation, looking at the C++ Standard Library headers in VC++ and looking at the C++ standard itself. I knew what the standard said, but I was curious about VC++ (including C++CLI) did their implementation.
First what does the standard say about std::random_shuffle . We can find that here. In particular it says:
Reorders the elements in the given range [first, last) such that each possible permutation of those elements has equal probability of appearance.
1) The random number generator is implementation-defined, but the function std::rand is often used.
The bolded part is key. The standard says that the RNG can be implementation specific (so results across different compilers will vary). The standard suggests that std::rand is often used. But this isn't a requirement. So if an implementation doesn't use std::rand then it follows that it likely won't use std::srand for a starting seed. An interesting footnote is that the std::random_shuffle functions are deprecated as of C++14. However std::shuffle remains. My guess is that since std::shuffle requires you to provide a function object you are explicitly defining the behavior you want when generating random numbers, and that is an advantage over the older std::random_shuffle.
I took my VS2013 and looked at the C++ standard library headers and discovered that <algorithm> uses template class that uses a completely different pseudo-rng (PRNG) than std::rand with an index (seed) set to zero. Although this may vary in detail between different versions of VC++ (including C++/CLI) I think it is probable that most versions of VC++/CLI do something similar. This would explain why each time you run your application you get the same shuffled decks.
The option I would opt for if I am looking for a Pseudo RNG and I'm not doing cryptography is to use something well established like Mersenne Twister:
Advantages The commonly-used version of Mersenne Twister, MT19937, which produces a sequence of 32-bit integers, has the following desirable properties:
It has a very long period of 2^19937 − 1. While a long period is not a guarantee of quality in a random number generator, short periods (such as the 2^32 common in many older software packages) can be problematic.
It is k-distributed to 32-bit accuracy for every 1 ≤ k ≤ 623 (see definition below).
It passes numerous tests for statistical randomness, including the Diehard tests.
Luckily for us C++11 Standard Library (which I believe should work on VS2010 and later C++/CLI) includes a Mersenne Twister function object that can be used with std::shuffle Please see this C++ documentation for more details. The C++ Standard Library reference provided earlier actually contains code that does this:
std::random_device rd;
std::mt19937 g(rd());
std::shuffle(v.begin(), v.end(), g);
The thing to note is that std::random_device produces non-deterministic (non repeatable) unsigned integers. We need non-deterministic data if we want to seed our Mersenne Twister (std::mt19937) PRNG with. This is similar in concept to seeding rand with srand(time(NULL)) (The latter not being an overly good source of randomness).
This looks all well and good but has one disadvantage when dealing with card shuffling. An unsigned integer on the Windows platform is 4 bytes (32 bits) and can store 2^32 values. This means there are only 4,294,967,296 possible starting points (seeds) therefore only that many ways to shuffle the deck. The problem is that there are 52! (52 factorial) ways to shuffle a standard 52 card deck. That happens to be 80658175170943878571660636856403766975289505440883277824000000000000 ways, which is far bigger than the number of unique ways we can get from setting a 32-bit seed.
Thankfully, Mersenne Twister can accept seeds between 0 and 2^19937-1. 52! is a big number but all combinations can be represented with a seed of 226 bits (or ~29 bytes). The Standard Library allow std::mt19937 to accept a seed up to 2^19937-1 (~624 bytes of data) if we so choose. But since we need only 226 bits the following code would allow us to create 29 bytes of non-deterministic data to be used as a suitable seed for std::mt19937:
// rd is an array to hold 29 bytes of seed data which covers the 226 bits we need */
std::array<unsigned char, 29> seed_data;
std::random_device rd;
std::generate_n(seed_data.data(), seed_data.size(), std::ref(rd));
std::seed_seq seq(std::begin(seed_data), std::end(seed_data));
// Set the seed for Mersenne *using the 29 byte sequence*
std::mt19937 g(seq);
Then all you need to do is call shuffle with code like:
std::shuffle(cards.begin(),cards.end(), g);
On Windows VC++/CLI you will get a warning that you'll want to suppress with the code above. So at the top of the file (before other includes) you can add this:
#define _SCL_SECURE_NO_WARNINGS 1

How exactly does PC/Mac generates random numbers for either 0 or 1?

This question is NOT about how to use any language to generate a random number between any interval. It is about generating either 0 or 1.
I understand that many random generator algorithm manipulate the very basic random(0 or 1) function and take seed from users and use an algorithm to generate various random numbers as needed.
The question is that how the CPU generate either 0 or 1? If I throw a coin, I can generate head or tailer. That's because I physically throw a coin and let the nature decide. But how does CPU do it? There must be an action that the CPU does (like throwing a coin) to get either 0 or 1 randomly, right?
Could anyone tell me about it?
Thanks
(This has several facets and thus several algorithms. Keep in mind that there are many different forms of randomness used for different purposes, but I understand your question in the way that you are interested in actual randomness used for cryptography.)
The fundamental problem here is that computers are (mostly) deterministic machines. Given the same input in the same state they always yield the same result. However, there are a few ways of actually gathering entropy:
User input. Since users bring outside input into the system you can take that to derive some bits from that. Similar to how you could use radioactive decay or line noise.
Network activity. Again, an outside source of stuff.
Generally interrupts (which kinda include the first two).
As alluded to in the first item, noise from peripherals, such as audio input or a webcam can be used.
There is dedicated hardware that can generate a few hundred MiB of randomness per second. Usually they give you random numbers directly instead of their internal entropy, though.
How exactly you derive bits from that is up to you but you could use time between events, or actual content from the events, etc. – generally eliminating bias from entropy sources isn't easy or trivial and a lot of thought and algorithmic work goes into that (in the case of the aforementioned special hardware this is all done in hardware and the code using it doesn't need to care about it).
Once you have a pool of actually random bits you can just use them as random numbers (/dev/random on Linux does that). But this has downsides, since there is usually little actual entropy and possibly a higher demand for random numbers. So you can invent algorithms to “stretch” that initial randomness in a manner that makes it still impossible or at least very difficult to predict anything about following numbers (/dev/urandom on Linux or both /dev/random and /dev/urandom on FreeBSD do that). Fortuna and Yarrow are so-called cryptographically secure pseudo-random number generators and designed with that in mind. You still have a very good guarantee about the quality of random numbers you generate, but have many more before your entropy pool runs out.
In any case, the CPU itself cannot give you a random 0 or 1. There's a lot more involved and this usually includes the complete computer system or special hardware built for that purpose.
There is also a second class of computational randomness: Plain vanilla pseudo-random number generators (PRNGs). What I said earlier about determinism – this is the embodiment of it. Given the same so-called seed a PRNG will yield the exact same sequence of numbers every time¹. While this sounds idiotic it has practical benefits.
Suppose you run a simulation involving lots of random numbers, maybe to simulate interaction between molecules or atoms that involve certain probabilities and unpredictable behaviour. In science you want results anyone can independently verify, given the same setup and procedure (or, with computing, the same algorithms). If you used actual randomness the only option you have would be to save every single random number used to make sure others can replicate the results independently.
But with a PRNG all you need to save is the seed and remember what algorithm you used. Others can then get the exact same sequence of pseudo-random numbers independently. Very nice property to have :-)
Footnotes
¹ This even includes the CSPRNGs mentioned above, but they are designed to be used in a special way that includes regular re-seeding with entropy to overcome that problem.
A CPU can only generate a uniform random number, U(0,1), which happens to range from 0 to 1. So mathematically, it would be defined as a random variable U in the range [0,1]. Examples of random draws of a U(0,1) random number in the range 0 to 1 would be 0.28100002, 0.34522, 0.7921, etc. The probability of any value between 0 and 1 is equal, i.e., they are equiprobable.
You can generate binary random variates that are either 0 or 1 by setting a random draw of U(0,1) to a 0 if U(0,1)<=0.5 and 1 if U(0,1)>0.5, since in theory there will be an equal number of random draws of U(0,1) below 0.5 and above 0.5.

Monte carlo on GPU

Today I had a talk with a friend of mine told me he tries to make some monte carlo simulations using GPU. What was interesting he told me that he wanted to draw numbers randomly on different processors and assumed that there were uncorrelated. But they were not.
The question is, whether there exists a method to draw independent sets of numbers on several GPUs? He thought that taking a different seed for each of them would solve the problem, but it does not.
If any clarifications are need please let me know, I will ask him to provide more details.
To generate completely independent random numbers, you need to use a parallel random number generator. Essentially, you choose a single seed and it generates M independent random number streams. So on each of the M GPUs you could then generate random numbers from independent streams.
When dealing with multiple GPUs you need to be aware that you want:
independent streams within GPUs (if RNs are generate by each GPU)
independent streams between GPUs.
It turns out that generating random numbers on each GPU core is tricky (see this question I asked a while back). When I've been playing about with GPUs and RNs, you only get a speed-up generating random on the GPU if you generate large numbers at once.
Instead, I would generate random numbers on the CPU, since:
It's easier and sometimes quicker to generate them on the CPU and transfer across.
You can use well tested parallel random number generators
The types of off-the shelf random number generators available for GPUs is very limited.
Current GPU random number libraries only generate RNs from a small number of distributions.
To answer your question in the comments: What do random numbers depend on?
A very basic random number generator is the linear congruential generator. Although this generator has been surpassed by newer methods, it should give you an idea of how they work. Basically, the ith random number depends on the (i-1) random number. As you point out, if you run two streams long enough, they will overlap. The big problem is, you don't know when they will overlap.
For generating iid uniform variables, you just have to initialize your generators with differents seeds. With Cuda, you may use the NVIDIA Curand Library which implements the Mersenne Twister generator.
For example, the following code executed by 100 kernels in parallel, will draw 10 sample of a (R^10)-uniform
__global__ void setup_kernel(curandState *state,int pseed)
{
int id = blockIdx.x * blockDim.x + threadIdx.x;
int seed = id%10+pseed;
/* 10 differents seed for uncorrelated rv,
a different sequence number, no offset */
curand_init(seed, id, 0, &state[id]);
}
If you take any ``good'' generator (e.g. Mersenne Twister etc), two sequences with different random seeds will be uncorrelated, be it on GPU or CPU. Hence I'm not sure what you mean by saying taking different seeds on different GPUs were not enough. Would you elaborate?

Does Kernel::srand have a maximum input value?

I'm trying to seed a random number generator with the output of a hash. Currently I'm computing a SHA-1 hash, converting it to a giant integer, and feeding it to srand to initialize the RNG. This is so that I can get a predictable set of random numbers for an set of infinite cartesian coordinates (I'm hashing the coordinates).
I'm wondering whether Kernel::srand actually has a maximum value that it'll take, after which it truncates it in some way. The docs don't really make this obvious - they just say "a number".
I'll try to figure it out myself, but I'm assuming somebody out there has run into this already.
Knowing what programmers are like, it probably just calls libc's srand(). Either way, it's probably limited to 2^32-1, 2^31-1, 2^16-1, or 2^15-1.
There's also a danger that the value is clipped when cast from a biginteger to a C int/long, instead of only taking the low-order bits.
An easy test is to seed with 1 and take the first output. Then, seed with 2i+1 for i in [1..64] or so, take the first output of each, and compare. If you get a match for some i=n and all greater is, then it's probably doing arithmetic modulo 2n.
Note that the random number generator is almost certainly limited to 32 or 48 bits of entropy anyway, so there's little point seeding it with a huge value, and an attacker can reasonably easily predict future outputs given past outputs (and an "attacker" could simply be a player on a public nethack server).
EDIT: So I was wrong.
According to the docs for Kernel::rand(),
Ruby currently uses a modified Mersenne Twister with a period of 2**19937-1.
This means it's not just a call to libc's rand(). The Mersenne Twister is statistically superior (but not cryptographically secure). But anyway.
Testing using Kernel::srand(0); Kernel::sprintf("%x",Kernel::rand(2**32)) for various output sizes (2*16, 2*32, 2*36, 2*60, 2*64, 2*32+1, 2*35, 2*34+1), a few things are evident:
It figures out how many bits it needs (number of bits in max-1).
It generates output in groups of 32 bits, most-significant-bits-first, and drops the top bits (i.e. 0x[r0][r1][r2][r3][r4] with the top bits masked off).
If it's not less than max, it does some sort of retry. It's not obvious what this is from the output.
If it is less than max, it outputs the result.
I'm not sure why 2*32+1 and 2*64+1 are special (they produce the same output from Kernel::rand(2**1024) so probably have the exact same state) — I haven't found another collision.
The good news is that it doesn't simply clip to some arbitrary maximum (i.e. passing in huge numbers isn't equivalent to passing in 2**31-1), which is the most obvious thing that can go wrong. Kernel::srand() also returns the previous seed, which appears to be 128-bit, so it seems likely to be safe to pass in something large.
EDIT 2: Of course, there's no guarantee that the output will be reproducible between different Ruby versions (the docs merely say what it "currently uses"; apparently this was initially committed in 2002). Java has several portable deterministic PRNGs (SecureRandom.getInstance("SHA1PRNG","SUN"), albeit slow); I'm not aware of something similar for Ruby.

Resources