Better Random Number Testing Suites

Better Random Number Testing Suites - random

I wanted to generate multiple random number sets and test them using conventional random number testing suites. I came across the NIST suite of testing, Diehard suite and a few other commonly mentioned ones.
These seemed to be pretty old. Are there any newer test suites that are deemed better for the modern generation schemes or if not which of these are the better choices for testing pseudo random number sets of relatively small numbers.
Thanks.

I depends a bit on what you want the random numbers for. Assuming you are doing simulations, Monte-Carlo methods, or most other non-security uses, testu01 as mentioned in another answer is very good. Also practrand http://pracrand.sourceforge.net/
and gjrand http://gjrand.sourceforge.net/
Lots of caveats. None is really easy to use out of the box. It would help to have some C or C++ experience. None can prove that the prng is good. They can find certain classes of faults if they exist (and lots of popular prngs have faults that at least one of these three suites can find).

TestU01 from prof P.L'Ecuyer seems to be pretty good choice, and it is not that old as you imply http://simul.iro.umontreal.ca/testu01/tu01.html

Related

Most suitable pseudo random number generators for Metropolis–Hastings MCMC

I am doing a lot of Metropolis-Hastings Markov chain Monte Carlo (MCMC).
Most codes I have in use, use Mersenne Twister (MT) as pseudo random number generator (PRNG).
However, I recently read, that MT is outdated and probably shouldn't be used anymore as it fails some tests and is relatively slow. So I am willing to switch.
Numpy now defaults to PCG (https://www.pcg-random.org/), which claims to be good. Other sites are rather critical. E.g. http://pcg.di.unimi.it/pcg.php.
It seems everyone praises its own work.
There is some good information already here: Pseudo-random number generator
But many answers are already a bit dated and I want to formulate my question a bit more specific.
As I said: the main use case is Metropolis-Hastings MCMC.
Therefore, I need:
uniformly distributed numbers in half-open and open intervals
around 2^50 samples, apparently per rule of thumb the PRNG should have a period of at least 2^128
sufficient quality of random numbers (whatever this might mean)
a reasonable fast PRNG (for a fixed runtime faster code means more accuracy for MCMC)
I do not need
cryptographically security
As I am by no means an expert, of course usability counts also. So I would welcome an available C++ implementation (this seems to be standard), which is sufficiently easy to use for the novice.

Comparing algorithmic performance to old methods

I have written a new algorithm for something. Now I need to compare it with existing methods, some of which are old about 10 years.
The idea I had is to look at benchmarks of different processors over the years in order to establish how much faster my processor (i7-920) is than average processor from 2003. Then I would simply divide old methods' execution time by the speedup factor and use those numbers to compare with my own algorithm.
Has something like this been done? So I don't redo the existing work.
Can such a comparison be done some other way?
Are there some scientific papers written about such comparisons which I can reference?

I don't know which of these are possible for you, but here's a list of options I can think of:
Run their implementation side-by-side on your machine against yours.
This is the best option.
Rewrite their implementations and do (1).
You preferably need to compare it against their test to ensure you get vaguely similar results.
Find a library that implements their algorithm (or multiple libraries) and do (1).
I suggest multiple libraries, if possible, since a single one may not have implemented the algorithm efficiently. You may also want to compare these against their test.
Compare the algorithms mathematically.
This may be difficult, but it's not impossible.
Do what you presented.
(a) I would not recommend this as there are other determining factors in your computer other than the processor speed that affect the speed of an algorithm. Getting an equation that perfectly balances these will likely be very difficult.
(b) There is a massive difference between top and bottom of the line computers, so using the average is not a particularly good idea. If the author didn't provide details regarding this, I'm afraid your benchmark is not likely to be too accurate.
Go out and buy a machine of similar specs to the one used by the desired test to benchmark on.
A 10-year-old machine should be pretty cheap, if you can find one. Also, see (5.b).
Contact the author to allow for any of the other options.
Papers often provide contact details of the authors, or you should be able to find them elsewhere if they have any sort of online presence and you're half-decent at using Google.

If I were reviewing your results, I would be annoyed if you attempted to demonstrate less than an order of magnitude speedup this way. There are a lot of variables determining algorithm performance, and I would be skeptical that a generic benchmark could capture the right ones. My gold standard is old and new algorithms implemented by the same programmer, with similar effort made to optimize, running on the same hardware. Using the previous authors' implementation instead of making a new one is commonplace in the experimental algorithms literature, but using different hardware isn't.

Algorithmic performance is usually measured in big-O terms, for which it is better to count basic operations, like comparisons, and do it for a range of input sizes.
If you must measure overall time, at least eliminate other sources of difference.
As #larsmans said, do it on the same processor.
Also, if there is existing work, there's no harm in repeating it.
Generally, in science, that's a good thing.

You should attempt to reduce the amount of differing factors between the two runs. I think just run-timing the two algorithms side by side on the same machine and/or comparing their Big O times are both equally valid and important. You should also attempt to use updated libraries and other external functions; using outdated ones my also be the cause of timing results.

How to test short period PRNGs?

So in short my question would be this:
How one should test a short period PRNG?
Emphasis being on short period, like from 2^8 up to at most 2^32. These small generators may have more compact and faster code, small state, might making them suitable for certain procedural generation tasks. The importance is that their generated sequence should be "reasonably random". In this case it may even be viable to test every seed one plans to use, so technically just the output stream itself may be sufficient.
I am aware of other similar questions like this, but they rather target long period, probably even cryptographically secure PRNGs.
So far I see that for example the Overlapping permutations test from the Diehard harness may work, but others seem to rather target long periods.
I tried to do X:Y plots to see correlations such as here, this seems to work and visually provide a good feedback, but I would like to see more if possible (if needed?).

Random number generation / which algorithm?

We need to migrate to a better RNG or RBG for some key value generation which will be further used for encryption of the data.
Which will be the most suitable algorithm? Shall I consider NIST doc for this?

Any pseudo random number generator that produces a Gaussian distribution and that has a wide output (say at least 32 bits) should be enough for creating keys. It's up to you to determine your needs and then find a matching RNG.
For more info, see http://www.random.org/randomness.
Depending on the language you choose to implement this, I'm sure you can find source code for pseudo-RNG on the Web, if the one built-in into your system isn't good enough.

As we are a programming site, I would seriously look at the secure random number generators at your disposal in your particular runtime environment. In general you will have to rely on system resources to generate randoms, at least to seed the pseudo random number generator. The only possible exception are CPU specific random instructions, such as the ones used on the latest Intel CPU's (hopefully well-tested secure RNGs will become a main feature of CPU's).
Within many programming environments there is very little choice but to use OpenSSL or /dev/random for seeding. In general it is hard to find useful information about the random number generator. Sometimes the RNG is really not suitable at all (e.g. the native PHP version).
If possible, try to find something that conforms to NIST requirements.

Computationally intensive algorithms

I am planning to write a bunch of programs on computationally intensive algorithms. The programs would serve as an indicator of different compiler/hardware performance.
I would want to pick up some common set of algorithms which are used in different fields, like Bioinformatics, Gaming, Image Processing, et al. The reason I want to do this would be to learn the algorithms and have a personal mini benchmark suit that would be small | useful | easy to maintain.
Any advice on algorithm selection would be tremendously helpful.

Benchmarks are not worthy of your attention!
The very best guide is to processor performance is: http://www.agner.org/optimize/
And then someone will chuck it in a box with 3GB of RAM defeating dual-channel hopes and your beautifully tuned benchmark will give widely different results again.
If you have a piece of performance critical code and you are sure you've picked the winning algorithm you can't then go use a generic benchmark to determine the best compiler. You have to actually compile your specific piece of code with each compiler and benchmark them with that. And the results you gain, whilst useful for you, will not extrapolate to others.
Case in point: people who make compression software - like zip and 7zip and the high-end stuff like PPMs and context-mixing and things - are very careful about performance and benchmark their programs. They hang out on www.encode.ru
And the situation is this: for engineers developing the same basic algorithm - say LZ or entropy coding like arithmetic-coding and huffman - the engineers all find very different compilers are better.
That is to say, two engineers solving the same problem with the same high-level algorithm will each benchmark their implementation and have results recommending different compilers...
(I have seen the same thing repeat itself repeatedly in competition programming e.g. Al Zimmermann's Programming Contests which is an equally performance-attentive community.)
(The newer GCC 4.x series is very good all round, but that's just my data-point, others still favour ICC)
(Platform benchmarks for IO-related tasks is altogether another thing; people don't appreciate how differently Linux, Windows and FreeBSD (and the rest) perform when under stress. And benchmarks there - on the same workload, same machine, different machines or different core counts - would be very generally informative. There aren't enough benchmarks like that about sadly.)

There was some work done at Berkeley about this a few years ago. The identified 13 common application paterns for parallel programming, the "13 Dwarves". The include things like linear algebra, n-body models, FFTs etc
http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html
See page 10 onwards.
There are some sample .NET implementations here:
http://paralleldwarfs.codeplex.com/

The typical one is Fast Fourier Transform, perhaps you can also do something like the Lucas–Lehmer primality test.

I remember a guy who tested computational performance of machines, compiler versions by inverting Hilbert matrices.
For image processing, median filtering (used for removing noise, bad pixels) is always too slow. It might make a good test, given a large enough image say 1000x1000.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio