Why does Erlang's random number generation seeding require 3 integers? - random

Erlang's rand:seed/2 uses 3 integers to seed the RNG.
Whereas RNG implementations commonly require 1 integer for a seed, why does Erlang's use 3 specifically?

Erlang uses a variant of the Wichmann-Hill algorithm as its PRNG. This algorithm dates to 1982, when 16 bit processors were common. To achieve a reasonably long cycle (for the time), it pools the results of three different linear congruential generators (LCGs), each of which had a cycle length < 215. The pooled result has a combined cycle length of
Each of the LCGs has its own integer state, so three separate seed values are required.

Related

Do most pseudorandom number generators produce a cyclic sequence of unique numbers?

Saw the formula for a pseudorandom number generator in BASIC ages ago and it used each pseudorandom number as the seed for the next one. So if it hit the same number after a while, it would cycle the same sequence all over again and therefore the numbers in the sequence were all different.
Would the sequence include the complete set of numbers from 0 to 2^16-1 for the 16 bit version of this generator, all appearing once?
Is this what happens in most pseudorandom number generators in most languages even today?
All pseudorandom number generators (PRNGs) have cycles; they all rely on a state which can be expressed as a big number with a specific end. Once that end is reached, the cycle starts again. Some PRNGs have only one cycle, others have several, and still others devolve to a cycle. "Random Invertible Mappings" has diagrams.
For example, the Mersenne Twister PRNG has a state of 19968 bits (and so has a state that can express any number less than 2^19968), so it will have a cycle no bigger than 2^19968 (and in fact it's less, namely about 2^19937).

Combining PRNG and 'true' random, fast and (perhaps) dumb way

Take fast PRNG like xoroshiro or xorshift and 'true' entropy based generator like /dev/random.
Seed PRNG with 'true' random, but also get a single number from 'true' random and use it to XOR all results from PRNG to produce final output.
Then, replace this number once a while (e.g. after 10000 random numbers are generated).
Perhaps this is naive, but I would hope this should improve some aspects of PRNG like period size with negligible impact on speed. What am I getting wrong?
What I am concerned about here is generating UUIDs (fast), which are basically 128-bit numbers which should be "really unique". What my concern is that using modern PRGN like xorshift family with periods close to 'just' 2^128 the chance of collision of entropy seeded PRNG generator is not as negligible as it would be with truly random numbers.
The improvements are only minor compared to the plain PRNG. For example the single true random number used for masking the result can be eliminated by taking the XOR of successive results. This will be the same value as the XOR of successive plain PRNG numbers. So if you can predict the PRNG, it is not too hard to do the same to the improved sequence.

Random number from many other random numbers, is it more random?

We want to generate a uniform random number from the interval [0, 1].
Let's first generate k random booleans (for example by rand()<0.5) and decide according to these on what subinterval [m*2^{-k}, (m+1)*2^{-k}] the number will fall. Then we use one rand() to get the final output as m*2^{-k} + rand()*2^{-k}.
Let's assume we have arbitrary precision.
Will a random number generated this way be 'more random' than the usual rand()?
PS. I guess the subinterval picking amounts to just choosing the binary representation of the output 0. b_1 b_2 b_3... one digit b_i at a time and the final step is adding the representation of rand() to the end of the output.
It depends on the definition of "more random". If you use more random generators, it means more random state, and it means that cycle length will be greater. But cycle length is just one property of random generators. Cycle length of 2^64 usually OK for almost any purpose (the only exception I know is that if you need a lot of different, long sequences, like for some kind of simulation).
However, if you combine two bad random generators, they don't necessarily become better, you have to analyze it. But there are generators, which do work this way. For example, KISS is an example for this: it combines 3, not-too-good generators, and the result is a good generator.
For card shuffling, you'll need a cryptographic RNG. Even a very good, but not cryptographic RNG is inadequate for this purpose. For example, Mersenne Twister, which is a good RNG, is not suitable for secure card shuffling! It is because observing output numbers, it is possible to figure out its internal state, so shuffle result can be predicted.
This can help, but only if you use a different pseudorandom generator for the first and last bits. (It doesn't have to be a different pseudorandom algorithm, just a different seed.)
If you use the same generator, then you will still only be able to construct 2^n different shuffles, where n is the number of bits in the random generator's state.
If you have two generators, each with n bits of state, then you can produce up to a total of 2^(2n) different shuffles.
Tinkering with a random number generator, as you are doing by using only one bit of random space and then calling iteratively, usually weakens its random properties. All RNGs fail some statistical tests for randomness, but you are more likely to get find that a noticeable cycle crops up if you start making many calls and combining them.

Pseudorandom hash of two integers

I need a NxN matrix with 16bit or 32bit pseudorandom uniformaly distributed numbers over the whole range of values. N is unfortunately very large (at least 1e6), so it can not be pregenerated (That would take about a TB of memory). The only viable option I can think of is using a hash of my indices i and j as matrix elements.
There are plenty of integer hash functions available, but I am not sure which ones are suitable for two reasons.
-Only 32bit unsigned integer operations available. Since N is at least 2^20 I can not naively concatenate the two indices into one 32bit key without creating unnecessary collisions.
-Pseudorandomness is important here, I am not building a hashtable. Most integer hashes I found are designed for hashtables and don't have very strong requirements.
A possible solution would be taking a cryptographic hash like SHA-2, but performance is important and that is just too expensive.
A suggestions on how to combine two 32bit uints into one wile avoiding collisions patterns would already help a great deal, since I could then pick from the whole range of 32bit to 32bit hashes.
Some insight on which 32bit to 32bit hashes have good randomness would also be much appreciated.
Pregenerating 1 or 2 Arrays of N random numbers is no problem if it helps.
In case it matters, the target are GPUs I am writing in recent versions of GLSL.
What about using LCG? It is well-known fact that in the form of
xn = (a*x+c) mod 232 where a mod 8 is 3 or 5 and c is odd, the resulting congruential sequence will have period 232.
Numerical recipes: a=1664525, c=1013904223, but there are tons of them
Form unique x from i, j, and compute xn.
I found a suitable algorithm. Block ciphers in counter mode are obviously suitable. I initially rejected the idea because of the performance implications of most block ciphers. However, I found a paper that introduces a related algorithm (basically a block cipher with less rounds) called Philox (Parallel Random Numbers: As Easy as 1, 2, 3 by Salmon et al.).
Link: http://www.thesalmons.org/john/random123/papers/random123sc11.pdf
The only problem left is how to combine the two indices into one 32bit number. But I guess XOR should be good enough if combined with a rotation to avoid commutativity.

Generating two differnet pseudorandom numbers using the same seed

Is it possible to generate two different psuedorandom numbers on two separate program runs without using time as the seed? i.e. using the same seed on both runs, is it possible to get two different numbers?
In general, it is not possible to get different pseudorandom numbers using the same seed.
Pseudorandom numbers are, by definition, not truly random numbers and therefore are not composed from sources of entropy. Or, if the numbers do contain some entropy input, the input is not enough to cause the sequence to qualify as statistically "random." (An example of a property that such a sequence should have is runs of 1-bits n-bits-long with probability of 2^(-n), among many other properties of statistical randomness. The definition of statistical randomness becomes more sophisticated (in a sense, more "actual" or close to nature) as mathematics around randomness improves. This is another way of saying that, at any given time, the definitions of statistical randomness are about to become out-dated or obsolete.)
In any case, the vast majority of pseudorandom number generators are, in fact, completely deterministic.
The canonical1 example of a pseudorandom number generator is a linear feedback shift register (LFSR). The LFSR can be implemented as a digital logic circuit containing a register which holds N bits, some gates numbering M, much less than N (e.g., M=1, M=2), usually these are XOR gates, which "feed back" into the register's bits at certain "tap bits." There is a lot about this on the web.
Given the same seed input, the LFSR will always generate the same sequence.
It is possible, using Walsh-Hadamard matrices, or otherwise called "M matrices", additionally called "sequency transform", to sample the output of an LFSR and determine that the sequence is, in fact, from an LFSR and also the structure of its gates and taps, as well as the current register content. From this information all sequence values are known, and it is possible to reverse out the possible seed values which were used as input. For these reasons, LFSRs are not suitable for security purposes such as random tokens for authentication.
By canonical, I am refering to Don Knuth's use of the LFSR as an example, as well as the timeless tradition which has ensued therefrom.
Not sure if you want to generate 2 different random numbers from same seed - or avoid it! But, if you really do want that, then similar to LFSRs, LCGs (Linear Congruential Generators) are often used to generate deterministic psuedo random numbers. You can 'easily' create 2 simple LCGs using different constants, which will generate 2 different psuedo random numbers for the same seed.

Resources