Effective period of Mersenne Twister generator output - random

Mersenne Twister generator has a period of (2^19937)-1, but it is period of internal states.
Any idea what is the effective period of MT 32 bit output - period over which 32 bit output does not repeat. It has to be smaller than (2^31)-1 but I couldn't find definite answer.
Thanks

I think you misunderstand what a period is.
It means that after generating a period of numbers, you'll get the exact same sequence again, the random generator repeats itself.
It is not a measure that a particular number generated again. It can (and it will) happen, that the same number will generated twice in a row. It doesn't mean that the period is 1.
So even for 32-bit output, MT's period is 2^19937-1.
For example, this 1-bit output RNG has a period of 8:
00101110 00101110 00101110 00101110 ...

Related

urandom_range(), urandom(), random() in verilog

I am confused between these three functions and I was wondering for some explanation. If I set the range how do I make the range exclusive or inclusive? Are the ranges inclusive or exclusive if I don't specify the range?
In addition to the answer from #dave_59, there are other important differences:
i) $random returns a signed 32-bit integer; $urandom and $urandom_range return unsigned 32-bit integers.
ii) The random number generator for $random is specified in IEEE Std 1800-2012. With the same seed you will get exactly the same sequence of random numbers in any SystemVerilog simulator. That is not the case for $urandom and $urandom_range, where the design of the random number generator is up to the EDA vendor.
iii) Each thread has its own random number generator for $urandom and $urandom_range, whereas there is only one random number generator for $random shared between all threads (ie only one for the entire simulation). This is really important, because having separate random number generators for each thread helps you simulation improve a property called random stability. Suppose you are using a random number generator to generate random stimulus. Suppose you find a bug and fix it. This could easily change the order in which threads (ie initial and always blocks) are executed. If that change changed the order in which random numbers were generated then you would never know whether the bug had gone away because you'd fixed it or because the stimulus has changed. If you have a random number generator for each thread then your testbench is far less vulnerable to such an effect - you can be far more sure that the bug has disappeared because you fixed it. That property is called random stability.
So, As #dave_59 says, you should only be using $urandom and $urandom_range.
You should only be using $urandom and $urandom_range. These two functions provide better quality random numbers and better seed initialization and stability than $random. The range specified by $urandom_range is always inclusive.
Although $random generates the exact same sequence of random numbers for each call, it is extremely difficult to keep the same call ordering as soon as any change is made to the design or testbench. Even more difficult when multiple threads are concurrently generating random numbers.

Is there any algorithm for random number generation whose pattern cannot be revealed?

Can it be possible to create random number whose pattern of getting the next random number never be repeated even the universe ends.
I read this security rule-of-thumb:
All processes which require non-trivial random numbers MUST attempt to
use openssl_pseudo_random_bytes(). You MAY fallback to
mcrypt_create_iv() with the source set to MCRYPT_DEV_URANDOM. You MAY
also attempt to directly read bytes from /dev/urandom. If all else
fails, and you have no other choice, you MUST instead generate a value
by strongly mixing multiple sources of available random or secret
values.
http://phpsecurity.readthedocs.org/en/latest/Insufficient-Entropy-For-Random-Values.html
In Layman's terms, no; In order to generate a particular data form, such as a string or integer, you must have an algorithm of some sort, which obviously cannot be 100% untraceable...
Basically, the final product myust come from a series of events (algorithm) in which is impossible to keep 'unrevealed'.
bignum getUniqueRandom()
{
static bignum sum = 0;
sum += rand();
return sum;
}
That way the next random number will be always greater than the previous (by a random factor between 0 and 1) and as result the numbers returned will never repeat.
edit:
The actual approach when randomness requirements are so high is to use a hardware random number generator; there are for example chips that measure atom decays of background radiation generating truly random seeds. Of course the nature or randomness is such that there is never a guarantee a pattern can't repeat, or you'd be damaging the actual randomness of the result. But the pattern can't be repeated by any technical/mathematical means, so the repeats are meaningless.
At every step in the execution of a computer program, the total internal state determines what the next total internal state will be. This internal state must be represented by some number of bits--all of the memory used by the program, registers of the processor, anything else that affects it. There can only be 2**N possible states given N bits of state information.
Since any given state T will lead to the same state T+1 (that's what "deterministic" means), the algorithm must eventually repeat itself after no more than 2**N steps. So what limits the cycle length of an RNG is the number of bits of internal state. A simple LCG might have only 32 bits of state, and therefore a cycle <= 2^32. Something like Mersenne Twister has 19968 bits of internal state, and its period is 2^19937-1.
So for any deterministic algorithm to be "unrepeatable in the history of the Universe", you'll probably need most of the atoms of the Universe to be memory for its internal state.

A PRNG that doesn't have a period of maximum length

Are there any 32-bit pseudorandom number generators (PRNG) that either have a greater or smaller period than (2^32 - 1)?
I would also appreciate a reliable source (book, research paper et.c) that uses it, thanks!
There are many PRNGs with different periods. Period is bounded by the size of the state space the PRNG maintains, not by the number of bits per integer. Java, for instance, uses a linear congruential generator with 48 bits of state. The late great George Marsaglia created "The mother of all random number generators", which pooled the results of smaller generators with relatively prime cycle lengths to get a cycle length of about 2250. As #Michael pointed out in a comment, Mersenne Twister has a whopping big period of 219937-1. Pierre L'ecuyer is considered to be a world leader on this topic, and you can read what many consider to be a pretty definitive chapter on the topic of PRNGs from the Elsevier publication "Handbooks in Operations Research and Management Science: Simulation". (Many of his other publications may also be of interest to you, both directly and as evidence of his expertise in this area.)
Linear congruential generators typically have a period of (2^n) exactly. This is, obviously, greater than (2^n-1), and uses only n bits of state (n=32 in your case) -- but it's the absolute upper limit on a PRNG with a 32-bit state.
Multiply-with-carry generators can be arranged to have a period of any safe prime, and configurations can be chosen either to use 32-bit arithmetic or to use a 32-bit state buffer, whichever you require. For a 32-bit state configuration all possible periods will be less than (2^32-1).

How exactly does PC/Mac generates random numbers for either 0 or 1?

This question is NOT about how to use any language to generate a random number between any interval. It is about generating either 0 or 1.
I understand that many random generator algorithm manipulate the very basic random(0 or 1) function and take seed from users and use an algorithm to generate various random numbers as needed.
The question is that how the CPU generate either 0 or 1? If I throw a coin, I can generate head or tailer. That's because I physically throw a coin and let the nature decide. But how does CPU do it? There must be an action that the CPU does (like throwing a coin) to get either 0 or 1 randomly, right?
Could anyone tell me about it?
Thanks
(This has several facets and thus several algorithms. Keep in mind that there are many different forms of randomness used for different purposes, but I understand your question in the way that you are interested in actual randomness used for cryptography.)
The fundamental problem here is that computers are (mostly) deterministic machines. Given the same input in the same state they always yield the same result. However, there are a few ways of actually gathering entropy:
User input. Since users bring outside input into the system you can take that to derive some bits from that. Similar to how you could use radioactive decay or line noise.
Network activity. Again, an outside source of stuff.
Generally interrupts (which kinda include the first two).
As alluded to in the first item, noise from peripherals, such as audio input or a webcam can be used.
There is dedicated hardware that can generate a few hundred MiB of randomness per second. Usually they give you random numbers directly instead of their internal entropy, though.
How exactly you derive bits from that is up to you but you could use time between events, or actual content from the events, etc. – generally eliminating bias from entropy sources isn't easy or trivial and a lot of thought and algorithmic work goes into that (in the case of the aforementioned special hardware this is all done in hardware and the code using it doesn't need to care about it).
Once you have a pool of actually random bits you can just use them as random numbers (/dev/random on Linux does that). But this has downsides, since there is usually little actual entropy and possibly a higher demand for random numbers. So you can invent algorithms to “stretch” that initial randomness in a manner that makes it still impossible or at least very difficult to predict anything about following numbers (/dev/urandom on Linux or both /dev/random and /dev/urandom on FreeBSD do that). Fortuna and Yarrow are so-called cryptographically secure pseudo-random number generators and designed with that in mind. You still have a very good guarantee about the quality of random numbers you generate, but have many more before your entropy pool runs out.
In any case, the CPU itself cannot give you a random 0 or 1. There's a lot more involved and this usually includes the complete computer system or special hardware built for that purpose.
There is also a second class of computational randomness: Plain vanilla pseudo-random number generators (PRNGs). What I said earlier about determinism – this is the embodiment of it. Given the same so-called seed a PRNG will yield the exact same sequence of numbers every time¹. While this sounds idiotic it has practical benefits.
Suppose you run a simulation involving lots of random numbers, maybe to simulate interaction between molecules or atoms that involve certain probabilities and unpredictable behaviour. In science you want results anyone can independently verify, given the same setup and procedure (or, with computing, the same algorithms). If you used actual randomness the only option you have would be to save every single random number used to make sure others can replicate the results independently.
But with a PRNG all you need to save is the seed and remember what algorithm you used. Others can then get the exact same sequence of pseudo-random numbers independently. Very nice property to have :-)
Footnotes
¹ This even includes the CSPRNGs mentioned above, but they are designed to be used in a special way that includes regular re-seeding with entropy to overcome that problem.
A CPU can only generate a uniform random number, U(0,1), which happens to range from 0 to 1. So mathematically, it would be defined as a random variable U in the range [0,1]. Examples of random draws of a U(0,1) random number in the range 0 to 1 would be 0.28100002, 0.34522, 0.7921, etc. The probability of any value between 0 and 1 is equal, i.e., they are equiprobable.
You can generate binary random variates that are either 0 or 1 by setting a random draw of U(0,1) to a 0 if U(0,1)<=0.5 and 1 if U(0,1)>0.5, since in theory there will be an equal number of random draws of U(0,1) below 0.5 and above 0.5.

Does Kernel::srand have a maximum input value?

I'm trying to seed a random number generator with the output of a hash. Currently I'm computing a SHA-1 hash, converting it to a giant integer, and feeding it to srand to initialize the RNG. This is so that I can get a predictable set of random numbers for an set of infinite cartesian coordinates (I'm hashing the coordinates).
I'm wondering whether Kernel::srand actually has a maximum value that it'll take, after which it truncates it in some way. The docs don't really make this obvious - they just say "a number".
I'll try to figure it out myself, but I'm assuming somebody out there has run into this already.
Knowing what programmers are like, it probably just calls libc's srand(). Either way, it's probably limited to 2^32-1, 2^31-1, 2^16-1, or 2^15-1.
There's also a danger that the value is clipped when cast from a biginteger to a C int/long, instead of only taking the low-order bits.
An easy test is to seed with 1 and take the first output. Then, seed with 2i+1 for i in [1..64] or so, take the first output of each, and compare. If you get a match for some i=n and all greater is, then it's probably doing arithmetic modulo 2n.
Note that the random number generator is almost certainly limited to 32 or 48 bits of entropy anyway, so there's little point seeding it with a huge value, and an attacker can reasonably easily predict future outputs given past outputs (and an "attacker" could simply be a player on a public nethack server).
EDIT: So I was wrong.
According to the docs for Kernel::rand(),
Ruby currently uses a modified Mersenne Twister with a period of 2**19937-1.
This means it's not just a call to libc's rand(). The Mersenne Twister is statistically superior (but not cryptographically secure). But anyway.
Testing using Kernel::srand(0); Kernel::sprintf("%x",Kernel::rand(2**32)) for various output sizes (2*16, 2*32, 2*36, 2*60, 2*64, 2*32+1, 2*35, 2*34+1), a few things are evident:
It figures out how many bits it needs (number of bits in max-1).
It generates output in groups of 32 bits, most-significant-bits-first, and drops the top bits (i.e. 0x[r0][r1][r2][r3][r4] with the top bits masked off).
If it's not less than max, it does some sort of retry. It's not obvious what this is from the output.
If it is less than max, it outputs the result.
I'm not sure why 2*32+1 and 2*64+1 are special (they produce the same output from Kernel::rand(2**1024) so probably have the exact same state) — I haven't found another collision.
The good news is that it doesn't simply clip to some arbitrary maximum (i.e. passing in huge numbers isn't equivalent to passing in 2**31-1), which is the most obvious thing that can go wrong. Kernel::srand() also returns the previous seed, which appears to be 128-bit, so it seems likely to be safe to pass in something large.
EDIT 2: Of course, there's no guarantee that the output will be reproducible between different Ruby versions (the docs merely say what it "currently uses"; apparently this was initially committed in 2002). Java has several portable deterministic PRNGs (SecureRandom.getInstance("SHA1PRNG","SUN"), albeit slow); I'm not aware of something similar for Ruby.

Resources