I can understand how using a seed for a pseudorandom gen such as the time does not make it truly random; but when a pseudorandom generator gets its seed from a hardware random number generator, doesn't the pseudorandom generator then become True Random, as its seed is gathered from a TRNG?
First of all, realize that individual numbers are not random or non-random: only large sets of numbers.
If you seed a PRNG from a truly random source, and then just keep calling the PRNG to get more numbers, then you will just have a pseudorandom sequence of numbers, albeit well seeded.
If you seed a PRNG with a truly random source and then fetch only one value from the PRNG, then you have a hash of a truly random number. If the PRNG's seed hashing function is good, this will be just as random as its input. If it's not, it might be more predictable (for example, a PRNG with only 64 bits of internal state will only produce 2^64 different values, regardless of how many bits you seed it with).
That's not to say that it's a bad idea--game simulations and Monte Carlo systems should use a fast PRNG seeded from a TRNG source to get the best compromise of speed and quality. But cryptographic applications need cryptographically secure random values, and that's trickier.
No
Good seeds are necessity, but they won't change the nature (and flaws) of the PRNG.
For example, even with good absolutely true random seed RNG such as LCG will still experience correlated sampling at high dimensions
Related
I've been thinking about this as a thought experiment to try and understand some hashing concepts. Consider the requirement for a say 128 bit hash function (i.e., its output is exactly 128 bits in length).
A. You might look at something like MD5. So you input your data to be hashed, and out pops a 128 bit number.
B. Alternatively, you find a magical pseudo random number generator (PRNG). Some sort of Frankenstein version of the Twister. It seeds itself from all of your input data to be hashed, and has an internal state size >> 128 bits. You then generate 128 pseudo random bits as output.
It seems to me that both A and B effectively produce an output that is determined solely by the input data. Are these two approaches therefore equivalent?
Supplemental:
Some feed back has suggested that there might be a security in-equivalence with my scenario. If the pseudo random number generator were to be something like Java's SecureRandom (which uses SHA-1), seeded from the input data, then might A <=> B?
If you seed a PRNG with your input data and then extract 128 bits of random data from it, then you effectively leave the hashing to the PRNG seed function, and the size of the hash that it generates will be the size of the PRNG state buffer.
However, if the state of the PRNG is larger than the 128 bits you extract as a hash, then there's a risk that some of the input data used for the seed won't have any effect on the bits of the PRNG state that you extract. This makes it a really bad hash, so you don't want to do that.
PRNG seed functions are typically very weak hashes, because hashing is not their business. They're almost certainly insecure (which you did not ask about), and separate from that they're usually quite weak at avalanching. A strong hash typically tries to ensure that every bit of input has a fair chance of affecting every bit of output. Insecure hashes typically don't worry that they'll fail at this if the input data is too short, but a PRNG seed will often make no effort at all.
Cryptographic hash functions are designed to make it hard to create input that generates a specific hash; and/or more it hard to create two inputs that generate the same hash.
If something is designed as a random number generating algorithm, then this was not one of the requirements for the design. So if something is "just" a random number generator, there is no guarantee that it satisfies these important constraints on a cryptographic hashcode. So in that sense, they are not equivalent.
Of course there may be random number generating algorithms that were also designed as cryptographic hashing algorithms, and in that case (if the implementation did a good job at satisfying the requirements) they may be equivalent.
I tried to generate discrete Uniform random number with (a+(b-a)*R) where R is random sample which is generated by liner Congruential method. But still I have doubt in the creation of D uniform number. Please give me anyone perfect formula for discrete uniform random number?
A linear generator is not a true random number generator. If you want a true random generator you need a good source of entropy. In Windows there are a number of interfaces to access entropy collected by the system. In Linux there is /dev/random. Better still would be a hardware source like the new intel DRNG instruction.
Once you have a good source, the source should be conditioned (unless conditioning is already applied). A simple way to do this is to seed a stream cipher such as AES in CTR mode. In fact AES/CTR is an excellent psuedo random generator if you use a random key as the seed.
I wonder is there any cheap and effective function to generate pseudo-random numbers by their indices? With something like that implementation:
var rand = new PseudoRandom(seed); // all sequences for same seeds are equal
trace(rand.get(index1)); // get int number by index1, for example =0x12345678
trace(rand.get(index2));
...
trace(rand.get(index1)); // must return the SAME number, =0x12345678
Probably it isn't about randomness but about good (fast and close as much as possible to uniform distribution) hashing where initial seed used as salt.
You could build such random number generator out of the stream cipher Salsa20. One of the nice features of Salsa20 is that you can jump ahead to any offset very cheaply. And Salsa20 is fast, typically less than 20 cycles per byte. Since the cipher is indistinguishable from a truly random stream, uniformity should be excellent.
Since you probably don't need cryptographically secure random numbers, you could even reduce the number of rounds to something like 8 instead of the usual 20 rounds.
Another option is to just use the ideas behind Salsa20, how to mix up a state array (Bernstein calls that a hashing function), to build your own random number generator.
Today, my friend had a thought that setting the seed of a pseudo-random number generator multiple times using the pseudo-random number generated to "make things more randomized".
An example in C#:
// Initiate one with a time-based seed
Random rand = new Random(milliseconds_since_unix_epoch());
// Then loop for a_number_of_times...
for (int i = 0; i < a_number_of_times; i++)
{
// ... to initiate with the next random number generated
rand = new Random(rand.Next());
}
// So is `rand` now really random?
assert(rand.Next() is really_random);
But I was thinking that this could probably increase the chance of getting a repeated seed being used for the pseudo-random number generator.
Will this
make things more randomized,
making it loop through a certain number of seeds used, or
does nothing to the randomness (i.e. neither increase nor decrease)?
Could any expert in pseudo-random number generators give some detailed explanations so that I can convince my friend? I would be happy to see answers explaining further detail in some pseudo-random number generator algorithm.
There are three basic levels of use for pseudorandom numbers. Each level subsumes the one below it.
Unexpected numbers with no particular correlation guarantees. Generators at this level typically have some hidden correlations that might matter to you, or might not.
Statistically-independent number with known non-correlation. These are generally required for numerical simulations.
Cryptographically secure numbers that cannot be guessed. These are always required when security is at issue.
Each of these is deterministic. A random number generator is an algorithm that has some internal state. Applying the algorithm once yields a new internal state and an output number. Seeding the generator means setting up an internal state; it's not always the case that the seed interface allows setting up every possible internal state. As a good rule of thumb, always assume that the default library random() routine operates at only the weakest level, level 1.
To answer your specific question, the algorithm in the question (1) cannot increase the randomness and (2) might decrease it. The expectation of randomness, thus, is strictly lower than seeding it once at the beginning. The reason comes from the possible existence of short iterative cycles. An iterative cycle for a function F is a pair of integers n and k where F^(n) (k) = k, where the exponent is the number of times F is applied. For example, F^(3) (x) = F(F(F(x))). If there's a short iterative cycle, the random numbers will repeat more often than they would otherwise. In the code presented, the iteration function is to seed the generator and then take the first output.
To answer a question you didn't quite ask, but which is relevant to getting an understanding of this, seeding with a millisecond counter makes your generator fail the test of level 3, unguessability. That's because the number of possible milliseconds is cryptographically small, which is a number known to be subject to exhaustive search. As of this writing, 2^50 should be considered cryptographically small. (For what counts as cryptographically large in any year, please find a reputable expert.) Now the number of milliseconds in a century is approximately 2^(41.5), so don't rely on that form of seeding for security purposes.
Your example won't increase the randomness because there is no increase in entropy. It is simply derived from the execution time of the program.
Instead of using something based of the current time, computers maintain an entropy pool, and build it up with data that is statistically random (or at least, unguessable). For example, the timing delay between network packets, or key-strokes, or hard-drive read times.
You should tap into that entropy pool if you want good random numbers. These are known as Cryptographically secure pseudorandom number generators.
In C#, see the Cryptography.RandomNumberGenerator Class for the right way to get a secure random number.
This will not make things more "random".
Our seed determines the random looking but completely determined sequence of numbers that rand.next() gives us.
Instead of making things more random, your code defines a mapping from your initial seed to some final seed, and, given the same initial seed, you will always end up with the same final seed.
Try playing with this code and you will see what I mean (also, here is a link to a version you can run in your browser):
int my_seed = 100; // change my seed to whatever you want
Random rand = new Random(my_seed);
for (int i = 0; i < a_number_of_times; i++)
{
rand = new Random(rand.Next());
}
// does this print the same number every run if we don't change the starting seed?
Console.WriteLine(rand.Next()); // yes, it does
The Random object with this final seed is just like any other Random object. It just took you more time then necessary to create it.
I'm just curious...
How do you simulate randomness? How is it done in modern OS (Windows, Linux, etc.)?
Edit:
Okay, NOT JUST GENERATING RANDOM NUMBER, which can be just done with calling rand() functions in most high level programming languages.
But, I'm more concerned with how it is actually done in modern operating systems.
Please see:
Pseudo-random number generator
True random number generator
Fast pseudo random number generator for procedural content
Create Random Number Sequence with No Repeats
How do you generate a random number in C#?
Seeding a random number generator in .NET
How to get random double value out of random byte array values?
Fast pseudo random number generator for procedural content
etc...