Is user delay between random takes is good improvement for PRNG? - algorithm

I thought that for making random choices for example for next track in a player or next page in the browser it could be possible to use time as 'natural phenomenon', for example decent RPNG just can continuously get next random number without program request (for example in a thread every several milliseconds or event more often) and when the time comes (based on the user decision), the choice will be naturally affected by this user delay.
Is this approach is good enough and how can it be tested? The problem for testing manually is that I can not wait that long in real world to save enough random numbers to feed them to some test program. Any artificial attempt to speed this up will make the method itself invalid.
Thanks

A good random number generator really doesn't need improvement, and even if it did, it isn't clear that user input timing would help.
Could a user ever detect a pattern in tracks selected by an LCG? Whatever your platform, its likely that its built-in random() function would be good enough (that is, it would appear completely random to a user).
If you are still worried, however, use a cryptographic quality RNG, seeded with data from the dedicated source of randomness on your system. Nowadays, many of these system RNGs use truly random bits generated through quantum events in hardware. However, they can be slow to produce bits, so its best to use them as a seed for a fast, algorithmic PRNG.
Now, if you aren't convinced these approaches are good enough, you should be very skeptical that the timing of user typing is a good source. The keys that are pressed by users are highly predictable, given the limited vocabulary in use and the patterns that tend to appear within that limited set of words. This predictability in letter sequences leads to a high degree of predictability in timing between key presses.
I know that a lot of security programs use this technique during key generation. I don't think that it is pure snake oil, but it could be a placebo to placate users. A good product will depend on the system RNG.

Acquiring the time information that you describe can indeed add entropy to a PRNG. However, from your description of your intended applications, I don't think you need it. For "random choices for example for next track in a player or next page in the browser", a trivial, unmodified PRNG is fine. For security applications such as nonces, etc. it is much more important.
Anyway, you should read about PRNG entropy sources.

I wouldn't improve PRNGs with user delays, mostly because they're quite regular: you type at around the same speed, and it takes too long to measure the delay between a click and another (assuming normal usage). I'd rather use other user-triggered events: pressed keys, distance between each click, position of the mouse at given moments.

Related

What should be used as a PRNG seed?

Most papers either do not even mention it or just say get an "initial vector" from somewhere somehow.
The approach posted in a lot of places is to use system time. However isn't this a serious vulnerability (assuming the algorithms are known)? If the time is known withing a few seconds I estimate (by doing a few trivial tests using QueryPerformanceCounter) there would be less than 24 bits of actual information (quite pathetic). Plus since time has a somewhat predictable nature, one could generate necessary information for a hypothetical attack in advance.
Is there a way to initialize a PRNG and not feel sad?
If you are needing to generate cryptographically secure random numbers, you should almost always use the operating system's CSPRNG; that is, /dev/urandom, CryptGenRandom, getrandom, getentropy, or the like. This will be secure, well-seeded, well-tested, and generally foolproof.
In the unlikely event that you actually need a different CSPRNG, then the PRNG entropy should either be drawn from the system CSPRNG or another CSPRNG that is seeded from the system one. For example, in the event you really need to generate random numbers very quickly and the OS generator is not fast enough, you could use a per-thread CSPRNG seeded from the OS generator.
We assume that an attacker knows how your code is designed and structured, so if you seed a CSPRNG with easily guessable values like the PID, time, or user ID, then it's likely that an attacker will be able to guess future output based on seeing a small amount of output.
If you don't need cryptographic security, you can draw the seed from any source you like, but the OS CSPRNG is not a bad choice if you don't care very much but just want a good seed. For non-cryptographic purposes, I like ChaCha8 (which is a cryptographic algorithm, but with 8 rounds is insufficiently conservative) as my PRNG because it is generally faster than most alternatives, has good statistical properties, and can easily be seeded with any 32- or 64-bit value by just repeating that value as the key. In such a case, as long as the seed is unlikely to repeat, it will probably produce good output.

What Type of Random Number Generator is Used in the Casino Gaming Industry? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Closed 6 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Given the extremely high requirements for unpredictability to prevent casinos from going bankrupt, what random number generation algorithm and seeding scheme is typically used in devices like slot machines, video poker machines, etc.?
EDIT: Related questions:
Do stateless random number generators exist?
True random number generator
Alternative Entropy Sources
There are many things that the gaming sites have to consider when choosing/implementing an RNG. Without due dilligence, it can go spectacularly wrong.
To get a licence to operate a gaming site in a particular jurisdiction usually requires that the RNG has been certified by an independent third-party. The third-party testers will analyse the source code and run statistical tests (e.g. Diehard) to ensure that the RNG behaves randomly. Reputable poker sites will usually include details of the certification that their RNG has undergone (for example: PokerStars's RNG page).
I've been involved in a few gaming projects, and for one of them I had to design and implement the RNG part, so I had to investigate all of these issues. Most poker sites will use some hardware device for entropy, but they won't rely on just hardware. Usually it will be used in conjunction with a pseudo-RNG (PRNG). There are two main reasons for this. Firstly, the hardware is slow, it can only extract a certain number of bits of entropy in a given time period from whatever physical process it is monitoring. Secondly, hardware fails in unpredictable ways that software PRNGs do not.
Fortuna is the state of the art in terms of cryptographically strong PRNGs. It can be fed entropy from one or more external sources (e.g. a hardware RNG) and is resilient in the face of attempted exploits or RNG hardware failure. It's a decent choice for gaming sites, though some might argue it is overkill.
Pokerroom.com used to just use Java's SecureRandom (they probably still do, but I couldn't find details on their site). This is mostly good enough, but it does suffer from the degrees of freedom problem.
Most stock RNG implementations (e.g. Mersenne Twister) do not have sufficient degrees of freedom to be able to generate every possible shuffle of a 52-card deck from a given initial state (this is something I tried to explain in a previous blog post).
EDIT: I've answered mostly in relation to online poker rooms and casinos, but the same considerations apply to physical video poker and video slots machines in real world casinos.
For a casino gaming applications, I think the seeding of the algorithm is the most important part to make sure all games "booted" up don't run through the same sequence or some small set of predictable sequences. That is, the source of entropy leading to the seed for the starting position is the critical thing. Beyond that, any good quality random number generator where each bit position as has a ~50/50 probability of being 1/0 and the period is relatively long would be sufficient. For example, something like the Mersenne twister PRNG has such properties.
Using cryptographically secure random generators only becomes important when the actual output of the random generator can be viewed directly. For example, if you were monitoring each number actually generated by the number generator - after viewing many numbers in the sequence - with a non-cryptographic generator information about that sequence can lead to establishing information about all the internal state of the generator. At this point, if you know what the algorithm looks like, you would be able to predict future numbers and that would be bad. The cryptographic generator prevents that reverse engineering back to the internal state so that predicting future numbers becomes "impossible".
However, in the case of a casino game, you would (or should) have no visibility to the actual numbers being generated under the hood. Each time a random number is generated - say a 32-bit number - that number will be used then, for example, mod 52 for a deck shuffling algorithm....no where in that process do you have any idea what numbers were being generated by the algorithm to shuffle that deck. That is, most of the bits of "randomness" is just being thrown out and even the ones being used you have no visibility to. Therefore, no way to reverse engineer the state.
Getting back to a true source of entropy to seed the whole process, that is the hard part. See the Wikipedia entry on entropy for some starting points on techniques.
As an aside, if you did want cryptographically sequence random numbers from a "regular" algorithm, a simple approach is to take a few random numbers in sequence, concatenate them together and then run something like MD5 or SHA-1 on them and the result is just as random and also cryptographically secure. That is, you just made your own "secure" random number generator.
We've been using the Protego R210-USB TRNG (and the non-usb version before that) as random seed generators in casino applications, with java.security.SecureRandom
on top. We had The Swedish National Laboratory of Forensic Science perform a separate audit of the R210, and it passed without a flaw.
You probably need a cryptographically secure pseudo-random generator. There are a lot of variants. Google "Blum-Blum-Shub", for example.
The security properties of these pseudo-random generators will generally be that, even when the attacker can observe polynomially many outputs from such generators, it won't be feasible to guess the next output with a probability much better than random guessing. Also, it is not feasible to distinguish the output of such generators from truly random bits. The security holds even when all the algorithms and parameters are known by the attacker (except for the secret seed).
The security of the generators is often measured with respect to a security parameter. In the case of BBS, it is the size of the modulus. This is no different from other crypto stuff. For example, RSA is secure only when the key is long enough.
Note that, the output of such generators may not be uniform (in fact, can be far away from uniform in statistical sense). But since no one can distinguish the two distributions without infinite computing power, these generators will suffice in most applications that require truly random bits.
Bear in mind, however, that these cryptographically secure pseudo-random generators are usually slow. So if speed is indeed a concern, less rigorous approaches may be more relevant, such as using hash functions, as suggested by Jeff.
Casino slot machines generate random numbers continuously at very high speed and use the most recent result(s) when the user pulls the lever (or hits the button) to spin the reels.
Even a simplistic generator can be used. Even if you knew the algorithm used, you cannot observe where in the sequence it is because nearly all the results are discarded. If somehow you did know where it was in the sequence, you'd have to have millisecond or better timing to take advantage of it.
Modern "mechanical reel" machines use PRNGs and drive the reels with stepper motors to simulate the old style spin-and-brake.
http://en.wikipedia.org/wiki/Slot_machine#Random_number_generators
http://entertainment.howstuffworks.com/slot-machine3.htm
Casinos shouldn't be using Pseudo-random number generators, they should be using hardware ones: http://en.wikipedia.org/wiki/Hardware_random_number_generator
I suppose anything goes for apps and offshore gambling nowadays, but all these other answers are incomplete, at least for Nevada Gaming Control Board licensed machines, which I believe the question is originally about.
The technical specifications for RNGs licensed in Nevada for gaming purposes are laid out in Regulation 14.040(2).
As of May 24, 2012, here is a summary of the rules a RNG must follow:
Static seeds cannot be used. You have to seed the RNG using a millisecond time source or other true entropy source which has no external readout anywhere on the machine. (This helps to reduce the incidence of "magic number" attacks)
The RNG must continue generating numbers in its sequence at least 100 times per second when the game is not being played. (This helps to avoid timing attacks)
RNG outputs cannot be reused; they must be used exactly once if at all and then thrown away.
Multi-system cabinets must use a separate RNG and separate seed for each game.
Games that use RNGs for helping to choose numbers on behalf of the player (such as Lotto Quick Pick) must use a separate RNG for that process.
Games must not roll RNGs until they are actually needed in a game. (i.e. you need to wait until the player chooses to deal or spin before generating RNGs)
The RNG must pass a 95% confidence chi-squared test based on 10,000 trials as apart of a system test. It must display a warning if this test fails, and it must disable play if it fails twice in a row.
It must remember, and be able to report on, the last 10 test results as described in 7.
Every possible game outcome must be generatable by the RNG. As a pessimal example, linear congruential generators don't actually generate every possible output in their range, so they're not very useful for gaming.
Additionally, your machine design has to be submitted to the gaming commission and it has to be approved, which is expensive and takes lots of time. There are a few third-party companies that specialize in auditing your new RNG to make sure it's random. Gaming Laboratories publishes an even stricter set of standards than Nevada does. They go into much greater detail about the limitations of hardware RNGs, and Nevada in particular likes to see core RNGs that it's previously approved. This can all get very expensive, which is why many developers prefer to license an existing previously-approved RNG for new game projects.
Here is a fun list of random number generator attacks to keep you up late at night.
For super-nerds only: the source for most USB hardware RNGs is typically an avalanche diode. However, the thermal noise produced by this type of diode is not quantum-random, and it is possible to influence the randomness of an avalanche diode by significantly lowering the temperature.
As a final note, someone above recommended just using a Mersenne Twister for random number generation. This is a Bad Idea unless you are taking additional entropy from some other source. The plain vanilla Mersenne Twister is highly inappropriate for gaming and cryptographic applications, as described by its creator.
If you want to do it properly you have to get physical - ERNIE the UK national savings number picker uses a shot noise in Neon tubes.
I for sure have seen a german gambling machine that was not allowed to be ran commercially after a given date, so I suppose it was a PNRG with a looong one time pad seed list.
Most poker sites use hardware random number generators. They will also modify the output to remove any scaling bias and often use 'pots' of numbers which can be 'stirred' using entropic events (user activity, serer i/o events etc). Quite often the resultant numbers just index pre-generated decks (starting off as a sorted list of cards).

Does any software exist for building entropy pools from user input?

It'd be nice to be able, for some purposes, to bypass any sort of algorithmically generated random numbers in favor of natural input---say, dice rolls. Cryptographic key generation, for instance, strikes me as a situation where little enough random data is needed, and the requirement that the data be truly random is high enough, that this might be a feasible and desirable thing to do.
So what I'd like to know, before I go and get my hands dirty, is this: does any software exist for building an entropy pool directly from random digit input? Note that it's not quite enough to simply convert things from radix r to radix 2; since, for instance, 3 and 2 are relatively prime, it's not entirely straightforward to turn a radix-3 (or radix-6) number into binary digits while holding onto maximal entropy in the original input.
The device /dev/random does exactly this on Linux -- maybe it would be worth looking at the source?
EDIT:
As joeytwiddle says, if sufficient randomness is unavailable, /dev/random will block, waiting for entropy to "build up" by monitoring external devices (e.g. mouse, disk drives). This may or may not be what you want. If you'd prefer never to wait and are satisfied with possibly-lower-quality randomness, use /dev/urandom instead -- it's a non-blocking pseudorandom number generator that injects randomness from /dev/random whenever it is available, making it more random than a plain deterministic PRNG. (See man /dev/urandom for further details.)
This paper suggests various approaches with implementation ideas for both UN*X and Windows.
I'm not sure what you're asking for. "Entropy pool" is just a word for "some random numbers", so you could certainly use dice rolls; simply use them as the seen to a pseudorandom number generator that has the characteristics you want.
You can get physically generated random numbers online from, eg, Lavarnd or Hotbits.
Note that the amount of entropy in the pool doesn't necessarily have to be an integer. This should mostly deal with your prime-factors-other-than-2 issue.
Even if you end up using an implementation that does require integer estimates, you need quite a few dice rolls to generate a crypto key. So you could just demand them in bunches. If the user gives you the results of 10 d6 rolls, and you estimate the entropy as 25 bits, you've only lost 0.08 bits per dice roll. Remember to round down ;-)
Btw, I would treat asking the user for TRNG data, rather than drawing it from hardware sources as /dev/random does, to be a fun toy rather than an improvement. It's difficult enough for experts to generate random numbers - you don't want to leave general users at the mercy of their own amateurism. "The generation of random numbers is too important to be left to chance" --Robert Coveyou.
By another way, the authors of BSD argue that since entropy estimation for practical sources on PC hardware isn't all that well understood (being a physics problem, not a math problem), using a PRNG isn't actually that bad an option, provided that it is well-reseeded according to Schneier / Kelsey / Ferguson's Yarrow design. Your dice idea does at least have the advantage over typical sources of entropy for /dev/random, that as long as the user can be trusted to find fair dice and roll them properly, you can confidently put a lower bound the entropy. It has the disadvantage that an observer with a good pair of binoculars and/or a means of eavesdropping on their keyboard (e.g. by its E/M emissions) can break the whole scheme, so really it all depends on your threat model.

Why is it hard for a program to generate random numbers?

My kids asked me this question and I couldn't really give a concise, understandable explanation.
So I'm hoping someone on SO can.
How about, "Because computers just follow instructions, and random numbers are the opposite of following instructions. If you make a random number by following instructions, then it's not very random! Imagine trying to give someone instructions on how to choose a random number."
Here's a kid friendly explanation:
Get a Dice (the number of sides doesn't matter)
Write these down on a piece of paper:
Move right
Move up
Move up
Turn the dice over
Move down
Move right
Show them the dice and paper. Explain that the dice represents the computer and the
paper represent the math or algorithm that tells the computer what number it will return.
Now, roll the dice. Tell them that you are "seeding" or asking the computer to start at a random dice position.
Follow each step in the paper (move right) by moving the dice.
Let's say that you threw a 6 sided die and it was seeded at 5. By moving right, you get a 4.
Explain that the computer must start with a starting value. This could be given by any number of sources such as the date or mouse movement. Show them that how they throw the dice determines the starting value.
Explain that the piece of paper is how the computer get the next number. Tell them that the instructions on the paper can be changed as easily as the algorithm for the random generator can be changed by the programmer.
Have fun showing them the various possibilities that is only limited by their imaginations.
Now for the answer to your question:
Tell them that when a good mathematician knows the starting value and what step the computer is currently at, the mathematician can tell what is the next value of the random number.
Ask the child were to hide the paper and throw the dice.
Then ask the child to follow the steps on the paper, you then write down how he gets the next random number.
Afterwards, show them your paper. Now that you have a copy of their random number generator, its easy for anyone else to "guess" the next random to come out.
No matter how creative the child is with their algorithm, you should still be able to deduce their algorithm. Tell your child that in the computer world, nothing is hidden and just by observation, even if its just the numbers that was observed, the random number algorithm can be discovered.
...as a side effect, if the child was able to come up with a good algorithm that confused you, in which you can't deduce the next sequence, then you have a bright child. :D
Here's my attempt at explaining randomness at an approximately eighth-grade level. Hope your kids find it useful!
Surprising as it may seem, a computer is not very smart. Computers must follow their instructions blindly, and are therefore completely predictable. A computer that doesn't follow its instructions in this manner is, in fact, broken! We want computers to do exactly what we tell them.
That's precisely what makes it hard to do things randomly. Computers must be told a sequence of instructions on how to generate random numbers. But that's not really random, because if you gave anybody else the instructions and the same starting point, they could come up with the same answers. So computers can't be truly random just by following instructions.
Ask them to devise a step-by-step method to generate a random number.
And don't accept "pick a number from 1 to 10" as an answer ;)
Trying out a problem should illustrate the difficulty of having to generate random numbers from a set of instructions, just like what computers actually have to do.
Because computers are deterministic machines.
Generating random numbers on a computer is like playing "Eenie meenie miney moe" when choosing who's It first in a game of tag. On the surface it does look random, but when you get into the details, it's completely deterministic. It's hard to make eenie meenie miney moe into a scheme that a person really can't predict the outcome of.
Also there's some difficulties with getting the distribution nice and even.
Because given any input, an algorithm produces the exact same output every single time. And you can't just provide a "random" input, because you're trying to generate the random number in the first place.
"Kids, unless they're broken, computers never lie, and they always do what you tell them to do. Even when we are disappointed by the results, it always turns out that they were doing what they were told to do with complete fidelity. They can only do two things: add one and one, and move a number from one place to another. If you want them to produce random numbers, you need to explain to them how to do that in terms of adding one and one and moving. Once you have explained that, the results will not be random."
Because the only true source of randomness exists at the quantum level. With suitable hardware assists, computers can access this level. for example, they can sample the decay of a radioactve isotope or the noise from a thermionic valve. But your basic PC doesn't come with this cool stuff.
A simple explanation for the children:
The definition of randomness is a philosophical and mathematical question, beyond the scope of this answer, but by definition there is no such thing as a "random" number. In a metaphysical sense, a number is only random in sequential form; however, there is a probability that a sequence follows certain statistical distributions depending on the sample size. A random number generator (in our case a pseudo-random number generator, or PRNG) is simply a device to produce a quasi-random sequence of numbers that we can only estimate (based on the given probability inherent within the sequence) to be random.
You should explain to the children that programs can only mimic these devices using complex mathematical formulas (which guarantee a lack of "randomness" by definition because they are a result of some function, or procedural algorithm). Typically, rigorous statistical analysis is necessary in order to differentiate the use of a quantum hardware PRNG (use this as an opportunity to explain to your kids the Heisenberg Principle!) and that of a strong software PRNG.
Had to be done really
Source: http://xkcd.com/221/
Because there is no such thing as a random number.
Random is a human concept that we use when we cannot comprehend data and do not understand it. If we are to believe that science will ultimately lead to an understanding of how everything works then surely everything is deterministic.
Take away the human and there is no random there is only "this". It happens because it happens, not because it is random.
Because a program is a system and everything in a system is made to run with consistency and regularity. Randomness has no place in a system.
It is hard because given the same sets of inputs and conditions, a program will produce the same result everytime. This by definition is not random.
Algorithms to generate random numbers are inevitably deterministic. They take a small random seed, and use it to obtain a long string of pseudo-random digits.
It's very difficult to do this without introducing subtle patterns into the data. A string of digits can look perfectly random but have repeated patterns which make the distribution innappropriate for applications where randomness is required.
Computers can only execute algorithmic computations, and a truly random number isn't an algorithmic thing. You can get algorithms that produce numbers that behave like random numbers; such algorithms are called 'Pseudo-Random number generators'.
At various times in the past, people have made random number generators from analog-digital converters connected to sources of electronic noise, but this tends to be fairly specialised kit.
Primarily because computers don't have any functions that behave in discrete, non-random ways. A computer is predictable, which allows us to program reliable software. If it wasn't predictable it would be easier to generate a random number (since our software could rely on this unpredictable method).
While it's possible to generate pseudo-random numbers, and numbers that are distributed randomly, you cannot generate truly random numbers without separate hardware. There is hardware that generates truly random numbers based on "quantum" interactions (at least according to the manufacturers). Online poker sites sometimes use these adapters for their generators.
Apparently there are even online services to provide random numbers - random.org for example.
As surprising as it may seem, it is difficult to get a computer to do something by chance. A computer follows its instructions blindly and is therefore completely predictable. (A computer that doesn't follow its instructions in this manner is broken.) There are two main approaches to generating random numbers using a computer: Pseudo-Random Number Generators (PRNGs) and True Random Number Generators (TRNGs).
Actually, on most modern computers it's not hard to produce numbers that are "random enough" for most purposes. As others have noted, the critical thing is having a source of randomness. You can't just write a program that will produce randomness algorithmically, but you can observe randomness in the various activities of most computers of reasonable complexity, i.e., the ones we typically think of when writing programs. One such source is timing data of interrupts from various system devices.
At one time many computers had no way to get at this data and could only offer pseudorandomness, that is, a random, but repeatable distribution of numbers based on a particular seed. For many purposes this is sufficient -- choosing a different seed each time results in good enough randomness. For other purposes, such as encryption, this isn't strong enough and you need some randomness to start with that isn't repeatable or predictable. Today, most computers (with the exception of embedded devices, perhaps) are sophisticated enough to have a source of randomness that can generate encryption-strength random numbers. For instance, Linux has /dev/random and the .NET framework supports the cryptographically strong RandomNumberGenerator class which has a number of implementations.
Its probably helpful to distinguish between a number that is hard to predict (which a computer can create) from something that is not deterministic (which is a bit tougher for computers, and theoretically, any physical being).
It's easy to come up with an algorithm that generates unexpected numbers, that appear random in some sense. But to design an algorithm that generates true random numbers, well, that's hard.
Imagine designing an algorithm to simulate a dice roll. You can easily formulate some procedure to generate different numbers on each iteration. But can you guarantee that, in the long run (I mean, up to the infinity), the amount of times that 6 came out will be the same as any other number? When designing a good random number generator, that's the kind of commitment that you have to assume. You have to provide strong guarantees (i.e. mathematical proofs) about the randomness, if the application (e.g. lottery) requires it.
It is relevant to note that humans perform very poorly at generating random numbers. Computers are worse because they just follow a strict set commands. Humans can only generate good (pseudo) random numbers when following an algorithm, a set of commands. Computers are the same.
Although it should be noted that computers can gather entropy from the "environment" connected to it, like keyboard and mouse actions, what aids in generating random numbers (either directly or by seeding a PRNG).
To make the computer generate a random number, the computer has to have a source of randomness to start with.
It has to be feeded a seed that can't be expected or calculated by just looking at the seed, if the seed comes from a clock then it can be predicted or calculated by knowing the time, if the seed comes from like filming a lavalamp and get numbers from the picture stream then it's harder to just look at the seed to know what next number will be.
The computer does not have an built in lava lamp to generate that randomness, thats whats make it hard, we have to substitute real randomness with some input that exists in the computer, maybe by logging passing tcpip-packets or other things, but its not many ways to get that randomness sources in.
Computers just don't have suitable hardware. Ordinary computer's hardware is meant to be deterministic. With suitable hardware like mentioned here random numbers are not a problem at all.
Awhile back I came across the "Dice-O-Matic"
http://GamesByEmail.com/News/DiceOMatic
Kind of interesting real world application of the problem.
Its not hard, here's a couple for free: 12, 1400, 397.6

True random number generator [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Sorry for this not being a "real" question, but Sometime back i remember seeing a post here about randomizing a randomizer randomly to generate truly random numbers, not just pseudo random. I dont see it if i search for it.
Does anybody know about that article?
I have to disagree with a lot of the answers to this question.
It is possible to collect random data on a computer. SSL, SSH and VPNs would not be secure if you couldn't.
The way software random number generator work is there is a pool of random data that is gathered from many different places, such as clock drift, interrupt timings, etc.
The trick to these schemes is in correctly estimating the entropy (the posh name for the randomness). It doesn't matter whether the source is bias, as long as you estimate the entropy correctly.
To illustrate this, the chance of me hitting the letter e in this comment is much higher than that of z , so if I were to use key interrupts as a source of entropy it would be bias - but there is still some randomness to be had in that input. You can't predict exactly which sequence of letters will come next in this paragraph. You can extract entropy from this uncertainty and use it part of a random byte.
Good quality real-random generators like Yarrow have quite sophisticated entropy estimation built in to them and will only emit as many bytes as it can reliably say it has in its "randomness pool."
I believe that was on thedailywtf.com - ie. not something that you want to do.
It is not possible to get a truly random number from pseudorandom numbers, no matter how many times you call randomize().
You can get "true" random numbers from special hardware. You could also collect entropy from mouse movements and things like that.
At the end of the post, I will answer your question of why you might want to use multiple random number generators for "more randomness".
There are philosophical debates about what randomness means. Here, I will mean "indistinguishable in every respect from a uniform(0,1) iid distribution over the samples drawn" I am totally ignoring philosophical questions of what random is.
Knuth volume 2 has an analysis where he attempts to create a random number generator as you suggest, and then analyzes why it fails, and what true random processes are. Volume 2 examines RNGs in detail.
The others recommend you using random physical processes to generate random numbers. However, as we can see in the Espo/vt interaction, these processes can have subtle periodic elements and other non-random elements, in part due to outside factors with deterministic behavior. In general, it is best never to assume randomness, but always to test for it, and you usually can correct for such artifacts if you are aware of them.
It is possible to create an "infinite" stream of bits that appears completely random, deterministically. Unfortunately, such approaches grow in memory with the number of bits asked for (as they would have to, to avoid repeating cycles), so their scope is limited.
In practice, you are almost always better off using a pseudo-random number generator with known properties. The key numbers to look for is the phase-space dimension (which is roughly offset between samples you can still count on being uniformally distributed) and the bit-width (the number of bits in each sample which are uniformally random with respect to each other), and the cycle size (the number of samples you can take before the distribution starts repeating).
However, since random numbers from a given generator are deterministically in a known sequence, your procedure might be exposed by someone searching through the generator and finding an aligning sequence. Therefore, you can likely avoid your distribution being immediately recognized as coming from a particular random number generator if you maintain two generators. From the first, you sample i, and then map this uniformally over one to n, where n is at most the phase dimension. Then, in the second you sample i times, and return the ith result. This will reduce your cycle size to (orginal cycle size/n) in the worst case, but for that cycle will still generate uniform random numbers, and do so in a way that makes the search for alignment exponential in n. It will also reduce the independent phase length. Don't use this method unless you understand what reduced cycle and independent phase lengths mean to your application.
An algorithm for truly random numbers cannot exist as the definition of random numbers is:
Having unpredictable outcomes and, in
the ideal case, all outcomes equally
probable; resulting from such
selection; lacking statistical
correlation.
There are better or worse pseudorandom number generators (PRNGs), i.e. completely predictable sequences of numbers that are difficult to predict without knowing a piece of information, called the seed.
Now, PRNGs for which it is extremely hard to infer the seed are cryptographically secure. You might want to look them up in Google if that is what you seek.
Another way (whether this is truly random or not is a philosophical question) is to use random sources of data. For example, unpredictable physical quantities, such as noise, or measuring radioactive decay.
These are still subject to attacks because they can be independently measured, have biases, and so on. So it's really tricky. This is done with custom hardware, which is usually quite expensive. I have no idea how good /dev/random is, but I would bet it is not good enough for cryptography (most cryptography programs come with their own RNG and Linux also looks for a hardware RNG at start-up).
According to Wikipedia /dev/random, in Unix-like operating systems, is a special file that serves as a true random number generator.
The /dev/random driver gathers environmental noise from various non-deterministic sources including, but not limited to, inter-keyboard timings and inter-interrupt timings that occur within the operating system environment. The noise data is sampled and combined with a CRC-like mixing function into a continuously updating ``entropy-pool''. Random bit strings are obtained by taking a MD5 hash of the contents of this pool. The one-way hash function distills the true random bits from pool data and hides the state of the pool from adversaries.
The /dev/random routine maintains an estimate of true randomness in the pool and decreases it every time random strings are requested for use. When the estimate goes down to zero, the routine locks and waits for the occurrence of non-deterministic events to refresh the pool.
The /dev/random kernel module also provides another interface, /dev/urandom, that does not wait for the entropy-pool to re-charge and returns as many bytes as requested. As a result /dev/urandom is considerably faster at generation compared to /dev/random which is used only when very high quality randomness is desired.
John von Neumann once said something to the effect of "anyone attempting to generate random numbers via algorithmic means is, of course, living in sin."
Not even /dev/random is random, in a mathematician's or a physicist's sense of the word. Not even radioisotope decay measurement is random. (The decay rate is. The measurement isn't. Geiger counters have a small reset time after each detected event, during which time they are unable to detect new events. This leads to subtle biases. There are ways to substantially mitigate this, but not completely eliminate it.)
Stop looking for true randomness. A good pseudorandom number generator is really what you're looking for.
If you believe in a deterministic universe, true randomness doesn't exist. :-) For example, someone has suggested that radioactive decay is truly random, but IMHO, just because scientists haven't yet worked out the pattern, doesn't mean that there isn't a pattern there to be worked out. Usually, when you want "random" numbers, what you need are numbers for encryption that no one else will be able to guess.
The closest you can get to random is to measure something natural that no enemy would also be able to measure. Usually you throw away the most significant bits, from your measurement, leaving numbers with are more likely to be evenly spread. Hard core random number users get special hardware that measures radioactive events, but you can get some randomness from the human using the computer from things like keypress intervals and mouse movements, and if the computer doesn't have direct users, from CPU temperature sensors, and from network traffic. You could also use things like web cams and microphones connected to sound cards, but I don't know if anyone does.
To summarize some of what has been said, our working definition of what a secure source of randomness is is similar to our definition of cryptographically secure: it appears random if smart folks have looked at it and weren't able to show that it isn't completely unpredictable.
There is no system for generating random numbers which couldn't conceivably be predicted, just as there is no cryptographic cipher that couldn't conceivably be cracked. The trusted solutions used for important work are merely those which have proven to be difficult to defeat so far. If anyone tells you otherwise, they're selling you something.
Cleverness is rarely rewarded in cryptography. Go with tried and true solutions.
A computer usually has many readily available physical sources of random noise:
Microphone (hopefully in a noisy place)
Compressed video from a webcam (pointed to something variable, like a lava lamp or a street)
Keyboard & mouse timing
Network packet content and timing (the whole world contributes)
And sometimes
Clock drift based hardware
Geiger counters and other detectors of rare events
All sorts of sensors attached to A/D converters
What's difficult is estimating the entropy of these sources, which is in most cases low despite the high data rates and very variable; but entropy can be estimated with conservative assumptions, or at least not wasted, to feed systems like Yarrow or Fortuna.
It's not possible to obtain 'true' random numbers, a computer is a logical construct that can't possibly create 'truly' random anything, only pseudo-random. There are better and worse pseudo-random algorithms out there, however.
In order to obtain a 'truly' random number you need a physical random source, some gambling machines actually have these built in - often it's a radioactive source, the radioactive decay (which as far as I know is truly random) is used to generate the numbers.
One of the best method to generate a random number is through Clock Drift. This primarily works with two oscillators.
An analogy of how this works is imagine a race car on a simple oval circuit with a while line at the start of the lap and also a while line on one of the tyres. When the car completes a lap, a number will be generated based on the difference between the position of the white line on the road and on the tyre.
Very easy to generate and impossible to predict.

Resources