True random number generator [closed] - algorithm

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Sorry for this not being a "real" question, but Sometime back i remember seeing a post here about randomizing a randomizer randomly to generate truly random numbers, not just pseudo random. I dont see it if i search for it.
Does anybody know about that article?

I have to disagree with a lot of the answers to this question.
It is possible to collect random data on a computer. SSL, SSH and VPNs would not be secure if you couldn't.
The way software random number generator work is there is a pool of random data that is gathered from many different places, such as clock drift, interrupt timings, etc.
The trick to these schemes is in correctly estimating the entropy (the posh name for the randomness). It doesn't matter whether the source is bias, as long as you estimate the entropy correctly.
To illustrate this, the chance of me hitting the letter e in this comment is much higher than that of z , so if I were to use key interrupts as a source of entropy it would be bias - but there is still some randomness to be had in that input. You can't predict exactly which sequence of letters will come next in this paragraph. You can extract entropy from this uncertainty and use it part of a random byte.
Good quality real-random generators like Yarrow have quite sophisticated entropy estimation built in to them and will only emit as many bytes as it can reliably say it has in its "randomness pool."

I believe that was on thedailywtf.com - ie. not something that you want to do.
It is not possible to get a truly random number from pseudorandom numbers, no matter how many times you call randomize().
You can get "true" random numbers from special hardware. You could also collect entropy from mouse movements and things like that.

At the end of the post, I will answer your question of why you might want to use multiple random number generators for "more randomness".
There are philosophical debates about what randomness means. Here, I will mean "indistinguishable in every respect from a uniform(0,1) iid distribution over the samples drawn" I am totally ignoring philosophical questions of what random is.
Knuth volume 2 has an analysis where he attempts to create a random number generator as you suggest, and then analyzes why it fails, and what true random processes are. Volume 2 examines RNGs in detail.
The others recommend you using random physical processes to generate random numbers. However, as we can see in the Espo/vt interaction, these processes can have subtle periodic elements and other non-random elements, in part due to outside factors with deterministic behavior. In general, it is best never to assume randomness, but always to test for it, and you usually can correct for such artifacts if you are aware of them.
It is possible to create an "infinite" stream of bits that appears completely random, deterministically. Unfortunately, such approaches grow in memory with the number of bits asked for (as they would have to, to avoid repeating cycles), so their scope is limited.
In practice, you are almost always better off using a pseudo-random number generator with known properties. The key numbers to look for is the phase-space dimension (which is roughly offset between samples you can still count on being uniformally distributed) and the bit-width (the number of bits in each sample which are uniformally random with respect to each other), and the cycle size (the number of samples you can take before the distribution starts repeating).
However, since random numbers from a given generator are deterministically in a known sequence, your procedure might be exposed by someone searching through the generator and finding an aligning sequence. Therefore, you can likely avoid your distribution being immediately recognized as coming from a particular random number generator if you maintain two generators. From the first, you sample i, and then map this uniformally over one to n, where n is at most the phase dimension. Then, in the second you sample i times, and return the ith result. This will reduce your cycle size to (orginal cycle size/n) in the worst case, but for that cycle will still generate uniform random numbers, and do so in a way that makes the search for alignment exponential in n. It will also reduce the independent phase length. Don't use this method unless you understand what reduced cycle and independent phase lengths mean to your application.

An algorithm for truly random numbers cannot exist as the definition of random numbers is:
Having unpredictable outcomes and, in
the ideal case, all outcomes equally
probable; resulting from such
selection; lacking statistical
correlation.
There are better or worse pseudorandom number generators (PRNGs), i.e. completely predictable sequences of numbers that are difficult to predict without knowing a piece of information, called the seed.
Now, PRNGs for which it is extremely hard to infer the seed are cryptographically secure. You might want to look them up in Google if that is what you seek.
Another way (whether this is truly random or not is a philosophical question) is to use random sources of data. For example, unpredictable physical quantities, such as noise, or measuring radioactive decay.
These are still subject to attacks because they can be independently measured, have biases, and so on. So it's really tricky. This is done with custom hardware, which is usually quite expensive. I have no idea how good /dev/random is, but I would bet it is not good enough for cryptography (most cryptography programs come with their own RNG and Linux also looks for a hardware RNG at start-up).

According to Wikipedia /dev/random, in Unix-like operating systems, is a special file that serves as a true random number generator.
The /dev/random driver gathers environmental noise from various non-deterministic sources including, but not limited to, inter-keyboard timings and inter-interrupt timings that occur within the operating system environment. The noise data is sampled and combined with a CRC-like mixing function into a continuously updating ``entropy-pool''. Random bit strings are obtained by taking a MD5 hash of the contents of this pool. The one-way hash function distills the true random bits from pool data and hides the state of the pool from adversaries.
The /dev/random routine maintains an estimate of true randomness in the pool and decreases it every time random strings are requested for use. When the estimate goes down to zero, the routine locks and waits for the occurrence of non-deterministic events to refresh the pool.
The /dev/random kernel module also provides another interface, /dev/urandom, that does not wait for the entropy-pool to re-charge and returns as many bytes as requested. As a result /dev/urandom is considerably faster at generation compared to /dev/random which is used only when very high quality randomness is desired.

John von Neumann once said something to the effect of "anyone attempting to generate random numbers via algorithmic means is, of course, living in sin."
Not even /dev/random is random, in a mathematician's or a physicist's sense of the word. Not even radioisotope decay measurement is random. (The decay rate is. The measurement isn't. Geiger counters have a small reset time after each detected event, during which time they are unable to detect new events. This leads to subtle biases. There are ways to substantially mitigate this, but not completely eliminate it.)
Stop looking for true randomness. A good pseudorandom number generator is really what you're looking for.

If you believe in a deterministic universe, true randomness doesn't exist. :-) For example, someone has suggested that radioactive decay is truly random, but IMHO, just because scientists haven't yet worked out the pattern, doesn't mean that there isn't a pattern there to be worked out. Usually, when you want "random" numbers, what you need are numbers for encryption that no one else will be able to guess.
The closest you can get to random is to measure something natural that no enemy would also be able to measure. Usually you throw away the most significant bits, from your measurement, leaving numbers with are more likely to be evenly spread. Hard core random number users get special hardware that measures radioactive events, but you can get some randomness from the human using the computer from things like keypress intervals and mouse movements, and if the computer doesn't have direct users, from CPU temperature sensors, and from network traffic. You could also use things like web cams and microphones connected to sound cards, but I don't know if anyone does.

To summarize some of what has been said, our working definition of what a secure source of randomness is is similar to our definition of cryptographically secure: it appears random if smart folks have looked at it and weren't able to show that it isn't completely unpredictable.
There is no system for generating random numbers which couldn't conceivably be predicted, just as there is no cryptographic cipher that couldn't conceivably be cracked. The trusted solutions used for important work are merely those which have proven to be difficult to defeat so far. If anyone tells you otherwise, they're selling you something.
Cleverness is rarely rewarded in cryptography. Go with tried and true solutions.

A computer usually has many readily available physical sources of random noise:
Microphone (hopefully in a noisy place)
Compressed video from a webcam (pointed to something variable, like a lava lamp or a street)
Keyboard & mouse timing
Network packet content and timing (the whole world contributes)
And sometimes
Clock drift based hardware
Geiger counters and other detectors of rare events
All sorts of sensors attached to A/D converters
What's difficult is estimating the entropy of these sources, which is in most cases low despite the high data rates and very variable; but entropy can be estimated with conservative assumptions, or at least not wasted, to feed systems like Yarrow or Fortuna.

It's not possible to obtain 'true' random numbers, a computer is a logical construct that can't possibly create 'truly' random anything, only pseudo-random. There are better and worse pseudo-random algorithms out there, however.
In order to obtain a 'truly' random number you need a physical random source, some gambling machines actually have these built in - often it's a radioactive source, the radioactive decay (which as far as I know is truly random) is used to generate the numbers.

One of the best method to generate a random number is through Clock Drift. This primarily works with two oscillators.
An analogy of how this works is imagine a race car on a simple oval circuit with a while line at the start of the lap and also a while line on one of the tyres. When the car completes a lap, a number will be generated based on the difference between the position of the white line on the road and on the tyre.
Very easy to generate and impossible to predict.

Related

What are typical means by which a random number can be generated in an embedded system?

What are typical means by which a random number can be generated in an embedded system? Can you offer advantages and disadvantages for each method, and/or some factors that might make you choose one method over another?
First, you have to ask a fundamental question: do you need unpredictable random numbers?
For example, cryptography requires unpredictable random numbers. That is, nobody must be able to guess what the next random number will be. This precludes any method that seeds a random number generator from common parameters such as the time: you need a proper source of entropy.
Some applications can live with a non-cryptographic-quality random number generator. For example, if you need to communicate over Ethernet, you need a random number generator for the exponential back-off; statistic randomness is enough for this¹.
Unpredictable RNG
You need an unpredictable RNG whenever an adversary might try to guess your random numbers and do something bad based on that guess. For example, if you're going to generate a cryptographic key, or use many other kinds of cryptographic algorithms, you need an unpredictable RNG.
An unpredictable RNG is made of two parts: an entropy source, and a pseudo-random number generator.
Entropy sources
An entropy source kickstarts the unpredictability. Entropy needs to come from an unpredictable source or a blend of unpredictable sources. The sources don't need to be fully unpredictable, they need to not be fully predictable. Entropy quantifies the amount of unpredictability. Estimating entropy is difficult; look for research papers or evaluations from security professionals.
There are three approaches to generating entropy.
Your device may include some non-deterministic hardware. Some devices include a dedicated hardware RNG based on physical phenomena such as unstable oscillators, thermal noise, etc. Some devices have sensors which capture somewhat unpredictable values, such as the low-order bits of light or sound sensors.
Beware that hardware RNG often have precise usage conditions. Most methods require some time after power-up before their output is truly random. Often environmental factors such as extreme temperatures can affect the randomness. Read the RNG's usage notes very carefully. For cryptographic applications, it is generally recommended to make statistical tests the HRNG's output and refuse to operate if these tests fail.
Never use a hardware RNG directly. The output is rarely fully unpredictable — e.g. each bit may have a 60% probability of being 1, or the probability of two consecutive bits being equal may be only 48%. Use the hardware RNG to seed a PRNG as explained below.
You can preload a random seed during manufacturing and use that afterwards. Entropy doesn't wear off when you use it²: if you have enough entropy to begin with, you'll have enough entropy during the lifetime of your device. The danger with keeping entropy around is that it must remain confidential: if the entropy pool accidentally leaks, it's toast.
If your device has a connection to a trusted third party (e.g. a server of yours, or a master node in a sensor network), it can download entropy from that (over a secure channel).
Pseudo-random number generator
A PRNG, also called deterministic random bit generator (DRBG), is a deterministic algorithm that generates a sequence of random numbers by transforming an internal state. The state must be seeded with sufficient entropy, after which the PRNG can run practically forever. Cryptographic-quality PRNG algorithms are based on cryptographic primitives; always use a vetted algorithm (preferably some well-audited third-party code if available).
The PRNG needs to be seeded with entropy. You can choose to inject entropy once during manufacturing, or at each boot, or periodically, or any combination.
Entropy after a reboot
You need to take care that the device doesn't boot twice in the same RNG state: otherwise an observer can repeat the same sequence of RNG calls after a reset and will know the RNG output the second time round. This is an issue for factory-injected entropy (which by definition is always the same) as well as for entropy derived from sensors (which takes time to accumulate).
If possible, save the RNG state to persistent storage. When the device boots, read the RNG state, apply some transformation to it (e.g. by generating one random word), and save the modified state. After this is done, you can start returning random numbers to applications and system services. That way, the device will boot with a different RNG state each time.
If this is not possible, you ned to be very careful. If your device has factory-injected entropy plus a reliable clock, you can mix the clock value into the RNG state to achieve unicity; however, beware that if your device loses power and the clock restarts from some fixed origin (blinking twelve), you'll be in a repeatable state.
Predictable RNG state after a reset or at the first boot is a common problem with embedded devices (and with servers). For example, a study of RSA public keys showed that many had been generated with insufficient entropy, resulting in many devices generating the same key³.
Statistical RNG
If you can't achieve a cryptographic quality, you can fall back to a less good RNG. You need to be aware that some applications (including a lot of cryptography) will be impossible.
Any RNG relies on a two-part structure: a unique seed (i.e. an entropy source) and a deterministic algorithm based on that seed.
If you can't gather enough entropy, at least gather as much as possible. In particular, make sure that no two devices start from the same state (this can usually be achieved by mixing the serial number into the RNG seed). If at all possible, arrange for the seed not to repeat after a reset.
The only excuse not to use a cryptographic DRBG is if your device doesn't have enough computing power. In that case, you can fall back to faster algorithm that allow observers to guess some numbers based on the RNG's past or future output. The Mersenne twister is a popular choice, but there have been improvements since its invention.
¹ Even this is debatable: with non-crypto-quality random backoff, another device could cause a denial of service by aligning its retransmission time with yours. But there are other ways to cause a DoS, by transmitting more often.
² Technically, it does, but only at an astronomical scale.
³ Or at least with one factor in common, which is just as bad.
One way to do it would be to create a Pseudo Random Bit Sequence, just a train of zeros and ones, and read the bottom bits as a number.
PRBS can be generated by tapping bits off a shift register, doing some logic on them, and using that logic to produce the next bit shifted in. Seed the shift register with any non zero number. There's a math that tells you which bits you need to tap off of to generate a maximum length sequence (i.e., 2^N-1 numbers for an N-bit shift register). There are tables out there for 2-tap, 3-tap, and 4-tap implementations. You can find them if you search on "maximal length shift register sequences" or "linear feedback shift register.
from: http://www.markharvey.info/fpga/lfsr/
HOROWITZ AND HILL gave a great part of a chapter on this. Most of the math surrounds the nature of the PRBS and not the number you generate with the bit sequence. There are some papers out there on the best ways to get a number out of the bit sequence and improving correlation by playing around with masking the bits you use to generate the random number, e.g., Horan and Guinee, Correlation Analysis of Random Number Sequences based on Pseudo Random Binary Sequence Generation, In the Proc. of IEEE ISOC ITW2005 on Coding and Complexity; editor M.J. Dinneen; co-chairs U. Speidel and D. Taylor; pages 82-85
An advantage would be that this can be achieved simply by bitshifting and simple bit logic operations. A one-liner would do it. Another advantage is that the math is pretty well understood. A disadvantage is that this is only pseudorandom, not random. Also, I don't know much about random numbers, and there might be better ways to do this that I simply don't know about.
How much energy you expend on this would depend on how random you need the number to be. If I were running a gambling site, and needed random numbers to generate deals, I wouldn't depend on Pseudo Random Bit Sequences. In those cases, I would probably look into analog noise techniques, maybe Johnson Noise around a big honking resistor or some junction noise on a PN junction, amplify that and sample it. The advantages of that are that if you get it right, you have a pretty good random number. The disadvantages are that sometimes you want a pseudorandom number where you can exactly reproduce a sequence by storing a seed. Also, this uses hardware, which someone must pay for, instead of a line or two of code, which is cheap. It also uses A/D conversion, which is yet another peripheral to use. Lastly, if you do it wrong -- say make a mistake where 60Hz ends up overwhelming your white noise-- you can get a pretty lousy random number.
What are typical means by which a random number can be generated in an embedded system?
Giles indirectly stated this: it depends on the use.
If you are using the generator to drive a simulation, then all you need is a uniform distribution and a linear congruential generator (LCG) will work fine.
If you need a secure generator, then its a trickier problem. I'm side-stepping what it means to be secure, but from 10,000 feet think "wrap it in a cryptographic transformation", like a SHA-1/HMAC or SHA-512/HMAC. There are others ways, like sampling random events, but they may not be viable.
When you need secure random numbers, some low resource devices are notoriously difficult to work with. See, for example, Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices and Traffic sensor flaw that could allow driver tracking fixed. And a caveat for Linux 3.0 kernel users: the kernel removed a couple of entropy sources, so entropy depletion and starvation might have gotten worse. See Appropriate sources of entropy on LWN.
If you have a secure generator, then your problem becomes getting your hands on a good seed (or seeds over time). One of the better methods I have seen for environments that are constrained is Hedging. Hedging was proposed for Virtual Machines where a program could produce the same sequence after a VM reset.
The idea for hedging is to extract the randomness provided by your peer, and use it to keep you secure generator fit. For example, in the case of TLS, there is a client_random and a server_random. If the device is a server, then it would stir in the client_random. If the device is a client, then it would stir in server_random.
You can find the two papers of interest that address hedging at:
When Good Randomness Goes Bad: Virtual Machine Reset
Vulnerabilities and Hedging Deployed Cryptography
When Virtual is Harder than Real: Resource Allocation Challenges in
Virtual Machine Based IT Environments
Using client_random and a server_random is consistent with Peter Guttman's view on the subject: "mix every entropy source you can get your hands on into your PRNG, including less-than-perfect ones". Gutmann is the author of Engineering Security.
Hedging only solves part of the problem. You will still need to solve other problems, like how to bootstrap the entropy pool, how to regenerate system key pairs when the pool is in a bad state, and how persist the entropy across reboots when there's no filesystem.
Although it may not be the most complex or sound method, it can be fun to use external stimuli as your seed for random number generation. Consider using analogue input from a photodiode, or a thermistor. Even random noise from a floating pin could potentially yield some interesting results.

Is there a somewhat-reliable way to detect that a list of integers came from a common PRNG?

Basically I'm looking for a detective function. I pass it a list of integers (probably between 20 and 100 integers) and it tell me "Yeah, 84% chance this came from a PRNG, I tested it against the main ones that most modern programming languages use", or "No, only 12% chance this came from a well-known PRNG".
If it helps (or hinders), the integers will always be between 1 and 999.
Does this exist?
Unless you are prepared to break new ground in number theory, you would only be able to detect obsolete, badly designed, or poorly seeded PRNGs. Good PRNGs are explicitly designed to prevent what you are trying to do. Random number generation is a critical part of digital cryptography, so a lot of effort goes into producing random numbers that meet all known tests.
There are batteries of tests to profile PRNGs. See for example this NIST page.
As the comments point out, the first two sentences are overstated and are only strictly true for PRNGs that may be used in cryptography. Weaker (i.e. more predictable) PRNGs might be chosen for other domains in order to improve time or space performance.
You can write a battery of tests for a list of candidate generators, but there are a lot of generators, and some have enormous state where adjacent values of a well-seeded generator will reveal nothing useful and you'll have to see wait for a long time before you can get the two data points which will have an informative relationship.
On the plus side; while the list of random number generators that you might encounter is vast, there are telltale signs that will help you identify some classes of simple generators quickly and then you can perform focussed analysis to derive the specific configuration.
Unfortunately even a simple generator like KISS shows that while the generator can be trivially broken when you know its configuration, it can hide its signature from anything that does not know its configuration, leaving you in a situation where you have to individually test for every possible configuration.
There are quality tests like dieharder and TestU01 which will consume many megabytes of data to identify any weakness in a generator; however, these can also identify weaknesses in real RNGs, so they could give a strong false positive.
To consume only a 100 integers you would really need to have a list of generators in mind. For example, to detect LCG used inappropriately, you simply test to see if the bottom three bits cycle through a repeating pattern of 8 values -- but this is by far the easiest case.
If you had a sequence 625 or more 32-bit integers, you could detect with high confidence whether it was from consecutive calls to Mersenne Twister. That is because it leaks state information in the output values.
For an example of how it is done, see this blog entry.
Similar results are in theory possible when you don't have ideal data such as full 32-bit integers, but you would need a longer sequence and the maths gets harder. You would also need to know - or perhaps guess by trying obvious options - how the numbers were being reduced from the larger range to the smaller one.
Similar results are possible from other PRNGs, but generally only the non-cryptographic ones.
In principle you could identify specific PRNG sequences with very high confidence, but even simple barriers such as missing numbers from the strict sequence can make it a lot harder. There will also be many PRNGs that you will not be able to reliably detect, and typically you will either have close to 100% confidence of a match (to a hackable PRNG) or 0% confidence of any match.
Whether or not a PRNG is a hackable (and therefore could be detected by the numbers it emits) is not a general indicator of PRNG quality. Obviously, "hackable" is opposite to a requirement for "secure", so don't consider Mersenne Twister for creating unguessable codes. However, do consider it as a source of randomness for e.g. neural networks, genetic algorithms, monte-carlo simulations and other places where you need a lot of statistically random-looking data.

True random number generation

How is exactly that we talk about "true random" numbers when we are actually measuring something. I mean, isn't measuring almost the opposite of randomness.
Som articles says that, for example, throwing a dice is "true random". Of course it isn't Pseudo-random, but is it even random?? If you could have a machine that throw dices from de exactly same position and always in the same direction with the exact same force always: woudn't it always turn out the same number? (I thing it does).
Please, can someone help me understand "true random" numbers??
Randomness is essentially a measure of how much we don't know. The universe may or may not be truly deterministic, it doesn't matter - we don't know (and have no foreseeable way of knowing) what the exact time between 2 cosmic ray impacts will be. For pseudorandom numbers, we do, in principle, have a way of knowing, because we can recreate the initial conditions and get the same output again.
Quantum effects are the source of this "True Randomness". E.g. the Heisenberg Uncertanity Principle says that your dice thrower can't exactly define both impulse and location of its throwing arm. (Reading up on pop-sci quantum physics can be scary - the predictability and stability of our world seems to be no more than a great feat of statistics.)
[edit] Since it came up in the comments: There are other, less "obscure" processes "looking random", e.g. wear and air turbulence for a die roll. However, all these things could be argued to be beyond our knowledge but fundamentally deterministic (assuming an objective reality.) Quantum processes are truly random at least under the widely accepted Copenhagen interpretation. [/edit]
There are - as mentioned in other replies - appliances that turn quantum effects into observable random number generators. There are algorithms to "extract" the randomness of any stream of data. There are test algorithms to check if a stream of data "behaves" like a random stream.
OTOH you can argue rather successfully that "random" is a man-made concept, i.e. something that isn't integral part of the objective world, but our limit of understanding (though the uncertainty principle is considered to be not just an observer effect).
When someone asks for any random number generator, the counter question should be: for what application? In the context of this discussion: who do you need to fool? Pseudo vs. True are just generation mechanisms, not fundamental opposites.
In that sense, chaotic beahvior is often "random enough" for most purposes, and can be created with few degrees of freedom already.
I think that when some talks about "true random" numbers in IT this is always from measuring/observing something that is thought to be random in contrast to the pseudo-random algorithms that will always return the very same pattern (given the same starting point or after wrapping around after a certain length). For example, I've heard about devices that measure the electric noise produced by some components like transistors. This is indeed "more" random than a deterministic algorithm.
To increase the "randomness" I know that for example Linux tries to incorporate various external events into its random number generator, for example mouse movements, key presses (AFAIK even duration of key presses), timings from the HD, etc. pp. That is, they try to improve the deterministic algorithm by adding indeterministic sources to it.
For true randomness you'll need to observe physical events. Try this.
True random numbers are those impossible to predict even when you have all the information you can currently collect. For example, the decay of radioactive atoms, wind direction and velocity at different places in the world or even the noise generated by a webcam (this list is in decreasing degrees of impossibility to predict.) There is no guarantee that what's random now will be random a thousand years from now.
Pseudo random numbers are totally possible to predict with the right information, either exploting flaws or knowing the seeds.
To get as close as possible to true random numbers in a computer, you'd need some special hardware.
The crucial difference is that we currently don't know how to predict stuff considered random, but we do currently know how to predict pseudo random numbers.
See this question for all the information you could possibly want about this.
I suppose, theoretically, a precise machine could be built that could skew the results of a die throw. In practice, though, there is always some level of variation that can't be predicted. That's where the randomness comes from. Certainly when a person throws a die, there is so much variation in each throw that the result is "truly random".
Computers can generate "true random" numbers by making use of random phenomena like quantum mechanical effects, or electro-magnetic noise.
On computer (Quartz) you can't generate true random because 2+2 is always 4. Then your random can be only pseudo random better or not better depends on how good this is hashed.
True randomization is a problem when you are working with logic, logic isn't random (at least not if it's working correctly..) That's the reason to why some cryptographic programs ask you to move your mouse in a random pattern since it's hard to reverse engineer you ;)
Anyway, as #DarkDust said, and #mdrg mentioned, you have to rely on physical observations, an example would be to hook up a radition meter and observe when some radioactive materia falls apart. Or measure the wind speed outside. Or measure the noise in some transistor. With some mathematical transformation it's impossible (apart from brute force..) to reverse engineer that random number then.
Randomness is really important for a large set of problem solving techniques in AI, economics, physics etc. The need to impose a probability distribution over a set of possible outcomes drives the need for better and better random number generation.
That said, true randomness is probably a debatable concept. Deterministically speaking it shouldn't happen - a la your dice tossing example. I think this is kind of a sensitive argument for philosophers. In reality we can take 'random' measurement with a geiger counter and some radioactive material. In an ideal setting this gives us a pretty good result made by measurement.
From a human perspective the randomness of our number generators only needs to achieve a certain probability of being random given a priori knowledge of the desired complexity of the outcome the random numbers are going to be required for.
If you think about using Bayes principle given the degree of true randomness measured by some arbitrary notion about how good your random numbers are (In the form of a probability distribution) then you can say something about 'trueness' of man-made random number generation. In fact the 'trueness' will approach zero as the period of a truely random number generator is infinite. This only matters when you get that far but we can't - so 'truely random' is a pretty useless distinction for computer scientists who know how to design a nice pseudo-random (everything is pseudo-random relative to some scale) number generator.
Experiments have shown that coin tossing by a human is not random - it appears that there is roughly a 51% chance that the face upwards when the coin is tossed will show when it lands.
Any physical event that is based on very large numbers is likely to generate true random numbers - examples are white noise or the last few digits of the number of transactions in a day on a major stock market.
Measurement is not the opposite of randomness. Measuring randomness can only be done on very large numbers of the random event, and is statistical in nature. What measuring randomness does is look for patterns in the event at different levels - single events, runs of two events, runs of three events etc. A pseudo random generator will generate patterns, if only the full cycle of the generator, but the better generators show fewer patterns.
From Japan, we are producing modules and PC-boards for True random number generator with the self check function.
I think, you can study what is the true random from our "theory" web pagem since how to check the random number randomness is equal to understanding the true randomness.
Please visit our web site, www.letech-rng.jp, and you can see, we joined Monte-Carlo conference 2010, and presented this theory. And also, you can download our paper at the conference, if you like.
Any number produced by applying classical physics cannot be truly random, because the parameters can be known and outcomes can be influenced by outside interference. The throw of the dice for example is not random. However, since influencing or determining the result of the throw would be very complicated, most people would call this a "true" random result. For all intents and purposes, it can be considered random. But strictly speaking, it is not truly random. Even the weather is not random. It can (theoretically) be influenced and predicting it is immensely complicated. In theory, you can know all parameters that influence it. In practice, you can't, but that's not good enough for true randomness, where actual theoretical impossibility of prediction or influence is a must.
The only true source of randomness, where the result is not predictable even when all involved parameters are known and outside interference cannot influence the result in any predictable manner, is the observation of certain quantum events. It has been mathematically proven that quantum behavior is unpredictable. Radioactive decay, for example. Random number generators based on radioactive decay do actually exist. An easier source of true randomness is the observation of photons reflecting off of a semi-transparent mirror. Such RNGs also exist. A search for "quantum random number generators" should give some quite interesting reads.
I have created a random pad using microphone audio input of the room noise combined with a pseudorandom. This is the only possible way I could think of (adding some kind of an analog, unpredicted, signal) to create true randomness.

What Type of Random Number Generator is Used in the Casino Gaming Industry? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Closed 6 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Given the extremely high requirements for unpredictability to prevent casinos from going bankrupt, what random number generation algorithm and seeding scheme is typically used in devices like slot machines, video poker machines, etc.?
EDIT: Related questions:
Do stateless random number generators exist?
True random number generator
Alternative Entropy Sources
There are many things that the gaming sites have to consider when choosing/implementing an RNG. Without due dilligence, it can go spectacularly wrong.
To get a licence to operate a gaming site in a particular jurisdiction usually requires that the RNG has been certified by an independent third-party. The third-party testers will analyse the source code and run statistical tests (e.g. Diehard) to ensure that the RNG behaves randomly. Reputable poker sites will usually include details of the certification that their RNG has undergone (for example: PokerStars's RNG page).
I've been involved in a few gaming projects, and for one of them I had to design and implement the RNG part, so I had to investigate all of these issues. Most poker sites will use some hardware device for entropy, but they won't rely on just hardware. Usually it will be used in conjunction with a pseudo-RNG (PRNG). There are two main reasons for this. Firstly, the hardware is slow, it can only extract a certain number of bits of entropy in a given time period from whatever physical process it is monitoring. Secondly, hardware fails in unpredictable ways that software PRNGs do not.
Fortuna is the state of the art in terms of cryptographically strong PRNGs. It can be fed entropy from one or more external sources (e.g. a hardware RNG) and is resilient in the face of attempted exploits or RNG hardware failure. It's a decent choice for gaming sites, though some might argue it is overkill.
Pokerroom.com used to just use Java's SecureRandom (they probably still do, but I couldn't find details on their site). This is mostly good enough, but it does suffer from the degrees of freedom problem.
Most stock RNG implementations (e.g. Mersenne Twister) do not have sufficient degrees of freedom to be able to generate every possible shuffle of a 52-card deck from a given initial state (this is something I tried to explain in a previous blog post).
EDIT: I've answered mostly in relation to online poker rooms and casinos, but the same considerations apply to physical video poker and video slots machines in real world casinos.
For a casino gaming applications, I think the seeding of the algorithm is the most important part to make sure all games "booted" up don't run through the same sequence or some small set of predictable sequences. That is, the source of entropy leading to the seed for the starting position is the critical thing. Beyond that, any good quality random number generator where each bit position as has a ~50/50 probability of being 1/0 and the period is relatively long would be sufficient. For example, something like the Mersenne twister PRNG has such properties.
Using cryptographically secure random generators only becomes important when the actual output of the random generator can be viewed directly. For example, if you were monitoring each number actually generated by the number generator - after viewing many numbers in the sequence - with a non-cryptographic generator information about that sequence can lead to establishing information about all the internal state of the generator. At this point, if you know what the algorithm looks like, you would be able to predict future numbers and that would be bad. The cryptographic generator prevents that reverse engineering back to the internal state so that predicting future numbers becomes "impossible".
However, in the case of a casino game, you would (or should) have no visibility to the actual numbers being generated under the hood. Each time a random number is generated - say a 32-bit number - that number will be used then, for example, mod 52 for a deck shuffling algorithm....no where in that process do you have any idea what numbers were being generated by the algorithm to shuffle that deck. That is, most of the bits of "randomness" is just being thrown out and even the ones being used you have no visibility to. Therefore, no way to reverse engineer the state.
Getting back to a true source of entropy to seed the whole process, that is the hard part. See the Wikipedia entry on entropy for some starting points on techniques.
As an aside, if you did want cryptographically sequence random numbers from a "regular" algorithm, a simple approach is to take a few random numbers in sequence, concatenate them together and then run something like MD5 or SHA-1 on them and the result is just as random and also cryptographically secure. That is, you just made your own "secure" random number generator.
We've been using the Protego R210-USB TRNG (and the non-usb version before that) as random seed generators in casino applications, with java.security.SecureRandom
on top. We had The Swedish National Laboratory of Forensic Science perform a separate audit of the R210, and it passed without a flaw.
You probably need a cryptographically secure pseudo-random generator. There are a lot of variants. Google "Blum-Blum-Shub", for example.
The security properties of these pseudo-random generators will generally be that, even when the attacker can observe polynomially many outputs from such generators, it won't be feasible to guess the next output with a probability much better than random guessing. Also, it is not feasible to distinguish the output of such generators from truly random bits. The security holds even when all the algorithms and parameters are known by the attacker (except for the secret seed).
The security of the generators is often measured with respect to a security parameter. In the case of BBS, it is the size of the modulus. This is no different from other crypto stuff. For example, RSA is secure only when the key is long enough.
Note that, the output of such generators may not be uniform (in fact, can be far away from uniform in statistical sense). But since no one can distinguish the two distributions without infinite computing power, these generators will suffice in most applications that require truly random bits.
Bear in mind, however, that these cryptographically secure pseudo-random generators are usually slow. So if speed is indeed a concern, less rigorous approaches may be more relevant, such as using hash functions, as suggested by Jeff.
Casino slot machines generate random numbers continuously at very high speed and use the most recent result(s) when the user pulls the lever (or hits the button) to spin the reels.
Even a simplistic generator can be used. Even if you knew the algorithm used, you cannot observe where in the sequence it is because nearly all the results are discarded. If somehow you did know where it was in the sequence, you'd have to have millisecond or better timing to take advantage of it.
Modern "mechanical reel" machines use PRNGs and drive the reels with stepper motors to simulate the old style spin-and-brake.
http://en.wikipedia.org/wiki/Slot_machine#Random_number_generators
http://entertainment.howstuffworks.com/slot-machine3.htm
Casinos shouldn't be using Pseudo-random number generators, they should be using hardware ones: http://en.wikipedia.org/wiki/Hardware_random_number_generator
I suppose anything goes for apps and offshore gambling nowadays, but all these other answers are incomplete, at least for Nevada Gaming Control Board licensed machines, which I believe the question is originally about.
The technical specifications for RNGs licensed in Nevada for gaming purposes are laid out in Regulation 14.040(2).
As of May 24, 2012, here is a summary of the rules a RNG must follow:
Static seeds cannot be used. You have to seed the RNG using a millisecond time source or other true entropy source which has no external readout anywhere on the machine. (This helps to reduce the incidence of "magic number" attacks)
The RNG must continue generating numbers in its sequence at least 100 times per second when the game is not being played. (This helps to avoid timing attacks)
RNG outputs cannot be reused; they must be used exactly once if at all and then thrown away.
Multi-system cabinets must use a separate RNG and separate seed for each game.
Games that use RNGs for helping to choose numbers on behalf of the player (such as Lotto Quick Pick) must use a separate RNG for that process.
Games must not roll RNGs until they are actually needed in a game. (i.e. you need to wait until the player chooses to deal or spin before generating RNGs)
The RNG must pass a 95% confidence chi-squared test based on 10,000 trials as apart of a system test. It must display a warning if this test fails, and it must disable play if it fails twice in a row.
It must remember, and be able to report on, the last 10 test results as described in 7.
Every possible game outcome must be generatable by the RNG. As a pessimal example, linear congruential generators don't actually generate every possible output in their range, so they're not very useful for gaming.
Additionally, your machine design has to be submitted to the gaming commission and it has to be approved, which is expensive and takes lots of time. There are a few third-party companies that specialize in auditing your new RNG to make sure it's random. Gaming Laboratories publishes an even stricter set of standards than Nevada does. They go into much greater detail about the limitations of hardware RNGs, and Nevada in particular likes to see core RNGs that it's previously approved. This can all get very expensive, which is why many developers prefer to license an existing previously-approved RNG for new game projects.
Here is a fun list of random number generator attacks to keep you up late at night.
For super-nerds only: the source for most USB hardware RNGs is typically an avalanche diode. However, the thermal noise produced by this type of diode is not quantum-random, and it is possible to influence the randomness of an avalanche diode by significantly lowering the temperature.
As a final note, someone above recommended just using a Mersenne Twister for random number generation. This is a Bad Idea unless you are taking additional entropy from some other source. The plain vanilla Mersenne Twister is highly inappropriate for gaming and cryptographic applications, as described by its creator.
If you want to do it properly you have to get physical - ERNIE the UK national savings number picker uses a shot noise in Neon tubes.
I for sure have seen a german gambling machine that was not allowed to be ran commercially after a given date, so I suppose it was a PNRG with a looong one time pad seed list.
Most poker sites use hardware random number generators. They will also modify the output to remove any scaling bias and often use 'pots' of numbers which can be 'stirred' using entropic events (user activity, serer i/o events etc). Quite often the resultant numbers just index pre-generated decks (starting off as a sorted list of cards).

Could a truly random number be generated using pings to pseudo-randomly selected IP addresses?

The question posed came about during a 2nd Year Comp Science lecture while discussing the impossibility of generating numbers in a deterministic computational device.
This was the only suggestion which didn't depend on non-commodity-class hardware.
Subsequently nobody would put their reputation on the line to argue definitively for or against it.
Anyone care to make a stand for or against. If so, how about a mention as to a possible implementation?
No.
A malicious machine on your network could use ARP spoofing (or a number of other techniques) to intercept your pings and reply to them after certain periods. They would then not only know what your random numbers are, but they would also control them.
Of course there's still the question of how deterministic your local network is, so it might not be as easy as all that in practice. But since you get no benefit from pinging random IPs on the internet, you might just as well draw entropy from ethernet traffic.
Drawing entropy from devices attached to your machine is a well-studied principle, and the pros and cons of various kinds of devices and methods of measuring can be e.g. stolen from the implementation of /dev/random.
[Edit: as a general principle, when working in the fundamentals of security (and the only practical needs for significant quantities of truly random data are security-related) you MUST assume that a fantastically well-resourced, determined attacker will do everything in their power to break your system.
For practical security, you can assume that nobody wants your PGP key that badly, and settle for a trade-off of security against cost. But when inventing algorithms and techniques, you need to give them the strongest security guarantees that they could ever possibly face. Since I can believe that someone, somewhere, might want someone else's private key badly enough to build this bit of kit to defeat your proposal, I can't accept it as an advance over current best practice. AFAIK /dev/random follows fairly close to best practice for generating truly random data on a cheap home PC]
[Another edit: it has suggested in comments that (1) it is true of any TRNG that the physical process could be influenced, and (2) that security concerns don't apply here anyway.
The answer to (1) is that it's possible on any real hardware to do so much better than ping response times, and gather more entropy faster, that this proposal is a non-solution. In CS terms, it is obvious that you can't generate random numbers on a deterministic machine, which is what provoked the question. But then in CS terms, a machine with an external input stream is non-deterministic by definition, so if we're talking about ping then we aren't talking about deterministic machines. So it makes sense to look at the real inputs that real machines have, and consider them as sources of randomness. No matter what your machine, raw ping times are not high on the list of sources available, so they can be ruled out before worrying about how good the better ones are. Assuming that a network is not subverted is a much bigger (and unnecessary) assumption than assuming that your own hardware is not subverted.
The answer to (2) is philosophical. If you don't mind your random numbers having the property that they can be chosen at whim instead of by chance, then this proposal is OK. But that's not what I understand by the term 'random'. Just because something is inconsistent doesn't mean it's necessarily random.
Finally, to address the implementation details of the proposal as requested: assuming you accept ping times as random, you still can't use the unprocessed ping times as RNG output. You don't know their probability distribution, and they certainly aren't uniformly distributed (which is normally what people want from an RNG).
So, you need to decide how many bits of entropy per ping you are willing to rely on. Entropy is a precisely-defined mathematical property of a random variable which can reasonably be considered a measure of how 'random' it actually is. In practice, you find a lower bound you're happy with. Then hash together a number of inputs, and convert that into a number of bits of output less than or equal to the total relied-upon entropy of the inputs. 'Total' doesn't necessarily mean sum: if the inputs are statistically independent then it is the sum, but this is unlikely to be the case for pings, so part of your entropy estimate will be to account for correlation. The sophisticated big sister of this hashing operation is called an 'entropy collector', and all good OSes have one.
If you're using the data to seed a PRNG, though, and the PRNG can use arbitrarily large seed input, then you don't have to hash because it will do that for you. You still have to estimate entropy if you want to know how 'random' your seed value was - you can use the best PRNG in the world, but its entropy is still limited by the entropy of the seed.]
Random numbers are too important to be left to chance.
Or external influence/manipulation.
Short answer
Using ping timing data by itself would not be truly random, but it can be used as a source of entropy which can then be used to generate truly random data.
Longer version
How random are ping times?
By itself, timing data from network operations (such as ping) would not be uniformly distributed. (And the idea of selecting random hosts is not practical - many will not respond at all, and the differences between hosts can be huge, with gaps between ranges of response time - think satellite connections).
However, while the timing will not be well distributed, there will be some level of randomness in the data. Or to put it another way, a level of information entropy is present. It is a fine idea to feed the timing data into a random number generator to seed it. So what level of entropy is present?
For network timing data of say around 50ms, measured to the nearest 0.1ms, with a spread of values of 2ms, you have about 20 values. Rounding down to the nearest power of 2 (16 = 2^4) you have 4 bits of entropy per timing value. If it is for any kind of secure application (such as generating cryptographic keys) then I would be conservative and say it was only 2 or 3 bits of entropy per reading. (Note that I've done a very rough estimate here, and ignored the possibility of attack).
How to generate truly random data
For true random numbers, you need to send the data into something designed along the lines of /dev/random that will collect the entropy, distributing it within a data store (using some kind of hash function, usually a secure one). At the same time, the entropy estimate is increased. So for a 128 bit AES key, 64 ping timings would be required before the entropy pool had enough entropy.
To be more robust, you could then add timing data from the keyboard and mouse usage, hard disk response times, motherboard sensor data (eg temperature), etc. It increases the rate of entropy collection and makes it hard for an attacker to monitor all sources of entropy. And indeed this is what is done with modern systems. The full list of MS Windows entropy sources is listed in the second comment of this post.
More reading
For discussion of the (computer security) attacks on random number generators, and the design of a cryptographically secure random number generator, you could do worse than read the yarrow paper by Bruce Schneier and John Kelsey. (Yarrow is used by BSD and Mac OS X systems).
No.
Unplug the network cable (or /etc/init.d/networking stop) and the entropy basically drops to zero.
Perform a Denial-Of-Service attack on the machine it's pinging and you also get predictable results (the ping-timeout value)
I guess you could. A couple things to watch out for:
Even if pinging random IP addresses, the first few hops (from you to the first real L3 router in the ISP network) will be the same for every packet. This puts a lower bound on the round trip time, even if you ping something in a datacenter in that first Point of Presence. So you have to be careful about normalizing the timing, there is a lower bound on the round trip.
You'd also have to be careful about traffic shaping in the network. A typical leaky bucket implementation in a router releases N bytes every M microseconds, which effectively perturbs your timing into specific timeslots rather than a continuous range of times. So you might need to discard the low order bits of your timestamp.
However I would disagree with the premise that there are not good sources of entropy in commodity hardware. Many x86 chipsets for the last few years have included random number generators. The ones I am familiar with use relatively sensitive ADCs to measure temperature in two different locations on the die, and subtract them. The low order bits of this temperature differential can be shown (via Chi-squared analysis) to be strongly random. As you increase the processing load on the system the overall temperature goes up, but the differential between two areas of the die remains uncorrelated and unpredictable.
The best source of randomness on commodity hardware I've seen, was a guy who removed a filter or something from his webcam, put opaque glue on the lens, and was then able to easily detect individual white pixels from cosmic rays striking the CCD. These are as close to perfectly random as possible, and are protected from external snooping by quantum effects.
Part of a good random number generator is equal probabilities of all numbers as n -> infinity.
So if you are planning to generate random bytes, then with sufficient data from a good rng, each byte should have an equal probability of being returned. Further, there should be no pattern or predictibiltiy (spikes in probability during certain time periods) of certain numbers being returned.
I am not too sure with using ping what you would be measuring to get the random variable, is it response time? If so, you can be pretty sure that some response times, or ranges of response times, will be more frequent than others and hence would make a potentially insecure random number generator.
If you want commodity hardware, your sound card should pretty much do it. Just turn up the volume on an analog input and you have a cheap white noise source. Cheap randomness without the need for a network.
The approach of measuring something to generate a random seed appears to be a pretty good one. The O'Reilly book Practical Unix and Internet Security gives a few similar additional methods of determining a random seed, such as asking the user to type a few keystrokes, and then measuring the time between keystrokes. (The book notes that this technique is used by PGP as a source of its randomness.)
I wonder if the current temperature of a system's CPU (measured out to many decimal places) could be a viable component of a random seed. This approach would have the advantage of not needing to access the network (so the random generator wouldn't become unavailable when the network connection goes down).
However, it's probably not likely that a CPU's internal sensor could accurately measure the CPU temperature out to enough decimal places to make the value truly viable as a random number seed; at least, not with "commodity-class hardware," as mentioned in the question!
It's not as good as using atmospheric noise but it's still truly random since it depends on the characteristics of the network which is notorious for random non-repeatable behavior.
See Random.org for more on randomness.
Here's an attempt at an implementation:
#ips : list = getIpAddresses();
#rnd = PseudorandomNumberGenerator(0 to (ips.count - 1));
#getTrueRandomNumber() { ping(ips[rnd.nextNumber()]).averageTime }
I would sooner use something like ISAAC as a stronger PRNG before trusting round trip pings as entropy. As others have said, it would just be too easy for someone to not only guess your numbers, but also possibly control them to various degrees.
Other great sources of entropy exist, which others have mentioned. One that was not mentioned (which might not be practical) is sampling noise from the on board audio device.. which is usually going to be a little noisy even if no microphone is connected to it.
I went 9 rounds with trying to come up with a strong (and fast) PRNG for a client/server RPC mechanism I was writing. Both sides had an identical key, consisting of 1024 lines of 32 character ciphers. The client would send AUTH xx, the server would return AUTH yy .. and both sides knew which two lines of the key to use to produce the blowfish secret (+ salt). Server would then send a SHA-256 digest of the entire key (encrypted), client knew it was talking to something that had the correct key .. session continued. Yeah, very weak protection for man in the middle, but a public key was out of the question for how the device was being used.
So, you had a non blocking server that had to handle up to 256 connections .. not only did the PRNG have to be strong, it had to be fast. It wasn't such a hardship to use slower methods to gather entropy in the client, but that could not be afforded in the server.
So, I have to ask regarding your idea .. how practical would it be?
No mathmatical computation can produce a random result but in the "real world" computers don't exactly just crunch numbers... With a little bit of creativity it should be possible to produce random results of the kind where there is no known method of reproducing or predicting exact outcomes.
One of the easiest to implement ideas I've seen which works universally on all systems is to use static from the computers sound card line in/mic port.
Other ideas include thermal noise and low level timing of cache lines. Many modern PCs with TPM chips have encryption quality hardware random number generators already onboard.
My kneejerk reaction to ping (esp if using ICMP) is that your cheating too blatently. At that point you might as well whip out a giger counter and use background radiation as your random source.
Yes, it's possible, but... the devil's in the details.
If you're going to generate a 32-bit integer, you need to gather >32 bits of entropy (and use a sufficient mixing function to get that entropy spread around, but that's known and doable). The big question that is:
how much entropy do ping times have?
The answer to this question depends on all sorts of assumptions about the network and your attack model, and there's different answers in different circumstances.
If attackers are able to totally control ping times, you get 0 bits of entropy per ping, and you can't ever total 32-bits of entropy, no matter how much you mix. If they have less than perfect control over ping times, you'll get some entropy, and (if you don't overestimate the amount of entropy you're gathering) will get perfectly random 32-bit numbers.
YouTube shows a device in action: http://www.youtube.com/watch?v=7n8LNxGbZbs
Random is, if nobody can predict the next state.
Though i cant definitively site for or against, this implementation has its issues.
Where are these IP Addresses coming from, if they are randomly selected, what happens when they do not reply or are late in replying, does that mean the random number will be slower to appear.
Also, even if you would make a visual graph of 100.000 results and calculated that there are no or few correlations between the numbers, does not mean it is truly random. As explained by dilbert :)
It doesn't strike me as a good source of randomness.
What metric would you use -- the obvious one is response time, but the range of values you can reasonably expect is small: a few tens of milliseconds to a few thousand. The response times themselves will follow a bell curve and not be randomly distributed across any interval (how would you choose the interval?) so you would have to try and select a few 'random' bits from the numbers.
The LSB might give you a random bit stream, but you would have to consider clock granularity issues - maybe due to how interrupts work you would always get multiples of 2ms on some systems.
There are probably much better 'interesting' ways of getting random bits -- maybe google for a random word, grab the first page and choose the Nth bit from the page.
Eh, I find that this kind of question leads into discussions about the meaning of 'truly random' pretty quickly.
I think that measuring pings would yield decent-quality random bits, but at an insufficient rate to be of much use (unless you were willing to do some serious DDOSing).
And I don't see that it would be any more random than measuring analogue/mechanical properties of the computer, or the behaviour of the meatbag operating it.
(edit) On a practical note, this approach opens you up to the possibility of someone on your network manipulating your 'random' number generator.
It seems to me that true randomness is ineffable - there is no way to know whether a sequence is random, since by definition it can contain anything no matter how improbable. Guaranteeing a particular distribution pattern reduces the randomness. The word "pattern" is a bit of a giveaway.
I MADE U A RANDOM NUMBER
BUT I EATED IT
Randomness is not a binary property -- it's a value between 0 and 1 that describes how difficult it is to predict the next value in a stream.
Asking "how random can my values be if I base them on pings?" is actually asking "how random are pings?". You can estimate that by gathering a large enough set of data (1 mln pings for example) and mapping their distribution curve and behavior in time. If the distribution is flat and the behavior is difficult to predict, the data seems more random. The more bumpy distribution or predictable behavior suggest lower randomness.
You should also consider the sample resolution. I could imagine the results being rounded in some way to a milisecond, so with pings you could have integer values between 0 and 500. That's not a lot of resolution.
On the practical side, I would recommend against it, since pings can be predicted and manipulated, further reducing their randomness.
Generally, I suggest against "rolling your own" randomness generators, encryption methods and hashing algorithms. As fun as it seems, it's mostly a lot of very intimidating math.
As to how to build a really good entropy generator -- I think that's probably going to have to be a sealed box that outputs some sort of result of interactions on atomic or sub-atomic level. I mean, if you're using a source of entropy that the enemy can easily read too, he only needs to find out your algorythm. Any form of connection is a possible attack vector, so you should place the source of entropy as close to the service that consumes it as possible.
You can use the XKCD method:
I got some code that creates random numbers with traceroute. I also have a program that does it using ping. I did it over a year ago for a class project. All it does is run traceroute on and address and it takes the least sig digit of the ms times. It works pretty well at getting random numbers but I really don't know how close it is to true random.
Here is a list of 8 numbers that I got when I ran it.
455298558263758292242406192
506117668905625112192115962
805206848215780261837105742
095116658289968138760389050
465024754117025737211084163
995116659108459780006127281
814216734206691405380713492
124216749135482109975241865
#include <iostream>
#include <string>
#include <stdio.h>
#include <cstdio>
#include <stdlib.h>
#include <vector>
#include <fstream>
using namespace std;
int main()
{
system("traceroute -w 5 www.google.com >> trace.txt");
string fname = "trace.txt";
ifstream in;
string temp;
vector<string> tracer;
vector<string> numbers;
in.open(fname.c_str());
while(in>>temp)
tracer.push_back(temp);
system("rm trace.txt");
unsigned index = 0;
string a = "ms";
while(index<tracer.size())
{
if(tracer[index]== a)
numbers.push_back(tracer[index-1]);
++index;
}
std::string rand;
for(unsigned i = 0 ; i < numbers.size() ; ++i)
{
std::string temp = numbers[i];
int index = temp.size();
rand += temp[index - 1];
}
cout<<rand<<endl;
return 0;
}
Very simply, since networks obey prescribed rules, the results are not random.
The webcam idea sounds (slightly) reasonable. Linux people often recommend simply using the random noise from a soundcard which has no mic attached.
here is my suggestion :
1- choose a punch of websites that are as far away from your location as possible. e.g. if you are in US try some websites that have their server IPs in malasia , china , russia , India ..etc . servers with high traffic are better.
2- during times of high internet traffic in your country (in my country it is like 7 to 11 pm) ping those websites many many many times ,take each ping result (use only the integer value) and calculate modulus 2 of it ( i.e from each ping operation you get one bit : either 0 or 1).
3- repeat the process for several days ,recording the results.
4- collect all the bits you got from all your pings (probably you will get hundreds of thousands of bits ) and choose from them your bits . (maybe you wanna choose your bits by using some data from the same method mentioned above :) )
BE CAREFUL : in your code you should check for timeout ..etc

Resources