I have searched for pseudo-RNG algorithms but all I can find seem to generate the next number by using the previous result as seed. Is there a way to generate them non-recursively?
The scenario where I need this is during OpenCL concurrent programming, each thread/pixel needs an independent RNG. I tried to seed them using BIG_NUMBER + work_id, but the result has a strong visual pattern in it. I tried several different RNG algorithms and all have this problem. Apparently they only guarantee the numbers are independent if you generate recursively, but not when you use sequential numbers as seed.
So my question is: Can I generate an array of random numbers, from an array of sequential numbers, independently and constant time for each number? Or is it mathematically impossible?
As the solution to my openCL problem, I can just pre-generate a huge array of random numbers recursively first and store in GPU memory, then use them later using seed as index. But I'm curious about the question above, because it seems very possible just by doing bunch of overflow and cutoffs, according to my very simple understanding of chaos theory.
Can I generate an array of random numbers, from an array of sequential numbers, independently and constant time for each number? Or is it mathematically impossible?
Sure you can - use block cipher in countnig mode. It is generally known as Counter based RNG, first widely used one was Fortuna RNG
Related
I am confused between these three functions and I was wondering for some explanation. If I set the range how do I make the range exclusive or inclusive? Are the ranges inclusive or exclusive if I don't specify the range?
In addition to the answer from #dave_59, there are other important differences:
i) $random returns a signed 32-bit integer; $urandom and $urandom_range return unsigned 32-bit integers.
ii) The random number generator for $random is specified in IEEE Std 1800-2012. With the same seed you will get exactly the same sequence of random numbers in any SystemVerilog simulator. That is not the case for $urandom and $urandom_range, where the design of the random number generator is up to the EDA vendor.
iii) Each thread has its own random number generator for $urandom and $urandom_range, whereas there is only one random number generator for $random shared between all threads (ie only one for the entire simulation). This is really important, because having separate random number generators for each thread helps you simulation improve a property called random stability. Suppose you are using a random number generator to generate random stimulus. Suppose you find a bug and fix it. This could easily change the order in which threads (ie initial and always blocks) are executed. If that change changed the order in which random numbers were generated then you would never know whether the bug had gone away because you'd fixed it or because the stimulus has changed. If you have a random number generator for each thread then your testbench is far less vulnerable to such an effect - you can be far more sure that the bug has disappeared because you fixed it. That property is called random stability.
So, As #dave_59 says, you should only be using $urandom and $urandom_range.
You should only be using $urandom and $urandom_range. These two functions provide better quality random numbers and better seed initialization and stability than $random. The range specified by $urandom_range is always inclusive.
Although $random generates the exact same sequence of random numbers for each call, it is extremely difficult to keep the same call ordering as soon as any change is made to the design or testbench. Even more difficult when multiple threads are concurrently generating random numbers.
I've been thinking about this as a thought experiment to try and understand some hashing concepts. Consider the requirement for a say 128 bit hash function (i.e., its output is exactly 128 bits in length).
A. You might look at something like MD5. So you input your data to be hashed, and out pops a 128 bit number.
B. Alternatively, you find a magical pseudo random number generator (PRNG). Some sort of Frankenstein version of the Twister. It seeds itself from all of your input data to be hashed, and has an internal state size >> 128 bits. You then generate 128 pseudo random bits as output.
It seems to me that both A and B effectively produce an output that is determined solely by the input data. Are these two approaches therefore equivalent?
Supplemental:
Some feed back has suggested that there might be a security in-equivalence with my scenario. If the pseudo random number generator were to be something like Java's SecureRandom (which uses SHA-1), seeded from the input data, then might A <=> B?
If you seed a PRNG with your input data and then extract 128 bits of random data from it, then you effectively leave the hashing to the PRNG seed function, and the size of the hash that it generates will be the size of the PRNG state buffer.
However, if the state of the PRNG is larger than the 128 bits you extract as a hash, then there's a risk that some of the input data used for the seed won't have any effect on the bits of the PRNG state that you extract. This makes it a really bad hash, so you don't want to do that.
PRNG seed functions are typically very weak hashes, because hashing is not their business. They're almost certainly insecure (which you did not ask about), and separate from that they're usually quite weak at avalanching. A strong hash typically tries to ensure that every bit of input has a fair chance of affecting every bit of output. Insecure hashes typically don't worry that they'll fail at this if the input data is too short, but a PRNG seed will often make no effort at all.
Cryptographic hash functions are designed to make it hard to create input that generates a specific hash; and/or more it hard to create two inputs that generate the same hash.
If something is designed as a random number generating algorithm, then this was not one of the requirements for the design. So if something is "just" a random number generator, there is no guarantee that it satisfies these important constraints on a cryptographic hashcode. So in that sense, they are not equivalent.
Of course there may be random number generating algorithms that were also designed as cryptographic hashing algorithms, and in that case (if the implementation did a good job at satisfying the requirements) they may be equivalent.
Is it possible to reverse a pseudo random number generator?
For example, take an array of generated numbers and get the original seed.
If so, how would this be implemented?
This is absolutely possible - you just have to create a PRNG which suits your purposes. It depends on exactly what you need to accomplish - I'd be happy to offer more advice if you describe your situation in more detail.
For general background, here are some resources for inverting a Linear Congruential Generator:
Reversible pseudo-random sequence generator
pseudo random distribution which guarantees all possible permutations of value sequence - C++
And here are some for inverting the mersenne twister:
http://www.randombit.net/bitbashing/2009/07/21/inverting_mt19937_tempering.html
http://b10l.com/reversing-the-mersenne-twister-rng-temper-function/
In general, no. It should be possible for most generators if you have the full array of numbers. If you don't have all of the numbers or know which numbers you have (do you have the 12th or the 300th?), you can't figure it out at all, because you wouldn't know where to stop.
You would have to know the details of the generator. Decoding a linear congruential generator is going to be different from doing so for a counter-based PRNG, which is going to be different from the Mersenne twister, which is going to be different with a Fibonacci generator. Plus you would probably need to know the parameters of the generator. If you had all of that AND the equation to generate a number is invertible, then it is possible. As to how, it really depends on the PRNG.
Use the language Janus a time-reversible language for doing reversible computing.
You could probably do something like create a program that does this (pseudo-code):
x = seed
x = my_Janus_prng(x)
x = reversible_modulus_op(x, N) + offset
Janus has the ability to give to you a program that takes the output number and whatever other data it needs to invert everything, and give you the program that ends with x = seed.
I don't know all the details about Janus or how you could do this, but just thought I would mention it.
Clearly, what you want to do is probably a better idea because if the RNG is not an injective function, then what should it map back to etc.
So you want to write a Janus program that outputs an array. The input to the Janus inverted program would then take an array (ideally).
I'm looking for a determenistic psuedo random generator that takes two inputs and always returns the same output. I'm looking for things like uniform distribution, unpredictable as possible, and doesn't repeat for a long long time. Ideally the function doesn't rely on previous values. The reason that is a problem is I'm generating terrain data for an extremely large procedurely generated world and can't afford to store previous values.
Any help is appreciated.
i think what you're looking for is perlin noise - it's a way of generating "random" values in 2d (typically) that look like terrain / clouds / etc.
note that this doesn't have much to do with cryptography etc, but a "real" random number source is probably not what you want for synthetic terrain (it looks too noisy/spikey).
there's a good article on perlin noise here.
the implementation of perlin noise does use a source of random numbers, but typically you can use whatever is present on your system (starting with a known seed if you want to reproduce it later).
Is the problem deciding on a PRNG algorithm to use or an algorithm that accepts 2 inputs?
If it's the former, why not use the built in random class - such as Random class in .NET - since it strives for uniform distribution and long cycles. Also, given the same seed it will generate the same sequence of numbers.
If it's the latter, what you can do is map the 2 inputs to a single ouput and use that as a seed to your random algorithm. You can define a simple hash function that takes a string and calculates an integer from it:
s[0] + s[1]^1 + s[2]^2 + ... s[n]^n = seed
Combination of two inputs (by concatenating each other, provided the inputs are binary integers) into one seed will do, for a PRNG, such as Mersenne Twister.
This question is NOT about how to use any language to generate a random number between any interval. It is about generating either 0 or 1.
I understand that many random generator algorithm manipulate the very basic random(0 or 1) function and take seed from users and use an algorithm to generate various random numbers as needed.
The question is that how the CPU generate either 0 or 1? If I throw a coin, I can generate head or tailer. That's because I physically throw a coin and let the nature decide. But how does CPU do it? There must be an action that the CPU does (like throwing a coin) to get either 0 or 1 randomly, right?
Could anyone tell me about it?
Thanks
(This has several facets and thus several algorithms. Keep in mind that there are many different forms of randomness used for different purposes, but I understand your question in the way that you are interested in actual randomness used for cryptography.)
The fundamental problem here is that computers are (mostly) deterministic machines. Given the same input in the same state they always yield the same result. However, there are a few ways of actually gathering entropy:
User input. Since users bring outside input into the system you can take that to derive some bits from that. Similar to how you could use radioactive decay or line noise.
Network activity. Again, an outside source of stuff.
Generally interrupts (which kinda include the first two).
As alluded to in the first item, noise from peripherals, such as audio input or a webcam can be used.
There is dedicated hardware that can generate a few hundred MiB of randomness per second. Usually they give you random numbers directly instead of their internal entropy, though.
How exactly you derive bits from that is up to you but you could use time between events, or actual content from the events, etc. – generally eliminating bias from entropy sources isn't easy or trivial and a lot of thought and algorithmic work goes into that (in the case of the aforementioned special hardware this is all done in hardware and the code using it doesn't need to care about it).
Once you have a pool of actually random bits you can just use them as random numbers (/dev/random on Linux does that). But this has downsides, since there is usually little actual entropy and possibly a higher demand for random numbers. So you can invent algorithms to “stretch” that initial randomness in a manner that makes it still impossible or at least very difficult to predict anything about following numbers (/dev/urandom on Linux or both /dev/random and /dev/urandom on FreeBSD do that). Fortuna and Yarrow are so-called cryptographically secure pseudo-random number generators and designed with that in mind. You still have a very good guarantee about the quality of random numbers you generate, but have many more before your entropy pool runs out.
In any case, the CPU itself cannot give you a random 0 or 1. There's a lot more involved and this usually includes the complete computer system or special hardware built for that purpose.
There is also a second class of computational randomness: Plain vanilla pseudo-random number generators (PRNGs). What I said earlier about determinism – this is the embodiment of it. Given the same so-called seed a PRNG will yield the exact same sequence of numbers every time¹. While this sounds idiotic it has practical benefits.
Suppose you run a simulation involving lots of random numbers, maybe to simulate interaction between molecules or atoms that involve certain probabilities and unpredictable behaviour. In science you want results anyone can independently verify, given the same setup and procedure (or, with computing, the same algorithms). If you used actual randomness the only option you have would be to save every single random number used to make sure others can replicate the results independently.
But with a PRNG all you need to save is the seed and remember what algorithm you used. Others can then get the exact same sequence of pseudo-random numbers independently. Very nice property to have :-)
Footnotes
¹ This even includes the CSPRNGs mentioned above, but they are designed to be used in a special way that includes regular re-seeding with entropy to overcome that problem.
A CPU can only generate a uniform random number, U(0,1), which happens to range from 0 to 1. So mathematically, it would be defined as a random variable U in the range [0,1]. Examples of random draws of a U(0,1) random number in the range 0 to 1 would be 0.28100002, 0.34522, 0.7921, etc. The probability of any value between 0 and 1 is equal, i.e., they are equiprobable.
You can generate binary random variates that are either 0 or 1 by setting a random draw of U(0,1) to a 0 if U(0,1)<=0.5 and 1 if U(0,1)>0.5, since in theory there will be an equal number of random draws of U(0,1) below 0.5 and above 0.5.