I am using the following shuffling algorithm in my code to shuffle a list. But I would like to know what sort of distribution is assumed in random.shuffle().
import random
random.shuffle(x)
where x is a list.
I read somewhere that random function in general uses uniform distribution but I could not find any clear information on random function page for random.shuffle
Does anyone know?
John Coleman wrote:
It uses the Fisher-Yates shuffle. Whether or not this is explicitly said in the documentation, it is both the obvious choice and clearly implemented in the source
Related
Is it possible to reverse a pseudo random number generator?
For example, take an array of generated numbers and get the original seed.
If so, how would this be implemented?
This is absolutely possible - you just have to create a PRNG which suits your purposes. It depends on exactly what you need to accomplish - I'd be happy to offer more advice if you describe your situation in more detail.
For general background, here are some resources for inverting a Linear Congruential Generator:
Reversible pseudo-random sequence generator
pseudo random distribution which guarantees all possible permutations of value sequence - C++
And here are some for inverting the mersenne twister:
http://www.randombit.net/bitbashing/2009/07/21/inverting_mt19937_tempering.html
http://b10l.com/reversing-the-mersenne-twister-rng-temper-function/
In general, no. It should be possible for most generators if you have the full array of numbers. If you don't have all of the numbers or know which numbers you have (do you have the 12th or the 300th?), you can't figure it out at all, because you wouldn't know where to stop.
You would have to know the details of the generator. Decoding a linear congruential generator is going to be different from doing so for a counter-based PRNG, which is going to be different from the Mersenne twister, which is going to be different with a Fibonacci generator. Plus you would probably need to know the parameters of the generator. If you had all of that AND the equation to generate a number is invertible, then it is possible. As to how, it really depends on the PRNG.
Use the language Janus a time-reversible language for doing reversible computing.
You could probably do something like create a program that does this (pseudo-code):
x = seed
x = my_Janus_prng(x)
x = reversible_modulus_op(x, N) + offset
Janus has the ability to give to you a program that takes the output number and whatever other data it needs to invert everything, and give you the program that ends with x = seed.
I don't know all the details about Janus or how you could do this, but just thought I would mention it.
Clearly, what you want to do is probably a better idea because if the RNG is not an injective function, then what should it map back to etc.
So you want to write a Janus program that outputs an array. The input to the Janus inverted program would then take an array (ideally).
I want to generate a sequence of random numbers that will be used to pick tiles for a "maze". Each maze will have an id and I want to use that id as a seed to a pseudo random function. That way I can generate the same maze over and over given it's maze id. Preferably I do not want to use a built in pseudo random function in a language since I do not have control over the algorithm and it could change from platform to platform. As such, I would like to know:
How should I go about implementing my own pseudo random function?
Is it even feasible to generate platform independent pseudo random numbers?
Yes, it is possible.
Here is an example of such an algorithm (and its use) for noise generation.
Those particular random functions (Noise1, Noise2, Noise3, ..) use input parameters and calculate the pseudo random values from there.
Their output range is from 0.0 to 1.0.
And there are many more out there (Like mentioned in the comments).
UPDATE 2019
Looking back at this answer, a better suited choice would be the below-mentioned mersenne twister. Or you could find any implementation of xorshift.
The Mersenne Twister may be a good pick for this. As you can see from the pseudocode on wikipedia, you can seed the RNG with whatever you prefer to produce identical values for any instance with that seed. In your case, the maze ID or the hash of the maze ID.
If you are using Python, you can use the random module by typing at the beginning,
import random. Then, to use it, you type-
var = random.randint(1000, 9999)
This gives the var a 4 digit number that can be used for its id
If you are using another language, there is likely a similar module
I'm looking for a determenistic psuedo random generator that takes two inputs and always returns the same output. I'm looking for things like uniform distribution, unpredictable as possible, and doesn't repeat for a long long time. Ideally the function doesn't rely on previous values. The reason that is a problem is I'm generating terrain data for an extremely large procedurely generated world and can't afford to store previous values.
Any help is appreciated.
i think what you're looking for is perlin noise - it's a way of generating "random" values in 2d (typically) that look like terrain / clouds / etc.
note that this doesn't have much to do with cryptography etc, but a "real" random number source is probably not what you want for synthetic terrain (it looks too noisy/spikey).
there's a good article on perlin noise here.
the implementation of perlin noise does use a source of random numbers, but typically you can use whatever is present on your system (starting with a known seed if you want to reproduce it later).
Is the problem deciding on a PRNG algorithm to use or an algorithm that accepts 2 inputs?
If it's the former, why not use the built in random class - such as Random class in .NET - since it strives for uniform distribution and long cycles. Also, given the same seed it will generate the same sequence of numbers.
If it's the latter, what you can do is map the 2 inputs to a single ouput and use that as a seed to your random algorithm. You can define a simple hash function that takes a string and calculates an integer from it:
s[0] + s[1]^1 + s[2]^2 + ... s[n]^n = seed
Combination of two inputs (by concatenating each other, provided the inputs are binary integers) into one seed will do, for a PRNG, such as Mersenne Twister.
This picture from Wikipedia has a nice example of the sort of functions I'd ideally like to generate:
Right now I'm using the Irwin-Hall Distribution, which is more or less a polynomial approximation of the Gaussian distribution...basically, you use uniform random number generator and iterate it x times, and take the average. The more iterations, the more like a Gaussian Distribution it is.
It's pretty nice; however I'd like to be able to have one where I can vary the mean. For example, let's say I wanted a number between the range 0 and 10, but around 7. Like, the mean (if I repeated this function multiple times) would turn out to be 7, but the actual range is 0-10.
Is there one I should look up, or should I work on doing some fancy maths with standard Gaussian distributions?
I see a contradiction in your question. From one side you want normal distribution which is symmetrical by it's nature, from other side you want the range asymmetrically disposed to mean value.
I suspect you should try to look at other distributions density functions of which are like bell curve but asymmetrical. Like log distribution or beta distribution.
Look into generating normal random variates. You can generate pairs of normal random variates X = N(0,1) and tranform it into ANY normal random variate Y = N(m,s) (Y = m + s*X).
Sounds like the Truncated Normal distribution is just what the doctor ordered. It is not "computationally simple" per se, but easy to implement if you have an existing implementation of a normal distribution.
You can just generate the distribution with the mean you want, standard deviation you want, and the two ends wherever you want. You'll have to do some work beforehand to compute the mean and standard deviation of the underlying (non-truncated) normal distribution to get the mean for the TN that you want, but you can use the formulae in that article. Also note that you can adjust the variance as well using this method :)
I have Java code (based on the Commons Math framework) for both an accurate (slower) and quick (less accurate) implementation of this distribution, with PDF, CDF, and sampling.
I need a random number generation algorithm that generates a random number for a specific input. But it will generate the same number every time it gets the same input. If this kind of algorithm available in the internet or i have to build one. If exists and any one knows that please let me know. (c, c++ , java, c# or any pseudo code will help much)
Thanks in advance.
You may want to look at the built in Java class Random. The description fits what you want.
Usually the standard implementation of random number generator depends on seed value.
You can use standard random with seed value set to some hash function of your input.
C# example:
string input = "Foo";
Random rnd = new Random(input.GetHashCode());
int random = rnd.Next();
I would use a hash function like SHA or MD5, this will generate the same output for a given input every time.
An example to generate a hash in java is here.
The Mersenne Twister algorithm is a good predictable random number generator. There are implementations in most languages.
How about..
public int getRandonNumber()
{
// decided by a roll of a dice. Can't get fairer than that!
return 4;
}
Or did you want a random number each time?
:-)
Some code like this should work for you:
MIN_VALUE + ((MAX_VALUE - MIN_VALUE +1) * RANDOM_INPUT / (MAX_VALUE + 1))
MIN_VALUE - Lower Bound
MAX_VALUE - Upper Bound
RANDOM_INPUT - Input Number
All pseudo-random number generators (which is what most RNGs on computers are) will generate the same sequence of numbers from a starting input, the seed. So you can use whatever RNG is available in your programming language of choice.
Given that you want one sample from a given seed, I'd steer clear of Mersenne Twister and other complex RNGs that have good statistical properties since you don't need it. You could use a simple LCG, or you could use a hash function like MD5. One problem with LCG is that often for a small seed the next value is always in the same region since the modulo doesn't apply, so if your input value is typically small I'd use MD5 for example.