I needed to generate a poisson distribution in excel and found a method (Inverse Transform Method)
did it in excel and then in sas (just for fun, so I do not need a quick answer) to compare with the ranpoi sas function.
Here my code (which works):
data Poisson(keep=mean Poisson PoissonSas);
mean=0.2;
confronta=exp(-mean);
do obs=1 to 100;
found=0;
Poisson=0;
ranuni=1;
do until(found=1);
ranuni=ranuni*ranuni(12547);
if ranuni<confronta then found=1;
else Poisson=Poisson+1;
end;
PoissonSas=ranpoi(012584,mean);
output;
end;
run;
proc means data=Poisson(drop=mean);run;
So I initialized the seed in both random functions to replicate results.
The strange thing is that I get different results depending on whether I submit the data step with both methods or only one of them (commenting the other), but the same results over and over for each type of submission.
I expected the same results always! Why this is not so?
(I am using sas 9.3)
Thanks!
It looks like SAS is interleaving the calls to the PRNGs as a single stream. Pseudo-random numbers are a sequence of values that are actually deterministic. If you seed and use the sequence in one algorithm, you'll get the same results every time for that algorithm. If you use the sequence alternating between two or more algorithms, the set of algorithms will always yield the same set of results (which seems to be the case for you), but the results for a given algorithm will be different because some of the underlying PRNs it was drawing before are now being used by the other algorithms. This is at the core of the synchronization requirement when using so-called variance reduction techniques based on common random numbers. In general, if you want identical results the solution is to have multiple instances of your PRNG, one for each "source" of randomness in your program, and to seed the multiple sources independently of each other but identically across runs. It looks like you tried to do this, but SAS doesn't behave the way you think it does. According to their documentation, it appears that they produce a single PRN stream based on the first seed entry in your code! This is a subset of one of their examples:
/* This DATA step calls the RANUNI and the RANNOR functions */
/* and produces a single stream of random numbers based on */
/* a seed value of 7. */
data d;
d = ranuni (7); f = ' '; output;
d = ranuni (8); f = ' '; output;
d = rannor (9); f = 'n'; output;
/* they actually have more... */
run;
By the way, your Poisson algorithm is not generally regarded as an inverse transform algorithm. Inversion is 1-to-1, i.e., a single input uniform produces a single random variable. The loop you're performing is actually doing acceptance/rejection, and you use a variable number of uniforms to come up with each Poisson value.
PJS's answer is essentially correct, but a few clarifications.
SAS does indeed use a single seed when you do it the way you did; all of what I'd call 'primitive' random functions work off of one PRNG stream, and only the first seed matters (and only matters the first time it's encountered).
However, RANPOI is a little different - probably because of how SAS creates poissons. It's not made clear in the documentation, but it appears that it uses up two random numbers (not sure if it's always two, or just coincidence). See the following test:
data test;
U=ranuni(7);
P=ranpoi(8,100);
put u= p=;
run;
data test2;
p=ranpoi(8,100);
u=ranuni(7);
put u= p=;
run;
data test3;
u=ranuni(8);
p=ranuni(7);
put u= p=;=
run;
data test4;
u=ranuni(7);
p=ranuni(8);
put u= p=;
run;
data test5;
do _t = 1 to 5;
u=ranuni(8);
put u=;
end;
run;
Now, in test4, we see the first two ranuni's when starting with seed 7, and indeed the first one matches the first one from test. However, test3 has the first two starting with seed 8, and the second one does not match the one from test2! test5 shows that in fact the third matches, meaning ranpoi in test2 used up 2 numbers from the stream.
In any event, if you want to change the seed midstream, you have two options.
One is to use CALL RANPOI (and CALL RANUNI), which allow you to store the seed in a variable. Two is to use RAND function, which works with CALL STREAMINIT to set seeds whenever you want to. The RAND function is considered 'better' than the more primitive RANPOI and such - it uses a better PRNG algorithm.
Related
There is a filter in WEKA in the preprocess tab named Randomized. The definition of this filter is Randomly shuffles the order of instances passed through it. The filter has a parameter called randomseed which is by default set as 42.
I found some definition of randomseed sunch as A random seed (or seed state, or just seed) is a number (or vector) used to initialize a pseudorandom number generator.
Seed function is used to save the state of a random function, so that it can generate same random numbers on multiple executions of the code on the same machine or on different machines (for a specific seed value). The seed value is the previous value number generated by the generator.
The number "42" was apparently chosen as a tribute to the "Hitch-hiker's Guide" books by Douglas Adams, as it was supposedly the answer to the great question of "Life, the universe, and everything" as calculated by a computer (named "Deep Thought") created specifically to solve it.
All those answers on the internet made me more confused.
I cannot understand what randomseed will do with random shuffle? Thus this means, 42 instances will take from the beginning of the instances and shuffle. Then again 42 instances will be taken and shuffled and the process will continue till the end?
The seed value simply initializes the java.util.Random object that generates the sequence of pseudo random numbers used for randomization.
A different seed value will result in a different sequence, therefore resulting in a different randomization of rows in your dataset.
The Randomize filter initializes such a Random object and then calls the randomize method of the weka.core.Instances object it currently holds. The code of the randomize method (at time of writing this is Weka 3.9.6) looks like this:
public void randomize(Random random) {
for (int j = numInstances() - 1; j > 0; j--) {
swap(j, random.nextInt(j + 1));
}
}
All options of option-handling classes in Weka have a default value. In case of the Randomize filter, the seed value has the default value of 42. It could have been anything, but someone was a fan of the Hitchhiker's Guide to the Galaxy and chose that value.
I'm looking to use a specific set of seeds for the intrinsic function RANDOM_NUMBER (a PRNG). What I've read so far is that the seed value can be set via calling RANDOM_SEED, specifically RANDOM_SEED(GET = array). My confusion is how to (if it's possible) set a specific value for the algorithm, for instance in the RAND, RANDU, or RANDM algorithms one can specify their own seed directly. I'm not sure how to set the seed, as the get function seems to take an array. If it takes an array, does it always pull the seed value from a specific index in the array?
Basically, is there a way to set a specific single seed value? If so, would someone be able to write-it out?
As a side note - I'm attempting to set my seed because allegedly one of the other PRNGs I mentioned only works well with "large odd numbers" according to my professor, so I decided that I may as well control this when comparing the PRNG's.
First, the RANDOM_SEED is only used for controlimg the seed of RANDOM_NUMBER(). If you use any other random number generator subroutine or function, it will not affect them at all or if yes then in some compiler specific way which you must find in the manual. But most probably it does not affect them.
Second, you should not care at all whether the seed array contains 1, 4 or 42 integers, it doesn't matter because the PRNG uses the bits from the whole array in some unspecified custom way. For example you may be generating 64 bit real numbers, but the seed array is made of 32 bit integers. You cannot simply say which integer from the seed array does what. You can view the whole seed array as one big integer number that is cut into smaller pieces if you want.
Regarding your professors advice, who knows what he meant, but the seed is probably set by some procedure of that particular generator, and not by the standard RANDOM_SEED, so you must read the documentation for that generator.
And how to use a specific seed in RANDOM_SEED? It was described on this site several times,jkust search for RANDOM_SEED in the top right search field, really. But it is simple, once you know the size of the array, size it to any non-trivial numbers (you need enough non-zero bits) you want and use put=. That's really all, just don't think about individual values in the array, the whole array is one piece of data together.
This question is about a comment in this question
Recommended way to initialize srand? The first comment says that srand() should be called only ONCE in an application. Why is it so?
That depends on what you are trying to achieve.
Randomization is performed as a function that has a starting value, namely the seed.
So, for the same seed, you will always get the same sequence of values.
If you try to set the seed every time you need a random value, and the seed is the same number, you will always get the same "random" value.
Seed is usually taken from the current time, which are the seconds, as in time(NULL), so if you always set the seed before taking the random number, you will get the same number as long as you call the srand/rand combo multiple times in the same second.
To avoid this problem, srand is set only once per application, because it is doubtful that two of the application instances will be initialized in the same second, so each instance will then have a different sequence of random numbers.
However, there is a slight possibility that you will run your app (especially if it's a short one, or a command line tool or something like that) many times in a second, then you will have to resort to some other way of choosing a seed (unless the same sequence in different application instances is ok by you). But like I said, that depends on your application context of usage.
Also, you may want to try to increase the precision to microseconds (minimizing the chance of the same seed), requires (sys/time.h):
struct timeval t1;
gettimeofday(&t1, NULL);
srand(t1.tv_usec * t1.tv_sec);
Random numbers are actually pseudo random. A seed is set first, from which each call of rand gets a random number, and modifies the internal state and this new state is used in the next rand call to get another number. Because a certain formula is used to generate these "random numbers" therefore setting a certain value of seed after every call to rand will return the same number from the call. For example srand (1234); rand (); will return the same value. Initializing once the initial state with the seed value will generate enough random numbers as you do not set the internal state with srand, thus making the numbers more probable to be random.
Generally we use the time (NULL) returned seconds value when initializing the seed value. Say the srand (time (NULL)); is in a loop. Then loop can iterate more than once in one second, therefore the number of times the loop iterates inside the loop in a second rand call in the loop will return the same "random number", which is not desired. Initializing it once at program start will set the seed once, and each time rand is called, a new number is generated and the internal state is modified, so the next call rand returns a number which is random enough.
For example this code from http://linux.die.net/man/3/rand:
static unsigned long next = 1;
/* RAND_MAX assumed to be 32767 */
int myrand(void) {
next = next * 1103515245 + 12345;
return((unsigned)(next/65536) % 32768);
}
void mysrand(unsigned seed) {
next = seed;
}
The internal state next is declared as global. Each myrand call will modify the internal state and update it, and return a random number. Every call of myrand will have a different next value therefore the the method will return the different numbers every call.
Look at the mysrand implementation; it simply sets the seed value you pass to next. Therefore if you set the next value the same everytime before calling rand it will return the same random value, because of the identical formula applied on it, which is not desirable, as the function is made to be random.
But depending on your needs you can set the seed to some certain value to generate the same "random sequence" each run, say for some benchmark or others.
Short answer: calling srand() is not like "rolling the dice" for the random number generator. Nor is it like shuffling a deck of cards. If anything, it's more like just cutting a deck of cards.
Think of it like this. rand() deals from a big deck of cards, and every time you call it, all it does is pick the next card off the top of the deck, give you the value, and return that card to the bottom of the deck. (Yes, that means the "random" sequence will repeat after a while. It's a very big deck, though: typically 4,294,967,296 cards.)
Furthermore, every time your program runs, a brand-new pack of cards is bought from the game shop, and every brand-new pack of cards always has the same sequence. So unless you do something special, every time your program runs, it will get exactly the same "random" numbers back from rand().
Now, you might say, "Okay, so how do I shuffle the deck?" And the answer -- at least as far as rand and srand are concerned -- is that there is no way of shuffling the deck.
So what does srand do? Based on the analogy I've been building here, calling srand(n) is basically like saying, "cut the deck n cards from the top". But wait, one more thing: it's actually start with another brand-new deck and cut it n cards from the top.
So if you call srand(n), rand(), srand(n), rand(), ..., with the same n every time, you won't just get a not-very-random sequence, you'll actually get the same number back from rand() every time. (Probably not the same number you handed to srand, but the same number back from rand over and over.)
So the best you can do is to cut the deck once, that is, call srand() once, at the beginning of your program, with an n that's reasonably random, so that you'll start at a different random place in the big deck each time your program runs. With rand(), that really is the best you can do.
[P.S. Yes, I know, in real life, when you buy a brand-new deck of cards it's typically in order, not in random order. For the analogy here to work, I'm imagining that each deck you buy from the game shop is in a seemingly random order, but the exact same seemingly-random order as every other deck of cards you buy from that same shop. Sort of like the identically shuffled decks of cards they use in bridge tournaments.]
Addendum: For a very cute demonstration of the fact that for a given PRNG algorithm and a given seed value, you always get the same sequence, see this question (which is about Java, not C, but anyway).
The reason is that srand() sets the initial state of the random generator, and all the values that generator produces are only "random enough" if you don't touch the state yourself in between.
For example you could do:
int getRandomValue()
{
srand(time(0));
return rand();
}
and then if you call that function repeatedly so that time() returns the same values in adjacent calls you just get the same value generated - that's by design.
A simpler solution for using srand() for generating different seeds for application instances run at the same second is as seen.
srand(time(NULL)-getpid());
This method makes your seed very close to random as there is no way to guess at what time your thread started and the pid will be different also.
srand seeds the pseudorandom number generator. If you call it more than once, you will reseed the RNG. And if you call it with the same argument, it will restart the same sequence.
To prove it, if you do something simple like this, you will see the same number printed 100 times:
#include <stdlib.h>
#include <stdio.h>
int main() {
for(int i = 0; i != 100; ++i) {
srand(0);
printf("%d\n", rand());
}
}
It seems that every time rand() runs, it will set a new seed for the next rand().
If srand() runs multiple times, the problem is if the two running happen in one second (the time(NULL) does not change), the next rand() will be the same as the rand() right after the previous srand().
I am using the rgsl library in Rust that wraps functions from the C GSL math libraries. I was using a random number generator function, but I am always getting the same exact value whenever I generate a new random number. I imagine that the number should vary upon each run of the function. Is there something that I am missing? Do I need to set a new random seed each time or such?
extern crate rgsl;
use rgsl::Rng;
fn main() {
rgsl::RngType::env_setup();
let t = rgsl::rng::default();
let r = Rng::new(&t).unwrap()
let val = rgsl::randist::binomial::binomial(&r, 0.01f64, 1u32);
print!("{}",val);
}
The value I keep getting is 1, which seems really high considering the probability of obtaining a 1 is 0.01.
The documentation for env_setup explains everything you need to know:
This function reads the environment variables GSL_RNG_TYPE and GSL_RNG_SEED and uses their values to set the corresponding library variables gsl_rng_default and gsl_rng_default_seed
If you don’t specify a generator for GSL_RNG_TYPE then gsl_rng_mt19937 is used as the default. The initial value of gsl_rng_default_seed is zero.
(Emphasis mine)
Like all software random number generators, this is really an algorithm that produces pseudo random numbers. The algorithm and the initial seed uniquely identify a sequence of these numbers. Since the seed is always the same, the first (and second, third, ...) number in the sequence will always be the same.
So if I want to generate a new series of random numbers, then I need to change the seed each time. However, if I use the rng to generate a set of random seeds, then I will get the same seeds each time.
That's correct.
Other languages don't seem to have this constraint, meaning that the seed can be manually set if desired, but is otherwise is random.
A classical way to do this is to seed your RNG with the current time. This produces an "acceptable" seed for many cases. You can also get access to true random data from the operating system and use that as a seed or mix it in to produce more random data.
Is there no way to do this in Rust?
This is a very different question. If you just want a random number generator in Rust, use the rand crate. This uses techniques like I described above.
You could even do something crazy like using random values from the rand crate to seed your other random number generator. I just assumed that there is some important reason you are using that crate instead of rand.
I came across an article about Car remote entry system at http://auto.howstuffworks.com/remote-entry2.htm In the third bullet, author says,
Both the transmitter and the receiver use the same pseudo-random number generator. When the transmitter sends a 40-bit code, it uses the pseudo-random number generator to pick a new code, which it stores in memory. On the other end, when the receiver receives a valid code, it uses the same pseudo-random number generator to pick a new one. In this way, the transmitter and the receiver are synchronized. The receiver only opens the door if it receives the code it expects.
Is it possible to have two PRNG functions producing same random numbers at the same time?
In PRNG functions, the output of the function is dependent on a 'seed' value, such that the same output will be provided from successive calls given the same seed value. So, yes.
An example (using C#) would be something like:
// Provide the same seed value for both generators:
System.Random r1 = new System.Random(1);
System.Random r2 = new System.Random(1);
// Will output 'True'
Console.WriteLine(r1.Next() == r2.Next());
This is all of course dependent on the random number generator using some sort of deterministic formula to generate its values. If you use a so-called 'true random' number generator that uses properties of entropy or noise in its generation, then it would be very difficult to produce the same values given some input, unless you're able to duplicate the entropic state for both calls into the function - which, of course, would defeat the purpose of using such a generator...
In the case of remote keyless entry systems, they very likely use a PRNG function that is deterministic in order to take advantage of this feature. There are many ICs that provide this sort of functionality to produce random numbers for electronic circuits.
Edit: upon request, here is an example of a non-deterministic random number generator which doesn't rely upon a specified seed value: Quantum Random Number Generator. Of course, as freespace points out in the comments, this is not a pseudorandom number generator, since it generates truly random numbers.
Most PRNGs have an internal state in the form of a seed, which they use to generate their next values. The internal logic goes something like this:
nextNumber = function(seed);
seed = nextNumber;
So every time you generate a new number, the seed is updated. If you give two PRNGs that use the same algorithm the same seed, function(seed) is going to evaluate to the same number (given that they are deterministic, which most are).
Applied to your question directly: the transmitter picks a code, and uses it as a seed. The receiver, after receiving it, uses this to seed its generator. Now the two are aligned, and they will generate the same values.
As Erik and Claudiu have said, ad long as you seed your PRNG with the same value you'll end up with the same output.
An example can be seen when using AES (or any other encryption algorithm) as the basis of your PRNG. As long as you keep using an inputs that match on both device (transmitter and receiver) then the outputs will also match.