I am searching for the proper PRNG or similar algorithm for my use-case which is a smart contract lottery. Let's say I want to draw a random winner from a list of ticket IDs, my intuition suggests that there may be a PRNG like Xorshift or MT that can be suitably random with a true random seed such that I can generate a set number of derived seeds from a publicly known seed while avoiding the issue of users guessing the next or Nth seed generated in order to anticipate the winning ticket. What algorithm is best fit for this purpose?
Related
I'm trying to know what number the computer will randomly choose. Will I need a specific algorithm? Or do I need artificial intelligence?
You can't predict a true random number, by definition, as it is random. Pseudorandom numbers can be predicted if you know both the algorithm being used and the seed number being provided.
https://www.howtogeek.com/183051/htg-explains-how-computers-generate-random-numbers/
Here's a good article on the different methods of generating random numbers.
I am currently trying two create two specialized versions of the linear congruence generator (pesudo-random number generator) algorithm in my program (and I set the seed to the result of the algorithm every time I generate a random number). I would think that this would make both random number generators more random, but it seems like altering one linear congruence generator implementation ruins the other whenever I'm able to get one of them to be random. Does sharing the seed actually make the program more random in any way, or do both generators behave like one generator when a seed is shared?
We want to generate a uniform random number from the interval [0, 1].
Let's first generate k random booleans (for example by rand()<0.5) and decide according to these on what subinterval [m*2^{-k}, (m+1)*2^{-k}] the number will fall. Then we use one rand() to get the final output as m*2^{-k} + rand()*2^{-k}.
Let's assume we have arbitrary precision.
Will a random number generated this way be 'more random' than the usual rand()?
PS. I guess the subinterval picking amounts to just choosing the binary representation of the output 0. b_1 b_2 b_3... one digit b_i at a time and the final step is adding the representation of rand() to the end of the output.
It depends on the definition of "more random". If you use more random generators, it means more random state, and it means that cycle length will be greater. But cycle length is just one property of random generators. Cycle length of 2^64 usually OK for almost any purpose (the only exception I know is that if you need a lot of different, long sequences, like for some kind of simulation).
However, if you combine two bad random generators, they don't necessarily become better, you have to analyze it. But there are generators, which do work this way. For example, KISS is an example for this: it combines 3, not-too-good generators, and the result is a good generator.
For card shuffling, you'll need a cryptographic RNG. Even a very good, but not cryptographic RNG is inadequate for this purpose. For example, Mersenne Twister, which is a good RNG, is not suitable for secure card shuffling! It is because observing output numbers, it is possible to figure out its internal state, so shuffle result can be predicted.
This can help, but only if you use a different pseudorandom generator for the first and last bits. (It doesn't have to be a different pseudorandom algorithm, just a different seed.)
If you use the same generator, then you will still only be able to construct 2^n different shuffles, where n is the number of bits in the random generator's state.
If you have two generators, each with n bits of state, then you can produce up to a total of 2^(2n) different shuffles.
Tinkering with a random number generator, as you are doing by using only one bit of random space and then calling iteratively, usually weakens its random properties. All RNGs fail some statistical tests for randomness, but you are more likely to get find that a noticeable cycle crops up if you start making many calls and combining them.
I would like to apply a permutation test to a sequence with 4,000,000 elements. To my knowledge, it is infeasible due to a number of possible permutations being ridiculously large (no RNG will generate uniformly distributed values in range {1 ... 4000000!}). I've heard of pseudorandom permutations though, and it sounds like something I need, but I can't comprehend if it's actually a proper replacement for random shuffle in my case.
If you are running a permutation test I presume that you want to generate a random sample from the set of all possible permutations, so that you can test some statistic calculated on the real data against the distribution of statistics calculated on the permuted data.
Algorithms for generating random permutations, such as those described at http://en.wikipedia.org/wiki/Random_permutation, typically use many random numbers, so there is no requirement for any single step of the generation process to need numbers as large as 4000000!. The only worry would be that, since the seed used to generate the random numbers is typically much smaller than 4000000!, not all permutations are possible.
There are other statistical tests which consume very large quantities of pseudo-random numbers (e.g. MCMC), so I wouldn't worry about this if you are using a random number generator which is commonly used for statistical tests. If you are worried about this, you could repeat the test with a cryptographically secure random number generator, such as http://docs.oracle.com/javase/6/docs/api/java/security/SecureRandom.html. This will be slower, so you might need to reduce the number of permutations tested, but it is very unlikely that it has any characteristic which would stand out far enough to affect your test results, because any such characteristic would be a security weakness - it would mean that, given a large quantity of random numbers already generated, you would have a slightly better than random chance of guessing the next number correctly.
Is it possible to generate two different psuedorandom numbers on two separate program runs without using time as the seed? i.e. using the same seed on both runs, is it possible to get two different numbers?
In general, it is not possible to get different pseudorandom numbers using the same seed.
Pseudorandom numbers are, by definition, not truly random numbers and therefore are not composed from sources of entropy. Or, if the numbers do contain some entropy input, the input is not enough to cause the sequence to qualify as statistically "random." (An example of a property that such a sequence should have is runs of 1-bits n-bits-long with probability of 2^(-n), among many other properties of statistical randomness. The definition of statistical randomness becomes more sophisticated (in a sense, more "actual" or close to nature) as mathematics around randomness improves. This is another way of saying that, at any given time, the definitions of statistical randomness are about to become out-dated or obsolete.)
In any case, the vast majority of pseudorandom number generators are, in fact, completely deterministic.
The canonical1 example of a pseudorandom number generator is a linear feedback shift register (LFSR). The LFSR can be implemented as a digital logic circuit containing a register which holds N bits, some gates numbering M, much less than N (e.g., M=1, M=2), usually these are XOR gates, which "feed back" into the register's bits at certain "tap bits." There is a lot about this on the web.
Given the same seed input, the LFSR will always generate the same sequence.
It is possible, using Walsh-Hadamard matrices, or otherwise called "M matrices", additionally called "sequency transform", to sample the output of an LFSR and determine that the sequence is, in fact, from an LFSR and also the structure of its gates and taps, as well as the current register content. From this information all sequence values are known, and it is possible to reverse out the possible seed values which were used as input. For these reasons, LFSRs are not suitable for security purposes such as random tokens for authentication.
By canonical, I am refering to Don Knuth's use of the LFSR as an example, as well as the timeless tradition which has ensued therefrom.
Not sure if you want to generate 2 different random numbers from same seed - or avoid it! But, if you really do want that, then similar to LFSRs, LCGs (Linear Congruential Generators) are often used to generate deterministic psuedo random numbers. You can 'easily' create 2 simple LCGs using different constants, which will generate 2 different psuedo random numbers for the same seed.