About Mersenne Twister generator's period - random

I have read that Mersenne Twister generator has a period of 2¹⁹⁹³⁷ - 1, but I'm confused about why can that be possible. I see this implementation of the Mersenne Twister algorithm and in the first comment it clearly says that it produces values in the range 0 to 2³² - 1. Therefore, after it has produced 2³² - 1 different random numbers, it will necessarily come back to the starting point (the seed), so the period can be at maximum 2³² - 1.
Also (and tell me if I'm wrong, please), a computer can't hold the number (2¹⁹⁹³⁷ - 1) ~ 4.3×10⁶⁰⁰¹, at least in a single block of memory. What am I missing here?

Your confusion stems from thinking that the output number and the internal state of a PRNG have to be the same thing.
Some very old PRNGs used to do this, such as Linear Congruental Generators. In those generators, the current output was fed back into the generator for the next step.
However, most PRNGS, including the Mersenne Twister, work from a much larger state, which it updates and uses to generate a 32-bit number (it doesn't really matter which order this is done in for the purposes of this answer).
In fact, the Mersenne Twister does indeed store 624 times 32-bit values, and that is 19968 bits, enough to contain the very long period that you are wondering about. The values are handled separately (as unsigned 32-bit integers), not treated as one giant number in a single-step calculation. The 32-bit random number you get from the output is related to this state, but does not determine the next number by itself.

You are wrong at
Therefore, after it has produced 2³² - 1 different random numbers, it
will necessarily come back to the starting point (the seed)...
That's right that the next number can be the same with one of the number already generated, but the internal state of the random number generator will not be the same. (Noone told you that every number in the range 2³² - 1 will be generated at the 2³² - 1th step.) So there's no bijection between the random number generated and the internal state of the generator. The random number generated can be calculated from the state but you don't even have to do it. You can step the internal state also without creating the random number.
And of course, the computer doesn't store the whole number sequence. It calculates the random number from the internal state. Consider a number sequence like 1, -1, 1, -1 ... you can generate the Nth number without storing number of N elements.

Related

Shuffle sequential numbers without a buffer

I am looking for a shuffle algorithm to shuffle a set of sequential numbers without buffering. Another way to state this is that I’m looking for a random sequence of unique numbers that have a given period.
Your typical Fisher–Yates shuffle needs to have each element all of the elements it is going to shuffle, so that isn’t going to work.
A Linear-Feedback Shift Register (LFSR) does what I want, but only works for periods that are powers-of-two less two. Here is an example of using a 4-bit LFSR to shuffle the numbers 1-14:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
8
12
14
7
4
10
5
11
6
3
2
1
9
13
The first two is the input, and the second row the output. What’s nice is that the state is very small—just the current index. You can start of any index and get a difference set of numbers (starting at 1 yields: 8, 12, 14; starting at 9: 6, 3, 2), although the sequence is always the same (5 is always followed by 11). If I want a different sequence, I can pick a different generator polynomial.
The limitations to the LFSR are that the periods are always power-of-two less two (the min and max are always the same, thus unshuffled) and there not enough enough generator polynomials to allow every possible random sequence.
A block cipher algorithm would work. Every key produces a uniquely shuffled set of numbers. However all block ciphers (that I know about) have power-of-two block sizes, and usually a fixed or limited number of block sizes. A block cipher with a arbitrary non-binary block size would be perfect if such a thing exists.
There are a couple of projects I have that could benefit from such an algorithm. One is for small embedded micros that need to produce a shuffled sequence of numbers with a period larger than the memory they have available (think Arduino Uno needing to shuffle 1 to 100,000).
Does such an algorithm exist? If not, what things might I search for to help me develop such an algorithm? Or is this simply not possible?
Edit 2022-01-30
I have received a lot of good feedback and I need to better explain what I am searching for.
In addition to the Arduino example, where memory is an issue, there is also the shuffle of a large number of records (billions to trillions). The desire is to have a shuffle applied to these records without needing a buffer to hold the shuffle order array, or the time needed to build that array.
I do not need an algorithm that could produce every possible permutation, but a large number of permutations. Something like a typical block cipher in counter mode where each key produces a unique sequence of values.
A Linear Congruential Generator using coefficients to produce the desired sequence period will only produce a single sequence. This is the same problem for a Linear Feedback Shift Register.
Format-Preserving Encryption (FPE), such as AES FFX, shows promise and is where I am currently focusing my attention. Additional feedback welcome.
It is certainly not possible to produce an algorithm which could potentially generate every possible sequence of length N with less than N (log2N - 1.45) bits of state, because there are N! possible sequence and each state can generate exactly one sequence. If your hypothetical Arduino application could produce every possible sequence of 100,000 numbers, it would require at least 1,516,705 bits of state, a bit more than 185Kib, which is probably more memory than you want to devote to the problem [Note 1].
That's also a lot more memory than you would need for the shuffle buffer; that's because the PRNG driving the shuffle algorithm also doesn't have enough state to come close to being able to generate every possible sequence. It can't generate more different sequences than the number of different possible states that it has.
So you have to make some compromise :-)
One simple algorithm is to start with some parametrisable generator which can produce non-repeating sequences for a large variety of block sizes. Then you just choose a block size which is as least as large as your target range but not "too much larger"; say, less than twice as large. Then you just select a subrange of the block size and start generating numbers. If the generated number is inside the subrange, you return its offset; if not, you throw it away and generate another number. If the generator's range is less than twice the desired range, then you will throw away less than half of the generated values and producing the next element in the sequence will be amortised O(1). In theory, it might take a long time to generate an individual value, but that's not very likely, and if you use a not-very-good PRNG like a linear congruential generator, you can make it very unlikely indeed by restricting the possible generator parameters.
For LCGs you have a couple of possibilities. You could use a power-of-two modulus, with an odd offset and a multiplier which is 5 mod 8 (and not too far from the square root of the block size), or you could use a prime modulus with almost arbitrary offset and multiplier. Using a prime modulus is computationally more expensive but the deficiencies of LCG are less apparent. Since you don't need to handle arbitrary primes, you can preselect a geometrically-spaced sample and compute the efficient division-by-multiplication algorithm for each one.
Since you're free to use any subrange of the generator's range, you have an additional potential parameter: the offset of the start of the subrange. (Or even offsets, since the subrange doesn't need to be contiguous.) You can also increase the apparent randomness by doing any bijective transformation (XOR/rotates are good, if you're using a power-of-two block size.)
Depending on your application, there are known algorithms to produce block ciphers for subword bit lengths [Note 2], which gives you another possible way to increase randomness and/or add some more bits to the generator state.
Notes
The approximation for the minimum number of states comes directly from Stirling's approximation for N!, but I computed the number of bits by using the commonly available lgamma function.
With about 30 seconds of googling, I found this paper on researchgate.net; I'm far from knowledgable enough in crypto to offer an opinion, but it looks credible; also, there are references to other algorithms in its footnotes.

1024 bit pseudo random generator in verilog for FPGA

I want to generate random vectors of length 1024 in verilog . I have looked at certain implementations like Tausworth generators and Mersenne Twisters.
Most Mersenne twisters have 32 bit/ 64 bit outputs . I want to simulate an error pattern of 1024 bits with some probability p . So , I generate a 32 bit random number (uniformly distributed) using Mersenne Twister. Since I have 32 bit random numbers , this number will be in the range 0 to 2^32-1 . After this I set the number to 1, if the number generated from this 32 bit value is less than p*(2^32-1) .Otherwise the number is mapped to a 0 in my 1023 bit vector . Basically , each 32 bit number is used to generate a bit in the 1023 vector according to the probabilistic distribution .
The above method implies that I need 1024 clock cycles to generate each 1024 bit vector. Is there any other way which allows me to do this quickly ? I understand that I could use several instance of the Mersenne Twister in parallel using different seed values but I was afraid that those numbers will not be truly random and that there will be collisions . Is there something that I am doing wrong or something that I am missing ? I would really appreciate your help
Okay,
So I read a bit about Mersenne Twisters in general from wikipedia. I accept I did't outright get all of it but I got this: Given a seed value (to initialise the array), the module generates 32 bit random numbers.
Now, from your description above, it takes one cycle to compute one random number.
So your problem basically boils to to it's mathematics rather than being about verilog as such.
I would try to explain the math of it as best as I can.
You have a 32 bit uniformly distributed random number. So, the probability of any one bit being high or low is exactly (well, close to, cause psuedo random) 0.5.
Let's forget that this is a pseudo random generator, because that is the best you are going to get(So let's consider this as our ideal).
Even if we generate 5 numbers one after the other, the probability of each one being any particular number is still uniformly distributed. So if we concatenate these five numbers, we will get a 160 bit completely random number.
If it's still not clear, consider this way.
I'm gonna break the problem down. Let's say we have a 4-bit random number generator (RNG), and we require 16 bit random numbers.
Each output of the RNG would be a hex digit with a uniform probability distribution. So the probability of getting some particular digit (say... A) is 1/16. Now I want to make a 4 digit Hex number (say... 0xA019).
Probability of getting A as the Most Significant digit = 1/16
Probability of getting 0 as digit number 2 = 1/16
Probability of getting 1 as digit number 3 = 1/16
Probability of getting 9 as the Least Significant digit = 1/16
So the probability of getting 0xA019 = 1/(2^16). Infact, probability of getting any four digit hex number would be exactly the same. Now, extend the same logic to Base-32 Number systems with 32 digit numbers as the required output and you have your solution.
So, we see, we could do with just 32 repetitions of the Mersenne twister to get the 1024 bit output (that would take 32 cycles, still kinda slow). What you could also do is synthesise 32 twisters in parallel (that would give you the output in one stroke but would be very heavy on the fpga in terms of area, power constraints).
The best way to go about this would be to try for some middle ground (maybe 4 parallel twisters running in 8 cycles). This would really be a question of the end application of the module and the power and timing constraints that you need for that application.
As for giving different seed values, most PRNGs usually have provision for input seeds just to increase randomness, from what I read on Mersenne Twisters, it has the same case.
Hope that answers your question.

Generating non-colliding random numbers from the combination of 2 numbers in a set?

I have a set of 64-bit unsigned integers with length >= 2. I pick 2 random integers, a, b from that set. I apply a deterministic operation to combine a and b into different 64-bit unsigned integers, c_1, c_2, c_3, etc. I add those c_ns to the set. I repeat that process.
What procedure can I use to guarantee that c will practically never collide with an existing bitstring on the set, even after millions of steps?
Since you're generating multiple 64-bit values from a pair of 64-bit numbers, I would suggest that you select two numbers at random, and use them to initialize a 64 bit xorshift random number generator with 128 bits of state. See https://en.wikipedia.org/wiki/Xorshift#xorshift.2B for an example.
However, it's rather difficult to predict the collision probability when you're using multiple random number generators. With a single PRNG, the rule of thumb is that you'll have a 50% chance of a collision after generating the square root of the range. For example, if you were generating 32-bit random numbers, your collision probability reaches 50% after about 70,000 numbers generated. Square root of 2^32 is 65,536.
With a single 64-bit PRNG, you could generate more than a billion random numbers without too much worry about collisions. In your case, you're picking two numbers from a potentially small pool, then initializing a PRNG and generating a relatively small number of values that you add back to the pool. I don't know how to calculate the collision probability in that case.
Note, however, that whatever the probability of collision, the possibility of collision always exists. That "one in a billion" chance does in fact occur: on average once every billion times you run the program. You're much better off saving your output numbers in a hash set or other data structure that won't allow you to store duplicates.
I think the best you can do without any other given constraints is to use a pseudo-random function that maps two 64-bit integers to a 64-bit integer. Depending on whether the order of a and b matter for your problem or not (i.e. (3, 5) should map to something else than (5, 3)) you shouldn't or should sort them before.
The natural choice for a pseudo-random function that maps a larger input to a smaller input is a hash function. You can select any hash function that produces an output of at least 64-bit and truncate it. (My favorite in this case would be SipHash with an arbitrary fixed key, it is fast and has public domain implementations in many languages, but you might just use whatever is available.)
The expected amount of numbers you can generate before you get a collision is determined by the birthday bound, as you are essentially selecting values at random. The linked article contains a table for the probabilities for 64-bit values. As an example, if you generate about 6 million entries, you have a collision probability of one in a million.
I don't think it is possible to beat this approach in the general case, as you could encode an arbitrary amount of information in the sequence of elements you combine while the amount of information in the output value is fixed to 64-bit. Thus you have to consider collisions, and a random function spreads out the probability evenly among all possible sequences.

Multiple independent pseudo random number generation in hardware (Verilog or VHDL)

I need pseudo random numbers generated for hardware (either in VHDL or Verilog) that meet the following criteria.
- Each number is 1-bit (doesn't have to be, but that would complicate things more)
- The N pseudo random numbers cannot be correlated with each other.
- The N pseudo random numbers need to be generated at the same time (every clock edge).
I understand that the following will not work :
- Using N different seeds for a given polynomial - they will simply be shifted versions of each other
- Using N different polynomials for a given length LFSR - not practical since N can be as large as 64, and I don't know what length LSFR would give 64 different tap combinations, too huge if possible at all.
If using LFSR, the lengths do not need to be identical. For a small N, say 4, I thought about using 4 different prime number lengths (to minimize repeatability), e.g., 15, 17, 19, 23, but again, for a large N, it gets very messy. Let's say, something on the order of 2^16 gives sufficient length for an LFSR.
Is there an elegant way of handling this problem? By elegant, I mean not having to code N different unique modules (15, 17, 19, 23 above as an example). Using N different instances of Mersenne Twister, with different seeds? I do not have unlimited amount of hardware resources (FF, LUT, BRAM), but for the sake of this discussion it's probably best to ignore resource issues.
Than you in advance.
One option is to use a cryptographic hash, these are typically wide (64-256 bits), and good hashes have the property that a single bit input change will propagate to all output bits in unpredictable fashion. Run an incrementing counter into the hash and start the counter at a random value.
The GHASH used in AES-GCM is hardware-friendly and can generate new output values every clock.

Given a true random number generator which outputs either a 1 or 0 per call, how do you use this to pick a number from an arbitrary range?

If I have a true random number generator (TRNG) which can give me either a 0 or a 1 each time I call it, then it is trivial to then generate any number in a range with a length equal to a power of 2. For example, if I wanted to generate a random number between 0 and 63, I would simply poll the TRNG 5 times, for a maximum value of 11111 and a minimum value of 00000. The problem is when I want a number in a rangle not equal to 2^n. Say I wanted to simulate the roll of a dice. I would need a range between 1 and 6, with equal weighting. Clearly, I would need three bits to store the result, but polling the TRNG 3 times would introduce two eroneous values. We could simply ignore them, but then that would give one side of the dice a much lower odds of being rolled.
My question of ome most effectively deals with this.
The easiest way to get a perfectly accurate result is by rejection sampling. For example, generate a random value from 1 to 8 (3 bits), rejecting and generating a new value (3 new bits) whenever you get a 7 or 8. Do this in a loop.
You can get arbitrarily close to accurate just by generating a large number of bits, doing the mod 6, and living with the bias. In cases like 32-bit values mod 6, the bias will be so small that it will be almost impossible to detect, even after simulating millions of rolls.
If you want a number in range 0 .. R - 1, pick least n such that R is less or equal to 2n. Then generate a random number r in the range 0 .. 2n-1 using your method. If it is greater or equal to R, discard it and generate again. The probability that your generation fails in this manner is at most 1/2, you will get a number in your desired range with less than two attempts on the average. This method is balanced and does not impair the randomness of the result in any fashion.
As you've observed, you can repeatedly double the range of a possible random values through powers of two by concatenating bits, but if you start with an integer number of bits (like zero) then you cannot obtain any range with prime factors other than two.
There are several ways out; none of which are ideal:
Simply produce the first reachable range which is larger than what you need, and to discard results and start again if the random value falls outside the desired range.
Produce a very large range, and distribute that as evenly as possible amongst your desired outputs, and overlook the small bias that you get.
Produce a very large range, distribute what you can evenly amongst your desired outputs, and if you hit upon one of the [proportionally] few values which fall outside of the set which distributes evenly, then discard the result and start again.
As with 3, but recycle the parts of the value that you did not convert into a result.
The first option isn't always a good idea. Numbers 2 and 3 are pretty common. If your random bits are cheap then 3 is normally the fastest solution with a fairly small chance of repeating often.
For the last one; supposing that you have built a random value r in [0,31], and from that you need to produce a result x [0,5]. Values of r in [0,29] could be mapped to the required output without any bias using mod 6, while values [30,31] would have to be dropped on the floor to avoid bias.
In the former case, you produce a valid result x, but there's some more randomness left over -- the difference between the ranges [0,5], [6,11], etc., (five possible values in this case). You can use this to start building your new r for the next random value you'll need to produce.
In the latter case, you don't get any x and are going to have to try again, but you don't have to throw away all of r. The specific value picked from the illegal range [30,31] is left-over and free to be used as a starting value for your next r (two possible values).
The random range you have from that point on needn't be a power of two. That doesn't mean it'll magically reach the range you need at the time, but it does mean you can minimise what you throw away.
The larger you make r, the more bits you may need to throw away if it overflows, but the smaller the chances of that happening. Adding one bit halves your risk but increases the cost only linearly, so it's best to use the largest r you can handle.

Resources