When should I write my own random number algorithm instead of using a stock math function? - algorithm

So I am taking a scripting test in Lua, and I am given this question:
Create an algorithm to generate a deck of cards, 1-52. Shuffle the deck of cards (do not use something like array.randomize() ). Then hand out 5 cards to two different players. Being that each card must be dealt to a different player at a time.
Typically I would do something like this to get a random number
local newDeck = {} --assume this array has all 52 cards in a playing deck
math.randomseed( os.time() )
local card = math.random(#newDeck)
...but it seems that the question is specifically asking that I do NOT use a stock math function.
(do not use something like array.randomize())
What would be the advantage to that? I can't imagine that the player of such a game would even notice a difference between random and pseudo random.

If only it were that simple. Most random number generators that are part of a language are linear congruential generators, meaning that the next term J, say, is related to the previous one I by
J = (aI + b) mod c
Where a, b, c are constants.
This means that it is possible to decipher the sequence from a single digit number of terms! (It's a set of simultaneous equations with bit of trickery to handle the modulus).
I'd say that an astute player is bound to notice the pseudo random nature of your sequence and may even game the system by unpicking your generator. You need to use a more sophisticated scheme. (Early attempts include Park-Miller and Bays-Durham; fairly well-known approaches).

I believe you are welcome to use the built in random number generator to get random numbers, but prohibited from using any built in array shufflers that may exist. How can you use a rng to have each card equally likely to be in each position?

you could just write something that draws a random card and puts it in the shuffled deck:
function shuf(tab)
local new = {}
for k=1,#tab do
new[#new+1]=table.remove(tab,math.random(#tab))
end
end
This approach makes sure you have no doubles.
I don't really think using a different RNG would matter that much unless you're doing cryptography, or something else that really matters.
Interpreting the question: just don't use a library function written for doing this. But there is a difference between a shuffler and a random number generator, since the latter can return double values while the former can't.

Related

Is it possible to reverse a pseudo random number generator?

Is it possible to reverse a pseudo random number generator?
For example, take an array of generated numbers and get the original seed.
If so, how would this be implemented?
This is absolutely possible - you just have to create a PRNG which suits your purposes. It depends on exactly what you need to accomplish - I'd be happy to offer more advice if you describe your situation in more detail.
For general background, here are some resources for inverting a Linear Congruential Generator:
Reversible pseudo-random sequence generator
pseudo random distribution which guarantees all possible permutations of value sequence - C++
And here are some for inverting the mersenne twister:
http://www.randombit.net/bitbashing/2009/07/21/inverting_mt19937_tempering.html
http://b10l.com/reversing-the-mersenne-twister-rng-temper-function/
In general, no. It should be possible for most generators if you have the full array of numbers. If you don't have all of the numbers or know which numbers you have (do you have the 12th or the 300th?), you can't figure it out at all, because you wouldn't know where to stop.
You would have to know the details of the generator. Decoding a linear congruential generator is going to be different from doing so for a counter-based PRNG, which is going to be different from the Mersenne twister, which is going to be different with a Fibonacci generator. Plus you would probably need to know the parameters of the generator. If you had all of that AND the equation to generate a number is invertible, then it is possible. As to how, it really depends on the PRNG.
Use the language Janus a time-reversible language for doing reversible computing.
You could probably do something like create a program that does this (pseudo-code):
x = seed
x = my_Janus_prng(x)
x = reversible_modulus_op(x, N) + offset
Janus has the ability to give to you a program that takes the output number and whatever other data it needs to invert everything, and give you the program that ends with x = seed.
I don't know all the details about Janus or how you could do this, but just thought I would mention it.
Clearly, what you want to do is probably a better idea because if the RNG is not an injective function, then what should it map back to etc.
So you want to write a Janus program that outputs an array. The input to the Janus inverted program would then take an array (ideally).

Generate Random Numbers non-algorithmically

I am looking for a satisfying solution of how to generate a random number.
I looked at this, this, this and this.
But am looking for something else.
Most of the posts mention using, R[n+1] = (a *R[n-1 + b) %n, this pseudo-random function, or some other mathematical functions.
But weirdly I am not looking for these; I want some non-algorithmic answer. Precisely, an "Interview" answer. Something easy to understand, not to make the interviewer feel that I mugged up a method :) .
For an interview question, a common answer might be to look at the intervals between keystrokes (ask the user to type something), disc seek times or input from a disconnected source -- that will give you thermal electrons from inside your MIC socket or whatever.
LavaRnd uses a digital camera with the lens cap on, which is a version of the last.
Some operating systems allows indirect access to some of this random input, usually through a secure random function; slower but more secure than the usual RNG.
Depending on what job the interview is for, you can talk about testing the raw data to check for entropy, and concentrating the entropy by using a cryptographic hash function like SHA-256.
There are also specialised, and expensive, hardware cards which use various quantum effects to generate true random numbers.
Take the system time, add a seed, modulo the upper limit. if upper limit is less than 0 than multiply it by -1 and then later the result subtract the max... not very strong but meets your requirement?
If you have a UI and only need a couple of randoms can ask the user to move mouse around, enter a few seeds, enter a few words and use them as seeds

How to implement a pseudo random function

I want to generate a sequence of random numbers that will be used to pick tiles for a "maze". Each maze will have an id and I want to use that id as a seed to a pseudo random function. That way I can generate the same maze over and over given it's maze id. Preferably I do not want to use a built in pseudo random function in a language since I do not have control over the algorithm and it could change from platform to platform. As such, I would like to know:
How should I go about implementing my own pseudo random function?
Is it even feasible to generate platform independent pseudo random numbers?
Yes, it is possible.
Here is an example of such an algorithm (and its use) for noise generation.
Those particular random functions (Noise1, Noise2, Noise3, ..) use input parameters and calculate the pseudo random values from there.
Their output range is from 0.0 to 1.0.
And there are many more out there (Like mentioned in the comments).
UPDATE 2019
Looking back at this answer, a better suited choice would be the below-mentioned mersenne twister. Or you could find any implementation of xorshift.
The Mersenne Twister may be a good pick for this. As you can see from the pseudocode on wikipedia, you can seed the RNG with whatever you prefer to produce identical values for any instance with that seed. In your case, the maze ID or the hash of the maze ID.
If you are using Python, you can use the random module by typing at the beginning,
import random. Then, to use it, you type-
var = random.randint(1000, 9999)
This gives the var a 4 digit number that can be used for its id
If you are using another language, there is likely a similar module

When to stop the looping in random number generators?

I'm not sure StackOverflow is the right place to ask this question, because this question is half-programming and half-mathematics. And also really sorry if my question is stupid ^_^
I'm studying about Monte Carlo simulations via the "Monte Carlo Methods" book. One of the first thing I must learn is about Random Number Generator. The basic algorithm of RNG is:
1. Initialize: Draw the seed S0 from the distribution µ on S. Set t = 1.
2. Transition: Set St = f(St−1).
3. Output: Set Ut = g(St).
4. Repeat: Set t = t+ 1 and return to Step 2.
(µ is a probability distribution on the finite set of states S, the input is S0 and the random number we desire it the output Ut)
It is not hard to understand, but the problem here is I don't see the random factor which lie in the number of repeat. How can we decide when to stop the loop of the RNG? All examples I read which implement a RNG are loop for 100 times, and they returns the same value for a specific seed. It is not random at all >_<
Can someone explain what I'm missing here? Any help will be appreciated. Thanks everyone
You can't get a true sequence of random numbers on a computer, without specialized hardware. (Such specialized hardware performs the equivalent of an initial roll of the dice using physics to provide the randomness. Electronic ones often use the electronic noise of specialized diodes at constant temperatures; others use radioactive decay events.)
Without that specialized hardware, what you can generate are pseudorandom numbers which, as you've observed, always generate the same sequence of numbers for the same initial seed. For simple applications, you can often get away with generating an initial seed from the time of invocation, which is effectively random.
And when I say "simple applications," I am excluding cryptography. (Not just that, but especially that.)
Sometimes when you are trying to debug a simulation, you actually want to have a reproducible stream of "random" numbers so you might specifically sent a stream to start with a specific seed.
For instance in the answer Creating a facet_wrap plot with ggplot2 with different annotations in each plot rcs starts the answer by creating a reproducible set of data using the R code
set.seed(1)
df <- data.frame(x=rnorm(300), y=rnorm(300), cl=gl(3,100)) # create test data
before going on to demonstrate how to answer the actual question.

A function where small changes in input always result in large changes in output

I would like an algorithm for a function that takes n integers and returns one integer. For small changes in the input, the resulting integer should vary greatly. Even though I've taken a number of courses in math, I have not used that knowledge very much and now I need some help...
An important property of this function should be that if it is used with coordinate pairs as input and the result is plotted (as a grayscale value for example) on an image, any repeating patterns should only be visible if the image is very big.
I have experimented with various algorithms for pseudo-random numbers with little success and finally it struck me that md5 almost meets my criteria, except that it is not for numbers (at least not from what I know). That resulted in something like this Python prototype (for n = 2, it could easily be changed to take a list of integers of course):
import hashlib
def uniqnum(x, y):
return int(hashlib.md5(str(x) + ',' + str(y)).hexdigest()[-6:], 16)
But obviously it feels wrong to go over strings when both input and output are integers. What would be a good replacement for this implementation (in pseudo-code, python, or whatever language)?
A "hash" is the solution created to solve exactly the problem you are describing. See wikipedia's article
Any hash function you use will be nice; hash functions tend to be judged based on these criteria:
The degree to which they prevent collisions (two separate inputs producing the same output) -- a by-product of this is the degree to which the function minimizes outputs that may never be reached from any input.
The uniformity the distribution of its outputs given a uniformly distributed set of inputs
The degree to which small changes in the input create large changes in the output.
(see perfect hash function)
Given how hard it is to create a hash function that maximizes all of these criteria, why not just use one of the most commonly used and relied-on existing hash functions there already are?
From what it seems, turning integers into strings almost seems like another layer of encryption! (which is good for your purposes, I'd assume)
However, your question asks for hash functions that deal specifically with numbers, so here we go.
Hash functions that work over the integers
If you want to borrow already-existing algorithms, you may want to dabble in pseudo-random number generators
One simple one is the middle square method:
Take a digit number
Square it
Chop off the digits and leave the middle digits with the same length as your original.
ie,
1111 => 01234321 => 2342
so, 1111 would be "hashed" to 2342, in the middle square method.
This way isn't that effective, but for a few number of hashes, this has very low collision rates, a uniform distribution, and great chaos-potential (small changes => big changes). But if you have many values, time to look for something else...
The grand-daddy of all feasibly efficient and simple random number generators is the (Mersenne Twister)[http://en.wikipedia.org/wiki/Mersenne_twister]. In fact, an implementation is probably out there for every programming language imaginable. Your hash "input" is something that will be called a "seed" in their terminology.
In conclusion
Nothing wrong with string-based hash functions
If you want to stick with the integers and be fancy, try using your number as a seed for a pseudo-random number generator.
Hashing fits your requirements perfectly. If you really don't want to use strings, find a Hash library that will take numbers or binary data. But using strings here looks OK to me.
Bob Jenkins' mix function is a classic choice, at when n=3.
As others point out, hash functions do exactly what you want. Hashes take bytes - not character strings - and return bytes, and converting between integers and bytes is, of course, simple. Here's an example python function that works on 32 bit integers, and outputs a 32 bit integer:
import hashlib
import struct
def intsha1(ints):
input = struct.pack('>%di' % len(ints), *ints)
output = hashlib.sha1(input).digest()
return struct.unpack('>i', output[:4])
It can, of course, be easily adapted to work with different length inputs and outputs.
Have a look at this, may be you can be inspired
Chaotic system
In chaotic dynamics, small changes vary results greatly.
A x-bit block cipher will take an number and convert it effectively to another number. You could combine (sum/mult?) your input numbers and cipher them, or iteratively encipher each number - similar to a CBC or chained mode. Google 'format preserving encyption'. It is possible to create a 32-bit block cipher (not widely 'available') and use this to create a 'hashed' output. Main difference between hash and encryption, is that hash is irreversible.

Resources