For a game that I'm making, where solar systems have an x and y coordinates, I'd like to use the coordinates to randomly generate the features for that solar system. The easiest way to do this seems to seed a random number generator with two seeds, the x and y coordinates. Is there anyway to get one reliable seed from the two seeds, or is there a good PRNG that takes two seeds and produces long periods?
EDIT: I'm aware of binary operations between the two numbers, but I'm trying to find the method that will lead to the least number of collisions? Addition and multiplication will easily result in collisions. But what about XOR?
Why not just combine the numbers in a meaningful way to generate your seed. For example, you could add them, which could be unique enough, or perhaps stack them using a little multiplication, for example:
seed = (x << 32) + y
seed1 ^ seed2
(where ^ is the bitwise XOR operator)
A simple Fibonacci PRNG uses 2 seeds
One of which should be odd. This generator
Uses a modulus which is a power of 10.
The period is long and invariable being
1.5 times the modulus; thus for modulus
1000000 or 10^6 the period is 1,500,000.
The simple pseudocode is:
Input "Enter power for 10^n modulus";m
Mod& = 10 ^ m
Input "Enter # of iterations"; n
Input "Enter seed #1"; a
Input "Enter seed #2"; b
Loop = 1
For loop = 1 to n
C = a + b
If c > m then c = c - m
A = b
B = c
Next
This generator is very fast and gives
An excellent uniform distribution.
Hope this helps.
why not use some kind of super simple fibonacci arithmetic or something like it to produce coordinates directly in base 10. Use the two starting numbers as the seeds. It won't produce random numbers suitable for monte carlo or anything like that, but they should be all right for a game. I'm not a programer or a mathmatician and have never tried to code anything so I couldn't do it for you.....
edit - something like f1 = some seed then f2 = some seed and G = (sqrt(5) + 1) / 2....
then some kind of loop. Xn = Xn-1 + Xn-2 mod(G) mod(1) (should produce a decimal between 0 and 1) and then multiply by what ever and take the least significant digits
and perhaps to prevent decay for as long as the numbers need to be produced...
an initial reseeding point at which f1 and f2 will be reseeded based on the generators own output, which will prevent the sequence of numbers being able to be described by a closed expression so...
if counter = initial reseeding point f1 = Xn and f2 = Xn - something. and... reseeding point is set to ceiling Xn * some multiplier.
so it's period should end when identical values for Xn and Xn - something are re-fed into f1 and f2, which shouldn't happen for at least what ever bit length you are using for the numbers.
.... I mean, that's my best guess...
Is there a reason you want to use the co-ordinates? For example, do you always want a system generated at the same coordinate to always be identical to any other system generated at that particular co-ordinate?
I would suggest using the more classical method of just seeding with the current time and using the results of that to continue generating your pseudo-randomness.
If you're adamant about using the coordinates, I would suggest concatenation (As I believe someone else suggested). At least then you're guaranteed to avoid collisions, assuming that you don't have two systems at the same co-ords.
I use one of George Marsaglia's PRNGs:
http://www.math.uni-bielefeld.de/~sillke/ALGORITHMS/random/marsaglia-c
It explicitly relies on two seeds so might just what you are looking for.
Related
I am working on a requirement where a function f will use string s as a seed and generate n no of strings y0..n , I can easily do this, but I also want to do inverse ie, f-1(yi) of generated strings will give me back s.
y0 = f(s) # first time I call f(s) it gives me y0
y1 = f(s) # second time I call f(s) it gives me y1
...
yi = f(s) # ith time I call f(s) it gives me yi
and so on.
The inverse function,
s = f-1(yi)
How can find the functions f and f-1, the other constraint the character size cannot to be too large for these strings, say max 20-25 characters.
Any suggestions please ?
Ok, this will get too channel-coding specific if I do it in broadness, here, but:
These are mathematical concepts, so let's map strings to numbers and look at them algebraically:
Your 20-character string space, assuming we're just using the 128 common ASCII characters, has 27 * 20 elements. That's pretty many elements.
However, communication technology has a method called scrambling which is a reversible process of mingling the bits in a sequence in a way that spreads the per-bit energy over the whole sequence. That leads to pretty randomly looking bit streams. It's typically implemented using feedback shift registers.
It's possible to find a 2140 state LFSR that fulfills your scrambling needs, and you can interpret the output of a multiplicative scrambler as the next element in your sequence.
However, please be aware that your problem is a hard one, which I hope I've illustrated sufficiently -- getting something that has good random properties is a harsh thing, and I can't recommend implementing something like that yourself -- it's going to make problems as soon as you need to rely on mathematical properties of your pseudorandom string.
I understand 3D hyperplanes can represent numbers generated by linear congruential generator. But I don't get how it determines the location for each number or point. Especially in a 3D cube? I mean, doesn't a point have to have X, Y, and Z values to be in there?! What if one of the numbers generated is "8"? It's just "8"... how would I know XYZ for that? (I hope you know what I'm talking about... couldn't post an image, sorry :/)
Suppose you generate batches of three pseudo-random numbers in a sequence from your linear congruential generator and use the first number in each batch as the x-dimension, the next as the y-dimension and the last as the z-dimension, you can then plot each batch of three pseudo-random numbers in a x-y-z cube. A similar argument goes for generating batches of n (n > 3) numbers, except you'll plot them in a hypercube.
Assume that you are generating each of those pseudo-random numbers with b bits. There are then 2nb possible numbers that would have to be generated to fill the (hyper)cube (which will be a very large number, for any typical value of b). However, if the generator has a period of less than 2nb (which will almost always be the case for practical purposes), it won't fill all the available spaces in the cube (or hypercube, if n > 3). It will only fill some of the spaces.
What's more, the filled spaces may be located in planes (or hyperplanes, if n > 3) passing through the (hyper)cube, with spaces in-between the (hyper)planes that represent numbers that the generator will never produce because it repeats its cycle without ever producing such a number. This occurs because the pseudo-random numbers are serially correlated. You can see this behaviour at any dimensionality but the number of (hyper)planes on which the pseudo-random numbers are located reduces as the dimensionality n increases, so the behaviour becomes much more obvious as n gets larger.
This can be a particular problem in when using the generated pseudo-random numbers as input to a simulation because the simulation can then produce output that is more an artefact of the imperfections of the pseudo-random numbers than a consequence of the simulated model.
The Wikipedia article on Linear congruential generator is excellent.
(EDITED TO ADD AN EXAMPLE)
Here is a linear congruential generator (with very poor parameters selected deliberately) implemented in Python. Pseudo-random numbers with an even index are assigned to x values and those with odd numbers are assigned to y values.
import matplotlib.pyplot as plt
def lcg (X, a, c, m):
return (a * X + c) % m;
x = []
y = []
X = 0
for i in range(1000):
X = lcg(X,43,5,256)
if i % 2 == 0:
x.append(X)
else:
y.append(X)
plt.scatter(x,y)
plt.show()
This script produces the following output:
You can see that the resulting (x,y) pairs are all found on a small number of straight lines and pairs that appear in-between the lines can never be produced by the generator. The same thing can be done in three or more dimensions to see how generators with better parameters than I've used here still produce outputs that sit on lines, planes or hyperplanes in 2, 3, or n-dimensional space.
Given an integer range R = [a, b] (where a >=0 and b <= 100), a bias integer n in R, and some deviation b, what formula can I use to skew a random number generator towards n?
So for example if I had the numbers 1 through 10 inclusively and I don't specify a bias number, then I should in theory have equal chances of randomly drawing one of them.
But if I do give a specific bias number (say, 3), then the number generator should be drawing 3 a more frequently than the other numbers.
And if I specify a deviation of say 2 in addition to the bias number, then the number generator should be drawing from 1 through 5 a more frequently than 6 through 10.
What algorithm can I use to achieve this?
I'm using Ruby if it makes it any easier/harder.
i think the simplest route is to sample from a normal (aka gaussian) distribution with the properties you want, and then transform the result:
generate a normal value with given mean and sd
round to nearest integer
if outside given range (normal can generate values over the entire range from -infinity to -infinity), discard and repeat
if you need to generate a normal from a uniform the simplest transform is "box-muller".
there are some details you may need to worry about. in particular, box muller is limited in range (it doesn't generate extremely unlikely values, ever). so if you give a very narrow range then you will never get the full range of values. other transforms are not as limited - i'd suggest using whatever ruby provides (look for "normal" or "gaussian").
also, be careful to round the value. 2.6 to 3.4 should all become 3, for example. if you simply discard the decimal (so 3.0 to 3.999 become 3) you will be biased.
if you're really concerned with efficiency, and don't want to discard values, you can simply invent something. one way to cheat is to mix a uniform variate with the bias value (so 9/10 times generate the uniform, 1/10 times return 3, say). in some cases, where you only care about average of the sample, that can be sufficient.
For the first part "But if I do give a specific bias number (say, 3), then the number generator should be drawing 3 a more frequently than the other numbers.", a very easy solution:
def randBias(a,b,biasedNum=None, bias=0):
x = random.randint(a, b+bias)
if x<= b:
return x
else:
return biasedNum
For the second part, I would say it depends on the task. In a case where you need to generate a billion random numbers from the same distribution, I would calculate the probability of the numbers explicitly and use weighted random number generator (see Random weighted choice )
If you want an unimodal distribution (where the bias is just concentrated in one particular value of your range of number, for example, as you state 3), then the answer provided by andrew cooke is good---mostly because it allows you to fine tune the deviation very accurately.
If however you wish to make several biases---for instance you want a trimodal distribution, with the numbers a, (a+b)/2 and b more frequently than others, than you would do well to implement weighted random selection.
A simple algorithm for this was given in a recent question on StackOverflow; it's complexity is linear. Using such an algorithm, you would simply maintain a list, initial containing {a, a+1, a+2,..., b-1, b} (so of size b-a+1), and when you want to add a bias towards X, you would several copies of X to the list---depending on how much you want to bias. Then you pick a random item from the list.
If you want something more efficient, the most efficient method is called the "Alias method" which was implemented very clearly in Python by Denis Bzowy; once your array has been preprocessed, it runs in constant time (but that means that you can't update the biases anymore once you've done the preprocessing---or you would to reprocess the table).
The downside with both techniques is that unlike with the Gaussian distribution, biasing towards X, will not bias also somewhat towards X-1 and X+1. To simulate this effect you would have to do something such as
def addBias(x, L):
L = concatList(L, [x, x, x, x, x])
L = concatList(L, [x+2])
L = concatList(L, [x+1, x+1])
L = concatList(L, [x-1,x-1,x-1])
L = concatList(L, [x-2])
Out of pure interested, I'm curious how to create PI sequentially so that instead of the number being produced after the outcome of the process, allow the numbers to display as the process itself is being generated. If this is the case, then the number could produce itself, and I could implement garbage collection on previously seen numbers thus creating an infinite series. The outcome is just a number being generated every second that follows the series of Pi.
Here's what I've found sifting through the internets :
This it the popular computer-friendly algorithm, The Machin-like Algorithm :
def arccot(x, unity)
xpow = unity / x
n = 1
sign = 1
sum = 0
loop do
term = xpow / n
break if term == 0
sum += sign * (xpow/n)
xpow /= x*x
n += 2
sign = -sign
end
sum
end
def calc_pi(digits = 10000)
fudge = 10
unity = 10**(digits+fudge)
pi = 4*(4*arccot(5, unity) - arccot(239, unity))
pi / (10**fudge)
end
digits = (ARGV[0] || 10000).to_i
p calc_pi(digits)
To expand on "Moron's" answer: What the Bailey-Borwein-Plouffe formula does for you is that it lets you compute binary (or equivalently hex) digits of pi without computing all of the digits before it. This formula was used to compute the quadrillionth bit of pi ten years ago. It's a 0. (I'm sure that you were on the edge of your seat to find out.)
This is not the same thing as a low-memory, dynamic algorithm to compute the bits or digits of pi, which I think what you could mean by "sequentially". I don't think that anyone knows how to do that in base 10 or in base 2, although the BPP algorithm can be viewed as a partial solution.
Well, some of the iterative formula for pi are also sort-of like a sequential algorithm, in the sense that there is an iteration that produces more digits with each round. However, it's also only a partial solution, because typically the number of digits doubles or triples with each step. So you'd wait with a lot of digits for a while, and the whoosh a lot more digits come quickly.
In fact, I don't know if there is any low-memory, efficient algorithm to produce digits of any standard irrational number. Even for e, you'd think that the standard infinite series is an efficient formula and that it's low-memory. But it only looks low memory at the beginning, and actually there are also faster algorithms to compute many digits of e.
Perhaps you can work with hexadecimal? David Bailey, Peter Borwein and Simon Plouffe discovered a formula for the nth digit after the decimal, in the hexadecimal expansion of pi.
The formula is:
(source: sciencenews.org)
You can read more about it here: http://www.andrews.edu/~calkins/physics/Miracle.pdf
The question of whether such a formula exists for base 10 is still open.
More info: http://www.sciencenews.org/sn_arc98/2_28_98/mathland.htm
I have been tossing around a conceptual idea for a machine (as in a Turing machine) and I'm wondering if any work has been done on this or related topics.
The idea is a machine that takes an entropy stream and gives out random symbols in any range without losing any entropy.
I'll grand that is a far from rigorous description so I'll give an example: Say I have a generator of random symbols in the range of 1 to n and I want to be able to ask for a symbols in any given range, first 1 to 12 and then 1 to 1234. (To keep it practicable I'll only consider deterministic machines where, given the same input stream and requests, it will always give the same output.) One necessary constraint is that the output contain at least as much entropy as the input. However, the constraint I'm most interested in is that the machine only reads in as much entropy as it spits out.
E.g. if asked for tokens in the range of 1 to S1, S2, S3, ... Sm it would only consume ceiling(sum(i = 1 to m, log(Si))/log(n)) inputs tokens.
This question asks about how to do this conversion while satisfying the first constraint but does very badly on the second.
Okay, I'm still not sure that I'm following what you want. It sounds like you want a function
f: I → O
where the inputs are a strongly random (uniform distribution etc) sequence of symbols on an alphabet I={1..n}. (So a series of random natural numbers ≤ n.) The outputs are another sequence on O={1..m} and you want that sequence to have as much entropy as the inputs.
Okay, if I've got this right, first off, if m < n, you can't. If m < n then lg m < lg n, so the entropy of the set of output symbols is smaller.
If m ≥ n, then you can do it trivially by just selecting the ith element of {1..m}. Entropy will be the same, since the number of possible output symbols is the same. They aren't going to be "random" in the sense of being uniformly distributed over the whole set {1..m}, though, because necessarily (pigeonhole principle) some symbols won't be selected at all.
If, on the other hand, you'd be satisfied with having a random sequence on {1..m}, then you can do it by selecting an appropriate pseudorandom number generator using your input from the random source as a seed.
My current pass at it:
By adding the following restriction: you know in advance what the sequence of ranges is {S1, S2, S3, ..., Sn}, than using base translation with a non-constant base might work:
Find Sp = S1 * S2 * S3 * ... * Sn
Extract m=cealing(log(Sp)/log(n)) terms from the input {R1, R2, R3, ..., Rm}
Find X = R1 + R2*n + R3*n^2 + ... + Rm*n^(m-1)
Reform X as O1 + S1*O2 + S1*S2*O3 + ... Sn*On + x where 1 <= Oi <= Si
This might be reformable into a solution that works for one value at a time by pushing x back into the input stream. However I can't convince my self that even the known outputs range form is sound so...