There are similar questions, but most of them too language specific. I'm looking for a general solution. Given some way to produce k random bytes and a number n, I need to produce a random number in range 1...n (inclusive).
What I've come up with so far:
To determine the number of bytes needed to represent n, calculate
f(n):=ceiling(ln(n)/8ln(2))=ceiling(0.180337*ln(n))
Get a random number in range in range 1...2^8f(n) for 0-indexed bytes b[i]:
r:=0
for i=0 to k-1:
r = r + b[i] * 2^(8*i)
end for
To scale to 1...n without bias:
R(n,r) := ceiling(n * (r / 256^f(n)))
But I'm not sure this does not create a bias or some subtle one-off error. Could you check whether this sound and/or make suggestions for improvements? Is this the right way to do this?
In answers, please assume that there are no modular bit-twiddling operations available, but you can assume arbitrary precision arithmetics. (I'm programming in Scheme.)
Edit: There is definitely something wrong with my approach, because in my tests rolling a dice yielded a few cases of 0! But where is the error?
This is similar to what you'd do if you wanted to generate a number from 1 to n from a random floating point number from 0 to 1, inclusive. If r is the random float:
result = (r * n) + 1
If you have arbitrary precision arithmetic, you can compute r by dividing your k-byte integer by the maximum value expressible in k bytes, + 1.
So if you have 4 bytes 87 6F BD 4A, and n = 200:
((0x876FBd4A/0x100000000) * 200) + 1
Suppose all I have is a routine that generates 0 and 1 randomly with equal probability , how can I use this to find a random number between 1 and n .I can't use any other random function. I need to use my routine to achieve the goal.Please any pointers will he helpful.
Let m = ceil(log2(n)); that is, m is the number of bits needed to represent n.
Generate a random bitstring of length m, and interpret it as a nonnegative integer k.
If k >= n, go back to step 2. Otherwise your random number is k + 1.
This is a form of rejection sampling that will give you a uniform distribution on the random integer k + 1 in the range [1, n].
Simplely use the fomular
rand()*(n-1)+1
A pseudo-random number generator is a function of no arguments that returns, when called
repeatedly, a sequence of values that appears to be random and uniformly distributed over a range {0,...,N - 1}, where N is typically 2^k and k is the number of bits in a computer word (e.g., 2^32 or 2^64 for many computers). A lagged Fibonacci generator for the range {0,...,N - 1} returns the values xn = (xn-r +xn-s) mod N, where r and s are integer constants of the algorithm (0 < r < s) and the initial "seed" values x0, x1,...,xs-1 are determined in some other way. The values r = 5 and s = 17 are recommended because they result in a sequence that does not repeat value for a very long time. Explain how to represent a lagged Fibonacci generator using list abstract data types. What representation would be most appropriate?
pseudo random number generators have state, which in case of your example is a list of 17 integers.
You can implement this using a simple array and an insert method. Every time you call insert, ith element is copied to next position. Discard the first element & insert the current random number.
Let's say I have N integers, where N can get huge, but each int is guaranteed to be between 0 and some cap M, where M fits easily in a signed 32-bit field.
If I want to compute the average of these N integers, I can't always just sum and divide them all in the same signed 32-bit space - the numerator carries a risk of overflow if N is too large. One solution to this problem is to just use 64-bit fields for the computation, to hold for larger N, but this solution doesn't scale - If M were a large 64-bit integer instead, the same problem would arise.
Does anyone know of an algorithm (preferably O(N)) that can compute the average of a list of positive integers in the same bit-space? Without doing something cheap like using two integers to simulate a larger one.
Supposing you know M initially, you can keep two variables, one is the answer so far divided by M, and the other is the remainder.
For example, in C++:
int ans = 0, remainder = 0;
for (int i=0;i<N;i++) {
remainder += input[i]; // update remainder so far
ans += remainder/N; // move what we can from remainder into ans
remainder%=N; // calculate what's left of remainder
}
At the end of the loop, the answer is found in ans, with a remainder in remainder (if you need a rounding method other than truncation).
This example works where the maximum input number M+N fits in a 32-bit int.
Note that this should work for positive and negative integers, because in C++, the / operator is the division operator, and % is actually a remainder operator (not really a modulo operator).
You can calculate a running average. If you have the average A of N elements, and you add another element E, the new average is (A*N+E)/(N+1). By the distributive property of division over addition, this is equivalent to (A*N)/(N+1) + E/(N+1). But if A*N overflows, you can use the associative property of multiplication and division, you can convert the first term to A*(N/N+1).
So the algorithm is:
n = 0
avg = 0
for each i in list
avg = avg*(n/(n+1)) + i/(n+1)
n = n+1
I want to get N random numbers whose sum is a value.
For example, let's suppose I want 5 random numbers that sum to 1.
Then, a valid possibility is:
0.2 0.2 0.2 0.2 0.2
Another possibility is:
0.8 0.1 0.03 0.03 0.04
And so on. I need this for the creation of a matrix of belongings for Fuzzy C-means.
Short Answer:
Just generate N random numbers, compute their sum, divide each one by
the sum and multiply by M.
Longer Answer:
The above solution does not yield a uniform distribution which might be an issue depending on what these random numbers are used for.
Another method proposed by Matti Virkkunen:
Generate N-1 random numbers between 0 and 1, add the numbers 0 and 1
themselves to the list, sort them, and take the differences of
adjacent numbers.
This yields a uniform distribution as is explained here
Generate N-1 random numbers between 0 and 1, add the numbers 0 and 1 themselves to the list, sort them, and take the differences of adjacent numbers.
I think it is worth noting that the currently accepted answer does not give a uniform distribution:
"Just generate N random numbers,
compute their sum, divide each one by
the sum"
To see this let's look at the case N=2 and M=1. This is a trivial case, since we can generate a list [x,1-x], by choosing x uniformly in the range (0,1).
The proposed solution generates a pair [x/(x+y), y/(x+y)] where x and y are uniform in (0,1). To analyze this we choose some z such that 0 < z < 0.5 and compute the probability that
the first element is smaller than z. This probaility should be z if the distribution were uniform. However, we get
Prob(x/(x+y) < z) = Prob(x < z(x+y)) = Prob(x(1-z) < zy) = Prob(x < y(z/(1-z))) = z/(2-2z).
I did some quick calculations and it appears that the only solution so far that appers to result in a uniform distribution was proposed by Matti Virkkunen:
"Generate N-1 random numbers between 0 and 1, add the numbers 0 and 1 themselves to the list, sort them, and take the differences of adjacent numbers."
Unfortunately, a number of the answers here are incorrect if you'd like uniformly random numbers. The easiest (and fastest in many languages) solution that guarantees uniformly random numbers is just
# This is Python, but most languages support the Dirichlet.
import numpy as np
np.random.dirichlet(np.ones(n))*m
where n is the number of random numbers you want to generate and m is the sum of the resulting array. This approach produces positive values and is particularly useful for generating valid probabilities that sum to 1 (let m = 1).
To generate N positive numbers that sum to a positive number M at random, where each possible combination is equally likely:
Generate N exponentially-distributed random variates. One way to generate such a number can be written as—
number = -ln(1.0 - RNDU())
where ln(x) is the natural logarithm of x and RNDU() is a method that returns a uniform random variate greater than 0 and less than 1. Note that generating the N variates with a uniform distribution is not ideal because a biased distribution of random variate combinations will result. However, the implementation given above has several problems, such as being ill-conditioned at large values because of the distribution's right-sided tail, especially when the implementation involves floating-point arithmetic. Another implementation is given in another answer.
Divide the numbers generated this way by their sum.
Multiply each number by M.
The result is N numbers whose sum is approximately equal to M (I say "approximately" because of rounding error). See also the Wikipedia article Dirichlet distribution.
This problem is also equivalent to the problem of generating random variates uniformly from an N-dimensional unit simplex.
However, for better accuracy (compared to the alternative of using floating-point numbers, which often occurs in practice), you should consider generating n random integers that sum to an integer m * x, and treating those integers as the numerators to n rational numbers with denominator x (and will thus sum to m assuming m is an integer). You can choose x to be a large number such as 232 or 264 or some other number with the desired precision. If x is 1 and m is an integer, this solves the problem of generating random integers that sum to m.
The following pseudocode shows two methods for generating n uniform random integers with a given positive sum, in random order. (The algorithm for this was presented in Smith and Tromble, "Sampling Uniformly from the Unit Simplex", 2004.) In the pseudocode below—
the method PositiveIntegersWithSum returns n integers greater than 0 that sum to m, in random order,
the method IntegersWithSum returns n integers 0 or greater that sum to m, in random order, and
Sort(list) sorts the items in list in ascending order (note that sort algorithms are outside the scope of this answer).
METHOD PositiveIntegersWithSum(n, m)
if n <= 0 or m <=0: return error
ls = [0]
ret = NewList()
while size(ls) < n
c = RNDINTEXCRANGE(1, m)
found = false
for j in 1...size(ls)
if ls[j] == c
found = true
break
end
end
if found == false: AddItem(ls, c)
end
Sort(ls)
AddItem(ls, m)
for i in 1...size(ls): AddItem(ret,
ls[i] - ls[i - 1])
return ret
END METHOD
METHOD IntegersWithSum(n, m)
if n <= 0 or m <=0: return error
ret = PositiveIntegersWithSum(n, m + n)
for i in 0...size(ret): ret[i] = ret[i] - 1
return ret
END METHOD
Here, RNDINTEXCRANGE(a, b) returns a uniform random integer in the interval [a, b).
In Java:
private static double[] randSum(int n, double m) {
Random rand = new Random();
double randNums[] = new double[n], sum = 0;
for (int i = 0; i < randNums.length; i++) {
randNums[i] = rand.nextDouble();
sum += randNums[i];
}
for (int i = 0; i < randNums.length; i++) {
randNums[i] /= sum * m;
}
return randNums;
}
Generate N-1 random numbers.
Compute the sum of said numbers.
Add the difference between the computed sum and the desired sum to the set.
You now have N random numbers, and their sum is the desired sum.
Just generate N random numbers, compute their sum, divide each one by
the sum.
Expanding on Guillaume's accepted answer, here's a Java function that does exactly that.
public static double[] getRandDistArray(int n, double m)
{
double randArray[] = new double[n];
double sum = 0;
// Generate n random numbers
for (int i = 0; i < randArray.length; i++)
{
randArray[i] = Math.random();
sum += randArray[i];
}
// Normalize sum to m
for (int i = 0; i < randArray.length; i++)
{
randArray[i] /= sum;
randArray[i] *= m;
}
return randArray;
}
In a test run, getRandDistArray(5, 1.0) returned the following:
[0.38106150346121903, 0.18099632814238079, 0.17275044310377025, 0.01732932296660358, 0.24786240232602647]
You're a little slim on constraints. Lots and lots of procedures will work.
For example, are numbers normally distributed? Uniform?
I'l assume that all the numbers must be positive and uniformly distributed around the mean, M/N.
Try this.
mean= M/N.
Generate N-1 values between 0 and 2*mean. This can be a standard number between 0 and 1, u, and the random value is (2*u-1)*mean to create a value in an appropriate range.
Compute the sum of the N-1 values.
The remaining value is N-sum.
If the remaining value does not fit the constraints (0 to 2*mean) repeat the procedure.