How to generate random number with a kind of required probability? - algorithm

In my game a user should be able to specify a probability when one or another random scene should be shown up with two integers A_x and B_x. Say, when A_x = 3, B_x = 6 a scene B should be shown in general 2 times frequently than scene A.
Are there any read-to-use formulas? Could you please point me out on them?
The first imagined idea of mine is smth. like saving the previously generated scene id and count it accordingly to the probability criterias A_x and B_x; but it looks silly.

With just two alternatives, you can work out the probability of A as A_x / (A_x + B_x) = 1/3. If you have a random number generator returning numbers uniformly distributed between 0 and 1 with a call such as rr.nextDouble() then something like the following should work.
if (rr.nextDouble() <= probA)
{
show A
}
else
{
show B
}
This generates A if the random number generator generates something <= probA, which should happen with probability probA.

Related

Get X random points in a fixed grid without repetition

I'm looking for a way of getting X points in a fixed sized grid of let's say M by N, where the points are not returned multiple times and all points have a similar chance of getting chosen and the amount of points returned is always X.
I had the idea of looping over all the grid points and giving each point a random chance of X/(N*M) yet I felt like that it would give more priority to the first points in the grid. Also this didn't meet the requirement of always returning X amount of points.
Also I could go with a way of using increments with a prime number to get kind of a shuffle without repeat functionality, but I'd rather have it behave more random than that.
Essentially, you need to keep track of the points you already chose, and make use of a random number generator to get a pseudo-uniformly distributed answer. Each "choice" should be independent of the previous one.
With your first idea, you're right, the first ones would have more chance of getting picked. Consider a one-dimensional array with two elements. With the strategy you mention, the chance of getting the first one is:
P[x=0] = 1/2 = 0.5
The chance of getting the second one is the chance of NOT getting the first one 0.5, times 1/2:
P[x=1] = 1/2 * 1/2 = 0.25
You don't mention which programming language you're using, so I'll assume you have at your disposal random number generator rand() which results in a random float in the range [0, 1), a Hashmap (or similar) data structure, and a Point data structure. I'll further assume that a point in the grid can be any floating point x,y, where 0 <= x < M and 0 <= y < N. (If this is a NxM array, then the same applies, but in integers, and up to (M-1,N-1)).
Hashmap points = new Hashmap();
Point p;
while (items.size() < X) {
p = new Point(rand()*M, rand()*N);
if (!points.containsKey(p)) {
items.add(p, 1);
}
}
Note: Two Point objects of equal x and y should be themselves considered equal and generate equal hash codes, etc.

Combining two unique numbers (order doesn't matter) to create a unique number

I'm creating a website in which 2 items out of a possible 117 are chosen and compared in different ways. I need a way to assign each of these matchups a unique number so they can be easily stored in a database and what not. I've seen pairing functions, but I cannot find one in which order doesn't matter. For example, I want the unique number for 2 and 17 to be the same as 17 and 2. Is there an equation that will satisfy this?
It depends on what programming language you are using.
In Java for example it would be quite easy, because same seed is producing the same random number sequence. So you could simple use the sum of both random numbers
Long seed = 2L + 17L;
Long seed2 = 17L+2L;
Random random = new Random(seed);
Random random2 = new Random(seed2);
Boolean b = (random.nextLong() == random2.nextLong()) //true
However, this would also return the same value for 1+18, 0+19 and so on - whatever sums up to 19.
So, to get really unique numbers "per pair" you would need to shift one of them. IE, with 117 entries, you could multiply the SMALLER (or larger) by 1000:
Long seed = 2L * 1000 + 17L;
....
Then you have a unique random number for 2,17 and 17,2 - but 19,0 or 0,19 would produce a DIFFERENT random number.
ps.: if it should ALWAYS return the same for 2,19 - the result is not really a random number, isn't it?
I know that the question dates from 2014, but I still wanted to add the following answer.
You could use the product of prime numbers to do this. For example, if your pair is (2,4), then you could use the product of the 2nd prime number (=3) and the 4th prime number (=7) as your id (= 3*7 = 21).
In order to do this though with your 117 possible combinations, you would need to pre-calculate all first 117 prime numbers and store them for example in an array or a hash table and then do something like (in JavaScript):
var primes = [2,3,5,7,...];
var a = 2;
var b = 17;
var id = primes[a-1]*primes[b-1];
Note that if you want to decode them as well, things are going to be more difficult since you would need to calculate the prime factorization of your id.

How do I get an unbiased random sample from a really huge data set?

For an application I'm working on, I need to sample a small set of values from a very large data set, on the order of few hundred taken from about 60 trillion (and growing).
Usually I use the technique of seeing if a uniform random number r (0..1) is less than S/T, where S is the number of sample items I still need, and T is the number of items in the set that I haven't considered yet.
However, with this new data, I don't have time to roll the die for each value; there are too many. Instead, I want to generate a random number of entries to "skip", pick the value at the next position, and repeat. That way I can just roll the die and access the list S times. (S is the size of the sample I want.)
I'm hoping there's a straightforward way to do that and create an unbiased sample, along the lines of the S/T test.
To be honest, approximately unbiased would be OK.
This is related (more or less a follow-on) to this persons question:
https://math.stackexchange.com/questions/350041/simple-random-sample-without-replacement
One more side question... the person who showed first showed this to me called it the "mailman's algorithm", but I'm not sure if he was pulling my leg. Is that right?
How about this:
precompute S random numbers from 0 to the size of your dataset.
order your numbers, low to high
store the difference between consecutive numbers as the skip size
iterate though the large dataset using the skip size above.
...The assumption being the order you collect the samples doesn't matter
So I thought about it, and got some help from http://math.stackexchange.com
It boils down to this:
If I picked n items randomly all at once, where would the first one land? That is, min({r_1 ... r_n}). A helpful fellow at math.stackexchange boiled it down to this equation:
x = 1 - (1 - r) ** (1 / n)
that is, the distribution would be 1 minus (1 - r) to the nth power. Then solve for x. Pretty easy.
If I generate a uniform random number and plug it in for r, this is distributed the same as min({r_1 ... r_n}) -- the same way that the lowest item would fall. Voila! I've just simulated picking the first item as if I had randomly selected all n.
So I skip over that many items in the list, pick that one, and then....
Repeat until n is 0
That way, if I have a big database (like Mongo), I can skip, find_one, skip, find_one, etc. Until I have all the items I need.
The only problem I'm having is that my implementation favors the first and last element in the list. But I can live with that.
In Python 2.7, my implementation looks like:
def skip(n):
"""
Produce a random number with the same distribution as
min({r_0, ... r_n}) to see where the next smallest one is
"""
r = numpy.random.uniform()
return 1.0 - (1.0 - r) ** (1.0 / n)
def sample(T, n):
"""
Take n items from a list of size T
"""
t = T
i = 0
while t > 0 and n > 0:
s = skip(n) * (t - n + 1)
i += s
yield int(i) % T
i += 1
t -= s + 1
n -= 1
if __name__ == '__main__':
t = [0] * 100
for c in xrange(10000):
for i in sample(len(t), 10):
t[i] += 1 # this is where we would read value i
pprint.pprint(t)

Finding if a Valid Rummikub Solution exist from selected Tiles

I'm currently making a rummikub "game" so that when I make a Mistake I am able to get the board back easily to what it was before.
Currently in need to find out for a given number of tiles if a solution exists and what the solution is.
Example
Elements Selected are
{Red1, Red1, Blue1, Black1, Orange1, Orange1}
And a solution would be
{Red1, Blue1, Orange1} and {Red1, Black1, Orange1}
I can currently determine which groups and runs are possible ({Red1, Blue1, Black1, Orange1} being a valid group that wouldn't appear in the valid solution). I need a solution that can go the next step and tell me which of the groups/runs can exist together and make sure each tile is used once. If no such solution exist, it needs to be able to report that. The solution does not have to be optimal (ie use the smallest number of groups/run), it just has to be valid.
Finding the highest scoring solution for a given set of tiles is a tough problem. It has already been the subject of many papers. The most recent approach is ILP (integer linear programming).
Basically, it means you define a function you wish to maximize, and put constraints on it.
For instance:
make a list of all unique tiles (53 in a standard game)
make a list of all possible valid sets (1174)
Define:
yi = number of times tile i is chosen in optimal solution
sij = indicates wether tile i is in set j
ti = number of times tile i can be found on table
ri = number of times tile i can be found on rack
G = y1 + y2 + ... y52 + y53
Then it follows that:
sum[ xj*sij, j=1...1174 ] = ti + yi (i=1...53)
yi<=ri (i=1...53)
0 <= xj <= 2 (j=1...1174)
0 <= yi <= 2 (i=1...53)
The solution will be given by the coefficients of xj (telling you exactly what sets to include in your solution). The function to maximize is of course G.
This approach is able to solve a given situation rather fast (1 second for 20 tiles).
If you have a method of determining what groups and runs are possible, why don't you modify your algorithm to remove tiles that you have already used, and make the function recursive. Here is some pseudo-code:
array PickTiles(TileArray) {
GroupsArray = all possible groups/runs;
foreach Group in GroupsArray {
newTileArray = TileArray;
remove Group from newTileArray;
if(newTileArray.length() == 0) {
return array(Group);
}
result = PickTiles(newTileArray);
if(result.length() > 0) {
return result.append(array(Group));
}
}
return array();
}

Random number generator that fills an interval

How would you implement a random number generator that, given an interval, (randomly) generates all numbers in that interval, without any repetition?
It should consume as little time and memory as possible.
Example in a just-invented C#-ruby-ish pseudocode:
interval = new Interval(0,9)
rg = new RandomGenerator(interval);
count = interval.Count // equals 10
count.times.do{
print rg.GetNext() + " "
}
This should output something like :
1 4 3 2 7 5 0 9 8 6
Fill an array with the interval, and then shuffle it.
The standard way to shuffle an array of N elements is to pick a random number between 0 and N-1 (say R), and swap item[R] with item[N]. Then subtract one from N, and repeat until you reach N =1.
This has come up before. Try using a linear feedback shift register.
One suggestion, but it's memory intensive:
The generator builds a list of all numbers in the interval, then shuffles it.
A very efficient way to shuffle an array of numbers where each index is unique comes from image processing and is used when applying techniques like pixel-dissolve.
Basically you start with an ordered 2D array and then shift columns and rows. Those permutations are by the way easy to implement, you can even have one exact method that will yield the resulting value at x,y after n permutations.
The basic technique, described on a 3x3 grid:
1) Start with an ordered list, each number may exist only once
0 1 2
3 4 5
6 7 8
2) Pick a row/column you want to shuffle, advance it one step. In this case, i am shifting the second row one to the right.
0 1 2
5 3 4
6 7 8
3) Pick a row/column you want to shuffle... I suffle the second column one down.
0 7 2
5 1 4
6 3 8
4) Pick ... For instance, first row, one to the left.
2 0 7
5 1 4
6 3 8
You can repeat those steps as often as you want. You can always do this kind of transformation also on a 1D array. So your result would be now [2, 0, 7, 5, 1, 4, 6, 3, 8].
An occasionally useful alternative to the shuffle approach is to use a subscriptable set container. At each step, choose a random number 0 <= n < count. Extract the nth item from the set.
The main problem is that typical containers can't handle this efficiently. I have used it with bit-vectors, but it only works well if the largest possible member is reasonably small, due to the linear scanning of the bitvector needed to find the nth set bit.
99% of the time, the best approach is to shuffle as others have suggested.
EDIT
I missed the fact that a simple array is a good "set" data structure - don't ask me why, I've used it before. The "trick" is that you don't care whether the items in the array are sorted or not. At each step, you choose one randomly and extract it. To fill the empty slot (without having to shift an average half of your items one step down) you just move the current end item into the empty slot in constant time, then reduce the size of the array by one.
For example...
class remaining_items_queue
{
private:
std::vector<int> m_Items;
public:
...
bool Extract (int &p_Item); // return false if items already exhausted
};
bool remaining_items_queue::Extract (int &p_Item)
{
if (m_Items.size () == 0) return false;
int l_Random = Random_Num (m_Items.size ());
// Random_Num written to give 0 <= result < parameter
p_Item = m_Items [l_Random];
m_Items [l_Random] = m_Items.back ();
m_Items.pop_back ();
}
The trick is to get a random number generator that gives (with a reasonably even distribution) numbers in the range 0 to n-1 where n is potentially different each time. Most standard random generators give a fixed range. Although the following DOESN'T give an even distribution, it is often good enough...
int Random_Num (int p)
{
return (std::rand () % p);
}
std::rand returns random values in the range 0 <= x < RAND_MAX, where RAND_MAX is implementation defined.
Take all numbers in the interval, put them to list/array
Shuffle the list/array
Loop over the list/array
One way is to generate an ordered list (0-9) in your example.
Then use the random function to select an item from the list. Remove the item from the original list and add it to the tail of new one.
The process is finished when the original list is empty.
Output the new list.
You can use a linear congruential generator with parameters chosen randomly but so that it generates the full period. You need to be careful, because the quality of the random numbers may be bad, depending on the parameters.

Resources