design a random(5) using random(7) - random

Given a random number generator random(7) which can generate number 1,2,3,4,5,6,7 in equal probability(i.e., the probability of each number occurs is 1/7). Now we want to design a random(5) which can generate 1,2,3,4,5 in equal probability(1/5).
There is one way: every time we run random(7), only return when it generates 1-5. If it is 6 or 7, run again until it is 1-5.
I am a little confused. The first question is:
How to proof the probability of each number occurs is 1/5 in mathematical way?
For example, assume probability of returned number 1 is P(1). If B means 'the selected number is in 1-5' and A means 'select 1', then according to conditional probability, P(1) = P(A|B) = P(AB) / P(B). Obviously P(B) is 5/7. But if P(1)=1/5, P(AB) should be 1/7, why? I think P(A)=1/7. Is there anywhere wrong?
The second question is, this method will run until random(7) not return 6 or 7. What if it runs for a long time not returning 1-5? I know the chance is very very small but is there any way to prevent it?
Thanks!

The answer to your first question is given by basic conditional probability:
Let X be the random(7) number then for any k in {1,2,3,4,5}:
P(X = k | X <= 5) = P(X = k)/P(X <= 5) = (1/7)/(5/7) = 1/5
This follows from the observation that the intersection of the 2 events X = k and X <= 5 is simply X = k.
The number of trials until the first success (where a success is getting a number <= 5) is a geometric random variable with p = 5/7. The expected number of trials is 1/p = 7/5 = 1.4. You will get a success sooner rather than later in this set up. As #PeterWalser said in his answer, the chance of not quickly getting a number in the range 1-5 is vanishingly small.
For fun you can write a short script to investigate it. Here is one in Python:
from random import randint
from collections import Counter
def trials_needed():
num = randint(1,7)
trial = 1
while num > 5:
num = randint(1,7)
trial += 1
return trial
counts = Counter(trials_needed() for i in range(10**6))
for c,i in counts.items(): print(c,":",i)
Output from a typical run:
1 : 714212
2 : 204141
3 : 58340
4 : 16515
5 : 4814
6 : 1456
7 : 347
8 : 133
9 : 28
10 : 10
11 : 4
Over 99% of the time less than 5 trials are needed. More than 10 trials is extremely rare.

The probability to roll a number n(1..5) with a rnd(7) are 1/7 in each roll.
The chance to get such a number in the first roll are 5/7, or: in 2/7 of all first-roll cases, you need to roll again.
This results in a series, when examining the probability that a certain number n(1..5) is rolled:
p(n) = 1/7 + 2/7 * (1/7 + 2/7 * (1/7 + 2/7 * (...)))
This series evaluates to 1/5, and that's the expected probability to roll a specific number n(1..5).
Second question: there is a chance that you need to roll eternally. The probability to have a result in x rolls is 1-(2/7)^x, this quickly approaches 1, so you're very likely to get a result in just a few rolls, but without guarantee. The probability to still not have a result in a large number of rolls becomes smaller than the probability C'thulu swallowing the planet in the next 5 minues, so it's not necessary to build in some prevention. If you absolutely must, then return 1 after n internal rolls, this only slightly skews the distribution of the random numbers produced.

Related

Minimum number of operations to make A and B equal simultaneously

Given two non-negative integers A and B, find the minimum number of operations to make them equal simultaneously. In one operation, you can:
either change A to 2*A
or change B to 2*B
or change both A and B to A-1, B-1
For example: A = 7, B = 25
Sequence of operations would be:
6 24
12 24
24 24
We cannot make them equal in less than 3 operations
I was asked this coding question in a test a week ago. Cannot think of a solution, it is stuck in my head.The input A and B were somewhat over 10^12 so it is clear that I cannot use a loop else it will exceed time limit.
A slow but working solution:
If they are equal, stop.
If one of them is 0, stop with failure (there is no solution if negative numbers are not allowed).
While both are larger than 1, decrease both.
Now the smaller is 1, the other is larger.
While the smaller has a shorter binary representation, double the smaller.
Continue at step 1.
In step 4, the maximum decreases. In step 5, the absolute difference decreases. Thus eventually the algorithm terminates.
This should give the optimal solution. We have to compare a few different ways and take the best solution.
One working solution is to double the smaller number as many times as it stays below the larger number (can be zero times). Then calculate the difference between the double of the (possibly multiple times) doubled smaller number and the larger number. And decrease the numbers as many times. Then double the smaller number one more time. [If the numbers are equal from the beginning, the solution is trivial instead.] This gives an upper bound of the steps.
Now try out the following optimizations:
2a) Choose a number n between 0 and up to the number of steps of the best solution so far.
2b) Choose one number as A and one number as B (two possibilities).
2c) Now count the applied steps of the following procedure.
Double A n times.
Calculate the smallest power of 2 (=m), with which B * 2^m >= A. m should be at least 1.
Calculate the difference of A with the product from step 4 in a mixed base (correct term?) system with each digit having a positional value of 2^(n+1)-1, which is from the least significant right digit to the left: 1, 3, 7, 15, 31, 63, ... From all possible representations the number must have the smallest crosssum, e.g. 100 for 7 is correct, 021 not. Sidenote: For the least checksum there will mostly be digits 0 and 1 and at most one digit 2, no other digits. There will never be a digit 1 right of a 2.)
Represent the number as m digits by filling the left positions with zero. If the number does not fit, go back to step 2 for another selection.
Take the most significant not processed digit from step 6 and do as many decreasing steps.
Double B.
Repeat from 7. with the next digit; if there are no more digits left, the numbers are equal.
If the number of steps is less than the best solution so far, choose this as the proposed solution.
Go back to step 2 for another selection.
After doing all selections from 2 we should have the optimal solution with the minimum number of steps.
The following examples are from an earlier version of the answer, where A is always the larger number and n=0, so we test only one selection.
Example 17 and 65
Power of 2: 2^2=4; 4x17=68
Difference: 68-65=3
3 = 010=10 in base 7/3/1
Start => 17/65
Decrease. Double. => 32/64
Double. => 64/64
Example 18 and 67
Power of 2: 2^2=4; 4x18=72
Difference: 72-67=5
5 = 012=12 in base 7/3/1
Start => 18/67
Decrease. Double. => 34/66
Decrease. Decrease. Double. => 64/64
Example 10 and 137
Power of 2: 2^4=16; 16*10=160
Difference: 160-137=23
23 = 1101 in base 15/7/3/1
Start => 10/137
Decrease. Double. => 18/136
Decrease. Double. => 34/135
Double. => 68/135
Decrease. Double. => 134/134
Here's a breadth-first search that does return the correct answer but may not be an optimal method of finding it. Maybe it can help others detect a pattern.
JavasScript code:
function f(a, b) {
const q = [[a, b, [a, b]]];
while (true){
const [x, y, path] = q.shift();
if (x == y) {
return path;
}
if (x > 0 && y > 0) {
q.push([x-1, y-1, path.concat([x-1, y-1])]);
}
q.push([2*x, y, path.concat([2*x, y])]);
q.push([x, 2*y, path.concat([x, 2*y])]);
}
return [];
}
function showPath(path) {
let out1 = "";
let out2 = "";
for (let i = 0; i < path.length; i += 2) {
const s1 = path[i].toString(2);
const s2 = path[i+1].toString(2);
const len = Math.max(s1.length, s2.length);
out1 += s1.padStart(len, "0");
out2 += s2.padStart(len, "0");
if (i < path.length - 2) {
out1 += " --> ";
out2 += " --> ";
}
}
console.log(out1);
console.log(out2);
}
showPath(f(89, 7));

four-square sum representation for integers upto N

Lagrange's four-square theorem proves that any natural number can be written as the sum of four square numbers. What I need is to find any one way to write a natural number x as sum of four square numbers for all 0 <= x <= N for any given upper limit N.
What I have done so far is find two-square sum representation for all the numbers <= N for which it is possible to find one, and saved them in an array called two_square_div. Then I used a greedy approach like following:
last_two_square_sum = 0
for num in 0..N
if num can be written as sum of two square
last_two_square_sum = num
other_last_two_square_sum = num - last_two_square_sum
four_square_div[num] = (two_square_div[last_two_square_sum], two_square_div[other_last_two_square_sum]
But this approach does not work for numbers like 23, for which last_two_square_sum = 20 other_last_two_square_sum = 3. But 3 can not be written as sum of two squares so this method fails.
So could anybody provide a correct O(N) solution or any helpful hint? Thank you.
Your algorithm should make more than one attempt (if it already does, then the exit condition must be improved).
23 can be written as 3 + 20, yes; but 3 is not a decomposable of order two and can't lead to a solution.
So you go on: next you try 4 + 19, and this time it's 19 that is rejected. Next you try 5, so 23-5 is 18, and 5 is 12 + 22 while 18 is 32 + 32.
(Of course this is not O(N) at all).
It is not clear to me how you arrive at 20 and not accept previous solutions; try posting the whole of the code.
Also, try asking on Math StackExchange.

Fast algorithm to optimize a sequence of arithmetic expression

EDIT: clarified description of problem
Is there a fast algorithm solving following problem?
And, is also for extendend version of this problem
that is replaced natural numbers to Z/(2^n Z)?(This problem was too complex to add more quesion in one place, IMO.)
Problem:
For a given set of natural numbers like {7, 20, 17, 100}, required algorithm
returns the shortest sequence of additions, mutliplications and powers compute
all of given numbers.
Each item of sequence are (correct) equation that matches following pattern:
<number> = <number> <op> <number>
where <number> is a natual number, <op> is one of {+, *, ^}.
In the sequence, each operand of <op> should be one of
1
numbers which are already appeared in the left-hand-side of equal.
Example:
Input: {7, 20, 17, 100}
Output:
2 = 1 + 1
3 = 1 + 2
6 = 2 * 3
7 = 1 + 6
10 = 3 + 7
17 = 7 + 10
20 = 2 * 10
100 = 10 ^ 2
I wrote backtracking algorithm in Haskell.
it works for small input like above, but my real query is
randomly distributed ~30 numbers in [0,255].
for real query, following code takes 2~10 minutes in my PC.
(Actual code,
very simple test)
My current (Pseudo)code:
-- generate set of sets required to compute n.
-- operater (+) on set is set union.
requiredNumbers 0 = { {} }
requiredNumbers 1 = { {} }
requiredNumbers n =
{ {j, k} | j^k == n, j >= 2, k >= 2 }
+ { {j, k} | j*k == n, j >= 2, k >= 2 }
+ { {j, k} | j+k == n, j >= 1, k >= 1 }
-- remember the smallest set of "computed" number
bestSet := {i | 1 <= i <= largeNumber}
-- backtracking algorithm
-- from: input
-- to: accumulator of "already computed" number
closure from to =
if (from is empty)
if (|bestSet| > |to|)
bestSet := to
return
else if (|from| + |to| >= |bestSet|)
-- cut branch
return
else
m := min(from)
from' := deleteMin(from)
foreach (req in (requiredNumbers m))
closure (from' + (req - to)) (to + {m})
-- recoverEquation is a function converts set of number to set of equation.
-- it can be done easily.
output = recoverEquation (closure input {})
Additional Note:
Answers like
There isn't a fast algorithm, because...
There is a heuristic algorithm, it is...
are also welcomed. Now I'm feeling that there is no fast and exact algorithm...
Answer #1 can be used as a heuristic, I think.
What if you worked backwards from the highest number in a sorted input, checking if/how to utilize the smaller numbers (and numbers that are being introduced) in its construction?
For example, although this may not guarantee the shortest sequence...
input: {7, 20, 17, 100}
(100) = (20) * 5 =>
(7) = 5 + 2 =>
(17) = 10 + (7) =>
(20) = 10 * 2 =>
10 = 5 * 2 =>
5 = 3 + 2 =>
3 = 2 + 1 =>
2 = 1 + 1
What I recommend is to transform it into some kind of graph shortest path algorithm.
For each number, you compute (and store) the shortest path of operations. Technically one step is enough: For each number you can store the operation and the two operands (left and right, because power operation is not commutative), and also the weight ("nodes")
Initially you register 1 with the weight of zero
Every time you register a new number, you have to generate all calculations with that number (all additions, multiplications, powers) with all already-registered numbers. ("edges")
Filter for the calculations: it the result of the calculation is already registered, you shouldn't store that, because there is an easier way to get to that number
Store only 1 operation for the commutative ones (1+2=2+1)
Prefilter the power operation because that may even cause overflow
You have to order this list to the shortest sum path (weight of the edge). Weight = (weight of operand1) + (weight of operand2) + (1, which is the weight of the operation)
You can exclude all resulting numbers which are greater than the maximum number that we have to find (e.g. if we found 100 already, anything greater that 20 can be excluded) - this can be refined so that you can check the members of the operations also.
If you hit one of your target numbers, then you found the shortest way of calculating one of your target numbers, you have to restart the generations:
Recalculate the maximum of the target numbers
Go back on the paths of the currently found number, set their weight to 0 (they will be given from now on, because their cost is already paid)
Recalculate the weight for the operations in the generation list, because the source operand weight may have been changed (this results reordering at the end) - here you can exclude those where either operand is greater than the new maximum
If all the numbers are hit, then the search is over
You can build your expression using the "backlinks" (operation, left and right operands) for each of your target numbers.
The main point is that we always keep our eye on the target function, which is that the total number of operation must be the minimum possible. In order to get this, we always calculate the shortest path to a certain number, then considering that number (and all the other numbers on the way) as given numbers, then extending our search to the remaining targets.
Theoretically, this algorithm processes (registers) each numbers only once. Applying the proper filters cuts the unnecessary branches, so nothing is calculated twice (except the weights of the in-queue elements)

How to implement Random(a,b) with only Random(0,1)? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
how to get uniformed random between a, b by a known uniformed random function RANDOM(0,1)
In the book of Introduction to algorithms, there is an excise:
Describe an implementation of the procedure Random(a, b) that only makes calls to Random(0,1). What is the expected running time of your procedure, as a function of a and b? The probability of the result of Random(a,b) should be pure uniformly distributed, as Random(0,1)
For the Random function, the results are integers between a and b, inclusively. For e.g., Random(0,1) generates either 0 or 1; Random(a, b) generates a, a+1, a+2, ..., b
My solution is like this:
for i = 1 to b-a
r = a + Random(0,1)
return r
the running time is T=b-a
Is this correct? Are the results of my solutions uniformly distributed?
Thanks
What if my new solution is like this:
r = a
for i = 1 to b - a //including b-a
r += Random(0,1)
return r
If it is not correct, why r += Random(0,1) makes r not uniformly distributed?
Others have explained why your solution doesn't work. Here's the correct solution:
1) Find the smallest number, p, such that 2^p > b-a.
2) Perform the following algorithm:
r=0
for i = 1 to p
r = 2*r + Random(0,1)
3) If r is greater than b-a, go to step 2.
4) Your result is r+a
So let's try Random(1,3).
So b-a is 2.
2^1 = 2, so p will have to be 2 so that 2^p is greater than 2.
So we'll loop two times. Let's try all possible outputs:
00 -> r=0, 0 is not > 2, so we output 0+1 or 1.
01 -> r=1, 1 is not > 2, so we output 1+1 or 2.
10 -> r=2, 2 is not > 2, so we output 2+1 or 3.
11 -> r=3, 3 is > 2, so we repeat.
So 1/4 of the time, we output 1. 1/4 of the time we output 2. 1/4 of the time we output 3. And 1/4 of the time we have to repeat the algorithm a second time. Looks good.
Note that if you have to do this a lot, two optimizations are handy:
1) If you use the same range a lot, have a class that computes p once so you don't have to compute it each time.
2) Many CPUs have fast ways to perform step 1 that aren't exposed in high-level languages. For example, x86 CPUs have the BSR instruction.
No, it's not correct, that method will concentrate around (a+b)/2. It's a binomial distribution.
Are you sure that Random(0,1) produces integers? it would make more sense if it produced floating point values between 0 and 1. Then the solution would be an affine transformation, running time independent of a and b.
An idea I just had, in case it's about integer values: use bisection. At each step, you have a range low-high. If Random(0,1) returns 0, the next range is low-(low+high)/2, else (low+high)/2-high.
Details and complexity left to you, since it's homework.
That should create (approximately) a uniform distribution.
Edit: approximately is the important word there. Uniform if b-a+1 is a power of 2, not too far off if it's close, but not good enough generally. Ah, well it was a spontaneous idea, can't get them all right.
No, your solution isn't correct. This sum'll have binomial distribution.
However, you can generate a pure random sequence of 0, 1 and treat it as a binary number.
repeat
result = a
steps = ceiling(log(b - a))
for i = 0 to steps
result += (2 ^ i) * Random(0, 1)
until result <= b
KennyTM: my bad.
I read the other answers. For fun, here is another way to find the random number:
Allocate an array with b-a elements.
Set all the values to 1.
Iterate through the array. For each nonzero element, flip the coin, as it were. If it is came up 0, set the element to 0.
Whenever, after a complete iteration, you only have 1 element remaining, you have your random number: a+i where i is the index of the nonzero element (assuming we start indexing on 0). All numbers are then equally likely. (You would have to deal with the case where it's a tie, but I leave that as an exercise for you.)
This would have O(infinity) ... :)
On average, though, half the numbers would be eliminated, so it would have an average case running time of log_2 (b-a).
First of all I assume you are actually accumulating the result, not adding 0 or 1 to a on each step.
Using some probabilites you can prove that your solution is not uniformly distibuted. The chance that the resulting value r is (a+b)/2 is greatest. For instance if a is 0 and b is 7, the chance that you get a value 4 is (combination 4 of 7) divided by 2 raised to the power 7. The reason for that is that no matter which 4 out of the 7 values are 1 the result will still be 4.
The running time you estimate is correct.
Your solution's pseudocode should look like:
r=a
for i = 0 to b-a
r+=Random(0,1)
return r
As for uniform distribution, assuming that the random implementation this random number generator is based on is perfectly uniform the odds of getting 0 or 1 are 50%. Therefore getting the number you want is the result of that choice made over and over again.
So for a=1, b=5, there are 5 choices made.
The odds of getting 1 involves 5 decisions, all 0, the odds of that are 0.5^5 = 3.125%
The odds of getting 5 involves 5 decisions, all 1, the odds of that are 0.5^5 = 3.125%
As you can see from this, the distribution is not uniform -- the odds of any number should be 20%.
In the algorithm you created, it is really not equally distributed.
The result "r" will always be either "a" or "a+1". It will never go beyond that.
It should look something like this:
r=0;
for i=0 to b-a
r = a + r + Random(0,1)
return r;
By including "r" into your computation, you are including the "randomness" of all the previous "for" loop runs.

loaded die algorithm

I want an algorithm to simulate this loaded die:
the probabilities are:
1: 1/18
2: 5/18
3: 1/18
4: 5/18
5: 1/18
6: 5/18
It favors even numbers.
My idea is to calculate in matlab the possibility of the above.
I can do it with 1/6 (normal die), but I am having difficulties applying it for a loaded die.
One way: generate two random numbers: first one is from 0 to 5 (0: odd, 1 - 5: even), which is used to determine even or odd. Then generate a second between 0 and 2, which determines exact number within its category. For example, if the first number is 3 (which says even) and second is 2 (which says the third chunk, 1-2 is a chunk, 3-4 is another chunk and 5-6 is the last chunk), the the result is 6.
Another way: generate a random number between 0 and 17, then you can simply / 6 and % 6 and use those two numbers to decide. For example, if /6 gives you 0, then the choice is between 1 and 2, then if % 6 == 0, the choice lands on 1, otherwise lands on 2.
In matlab:
ceil(rand*3)*2-(rand>(5/6))
The generic solution:
Use roulette wheel selection
n = generate number between 0 and sum( probabilities )
s = 0;
i = 0;
while s <= n do
i = i + 1;
s = s + probability of element i;
done
After the loop is done i will be the number of the chosen element. This works for any kind of skewed probability distribution, even when you have weights instead of a probability and want to skip normalizing.
In the concise language of J,
>:3(<([++:#])|)?18

Resources