Expected number of moves needed to turn on all the n bulbs by tossing a baised coin with p% probability? - probability

There are n bulbs. Initially, each bulb are turned off. In each move you can select one random bulb (Probability of selecting each bulb is same). If the bulb is already turned on, you do nothing. If the bulb is turned off, you must toss a coin. If it’s head, you can turn the bulb on, but if it’s tail, the bulb will remain off.
To make the problem even more boring, the coin is not a fair coin, the chance of landing a tail is p%.
What is the expected number of moves needed to turn on all the bulbs?
I want to know the algorithm or solution process for it as n is variable for this problem.

Naive approach: Monte-Carlo-Simulation
Description
(You did not give any information on what kind of solution you want, therefore i present the probably most simple but still powerful approach: simulation)
The Monte-Carlo method allows us to simulate this stochastic-process and observe the number of steps needed. As this is only a noisy estimation, we need to do this many times. Using an infinite number of runs, this solution will converge to the theoretical one!
Code
Here is some simple python-based code:
import random
import matplotlib.pyplot as plt # just for plotting
n = 10
p = 0.7
n_samples = 1000000
def run():
states = [0 for i in range(n)]
steps = 0
while 0 in states:
index = random.randint(0, n-1)
if random.random() < p: # non-fair coin
states[index] = 1
steps += 1
return steps
avg = 0.0
samples = []
for sample in range(n_samples):
steps = run()
avg += steps
samples.append(steps)
print(avg / n_samples)
plt.hist(samples)
plt.show()
Code output
41.853233
Math approach: Absorbing Markov Chain
Description
As the probabilities describing the state-change of light-bulbs is only dependent on the current state, the Markov-assumption is valid and we can use Markov-Chains to obtain the average steps needed.
Because the final state will self-loop forever and because it will be reached given an infinite number of steps, this is an Absorbing Markov Chain.
As all light bulbs are the same, we don't need to model transitions where each combination of activated light bulbs maps to each other. We can reduce it to the much simpler: 0 light-bulbs -> 1 light-bulbs -> ... (and the self-loops of course).
The discrete-time discrete-state-space Absorbing Markov Chain allows a simple and powerful calculation of the desired value.
Some theory is explained at wikipedia. It's also the source for the formulas in use for the following code.
Code
Again some python:
import numpy as np
N = 10
P = 0.7
""" Build transition matrix """
trans_mat = np.zeros((N+1, N+1))
for source_state in range(N):
prob_hitting_next = ((N-source_state) / float(N)) * P
inverse = 1.0 - prob_hitting_next
trans_mat[source_state, source_state] = inverse
trans_mat[source_state, source_state+1] = prob_hitting_next
trans_mat[N, N] = 1.0
""" Will look like this:
[[ 0.3 0.7 0. 0. 0. 0. 0. 0. 0. 0. 0. ]
[ 0. 0.37 0.63 0. 0. 0. 0. 0. 0. 0. 0. ]
[ 0. 0. 0.44 0.56 0. 0. 0. 0. 0. 0. 0. ]
[ 0. 0. 0. 0.51 0.49 0. 0. 0. 0. 0. 0. ]
[ 0. 0. 0. 0. 0.58 0.42 0. 0. 0. 0. 0. ]
[ 0. 0. 0. 0. 0. 0.65 0.35 0. 0. 0. 0. ]
[ 0. 0. 0. 0. 0. 0. 0.72 0.28 0. 0. 0. ]
[ 0. 0. 0. 0. 0. 0. 0. 0.79 0.21 0. 0. ]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0.86 0.14 0. ]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.93 0.07]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. ]]
"""
""" Q: the sub-matrix of trans_mat without
the rows and columns of any absorbing states
N_fund: fundamental matrix
t: expected number of steps before beeing absorved for each start-state
"""
Q_sub = trans_mat[:N, :N]
N_fund = np.linalg.inv(np.eye(N) - Q_sub)
t = np.dot(N_fund, np.ones(N))
print(t)
print(t[0]) # this is the value we want
Code output
[ 41.84240363 40.4138322 38.82653061 37.04081633
35. 32.61904762 29.76190476 26.19047619 21.42857143 14.28571429]
41.8424036281 # less than 1 second calculation time!

Related

Better than brute force algorithms for a coin-flipping game

I have a problem and I feel like there should be a well-known algorithm for solving it that's better than just brute force, but I can't think of one, so I'm asking here.
The problem is as follows: given n sorted (from low to high) lists containing m probabilities, choose one index for each list such that the sum of the chosen indexes is less than m. Then, for each list, we flip a coin, where the chance of it landing heads is equal to the probability at the chosen index for that list. Maximize the chance of the coin landing heads at least once.
Are there any algorithms for solving this problem that are better than just brute force?
This problem seems most similar to the knapsack problem, except the value of the items in the knapsack isn't merely a sum of the items in the knapsack. (Written in Python, instead of sum(p for p in chosen_probabilities) it's 1 - math.prod([1 - p for p in chosen_probabilities])) And, there's restrictions on what items you can add given what items are already in the knapsack. For example, if the index = 3 item for a particular list is already in the knapsack, then adding in the item with index = 2 for that same list isn't allowed (since you can only pick one index for each list). So there are certain items that can and can't be added to the knapsack based on what items are already in it.
Linear optimization won't work because the values in the lists don't increase linearly, the final coin probability isn't linear with respect to the chosen probabilities, and our constraint is on the sum of the indexes, rather than the values in the lists themselves. As David has pointed out, linear optimization will work if you use binary variables to pick out the indexes and a logarithm to deal with the non-linearity.
EDIT:
I've found that explaining the motivation behind this problem can be helpful for understanding it. Imagine you have 10 seconds to solve a problem, and three different ways to solve it. You have models of how likely it is that each method will solve the problem, given how many seconds you try that method for, but if you switch methods, you lose all progress on the one you were previously trying. What methods should you try and for how long?
Maximizing 1 - math.prod([1 - p for p in chosen_probabilities]) is equivalent to minimizing math.prod([1 - p for p in chosen_probabilities]), which is equivalent to minimizing the log of this objective, which is a linear function of 0-1 indicator variables, so you could do an integer programming formulation this way.
I can't promise that this will be much better than brute force. The problem is that math.log(1 - p) is well approximated by -p when p is close to zero. My intuition is that for nontrivial instances it will be qualitatively similar to using integer programming to solve subset sum, which doesn't go particularly well.
If you're willing to settle for a bicriteria approximation scheme (get an answer such that the sum of the chosen indexes is less than m, that is at least as good as the best answer summing to less than (1 − ε) m) then you can round up the probability to multiples of ε and use dynamic programming to get an algorithm that runs in time polynomial in n, m, 1/ε.
Here is working code for David Eisenstat's solution.
To understand the implementation, I think it helps to go through the math first.
As a reminder, there are n lists, each with m options. (In the motivating example at the bottom of the question, each list represents a method for solving the problem, and you are given m-1 seconds to solve the problem. Each list is such that list[index] gives the chance of solving the problem with that method if the method is run for index seconds.)
We let the lists be stored in a matrix called d (named data in the code), where each row in the matrix is a list. (And thus each column represents an index, or, if following the motivating example, an amount of time.)
The probability of the coin landing heads, given that we chose index j* for list i, is computed as
We would like to maximize this.
(To explain the stats behind this equation, we're computing 1 minus the probability that the coin doesn't land on heads. The probability that the coin doesn't land on heads is the probability that each flip doesn't land on heads. The probability that a single flip doesn't land on heads is just 1 minus the probability that does land on heads. And the probability it does land on heads is the number we've chosen, d[i][j*]. Thus, the total probability that all the flips land on tails is just the product of the probability that each one lands on tails. And then the probability that the coin lands on heads is just 1 minus the probability that all the flips land on tails.)
Which, as David pointed out, is the same as minimizing:
Which is the same as minimizing:
Which is equivalent to:
Then, since this is linear sum, we can turn it into an integer program.
We'll be minimizing:
This lets the computer choose the indexes by allowing it to create an n by m matrix of 1s and 0s called x where the 1s pick out particular indexes. We'll then define rules so that it doesn't pick out invalid sets of indexes.
The first rule is that you have to pick out an index for each list:
The second rule is that you have to respect the constraint that the indexes chosen must sum to m or less:
And that's it! Then we can just tell the computer to minimize that sum according to those rules. It will spit out an x matrix with a single 1 on each row to tell us which index it has picked for the list on that row.
In code (using the motivating example), this is implemented as:
'''
Requirements:
cvxopt==1.2.6
cvxpy==1.1.10
ecos==2.0.7.post1
numpy==1.20.1
osqp==0.6.2.post0
qdldl==0.1.5.post0
scipy==1.6.1
scs==2.1.2
'''
import math
import cvxpy as cp
import numpy as np
# number of methods
n = 3
# if you have 10 seconds, there are 11 options for each method (0 seconds, 1 second, ..., 10 seconds)
m = 11
# method A has 30% chance of working if run for at least 3 seconds
# equivalent to [0, 0, 0, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3]
A_list = [0, 0, 0] + [0.3] * (m - 3)
# method B has 30% chance of working if run for at least 3 seconds
# equivalent to [0, 0, 0, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3]
B_list = [0, 0, 0] + [0.3] * (m - 3)
# method C has 40% chance of working if run for 4 seconds, 30% otherwise
# equivalent to [0.3, 0.3, 0.3, 0.3, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4]
C_list = [0.3, 0.3, 0.3, 0.3] + [0.4] * (m - 4)
data = [A_list, B_list, C_list]
# do the logarithm
log_data = []
for row in data:
log_row = []
for col in row:
# deal with domain exception
if col == 1:
new_col = float('-inf')
else:
new_col = math.log(1 - col)
log_row.append(new_col)
log_data.append(log_row)
log_data = np.array(log_data)
x = cp.Variable((n, m), boolean=True)
objective = cp.Minimize(cp.sum(cp.multiply(log_data, x)))
# the current solver doesn't work with equalities, so each equality must be split into two inequalities.
# see https://github.com/cvxgrp/cvxpy/issues/1112
one_choice_per_method_constraint = [cp.sum(x[i]) <= 1 for i in range(n)] + [cp.sum(x[i]) >= 1 for i in range(n)]
# constrain the solution to not use more time than is allowed
# note that the time allowed is (m - 1), not m, because time is 1-indexed and the lists are 0-indexed
js = np.tile(np.array(list(range(m))), (n, 1))
time_constraint = [cp.sum(cp.multiply(js, x)) <= m - 1, cp.sum(cp.multiply(js, x)) >= m - 1]
constraints = one_choice_per_method_constraint + time_constraint
prob = cp.Problem(objective, constraints)
result = prob.solve()
def compute_probability(data, choices):
# compute 1 - ((1 - p1) * (1 - p2) * ...)
return 1 - np.prod(np.add(1, -np.multiply(data, choices)))
print("Choices:")
print(x.value)
'''
Choices:
[[0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]]
'''
print("Chance of success:")
print(compute_probability(data, x.value))
'''
Chance of success:
0.7060000000000001
'''
And there we have it! The computer has correctly determined that running method A for 3 seconds, method B for 3 seconds, and method C for 4 seconds is optimal. (Remember that the x matrix is 0-indexed, while the times are 1-indexed.)
Thank you, David, for the suggestion!

What is the difference between these two tensors and why?

What is the difference in dimension or rank between the first two results shown? Why am I able to add those two (matrices/vectors)? This may sound like a naive question but I am trying hard to understand how addition between tensors/matrices work. Thank you.
(I also wanted to know why I can add the last two results. Aren't they two different sized matrices?)
import tensorflow as tf
import numpy as np
W = tf.Variable(tf.zeros([784, 10]))
x = tf.Variable(tf.zeros([2,784]))
z = tf.matmul(x,W)
Y = tf.Variable([4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 14.0])
x = tf.Variable(tf.zeros([2,10]))
model = tf.initialize_all_variables()
with tf.Session() as session:
session.run(model)
print(session.run(z))
print(session.run(Y))
print(session.run(x))
Result:
[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[ 4. 5. 6. 7. 8. 9. 10. 11. 12. 14.]
[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
I don't see any addition, only multiplication.
All you are doing with the Y variable is printing out the tensor itself which contains the floating values you specified.
With z, you are multiplying these two tensors together. The general formula for the dimensions of the result of multiplication of matrices is MxN * OxP = MxP (M and O are rows, N and P are columns). So for x, you have a 2x784 tensor multiplied by a 784x10 tensor. This (by the general dimension formula) gives you a tensor with dimensions 2x10.
If you meant that you could do addition after the fact with Y and z, it is because libraries like tensorflow typically apply broadcasting from one tensor to another as long as some dimension matches. So if you did Y + z you would get
[[ 4. 5. 6. 7. 8. 9. 10. 11. 12. 14.]
[ 4. 5. 6. 7. 8. 9. 10. 11. 12. 14.]]
because of the broadcasting being applied to the number of rows in z.
EDIT: I just thought out that you asked for difference in terms of arithmetic x) because of broadcasting, z - Y would be
[[ -4. -5. -6. -7. -8. -9. -10. -11. -12. -14.]
[ -4. -5. -6. -7. -8. -9. -10. -11. -12. -14.]]

How to program an unfair or biased coin flip in Ruby?

I need to make a coin flip that obeys a certain probability of outcome. For example, a coin flip with a 67% chance of coming out Heads, 83% chance of coming out Tails, etc.
I managed to get the result I'm after by populating an array with 100 true and false in the equivalent distribution, then picking one item at random. What is a more elegant way to go about it?
rand < 0.67
rand < 0.83
will give true with probability of 67% and 83%, respectively - because a uniformly selected random number x that is 0 <= x < 1 (such as returned by Kernel#rand) will be 67% likely to land in the segment 0 <= x < 0.67.
Random#rand(max) (and Kernel#rand(max)):
When max is an Integer [greater than or equal to 1], rand returns a random integer greater than or equal to zero and less than max..
So:
p = rand(100)
return p < 83 # ie. true for heads
In theory such can be "exact" like an array distribution method.

algorithm to map values to unit scale with logarithmic spacing

I'm looking for an algorithm to map values to a unit scale with logarithmic spacing. The scale ranges from 0 to 1. Incoming values would be in the range of 0 to 10000.
0 maps to 0, 1 maps to .2, 10 maps to .4, 100 maps to .6
1000 maps to .8, 10000 maps to 1.0
Help/pointers would be appreciated.
If you are literally looking "to map values to a unit scale with logarithmic spacing", and with f(0)=0, then your example values are wrong.
However, you can do this with f(x) = log(1+x)/log(1+max)
So with max=10000, we have :
f(0)=0
f(1)=0.0753
f(2)=0.1193
f(10)=0.2603
f(100)=0.5010
f(1000)=0.7501
f(10000)=1
which on a log scale makes sense : if 1 is near 0 and 10000 is 1, then 100 which has the average number of zeroes of the previous numbers, should be around 0.5. You really don't wart to start considering log(0)as an option.
However, as soon as your minimum value is not 0 anymore (even if the min value is very very very small, as soon as it's non-zero), you can do a more reasonable interpolation :
f(x) = (log(x) - log(min)) / (log(max) - log(min))
which is the same as user3246191's comment under his answer :
f(x) = log(x/min) / log(max/min)
Since all values returned by f in this post are fractions of logarithms, you can take the logarithm in any base you please. I would recommend the native one for your programming language (ie if log10(x) is defined as ln(x)/ln(10), take ln(x) instead).
It is not really clear what is the transform you are trying to apply. For what you try to say it seems that a potential function would be
f(x) = 0.2(1+ log(x)/log(10))
which satisfies f(1) = 0.2, f(10) = 0.4, f(100) = 0.6, f(1000) = 0.8, f(10000) = 1
but on the other hand f(0.1) = 0 and f(0) = -infty.
Of course it is possible to modify f so that f(0) = 0 but this will be somewhat arbitrary and your question is not really well formulated then.

How to implement Random(a,b) with only Random(0,1)? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
how to get uniformed random between a, b by a known uniformed random function RANDOM(0,1)
In the book of Introduction to algorithms, there is an excise:
Describe an implementation of the procedure Random(a, b) that only makes calls to Random(0,1). What is the expected running time of your procedure, as a function of a and b? The probability of the result of Random(a,b) should be pure uniformly distributed, as Random(0,1)
For the Random function, the results are integers between a and b, inclusively. For e.g., Random(0,1) generates either 0 or 1; Random(a, b) generates a, a+1, a+2, ..., b
My solution is like this:
for i = 1 to b-a
r = a + Random(0,1)
return r
the running time is T=b-a
Is this correct? Are the results of my solutions uniformly distributed?
Thanks
What if my new solution is like this:
r = a
for i = 1 to b - a //including b-a
r += Random(0,1)
return r
If it is not correct, why r += Random(0,1) makes r not uniformly distributed?
Others have explained why your solution doesn't work. Here's the correct solution:
1) Find the smallest number, p, such that 2^p > b-a.
2) Perform the following algorithm:
r=0
for i = 1 to p
r = 2*r + Random(0,1)
3) If r is greater than b-a, go to step 2.
4) Your result is r+a
So let's try Random(1,3).
So b-a is 2.
2^1 = 2, so p will have to be 2 so that 2^p is greater than 2.
So we'll loop two times. Let's try all possible outputs:
00 -> r=0, 0 is not > 2, so we output 0+1 or 1.
01 -> r=1, 1 is not > 2, so we output 1+1 or 2.
10 -> r=2, 2 is not > 2, so we output 2+1 or 3.
11 -> r=3, 3 is > 2, so we repeat.
So 1/4 of the time, we output 1. 1/4 of the time we output 2. 1/4 of the time we output 3. And 1/4 of the time we have to repeat the algorithm a second time. Looks good.
Note that if you have to do this a lot, two optimizations are handy:
1) If you use the same range a lot, have a class that computes p once so you don't have to compute it each time.
2) Many CPUs have fast ways to perform step 1 that aren't exposed in high-level languages. For example, x86 CPUs have the BSR instruction.
No, it's not correct, that method will concentrate around (a+b)/2. It's a binomial distribution.
Are you sure that Random(0,1) produces integers? it would make more sense if it produced floating point values between 0 and 1. Then the solution would be an affine transformation, running time independent of a and b.
An idea I just had, in case it's about integer values: use bisection. At each step, you have a range low-high. If Random(0,1) returns 0, the next range is low-(low+high)/2, else (low+high)/2-high.
Details and complexity left to you, since it's homework.
That should create (approximately) a uniform distribution.
Edit: approximately is the important word there. Uniform if b-a+1 is a power of 2, not too far off if it's close, but not good enough generally. Ah, well it was a spontaneous idea, can't get them all right.
No, your solution isn't correct. This sum'll have binomial distribution.
However, you can generate a pure random sequence of 0, 1 and treat it as a binary number.
repeat
result = a
steps = ceiling(log(b - a))
for i = 0 to steps
result += (2 ^ i) * Random(0, 1)
until result <= b
KennyTM: my bad.
I read the other answers. For fun, here is another way to find the random number:
Allocate an array with b-a elements.
Set all the values to 1.
Iterate through the array. For each nonzero element, flip the coin, as it were. If it is came up 0, set the element to 0.
Whenever, after a complete iteration, you only have 1 element remaining, you have your random number: a+i where i is the index of the nonzero element (assuming we start indexing on 0). All numbers are then equally likely. (You would have to deal with the case where it's a tie, but I leave that as an exercise for you.)
This would have O(infinity) ... :)
On average, though, half the numbers would be eliminated, so it would have an average case running time of log_2 (b-a).
First of all I assume you are actually accumulating the result, not adding 0 or 1 to a on each step.
Using some probabilites you can prove that your solution is not uniformly distibuted. The chance that the resulting value r is (a+b)/2 is greatest. For instance if a is 0 and b is 7, the chance that you get a value 4 is (combination 4 of 7) divided by 2 raised to the power 7. The reason for that is that no matter which 4 out of the 7 values are 1 the result will still be 4.
The running time you estimate is correct.
Your solution's pseudocode should look like:
r=a
for i = 0 to b-a
r+=Random(0,1)
return r
As for uniform distribution, assuming that the random implementation this random number generator is based on is perfectly uniform the odds of getting 0 or 1 are 50%. Therefore getting the number you want is the result of that choice made over and over again.
So for a=1, b=5, there are 5 choices made.
The odds of getting 1 involves 5 decisions, all 0, the odds of that are 0.5^5 = 3.125%
The odds of getting 5 involves 5 decisions, all 1, the odds of that are 0.5^5 = 3.125%
As you can see from this, the distribution is not uniform -- the odds of any number should be 20%.
In the algorithm you created, it is really not equally distributed.
The result "r" will always be either "a" or "a+1". It will never go beyond that.
It should look something like this:
r=0;
for i=0 to b-a
r = a + r + Random(0,1)
return r;
By including "r" into your computation, you are including the "randomness" of all the previous "for" loop runs.

Resources