discrete universal distribution probability - probability

if x is an integer randomly generated from a discrete universal distribution of (-2^53, 2^53], that is, we randomly (not in a mathematical way) choose an integer between (-2^53, 2^53], what is the probability P ( x + 1 === x) ?
Tips from Probability 101: A discrete universal distribution X of [1, 2, 3, 4, 5, 6] is P(X = i) = ⅙, i = 1, 2, …, 6.

Assuming I understood the question, you're asking what's the probability that freshly sampled number is equal to the previous one. Well, that probability is equal to 2-54 for discrete universal (uniform) distribution.

Related

Optimization Algorithm for large search space

Problem:
Find a combination of 48 numbers (x) ranging from 1-6 that maximises an equation (y). The equation comprises of 48 distinct functions that are unknown and only take in one number per function.
max: y = f1(x1) + f2(x2) + ... + f48(x48)
where: x = {1:6}
example: x = [6, 1, 4, ..., 4] => y = 167
My first idea was to solve this using brute force, however, the search space is very large 6^48. Does anyone know of an algorithm that I could use or clever programming tricks?
The search space if not that large at all.
y is the sum of 48 distinct functions, so you need to maximize each one of them. There are 6 possibilities for each f_i, in total you need to check 6*48=288 cases to brute force.
Start with some base answer like x = [1, ..., 1]. Find the optimal value for x_1, then x_2, etc.

uniform random distribution at the bit level

I would like to understand how uniform random distribution works at the bit level.
For example, in fortran random_number gives an uniform distribution between [0,1). Real numbers have a mantisse and an exponent. So, I wonder if all possible numbers (at the bit level) are obtained. And in this case, if I consider number at the bit level, they won't have the same probability to to be chosen. Or another solution, not all numbers are used and numbers have the same interval : The largest interval between two numbers (ie exponent = 0, all mantisse bits=1 - all mantisse bits but last = 1 and last =0) is used.
Is there some links to explain that ?
In principle, it's easy.
A uniform random variable in (0, 1) is distributed as:
b0/2 + b1/4 + b2/8 + ...,
Where bi are unbiased random bits (zeros and ones).
This is a very old insight, dating at least from von Neumann (1951, "Various techniques used in connection with random digits").
Thus, in principle, all that's needed is to generate a steady sequence of unbiased random bits.
But generating a "uniform" floating-point number in the interval (0, 1) is non-trivial by comparison. See the following, for example:
Random floating point double in Inclusive Range
To respond to your comment:
In theory, a uniform distribution on (0, 1) is the same as one on [0, 1), (0, 1], or [0, 1]: the values 0 and 1 occur with probability zero, as is any particular number on (0, 1). However, a "uniform" floating-point number on (0, 1) is not the same as one on [0, 1), (0, 1], or [0, 1], since zero and 1 may occur with positive probability depending on whether the interval contains 0 or 1, respectively. In effect, "throwing away" zeros and ones on a "uniform" floating-point number on [0, 1] is the best that can be done to get a "uniform" floating-point number on (0, 1).

Better than brute force algorithms for a coin-flipping game

I have a problem and I feel like there should be a well-known algorithm for solving it that's better than just brute force, but I can't think of one, so I'm asking here.
The problem is as follows: given n sorted (from low to high) lists containing m probabilities, choose one index for each list such that the sum of the chosen indexes is less than m. Then, for each list, we flip a coin, where the chance of it landing heads is equal to the probability at the chosen index for that list. Maximize the chance of the coin landing heads at least once.
Are there any algorithms for solving this problem that are better than just brute force?
This problem seems most similar to the knapsack problem, except the value of the items in the knapsack isn't merely a sum of the items in the knapsack. (Written in Python, instead of sum(p for p in chosen_probabilities) it's 1 - math.prod([1 - p for p in chosen_probabilities])) And, there's restrictions on what items you can add given what items are already in the knapsack. For example, if the index = 3 item for a particular list is already in the knapsack, then adding in the item with index = 2 for that same list isn't allowed (since you can only pick one index for each list). So there are certain items that can and can't be added to the knapsack based on what items are already in it.
Linear optimization won't work because the values in the lists don't increase linearly, the final coin probability isn't linear with respect to the chosen probabilities, and our constraint is on the sum of the indexes, rather than the values in the lists themselves. As David has pointed out, linear optimization will work if you use binary variables to pick out the indexes and a logarithm to deal with the non-linearity.
EDIT:
I've found that explaining the motivation behind this problem can be helpful for understanding it. Imagine you have 10 seconds to solve a problem, and three different ways to solve it. You have models of how likely it is that each method will solve the problem, given how many seconds you try that method for, but if you switch methods, you lose all progress on the one you were previously trying. What methods should you try and for how long?
Maximizing 1 - math.prod([1 - p for p in chosen_probabilities]) is equivalent to minimizing math.prod([1 - p for p in chosen_probabilities]), which is equivalent to minimizing the log of this objective, which is a linear function of 0-1 indicator variables, so you could do an integer programming formulation this way.
I can't promise that this will be much better than brute force. The problem is that math.log(1 - p) is well approximated by -p when p is close to zero. My intuition is that for nontrivial instances it will be qualitatively similar to using integer programming to solve subset sum, which doesn't go particularly well.
If you're willing to settle for a bicriteria approximation scheme (get an answer such that the sum of the chosen indexes is less than m, that is at least as good as the best answer summing to less than (1 − ε) m) then you can round up the probability to multiples of ε and use dynamic programming to get an algorithm that runs in time polynomial in n, m, 1/ε.
Here is working code for David Eisenstat's solution.
To understand the implementation, I think it helps to go through the math first.
As a reminder, there are n lists, each with m options. (In the motivating example at the bottom of the question, each list represents a method for solving the problem, and you are given m-1 seconds to solve the problem. Each list is such that list[index] gives the chance of solving the problem with that method if the method is run for index seconds.)
We let the lists be stored in a matrix called d (named data in the code), where each row in the matrix is a list. (And thus each column represents an index, or, if following the motivating example, an amount of time.)
The probability of the coin landing heads, given that we chose index j* for list i, is computed as
We would like to maximize this.
(To explain the stats behind this equation, we're computing 1 minus the probability that the coin doesn't land on heads. The probability that the coin doesn't land on heads is the probability that each flip doesn't land on heads. The probability that a single flip doesn't land on heads is just 1 minus the probability that does land on heads. And the probability it does land on heads is the number we've chosen, d[i][j*]. Thus, the total probability that all the flips land on tails is just the product of the probability that each one lands on tails. And then the probability that the coin lands on heads is just 1 minus the probability that all the flips land on tails.)
Which, as David pointed out, is the same as minimizing:
Which is the same as minimizing:
Which is equivalent to:
Then, since this is linear sum, we can turn it into an integer program.
We'll be minimizing:
This lets the computer choose the indexes by allowing it to create an n by m matrix of 1s and 0s called x where the 1s pick out particular indexes. We'll then define rules so that it doesn't pick out invalid sets of indexes.
The first rule is that you have to pick out an index for each list:
The second rule is that you have to respect the constraint that the indexes chosen must sum to m or less:
And that's it! Then we can just tell the computer to minimize that sum according to those rules. It will spit out an x matrix with a single 1 on each row to tell us which index it has picked for the list on that row.
In code (using the motivating example), this is implemented as:
'''
Requirements:
cvxopt==1.2.6
cvxpy==1.1.10
ecos==2.0.7.post1
numpy==1.20.1
osqp==0.6.2.post0
qdldl==0.1.5.post0
scipy==1.6.1
scs==2.1.2
'''
import math
import cvxpy as cp
import numpy as np
# number of methods
n = 3
# if you have 10 seconds, there are 11 options for each method (0 seconds, 1 second, ..., 10 seconds)
m = 11
# method A has 30% chance of working if run for at least 3 seconds
# equivalent to [0, 0, 0, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3]
A_list = [0, 0, 0] + [0.3] * (m - 3)
# method B has 30% chance of working if run for at least 3 seconds
# equivalent to [0, 0, 0, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3]
B_list = [0, 0, 0] + [0.3] * (m - 3)
# method C has 40% chance of working if run for 4 seconds, 30% otherwise
# equivalent to [0.3, 0.3, 0.3, 0.3, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4]
C_list = [0.3, 0.3, 0.3, 0.3] + [0.4] * (m - 4)
data = [A_list, B_list, C_list]
# do the logarithm
log_data = []
for row in data:
log_row = []
for col in row:
# deal with domain exception
if col == 1:
new_col = float('-inf')
else:
new_col = math.log(1 - col)
log_row.append(new_col)
log_data.append(log_row)
log_data = np.array(log_data)
x = cp.Variable((n, m), boolean=True)
objective = cp.Minimize(cp.sum(cp.multiply(log_data, x)))
# the current solver doesn't work with equalities, so each equality must be split into two inequalities.
# see https://github.com/cvxgrp/cvxpy/issues/1112
one_choice_per_method_constraint = [cp.sum(x[i]) <= 1 for i in range(n)] + [cp.sum(x[i]) >= 1 for i in range(n)]
# constrain the solution to not use more time than is allowed
# note that the time allowed is (m - 1), not m, because time is 1-indexed and the lists are 0-indexed
js = np.tile(np.array(list(range(m))), (n, 1))
time_constraint = [cp.sum(cp.multiply(js, x)) <= m - 1, cp.sum(cp.multiply(js, x)) >= m - 1]
constraints = one_choice_per_method_constraint + time_constraint
prob = cp.Problem(objective, constraints)
result = prob.solve()
def compute_probability(data, choices):
# compute 1 - ((1 - p1) * (1 - p2) * ...)
return 1 - np.prod(np.add(1, -np.multiply(data, choices)))
print("Choices:")
print(x.value)
'''
Choices:
[[0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]]
'''
print("Chance of success:")
print(compute_probability(data, x.value))
'''
Chance of success:
0.7060000000000001
'''
And there we have it! The computer has correctly determined that running method A for 3 seconds, method B for 3 seconds, and method C for 4 seconds is optimal. (Remember that the x matrix is 0-indexed, while the times are 1-indexed.)
Thank you, David, for the suggestion!

Randomly select N unique elements from a list, given a probability for each

I've run into a problem: I have a list or array (IList) of elements that have a field (float Fitness). I need to efficiently choose N random unique elements depending on this variable: the bigger - the more likely it is to be chosen.
I searched on the internet, but the algorithms I found were rather unreliable.
The answer stated here seems to have a bigger probability at the beginning which I need to make sure to avoid.
-Edit-
For example I need to choose from objects with the values [-5, -3, 0, 1, 2.5] (negative values included).
The basic algorithm is to sum the values, and then draw a point from 0-sum(values) and an order for the items, and see which one it "intersects".
For the values [0.1, 0.2, 0.3] the "windows" [0-0.1, 0.1-0.3, 0.3-0.6] will look like this:
1 23 456
|-|--|---|
|-*--*---|
And you draw a point [0-0.6] and see what window it hit on the axis.
Pseudo-python for this:
original_values = {val1, val2, ... valn}
# list is to order them, order doesn't matter outside this context.
values = list(original_values)
# limit
limit = sum(values)
draw = random() * limit
while true:
candidate = values.pop()
if candidate > draw:
return candidate
draw -= candidate
So what shall those numbers represent?
Does 2.5 mean, that the probability to be chosen is twice as high than 1.25? Well - the negative values don't fit into that scheme.
I guess fitness means something like -5: very ill, 2.5: very fit. We have a range of 7.5 and could randomly pick an element, if we know how many candidates there are and if we have access by index.
Then, take a random number between -5 and 2.5 and see, if our number is lower than or equal to the candidates fitness. If so, the candidate is picked, else we repeat with step 1. I would say, that we then generate a new threshold to survive, because if we got an 2.5, but no candidate with that fitness remains, we would search infinitely.
The range of fitnesses has to be known for this, too.
fitnesses [-5, -3, 0, 1, 2.5]
rand -5 x x x x x
-2.5 - - x x x
0 - - x x x
2.5 - - - - x
If every candidate shall be testet every round, and the -5 guy shall have a chance to survive, you have to stretch the interval of random numbers a bit, to give him a chance, for instance, from -6 to 3.

Maximum non-segment sum

We have a list / array of numbers (positives and negatives are all possible).
A segment is defined as contiguous subsequence of the numbers. For example, [1;-2;3;4;5] is the array and a segment of it is [1;-2;3] or [-2;3;4], etc. All numbers in a segment must be contiguous.
A non-segment is defined as all subsequences of the array, except all segments. So contiguous numbers are possible in a non-segment, but there must be at least two numbers which are not contiguous.
For example, [1;3;4] is a non-segment, [1;-2;3;5] is also a non-segment because 3 and 5 are not contiguous (there is a '4' between them in the original array).
The question is what is the non-segment having the maximum sum?
Note
Numbers can be mix of positives and negatives
It is not the problem of http://algorithmsbyme.wordpress.com/2012/07/29/amazon-interview-question-maximum-possible-sum-with-non-consecutive-numbers/ or Maximum sum of non consecutive elements. In those problems, no numbers can be contiguous and all numbers are positive.
This is problem 11 in the book Pearls of functional algorithm design and it says there is a linear way to solve it.
But I can't understand nor find out a linear way. So I try my luck here.
Here's a solution better suited to the functional programming idiom. One can imagine a four-state finite automaton that accepts strings having two non-adjacent 1s.
0 1 0 0,1
___ ___ ___ ___
v / 1 v / 0 v / 1 v /
---> (q0) ---> (q1) ---> (q2) ---> ((q3))
What the Haskell program below does is essentially to scan the numbers one at a time and remember the maximum values that can be made via choices that, when interpreted as 0s and 1s, put the automaton in state q1 (segmentEndingHere), state q2 (segmentNotEndingHere), or state q3 (nonSegment). This technique is a sledgehammer that works on many of these problems about optimization on a sequence.
maximumNonSegmentSum :: (Num a, Ord a) => [a] -> Maybe a
maximumNonSegmentSum = mnss Nothing Nothing Nothing
where
(^+) :: (Num a) => a -> Maybe a -> Maybe a
(^+) = liftM . (+)
mnss ::
(Num a, Ord a) => Maybe a -> Maybe a -> Maybe a -> [a] -> Maybe a
mnss segmentEndingHere segmentNotEndingHere nonSegment xs
= case xs of
[] -> nonSegment
x : xs'
-> mnss ((x ^+ segmentEndingHere) `max` Just x)
(segmentNotEndingHere `max` segmentEndingHere)
(((x `max` 0) ^+ nonSegment) `max` (x ^+ segmentNotEndingHere))
xs'
There are 2 possibilities:
There exists at most 2 non-negative numbers, and, in the case of 2 existing, they're neighbouring.
In this case we pick the largest pair of non-neighbouring numbers. This can be done in linear time by finding the largest number and the sum of that with the largest non-neighbouring number, and then the sum of both its neighbouring numbers.
Example:
Input: [-5, -10, -6, -2, -1, -2, -10]
The largest number is -1, so we sum -1 and the largest non-neighbouring number (-5), which gives -6. Then we also try -2 and -2, giving -4. So the largest non-segment sum is -4.
There exists at least two non-neighbouring non-negative numbers.
We pick all positive numbers. If the largest number is zero (i.e. there are no positive numbers), pick all the zero's instead.
If all the picked numbers are consecutive, try to:
Exclude the smallest one that's not on one of the ends.
Include the largest (i.e. closest to 0) non-positive number that's not neighbouring to the picked numbers (if there exists such a 0, this would be the best option).
In turn, try to exclude the numbers from the ends of the sequence, then include the non-positive number next to it (do this only if there exists a number next to it).
Pick the option here giving the largest sum.
Clearly all of this can happen in linear time.
Example:
Input: [-5, -1, 5, 7, 9, 11, -1, -10]
So first we pick all positive numbers - 5, 7, 9, 11, but they're consecutive.
So we try to exclude the smallest non-end number (7),
giving us sum(5, 9, 11) = 25.
Then we try to include the largest non-neighbouring negative number (-5),
giving us sum(-5, 5, 7, 9, 11) = 27.
Then we try to exclude the left edge (5) and include the number next to it (-1),
giving us sum(-1, 7, 9, 11) = 26.
Then we try to exclude the right edge (11) and include the number next to it (-1),
giving us sum(-1, 5, 7, 9) = 20.
Clearly the maximum sum is 27.
Note how we can make any of the conditions the maximum sum by just changing a value, thus all the conditions are needed.
Take all of the positive numbers. If they form a segment, check if you can add in something not adjacent to it. Also check if you can pick something off in the middle. Also check if you can pick off an end and put in the number next to it. It's not too hard to prove that one of these leads to the best non-segment.

Resources