I'm looking for an algorithm to map values to a unit scale with logarithmic spacing. The scale ranges from 0 to 1. Incoming values would be in the range of 0 to 10000.
0 maps to 0, 1 maps to .2, 10 maps to .4, 100 maps to .6
1000 maps to .8, 10000 maps to 1.0
Help/pointers would be appreciated.
If you are literally looking "to map values to a unit scale with logarithmic spacing", and with f(0)=0, then your example values are wrong.
However, you can do this with f(x) = log(1+x)/log(1+max)
So with max=10000, we have :
f(0)=0
f(1)=0.0753
f(2)=0.1193
f(10)=0.2603
f(100)=0.5010
f(1000)=0.7501
f(10000)=1
which on a log scale makes sense : if 1 is near 0 and 10000 is 1, then 100 which has the average number of zeroes of the previous numbers, should be around 0.5. You really don't wart to start considering log(0)as an option.
However, as soon as your minimum value is not 0 anymore (even if the min value is very very very small, as soon as it's non-zero), you can do a more reasonable interpolation :
f(x) = (log(x) - log(min)) / (log(max) - log(min))
which is the same as user3246191's comment under his answer :
f(x) = log(x/min) / log(max/min)
Since all values returned by f in this post are fractions of logarithms, you can take the logarithm in any base you please. I would recommend the native one for your programming language (ie if log10(x) is defined as ln(x)/ln(10), take ln(x) instead).
It is not really clear what is the transform you are trying to apply. For what you try to say it seems that a potential function would be
f(x) = 0.2(1+ log(x)/log(10))
which satisfies f(1) = 0.2, f(10) = 0.4, f(100) = 0.6, f(1000) = 0.8, f(10000) = 1
but on the other hand f(0.1) = 0 and f(0) = -infty.
Of course it is possible to modify f so that f(0) = 0 but this will be somewhat arbitrary and your question is not really well formulated then.
Related
I have a problem and I feel like there should be a well-known algorithm for solving it that's better than just brute force, but I can't think of one, so I'm asking here.
The problem is as follows: given n sorted (from low to high) lists containing m probabilities, choose one index for each list such that the sum of the chosen indexes is less than m. Then, for each list, we flip a coin, where the chance of it landing heads is equal to the probability at the chosen index for that list. Maximize the chance of the coin landing heads at least once.
Are there any algorithms for solving this problem that are better than just brute force?
This problem seems most similar to the knapsack problem, except the value of the items in the knapsack isn't merely a sum of the items in the knapsack. (Written in Python, instead of sum(p for p in chosen_probabilities) it's 1 - math.prod([1 - p for p in chosen_probabilities])) And, there's restrictions on what items you can add given what items are already in the knapsack. For example, if the index = 3 item for a particular list is already in the knapsack, then adding in the item with index = 2 for that same list isn't allowed (since you can only pick one index for each list). So there are certain items that can and can't be added to the knapsack based on what items are already in it.
Linear optimization won't work because the values in the lists don't increase linearly, the final coin probability isn't linear with respect to the chosen probabilities, and our constraint is on the sum of the indexes, rather than the values in the lists themselves. As David has pointed out, linear optimization will work if you use binary variables to pick out the indexes and a logarithm to deal with the non-linearity.
EDIT:
I've found that explaining the motivation behind this problem can be helpful for understanding it. Imagine you have 10 seconds to solve a problem, and three different ways to solve it. You have models of how likely it is that each method will solve the problem, given how many seconds you try that method for, but if you switch methods, you lose all progress on the one you were previously trying. What methods should you try and for how long?
Maximizing 1 - math.prod([1 - p for p in chosen_probabilities]) is equivalent to minimizing math.prod([1 - p for p in chosen_probabilities]), which is equivalent to minimizing the log of this objective, which is a linear function of 0-1 indicator variables, so you could do an integer programming formulation this way.
I can't promise that this will be much better than brute force. The problem is that math.log(1 - p) is well approximated by -p when p is close to zero. My intuition is that for nontrivial instances it will be qualitatively similar to using integer programming to solve subset sum, which doesn't go particularly well.
If you're willing to settle for a bicriteria approximation scheme (get an answer such that the sum of the chosen indexes is less than m, that is at least as good as the best answer summing to less than (1 − ε) m) then you can round up the probability to multiples of ε and use dynamic programming to get an algorithm that runs in time polynomial in n, m, 1/ε.
Here is working code for David Eisenstat's solution.
To understand the implementation, I think it helps to go through the math first.
As a reminder, there are n lists, each with m options. (In the motivating example at the bottom of the question, each list represents a method for solving the problem, and you are given m-1 seconds to solve the problem. Each list is such that list[index] gives the chance of solving the problem with that method if the method is run for index seconds.)
We let the lists be stored in a matrix called d (named data in the code), where each row in the matrix is a list. (And thus each column represents an index, or, if following the motivating example, an amount of time.)
The probability of the coin landing heads, given that we chose index j* for list i, is computed as
We would like to maximize this.
(To explain the stats behind this equation, we're computing 1 minus the probability that the coin doesn't land on heads. The probability that the coin doesn't land on heads is the probability that each flip doesn't land on heads. The probability that a single flip doesn't land on heads is just 1 minus the probability that does land on heads. And the probability it does land on heads is the number we've chosen, d[i][j*]. Thus, the total probability that all the flips land on tails is just the product of the probability that each one lands on tails. And then the probability that the coin lands on heads is just 1 minus the probability that all the flips land on tails.)
Which, as David pointed out, is the same as minimizing:
Which is the same as minimizing:
Which is equivalent to:
Then, since this is linear sum, we can turn it into an integer program.
We'll be minimizing:
This lets the computer choose the indexes by allowing it to create an n by m matrix of 1s and 0s called x where the 1s pick out particular indexes. We'll then define rules so that it doesn't pick out invalid sets of indexes.
The first rule is that you have to pick out an index for each list:
The second rule is that you have to respect the constraint that the indexes chosen must sum to m or less:
And that's it! Then we can just tell the computer to minimize that sum according to those rules. It will spit out an x matrix with a single 1 on each row to tell us which index it has picked for the list on that row.
In code (using the motivating example), this is implemented as:
'''
Requirements:
cvxopt==1.2.6
cvxpy==1.1.10
ecos==2.0.7.post1
numpy==1.20.1
osqp==0.6.2.post0
qdldl==0.1.5.post0
scipy==1.6.1
scs==2.1.2
'''
import math
import cvxpy as cp
import numpy as np
# number of methods
n = 3
# if you have 10 seconds, there are 11 options for each method (0 seconds, 1 second, ..., 10 seconds)
m = 11
# method A has 30% chance of working if run for at least 3 seconds
# equivalent to [0, 0, 0, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3]
A_list = [0, 0, 0] + [0.3] * (m - 3)
# method B has 30% chance of working if run for at least 3 seconds
# equivalent to [0, 0, 0, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3]
B_list = [0, 0, 0] + [0.3] * (m - 3)
# method C has 40% chance of working if run for 4 seconds, 30% otherwise
# equivalent to [0.3, 0.3, 0.3, 0.3, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4]
C_list = [0.3, 0.3, 0.3, 0.3] + [0.4] * (m - 4)
data = [A_list, B_list, C_list]
# do the logarithm
log_data = []
for row in data:
log_row = []
for col in row:
# deal with domain exception
if col == 1:
new_col = float('-inf')
else:
new_col = math.log(1 - col)
log_row.append(new_col)
log_data.append(log_row)
log_data = np.array(log_data)
x = cp.Variable((n, m), boolean=True)
objective = cp.Minimize(cp.sum(cp.multiply(log_data, x)))
# the current solver doesn't work with equalities, so each equality must be split into two inequalities.
# see https://github.com/cvxgrp/cvxpy/issues/1112
one_choice_per_method_constraint = [cp.sum(x[i]) <= 1 for i in range(n)] + [cp.sum(x[i]) >= 1 for i in range(n)]
# constrain the solution to not use more time than is allowed
# note that the time allowed is (m - 1), not m, because time is 1-indexed and the lists are 0-indexed
js = np.tile(np.array(list(range(m))), (n, 1))
time_constraint = [cp.sum(cp.multiply(js, x)) <= m - 1, cp.sum(cp.multiply(js, x)) >= m - 1]
constraints = one_choice_per_method_constraint + time_constraint
prob = cp.Problem(objective, constraints)
result = prob.solve()
def compute_probability(data, choices):
# compute 1 - ((1 - p1) * (1 - p2) * ...)
return 1 - np.prod(np.add(1, -np.multiply(data, choices)))
print("Choices:")
print(x.value)
'''
Choices:
[[0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]]
'''
print("Chance of success:")
print(compute_probability(data, x.value))
'''
Chance of success:
0.7060000000000001
'''
And there we have it! The computer has correctly determined that running method A for 3 seconds, method B for 3 seconds, and method C for 4 seconds is optimal. (Remember that the x matrix is 0-indexed, while the times are 1-indexed.)
Thank you, David, for the suggestion!
Given a set of 2D points (black dots in the picture) I need to choose two lines to somehow represent those points. I probably need to minimize the sum of squares of [distance of x to the closer of two lines]^2, although if any other metric makes this easier, this is fine too.
The obvious but ineffective approach is to try min squares over all 2^n partitions. A practical approach is probably iterative improvement, maybe starting with a random partition.
Is there any research on algorithms to handle this problem?
I think this can be formulated as an explicit optimization problem:
min sum(j, r1(j)^2 + r2(j)^2) (quadratic)
subject to
r1(j) = (y(j) - a0 - a1*x(j)) * δ(j) (quadratic)
r2(j) = (y(j) - b0 - b1*x(j)) * (1-δ(j)) (quadratic)
δ(j) ∈ {0,1}
We do the assignment of points to a line and the regression (minimization of the sum of the squared residuals) at the same time.
This is a non-convex quadratically constrained mixed-integer quadratic programming problem. Solvers that can handle this include Gurobi, Baron, Couenne, Antigone.
We can reformulate this a bit:
min sum(j, r(j)^2) (convex quadratic)
subject to
r(j) = y(j) - a0 - a1*x(j) + s1(j) (one of these will be relaxed)
r(j) = y(j) - b0 - b1*x(j) + s2(j) (all linear)
-U*δ(j) <= s1(j) <= U*δ(j)
-U*(1-δ(j)) <= s2(j) <= U*(1-δ(j))
δ(j) ∈ {0,1}
s1(j),s2(j) ∈ [-U,U]
U = 1000 (reasonable bound, constant)
This makes it a straight convex MIQP model. This will allow more solvers to be used (e.g. Cplex) and it is much easier to solve. Some other formulations are here. Some of the models mentioned do not need the bounds I had to use in the above big-M formulation. It is noted these models deliver proven optimal solutions (for the non-convex models this would require a global solver; the convex models are easier and don't need this). Furthermore, instead of a least squares objective, we can also form an L1 objective. In the latter case we end up with a linear MIP model.
A small test confirms this works:
This problem has 50 points, and needed 1.25 seconds using Cplex's MIQP solver on a slow laptop. This may be a case of a statistical problem where MIP/MIQP methods have something to offer.
Two related concepts from the literature come to mind.
First, my intuition is that there should be a way to interpret this problem as estimating the parameters for a mixture model. I haven't worked out the details, since parameter estimation typically uses expectation--maximization, and I can just describe how that would work: initialize a partition into two parts, then alternately run a regression on each part and reassign points based on their distance to the new regression lines.
Second, assuming that the input is relatively clean, you should be able to get a good initial partition using RANSAC. For some small k, take two disjoint random samples of k points and fit lines through them, then assign every other point. For a (100 − x)% chance of success you'll want to repeat this about ln(100/x) × 22k−1 times and take the best one.
In OPL CPLEX if I start with the example curve fitting from Model Buidling
Let me add a few points to the .dat first
n=24;
x = [1,2,3,4,5,0.0, 0.5, 1.0, 1.5, 1.9, 2.5, 3.0, 3.5, 4.0, 4.5,
5.0, 5.5, 6.0, 6.6, 7.0, 7.6, 8.5, 9.0, 10.0];
y = [10,20,30,40,50,1.0, 0.9, 0.7, 1.5, 2.0, 2.4, 3.2, 2.0, 2.7, 3.5,
1.0, 4.0, 3.6, 2.7, 5.7, 4.6, 6.0, 6.8, 7.3];
then .mod with MIP and absolute value for distance
execute
{
cplex.tilim=10;
}
int n=...;
range points=1..n;
float x[points]=...;
float y[points]=...;
int nbLines=2;
range lines=1..nbLines;
dvar float a[lines];
dvar float b[lines];
// distance between a point and a line
dvar float dist[points][lines];
// minimal distance to the closest line
dvar float dist2[points];
dvar float+ obj;
minimize obj;
subject to
{
obj==sum(i in points) dist2[i];
forall(i in points,j in lines) dist[i][j]==abs(b[j]*x[i]+a[j]-y[i]);
forall(i in points) dist2[i]==min(l in lines ) dist[i][l];
}
// which line for each point ?
int whichline[p in points]=first({l | l in lines : dist2[p]==dist[p][l]});
execute
{
writeln("b = ",b);
writeln("a = ",a);
writeln("which line = ",whichline);
}
gives
b = [0.6375 10]
a = [0.58125 0]
which line = [2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
With quadratic some reformulation
int n=...;
range points=1..n;
float x[points]=...;
float y[points]=...;
int nbLines=2;
range lines=1..nbLines;
dvar float a[lines];
dvar float b[lines];
dvar float distance[points][lines];
dvar boolean which[points]; // 1 means 1, 0 means 2
minimize sum(i in points,j in lines) distance[i][j]^2;
subject to
{
forall(i in points,j in 0..1) (which[i]==j) => (distance[i][2-j]==b[2-j]*x[i]+a[2-j]-y[i]);
}
execute
{
writeln("b = ",b);
writeln("a = ",a);
writeln("which = ",which);
}
gives
b = [0.61077 10]
a = [0.42613 0]
which = [0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
Suppose i provide you with random seeds between 0 and 1 but after some observations you find out that my seeds are not distributed properly and most of them are less than 0.5, would you still be able to use this source by using an algorithm that makes the seeds more distributed?
If yes, please provide me with necessary sources.
It really depends on how numbers are distributed in interval [0...1]. In general, you need CDF (cumulative distribution function) to map some arbitrary [0...1] domain distribution into uniform [0...1]. But for some particular cases you could do some simple transformation. Code below (in Python) first construct simple unfair RNG which generates 60% of numbers below 0.5 and 40% above.
import random
def unfairRng():
q = random.random()
if q < 0.6: # result is skewed toward [0...0.5] interval
return 0.5*random.random()
return 0.5 + 0.5*random.random()
random.seed(312345)
nof_trials = 100000
h = [0, 0]
for k in range(0, nof_trials):
q = unfairRng()
h[0 if q < 0.5 else 1] += 1
print(h)
I count then numbers above and below 0.5, and output on my machine is
[60086, 39914]
which is quite close to 60/40 split I described.
Ok, let's "fix" RNG by taking numbers from unfairRNG and alternating just returning value and next time returning 1-value. Again, Python code
def fairRng():
if (fairRng.even == 0):
fairRng.even = 1
return unfairRng()
else:
fairRng.even = 0
return 1.0 - unfairRng()
fairRng.even = 0
h = [0, 0]
for k in range(0, nof_trials):
q = fairRng()
h[0 if q < 0.5 else 1] += 1
print(h)
Again, counting histogram and result is
[49917, 50083]
which "fix" unfair RNG and make it fair.
Flipping a coin out of an unfair coin is done by flipping twice and, if the results are different, using the first; otherwise, discard the result.
This results in a coin with exactly 50/50 chance, but it's not guaranteed to run in finite time.
Random number sequences generated by any algorithm will have no more entropy ("randomness") than the seeds themselves. For instance, if each seed has an entropy of only 1 bit for every 64 bits, they can each be transformed, at least in theory, to a 1 bit random number with full entropy. However, measuring the entropy of those seeds is nontrivial (entropy estimation). Moreover, not every algorithm is suitable in all cases for extracting the entropy of random seeds (entropy extraction, randomness extraction).
I'm searching for an algorithm (no matter what programming language, maybe Pseudo-code?) where you get a random number with different probability's.
For example:
A random Generator, which simulates a dice where the chance for a '6'
is 50% and for the other 5 numbers it's 10%.
The algorithm should be scalable, because this is my exact problem:
I have a array (or database) of elements, from which i want to
select 1 random element. But each element should have a different
probability to be selected. So my idea is that every element get a
number. And this number divided by the sum of all numbers results the
chance for the number to be randomly selected.
Anybody know a good programming language (or library) for this problem?
The best solution would be a good SQL Query which delivers 1 random entry.
But i would also be happy with every hint or attempt in an other programming language.
A simple algorithm to achieve it is:
Create an auexillary array where sum[i] = p1 + p2 + ... + pi. This is done only once.
When you draw a number, draw a number r with uniform distribution over [0,sum[n]), and binary search for the first number higher than the uniformly distributed random number. It can be done using binary search efficiently.
It is easy to see that indeed the probability for r to lay in a certain range [sum[i-1],sum[i]), is indeed sum[i]-sum[i-1] = pi
(In the above, we regard sum[-1]=0, for completeness)
For your cube example:
You have:
p1=p2=....=p5 = 0.1
p6 = 0.5
First, calculate sum array:
sum[1] = 0.1
sum[2] = 0.2
sum[3] = 0.3
sum[4] = 0.4
sum[5] = 0.5
sum[6] = 1
Then, each time you need to draw a number: Draw a random number r in [0,1), and choose the number closest to it, for example:
r1 = 0.45 -> element = 4
r2 = 0.8 -> element = 6
r3 = 0.1 -> element = 2
r4 = 0.09 -> element = 1
An alternative answer. Your example was in percentages, so set up an array with 100 slots. A 6 is 50%, so put 6 in 50 of the slots. 1 to 5 are at 10% each, so put 1 in 10 slots, 2 in 10 slots etc. until you have filled all 100 slots in the array. Now pick one of the slots at random using a uniform distribution in [0, 99] or [1, 100] depending on the language you are using.
The contents of the selected array slot will give you the distribution you want.
ETA: On second thoughts, you don't actually need the array, just use cumulative probabilities to emulate the array:
r = rand(100) // In range 0 -> 99 inclusive.
if (r < 50) return 6; // Up to 50% returns a 6.
if (r < 60) return 1; // Between 50% and 60% returns a 1.
if (r < 70) return 2; // Between 60% and 70% returns a 2.
etc.
You already know what numbers are in what slots, so just use cumulative probabilities to pick a virtual slot: 50; 50 + 10; 50 + 10 + 10; ...
Be careful of edge cases and whether your RNG is 0 -> 99 or 1 -> 100.
I'm looking for a decent, elegant method of calculating this simple logic.
Right now I can't think of one, it's spinning my head.
I am required to do some action only 15% of the time.
I'm used to "50% of the time" where I just mod the milliseconds of the current time and see if it's odd or even, but I don't think that's elegant.
How would I elegantly calculate "15% of the time"? Random number generator maybe?
Pseudo-code or any language are welcome.
Hope this is not subjective, since I'm looking for the "smartest" short-hand method of doing that.
Thanks.
Solution 1 (double)
get a random double between 0 and 1 (whatever language you use, there must be such a function)
do the action only if it is smaller than 0.15
Solution 2 (int)
You can also achieve this by creating a random int and see if it is dividable to 6 or 7. UPDATE --> This is not optimal.
You can produce a random number between 0 and 99, and check if it's less than 15:
if (rnd.Next(100) < 15) ...
You can also reduce the numbers, as 15/100 is the same as 3/20:
if (rnd.Next(20) < 3) ...
Random number generator would give you the best randomness. Generate a random between 0 and 1, test for < 0.15.
Using the time like that isn't true random, as it's influenced by processing time. If a task takes less than 1 millisecond to run, then the next random choice will be the same one.
That said, if you do want to use the millisecond-based method, do milliseconds % 20 < 3.
Just use a PRNG. Like always, it's a performance v. accuracy trade-off. I think making your own doing directly off the time is a waste of time (pun intended). You'll probably get biasing effects even worse than a run of the mill linear congruential generator.
In Java, I would use nextInt:
myRNG.nextInt(100) < 15
Or (mostly) equivalently:
myRNG.nextInt(20) < 3
There are way to get a random integer in other languages (multiple ways actually, depending how accurate it has to be).
Using modulo arithmetic you can easily do something every Xth run like so
(6 will give you ruthly 15%
if( microtime() % 6 === ) do it
other thing:
if(rand(0,1) >= 0.15) do it
boolean array[100] = {true:first 15, false:rest};
shuffle(array);
while(array.size > 0)
{
// pop first element of the array.
if(element == true)
do_action();
else
do_something_else();
}
// redo the whole thing again when no elements are left.
Here's one approach that combines randomness and a guarantee that eventually you get a positive outcome in a predictable range:
Have a target (15 in your case), a counter (initialized to 0), and a flag (initialized to false).
Accept a request.
If the counter is 15, reset the counter and the flag.
If the flag is true, return negative outcome.
Get a random true or false based on one of the methods described in other answers, but use a probability of 1/(15-counter).
Increment counter
If result is true, set flag to true and return a positive outcome. Else return a negative outcome.
Accept next request
This means that the first request has probability of 1/15 of return positive, but by the 15th request, if no positive result has been returned, there's a probability of 1/1 of a positive result.
This quote is from a great article about how to use a random number generator:
Note: Do NOT use
y = rand() % M;
as this focuses on the lower bits of
rand(). For linear congruential random
number generators, which rand() often
is, the lower bytes are much less
random than the higher bytes. In fact
the lowest bit cycles between 0 and 1.
Thus rand() may cycle between even and
odd (try it out). Note rand() does not
have to be a linear congruential
random number generator. It's
perfectly permissible for it to be
something better which does not have
this problem.
and it contains formulas and pseudo-code for
r = [0,1) = {r: 0 <= r < 1} real
x = [0,M) = {x: 0 <= x < M} real
y = [0,M) = {y: 0 <= y < M} integer
z = [1,M] = {z: 1 <= z <= M} integer