Using the Sobol-Sequence to calculate pi in Julia - random

One can approximate pi by looking at the relationship of randomly generated points on a sqaure with a circle inside.
function picircle(n)
n = n
N = 2n+1
x = range(-1, 1, length=N)
y = rand(N)
center = (0,0)
radius = 1
n_in_circle = 0
for i in 1:N
if norm((x[i], y[i]) .- center) < radius
n_in_circle += 1
end
end
println(4 * n_in_circle / N)
end
picircle(1000)
3.1424287856071964
However, I would like to use the Quasi-Monte Carlo method. Instead of using pseudo random numbers I want to use numbers from a Sobol-Sequence. I know how to generate them but I am not sure how to implement it in my code.
using Sobol
s = SobolSeq(2) # Creates a Sobol-Sequenz in 2 Dimensions

See the README.md of Sobol.jl to learn how to iterate over a SobolSeq. The gist is that we can use next!(s) to get the next n elements, for an n-dimensional sequence.
julia> using Sobol
julia> s = SobolSeq(2)
2-dimensional Sobol sequence on [0,1]^2
julia> N = 10_000_000
10000000
julia> 4 * count(hypot(next!(s)...) < 1 for _ in 1:N) / N
3.1415952

Related

Halton sequence extension

I am trying to fill an area defined by 2 intervals [a,b] x [c,d] with points uniformly distributed and I am implementing the Halton sequence. I am using the following code (which generates subunitary numbers).
The number I is input.
The number H is output.
for i = 1:N
H = 0
half = 1 / 2
I = rand() % MATLAB rand()
do while ( I is not zero )
digit = mod ( I, 2 )
H = H + digit * half
I = ( I - digit ) / 2
half = half / 2
end
x(i) = H
end
For the x-axis I use base 2 and for the y-axis I use base 3.
Because I divide by 2, 3 I seem to be unable to fill the whole [0,1] x [0,1] space completely. I have to fill [0,1] x [0,1] and I actually fill [0,0.5] x [0,0.35]. And when I try to extend the algorithm for [a,b] x [c,d] I get points in [a,b-0.5] x [c,d-1].
What can I do to fill the correct full intervals?

implementing a simple big bang big crunch (BB-BC) in matlab

i want to implement a simple BB-BC in MATLAB but there is some problem.
here is the code to generate initial population:
pop = zeros(N,m);
for j = 1:m
% formula used to generate random number between a and b
% a + (b-a) .* rand(N,1)
pop(:,j) = const(j,1) + (const(j,2) - const(j,1)) .* rand(N,1);
end
const is a matrix (mx2) which holds constraints for control variables. m is number of control variables. random initial population is generated.
here is the code to compute center of mass in each iteration
sum = zeros(1,m);
sum_f = 0;
for i = 1:N
f = fitness(new_pop(i,:));
%keyboard
sum = sum + (1 / f) * new_pop(i,:);
%keyboard
sum_f = sum_f + 1/f;
%keyboard
end
CM = sum / sum_f;
new_pop holds newly generated population at each iteration, and is initialized with pop.
CM is a 1xm matrix.
fitness is a function to give fitness value for each particle in generation. lower the fitness, better the particle.
here is the code to generate new population in each iteration:
for i=1:N
new_pop(i,:) = CM + rand(1) * alpha1 / (n_itr+1) .* ( const(:,2)' - const(:,1)');
end
alpha1 is 0.9.
the problem is that i run the code for 100 iterations, but fitness just decreases and becomes negative. it shouldnt happen at all, because all particles are in search space and CM should be there too, but it goes way beyond the limits.
for example, if this is the limits (m=4):
const = [1 10;
1 9;
0 5;
1 4];
then running yields this CM:
57.6955 -2.7598 15.3098 20.8473
which is beyond all limits.
i tried limiting CM in my code, but then it just goes and sticks at all top boundaries, which in this example give CM=
10 9 5 4
i am confused. there is something wrong in my implementation or i have understood something wrong in BB-BC?

generate random numbers within a range with different probabilities

How can i generate a random number between A = 1 and B = 10 where each number has a different probability?
Example: number / probability
1 - 20%
2 - 20%
3 - 10%
4 - 5%
5 - 5%
...and so on.
I'm aware of some hard-coded workarounds which unfortunately are of no use with larger ranges, for example A = 1000 and B = 100000.
Assume we have a
Rand()
method which returns a random number R, 0 < R < 1, can anyone post a code sample with a proper way of doing this ? prefferable in c# / java / actionscript.
Build an array of 100 integers and populate it with 20 1's, 20 2's, 10 3's, 5 4's, 5 5's, etc. Then just randomly pick an item from the array.
int[] numbers = new int[100];
// populate the first 20 with the value '1'
for (int i = 0; i < 20; ++i)
{
numbers[i] = 1;
}
// populate the rest of the array as desired.
// To get an item:
// Since your Rand() function returns 0 < R < 1
int ix = (int)(Rand() * 100);
int num = numbers[ix];
This works well if the number of items is reasonably small and your precision isn't too strict. That is, if you wanted 4.375% 7's, then you'd need a much larger array.
There is an elegant algorithm attributed by Knuth to A. J. Walker (Electronics Letters 10, 8 (1974), 127-128; ACM Trans. Math Software 3 (1977), 253-256).
The idea is that if you have a total of k * n balls of n different colors, then it is possible to distribute the balls in n containers such that container no. i contains balls of color i and at most one other color. The proof is by induction on n. For the induction step pick the color with the least number of balls.
In your example n = 10. Multiply the probabilities with a suitable m such that they are all integers. So, maybe m = 100 and you have 20 balls of color 0, 20 balls of color 1, 10 balls of color 2, 5 balls of color 3, etc. So, k = 10.
Now generate a table of dimension n with each entry being a probability (the ration of balls of color i vs the other color) and the other color.
To generate a random ball, generate a random floating-point number r in the range [0, n). Let i be the integer part (floor of r) and x the excess (r – i).
if (x < table[i].probability) output i
else output table[i].other
The algorithm has the advantage that for each random ball you only make a single comparison.
Let me work out an example (same as Knuth).
Consider simulating throwing a pair of dice.
So P(2) = 1/36, P(3) = 2/36, P(4) = 3/36, P(5) = 4/36, P(6) = 5/36, P(7) = 6/36, P(8) = 5/36, P(9) = 4/36, P(10) = 3/36, P(11) = 2/36, P(12) = 1/36.
Multiply by 36 * 11 to get 393 balls, 11 of color 2, 22 of color 3, 33 of color 4, …, 11 of color 12.
We have k = 393 / 11 = 36.
Table[2] = (11/36, color 4)
Table[12] = (11/36, color 10)
Table[3] = (22/36, color 5)
Table[11] = (22/36, color 5)
Table[4] = (8/36, color 9)
Table[10] = (8/36, color 6)
Table[5] = (16/36, color 6)
Table[9] = (16/36, color 8)
Table[6] = (7/36, color 8)
Table[8] = (6/36, color 7)
Table[7] = (36/36, color 7)
Assuming that you have a function p(n) that gives you the desired probability for a random number:
r = rand() // a random number between 0 and 1
for i in A to B do
if r < p(i)
return i
r = r - p(i)
done
A faster way is to create an array of (B - A) * 100 elements and populate it with numbers from A to B such that the ratio of the number of each item occurs in the array to the size of the array is its probability. You can then generate a uniform random number to get an index to the array and directly access the array to get your random number.
Map your uniform random results to the required outputs according to the probabilities.
E.g., for your example:
If `0 <= Round() <= 0.2`: result = 1.
If `0.2 < Round() <= 0.4`: result = 2.
If `0.4 < Round() <= 0.5`: result = 3.
If `0.5 < Round() <= 0.55`: result = 4.
If `0.55 < Round() <= 0.65`: result = 5.
...
Here's an implementation of Knuth's Algorithm. As discussed by some of the answers it works by
1) creating a table of summed frequencies
2) generates a random integer
3) rounds it with ceiling function
4) finds the "summed" range within which the random number falls and outputs original array entity based on it
Inverse Transform
In probability speak, a cumulative distribution function F(x) returns the probability that any randomly drawn value, call it X, is <= some given value x. For instance, if I did F(4) in this case, I would get .6. because the running sum of probabilities in your example is {.2, .4, .5, .55, .6, .65, ....}. I.e. the probability of randomly getting a value less than or equal to 4 is .6. However, what I actually want to know is the inverse of the cumulative probability function, call it F_inv. I want to know what is the x value given the cumulative probability. I want to pass in F_inv(.6) and get back 4. That is why this is called the inverse transform method.
So, in the inverse transform method, we are basically trying to find the interval in the cumulative distribution in which a random Uniform (0,1) number falls. This works out to the algorithm that perreal and icepack posted. Here is another way to state it in terms of the cumulative distribution function
Generate a random number U
for x in A .. B
if U <= F(x) then return x
Note that it might be more efficient to have the loop go from B to A and check if U >= F(x) if the smaller probabilities come at the beginning of the distribution

Algorithm to express elements of a matrix as a vector

Statement of Problem:
I have an array M with m rows and n columns. The array M is filled with non-zero elements.
I also have a vector t with n elements, and a vector omega
with m elements.
The elements of t correspond to the columns of matrix M.
The elements of omega correspond to the rows of matrix M.
Goal of Algorithm:
Define chi as the multiplication of vector t and omega. I need to obtain a 1D vector a, where each element of a is a function of chi.
Each element of chi is unique (i.e. every element is different).
Using mathematics notation, this can be expressed as a(chi)
Each element of vector a corresponds to an element or elements of M.
Matlab code:
Here is a code snippet showing how the vectors t and omega are generated. The matrix M is pre-existing.
[m,n] = size(M);
t = linspace(0,5,n);
omega = linspace(0,628,m);
Conceptual Diagram:
This appears to be a type of integration (if this is the right word for it) along constant chi.
Reference:
Link to reference
The algorithm is not explicitly stated in the reference. I only wish that this algorithm was described in a manner reminiscent of computer science textbooks!
Looking at Figure 11.5, the matrix M is Figure 11.5(a). The goal is to find an algorithm to convert Figure 11.5(a) into 11.5(b).
It appears that the algorithm is a type of integration (averaging, perhaps?) along constant chi.
It appears to me that reshape is the matlab function you need to use. As noted in the link:
B = reshape(A,siz) returns an n-dimensional array with the same elements as A, but reshaped to siz, a vector representing the dimensions of the reshaped array.
That is, create a vector siz with the number m*n in it, and say A = reshape(P,siz), where P is the product of vectors t and ω; or perhaps say something like A = reshape(t*ω,[m*n]). (I don't have matlab here, or would run a test to see if I have the product the right way around.) Note, the link does not show an example with one number (instead of several) after the matrix parameter to reshape, but I would expect from the description that A = reshape(t*ω,m*n) might also work.
You should add a pseudocode or a link to the algorithm you want to implement. From what I could understood I have developed the following code anyway:
M = [1 2 3 4; 5 6 7 8; 9 10 11 12]' % easy test M matrix
a = reshape(M, prod(size(M)), 1) % convert M to vector 'a' with reshape command
[m,n] = size(M); % Your sample code
t = linspace(0,5,n); % Your sample code
omega = linspace(0,628,m); % Your sample code
for i=1:length(t)
for j=1:length(omega) % Acces a(chi) in the desired order
chi = length(omega)*(i-1)+j;
t(i) % related t value
omega(j) % related omega value
a(chi) % related a(chi) value
end
end
As you can see, I also think that the reshape() function is the solution to your problems. I hope that this code helps,
The basic idea is to use two separate loops. The outer loop is over the chi variable values, whereas the inner loop is over the i variable values. Referring to the above diagram in the original question, the i variable corresponds to the x-axis (time), and the j variable corresponds to the y-axis (frequency). Assuming that the chi, i, and j variables can take on any real number, bilinear interpolation is then used to find an amplitude corresponding to an element in matrix M. The integration is just an averaging over elements of M.
The following code snippet provides an overview of the basic algorithm to express elements of a matrix as a vector using the spectral collapsing from 2D to 1D. I can't find any reference for this, but it is a solution that works for me.
% Amp = amplitude vector corresponding to Figure 11.5(b) in book reference
% M = matrix corresponding to the absolute value of the complex Gabor transform
% matrix in Figure 11.5(a) in book reference
% Nchi = number of chi in chi vector
% prod = product of timestep and frequency step
% dt = time step
% domega = frequency step
% omega_max = maximum angular frequency
% i = time array element along x-axis
% j = frequency array element along y-axis
% current_i = current time array element in loop
% current_j = current frequency array element in loop
% Nchi = number of chi
% Nivar = number of i variables
% ivar = i variable vector
% calculate for chi = 0, which only occurs when
% t = 0 and omega = 0, at i = 1
av0 = mean( M(1,:) );
av1 = mean( M(2:end,1) );
av2 = mean( [av0 av1] );
Amp(1) = av2;
% av_val holds the sum of all values that have been averaged
av_val_sum = 0;
% loop for rest of chi
for ccnt = 2:Nchi % 2:Nchi
av_val_sum = 0; % reset av_val_sum
current_chi = chi( ccnt ); % current value of chi
% loop over i vector
for icnt = 1:Nivar % 1:Nivar
current_i = ivar( icnt );
current_j = (current_chi / (prod * (current_i - 1))) + 1;
current_t = dt * (current_i - 1);
current_omega = domega * (current_j - 1);
% values out of range
if(current_omega > omega_max)
continue;
end
% use bilinear interpolation to find an amplitude
% at current_t and current_omega from matrix M
% f_x_y is the bilinear interpolated amplitude
% Insert bilinear interpolation code here
% add to running sum
av_val_sum = av_val_sum + f_x_y;
end % icnt loop
% compute the average over all i
av = av_val_sum / Nivar;
% assign the average to Amp
Amp(ccnt) = av;
end % ccnt loop

Randomly Generate a set of numbers of n length totaling x

I'm working on a project for fun and I need an algorithm to do as follows:
Generate a list of numbers of Length n which add up to x
I would settle for list of integers, but ideally, I would like to be left with a set of floating point numbers.
I would be very surprised if this problem wasn't heavily studied, but I'm not sure what to look for.
I've tackled similar problems in the past, but this one is decidedly different in nature. Before I've generated different combinations of a list of numbers that will add up to x. I'm sure that I could simply bruteforce this problem but that hardly seems like the ideal solution.
Anyone have any idea what this may be called, or how to approach it? Thanks all!
Edit: To clarify, I mean that the list should be length N while the numbers themselves can be of any size.
edit2: Sorry for my improper use of 'set', I was using it as a catch all term for a list or an array. I understand that it was causing confusion, my apologies.
This is how to do it in Python
import random
def random_values_with_prescribed_sum(n, total):
x = [random.random() for i in range(n)]
k = total / sum(x)
return [v * k for v in x]
Basically you pick n random numbers, compute their sum and compute a scale factor so that the sum will be what you want it to be.
Note that this approach will not produce "uniform" slices, i.e. the distribution you will get will tend to be more "egalitarian" than it should be if it was picked at random among all distribution with the given sum.
To see the reason you can just picture what the algorithm does in the case of two numbers with a prescribed sum (e.g. 1):
The point P is a generic point obtained by picking two random numbers and it will be uniform inside the square [0,1]x[0,1]. The point Q is the point obtained by scaling P so that the sum is required to be 1. As it's clear from the picture the points close to the center of the have an higher probability; for example the exact center of the squares will be found by projecting any point on the diagonal (0,0)-(1,1), while the point (0, 1) will be found projecting only points from (0,0)-(0,1)... the diagonal length is sqrt(2)=1.4142... while the square side is only 1.0.
Actually, you need to generate a partition of x into n parts. This is usually done the in following way: The partition of x into n non-negative parts can be represented in the following way: reserve n + x free places, put n borders to some arbitrary places, and stones to the rest. The stone groups add up to x, thus the number of possible partitions is the binomial coefficient (n + x \atop n).
So your algorithm could be as follows: choose an arbitrary n-subset of (n + x)-set, it determines uniquely a partition of x into n parts.
In Knuth's TAOCP the chapter 3.4.2 discusses random sampling. See Algortihm S there.
Algorithm S: (choose n arbitrary records from total of N)
t = 0, m = 0;
u = random, uniformly distributed on (0, 1)
if (N - t)*u >= n - m, skip t-th record and increase t by 1; otherwise include t-th record in the sample, increase m and t by 1
if M < n, return to 2, otherwise, algorithm finished
The solution for non-integers is algorithmically trivial: you just select arbitrary n numbers that don't sum up to 0, and norm them by their sum.
If you want to sample uniformly in the region of N-1-dimensional space defined by x1 + x2 + ... + xN = x, then you're looking at a special case of sampling from a Dirichlet distribution. The sampling procedure is a little more involved than generating uniform deviates for the xi. Here's one way to do it, in Python:
xs = [random.gammavariate(1,1) for a in range(N)]
xs = [x*v/sum(xs) for v in xs]
If you don't care too much about the sampling properties of your results, you can just generate uniform deviates and correct their sum afterwards.
Here is a version of the above algorithm in Javascript
function getRandomArbitrary(min, max) {
return Math.random() * (max - min) + min;
};
function getRandomArray(min, max, n) {
var arr = [];
for (var i = 0, l = n; i < l; i++) {
arr.push(getRandomArbitrary(min, max))
};
return arr;
};
function randomValuesPrescribedSum(min, max, n, total) {
var arr = getRandomArray(min, max, n);
var sum = arr.reduce(function(pv, cv) { return pv + cv; }, 0);
var k = total/sum;
var delays = arr.map(function(x) { return k*x; })
return delays;
};
You can call it with
var myarray = randomValuesPrescribedSum(0,1,3,3);
And then check it with
var sum = myarray.reduce(function(pv, cv) { return pv + cv;},0);
This code does a reasonable job. I think it produces a different distribution than 6502's answer, but I am not sure which is better or more natural. Certainly his code is clearer/nicer.
import random
def parts(total_sum, num_parts):
points = [random.random() for i in range(num_parts-1)]
points.append(0)
points.append(1)
points.sort()
ret = []
for i in range(1, len(points)):
ret.append((points[i] - points[i-1]) * total_sum)
return ret
def test(total_sum, num_parts):
ans = parts(total_sum, num_parts)
assert abs(sum(ans) - total_sum) < 1e-7
print ans
test(5.5, 3)
test(10, 1)
test(10, 5)
In python:
a: create a list of (random #'s 0 to 1) times total; append 0 and total to the list
b: sort the list, measure the distance between each element
c: round the list elements
import random
import time
TOTAL = 15
PARTS = 4
PLACES = 3
def random_sum_split(parts, total, places):
a = [0, total] + [random.random()*total for i in range(parts-1)]
a.sort()
b = [(a[i] - a[i-1]) for i in range(1, (parts+1))]
if places == None:
return b
else:
b.pop()
c = [round(x, places) for x in b]
c.append(round(total-sum(c), places))
return c
def tick():
if info.tick == 1:
start = time.time()
alpha = random_sum_split(PARTS, TOTAL, PLACES)
end = time.time()
log('alpha: %s' % alpha)
log('total: %.7f' % sum(alpha))
log('parts: %s' % PARTS)
log('places: %s' % PLACES)
log('elapsed: %.7f' % (end-start))
yields:
[2014-06-13 01:00:00] alpha: [0.154, 3.617, 6.075, 5.154]
[2014-06-13 01:00:00] total: 15.0000000
[2014-06-13 01:00:00] parts: 4
[2014-06-13 01:00:00] places: 3
[2014-06-13 01:00:00] elapsed: 0.0005839
to the best of my knowledge this distribution is uniform

Resources