I am attempting to generate a random distribution that follows an upside-down gaussian distribution, shifted uo so that it is still in range(0,1). I need to do this with as few special functions as possible and can only use a flat random number generator.
I am able to generate according to a Gaussian by putting the flat random numbers through the inverse Gaussian CDF. This works and gives me the gaussian dist that I would expect. In python, this looks like this:
def InverseCDF(x, mu, sigma):
return mu + sigma * special.erfinv(2*x - 1)
Now when I am trying to generate a distribution that follows 1-e^(-x^2), I believe the inverse CDF of this function is the same as for the regular gaussian with the argument of the inverse error function now 2*p + 1. So it would look like below:
def InverseCDF(x, mu, sigma):
return mu + sigma * special.erfinv(2*x + 1)
The problem here is that erfinv is only defined from (-1,1) and the argument is now greater than 1. I have tried scaling this and flipping in all sorts of ways, putting negatives almost everywhere I can, and I can never seem to generate a histogram that follows an upside-down gaussian. In most cases, I actually get back a regular gaussian distribution.
Any idea what I'm doing wrong, or any tips on how to generate this upside-down gaussian? Thanks in advance for any help.
OK, with x between 0 and 1, I get this for the cdf:
-(sqrt(%pi)*(sqrt(2)*sigma*erf((sqrt(2)*x-sqrt(2)*mu)/(2*sigma))
+sqrt(2)*erf(mu/(sqrt(2)*sigma))*sigma)
-2*x)
/(sqrt(%pi)*(sqrt(2)*erf((sqrt(2)*mu-sqrt(2))/(2*sigma))
-sqrt(2)*erf(mu/(sqrt(2)*sigma)))*sigma
+2)
Maybe some algebra will make it possible to figure out a formula for the inverse, if not, I guess a numerical root search will work. I guess it will be simpler for specific values of mu and sigma.
I did that with Maxima (http://maxima.sourceforge.net), by constructing the pdf and integrating it. Plotting the expression above yields a plausible picture.
Related
I was working on my final assignment, and I raised Box Muller Gaussian Distribution method to look for random numbers in unity software.
I am very confused about the gaussian distribution function on the pseudocode that I found in one of the journals.
Pseudocode algoritma Box-Muller(Sukajaya dkk., 2012) :
a. Generate uniform random number u, v in range [-1, 1]
b. Calculate s = u2 + v2
c. Looping step 2 until s < 1
d. Find normal random numbers `z0 = u. √((-2lns)/s)` and z1 = v . √(- (-2lns)/s)
I think the pseudocode only talks about the Box Muller and the Gaussian Distribution function is only for displaying diagrams of randomized numbers.
The Box-Muller algorithm does not contain a direct implementation of the Gaussian density formula. Instead, it produces outcomes which (cumulatively) follow that density. The results z0 and z1 produced by the algorithm are two independent Gaussian random values. If you iterate the algorithm hundreds or thousands of times and build a histogram of all the z values, it will start looking like the bell-shaped curve of a Gaussian distribution. The math behind it is beyond the scope of a StackOverflow post, so I'm going to advise that you just push the "I believe!" button, or see the Wikipedia article if you want more explanation and links to various original sources.
I'm not sure what you mean when you say "the Gaussian Distribution function is only for displaying diagrams of randomized numbers." The Gaussian is one of the most important modeling distributions out there because sums of values from all other distributions with finite variance will converge to the Gaussian in distribution. That means if you're studying averages (which are built from sums) or aggregates of lots of little errors, the Gaussian distribution does a great job of characterizing the results.
I'm writing an algorithm for sclera detection on grayscale images and I found a formula that I cannot explain how it works. Here is the paper segment I'm trying to use:
Here it says that I should use the HSL information of the image and calculate 3 thresholds for the 3 components which I later use for thresholding. The problem is that I cannot make any sense of the notation arg{t|min| ...} as they are not explained at all in the paper.
I deduced how the sum works and that I should have a constant at the end of the computation of the sum, but what does this previosuly mentioned operator do with the constant gathered from the sum I cannot find anywhere.
I tried to search for the meaning of the arg notation, but this wikipedia page doesn't seem to give me any answers: https://en.wikipedia.org/wiki/Argument_(complex_analysis)
Here they say that the result of the operation is the angle of the complex number, however I don't have any complex numbers, therefore if I consider a real number as complex my angle will always be 0.
Can anyone explain what should this operation do?
arg in this case means the argument of the function that gives the minimum value:
e.g .
m=arg{min f(x)}
is the x value for which the function f achieves its minimum value.
It's a standard notation in image classification etc. If you look at this you will see it https://en.wikipedia.org/wiki/Maximum_a_posteriori_estimation
I am trying to apply Random Projections method on a very sparse dataset. I found papers and tutorials about Johnson Lindenstrauss method, but every one of them is full of equations which makes no meaningful explanation to me. For example, this document on Johnson-Lindenstrauss
Unfortunately, from this document, I can get no idea about the implementation steps of the algorithm. It's a long shot but is there anyone who can tell me the plain English version or very simple pseudo code of the algorithm? Or where can I start to dig this equations? Any suggestions?
For example, what I understand from the algorithm by reading this paper concerning Johnson-Lindenstrauss is that:
Assume we have a AxB matrix where A is number of samples and B is the number of dimensions, e.g. 100x5000. And I want to reduce the dimension of it to 500, which will produce a 100x500 matrix.
As far as I understand: first, I need to construct a 100x500 matrix and fill the entries randomly with +1 and -1 (with a 50% probability).
Edit:
Okay, I think I started to get it. So we have a matrix A which is mxn. We want to reduce it to E which is mxk.
What we need to do is, to construct a matrix R which has nxk dimension, and fill it with 0, -1 or +1, with respect to 2/3, 1/6 and 1/6 probability.
After constructing this R, we'll simply do a matrix multiplication AxR to find our reduced matrix E. But we don't need to do a full matrix multiplication, because if an element of Ri is 0, we don't need to do calculation. Simply skip it. But if we face with 1, we just add the column, or if it's -1, just subtract it from the calculation. So we'll simply use summation rather than multiplication to find E. And that is what makes this method very fast.
It turned out a very neat algorithm, although I feel too stupid to get the idea.
You have the idea right. However as I understand random project, the rows of your matrix R should have unit length. I believe that's approximately what the normalizing by 1/sqrt(k) is for, to normalize away the fact that they're not unit vectors.
It isn't a projection, but, it's nearly a projection; R's rows aren't orthonormal, but within a much higher-dimensional space, they quite nearly are. In fact the dot product of any two of those vectors you choose will be pretty close to 0. This is why it is a generally good approximation of actually finding a proper basis for projection.
The mapping from high-dimensional data A to low-dimensional data E is given in the statement of theorem 1.1 in the latter paper - it is simply a scalar multiplication followed by a matrix multiplication. The data vectors are the rows of the matrices A and E. As the author points out in section 7.1, you don't need to use a full matrix multiplication algorithm.
If your dataset is sparse, then sparse random projections will not work well.
You have a few options here:
Option A:
Step 1. apply a structured dense random projection (so called fast hadamard transform is typically used). This is a special projection which is very fast to compute but otherwise has the properties of a normal dense random projection
Step 2. apply sparse projection on the "densified data" (sparse random projections are useful for dense data only)
Option B:
Apply SVD on the sparse data. If the data is sparse but has some structure SVD is better. Random projection preserves the distances between all points. SVD preserves better the distances between dense regions - in practice this is more meaningful. Also people use random projections to compute the SVD on huge datasets. Random Projections gives you efficiency, but not necessarily the best quality of embedding in a low dimension.
If your data has no structure, then use random projections.
Option C:
For data points for which SVD has little error, use SVD; for the rest of the points use Random Projection
Option D:
Use a random projection based on the data points themselves.
This is very easy to understand what is going on. It looks something like this:
create a n by k matrix (n number of data point, k new dimension)
for i from 0 to k do #generate k random projection vectors
randomized_combination = feature vector of zeros (number of zeros = number of features)
sample_point_ids = select a sample of point ids
for each point_id in sample_point_ids do:
random_sign = +1/-1 with prob. 1/2
randomized_combination += random_sign*feature_vector[point_id] #this is a vector operation
normalize the randomized combination
#note that the normal random projection is:
# randomized_combination = [+/-1, +/-1, ...] (k +/-1; if you want sparse randomly set a fraction to 0; also good to normalize by length]
to project the data points on this random feature just do
for each data point_id in dataset:
scores[point_id, j] = dot_product(feature_vector[point_id], randomized_feature)
If you are still looking to solve this problem, write a message here, I can give you more pseudocode.
The way to think about it is that a random projection is just a random pattern and the dot product (i.e. projecting the data point) between the data point and the pattern gives you the overlap between them. So if two data points overlap with many random patterns, those points are similar. Therefore, random projections preserve similarity while using less space, but they also add random fluctuations in the pairwise similarities. What JLT tells you is that to make fluctuations 0.1 (eps)
you need about 100*log(n) dimensions.
Good Luck!
An R Package to perform Random Projection using Johnson- Lindenstrauss Lemma
RandPro
I'm writing a program that simulates various random walks (with differing distributions). At each timestep, I need randomly generated, two dimensional step distances and angles from the distribution of the random walk. I'm hoping someone can check my understanding of how to generate these random numbers.
As I understand it I can use Inverse Transform Sampling as follows:
If f(x) is the pdf of our random walk that has a non-uniform distribution, and y is a random number from a uniform distribution.
Then if we let f(x) = y and solve to find x then we have a random number from the non-uniform distribution.
Is this a feasible solution?
Not quite. The function that needs to be inverted is not f(x), the pdf, but F(x)=P(X<=x)=int_{-inf}^{x}f(t)dt, the cdf. The good thing is that F is monotone, so actually has a unique inverse (unlike f).
There are multiple other ways of generating random numbers according to a given distribution. For example, if the cdf F is difficult to compute or to invert, rejection sampling can be a good option if f is easy to compute.
You are close, but not quite. Every probability density function (pdf) has a corresponding cumulative density function (cdf). An important property about CDF(x) is that they are always between 0 and 1. Because it is relatively easy to draw a random number between 0 and 1, we can use that to work our way backwards to the distribution. So changing the word pdf to CDF in your question makes the statement correct.
As an aside for this to make sense computationally you need to find an easy to calculate inverse of the CDF. One way to do this is to fit a polynomial approximation to the CDF and find the inverse of that function. There are more advanced techniques for simulating probability distributions with messy distributions. See this book chapter for the details.
From a paper I'm reading right know:
...
S(t+1, k) = S(t, k) + ... + C*∆
...
∆ is a standard random variable with mean 0 and variance 1.
...
How to generate this series of random values with this mean and variance? If someone has links to a C or C++ library I would be glad but I wouldn't mind implementing it myself if someone tells me how to do it :)
Do you have any restrictions on the distribution of \Delta ? if not you can just use a uniform distribution in [-sqrt(3), sqrt(3)]. The reason why this would work is because for an uniform distribution [a,b] the variance is 1/(12) (b-a)^2.
You can use the Box-Muller transform.
Suppose U1 and U2 are independent random variables that are uniformly distributed in the interval (0, 1]. Let
and
Then Z0 and Z1 are independent random variables with a normal distribution of standard deviation 1.
Waffles is a mature, stable C++ library that you can use. In particular, the noise function in the *waffles_generate* module will do what you want.
Aside from center and spread (mean and sd) also need to know the probability distribution that the random numbers are drawn from. If the paper you are reading doesn't say anything about this, and there's no other reasonable inference supported by context, then the author probably is referring to a normal distribution (gaussian)--because that's the most common, and because the two parameters one needs to completely specify a normal distribution are mean and sd. Many distributions are not specified this way--e.g., for a Gamma distribution, shape, scale, and rate are needed; to specify a Logistic, you need location and scale, etc.
If all you want it a certain mean 0 and variance 1, probably the simplest is this. Do you have a uniform random number generator unif() that gives you numbers between 0 and 1? If you want the number to be very close to a normal distribution, can just add up 12 uniform(0,1) numbers and subtract 6. If you want it to be really exactly a normal distribution, you can use the Box-Muller transform, as Mark suggested, if you don't mind throwing in a log, a sine, and a cosine.