MATLAB 1-D Pseudo Wigner Distribution - algorithm

I'm trying to implement the 1-D Pseudo Wigner Distribution. The distribution has this expression:
where n and k represent time and frequency, m is the shifting parameter. z[n] is a vector and z* is the complex conjugate. The author of this article say that
it can be interpreted as the Discrete Fourier Transform of the product z[n+m]z*[n-m].
I would like to know which is the smartest and fastest way to implement this (in particular the product z[m+n]z*[n-m], if I get this then I can apply the fft) because I need to use this expression a lot of times. Any help is greatly appreciated!

Related

Evolving a matrix using a genetic algorithm

I recently discovered genetic algothims and after doing a little research I can't find any example on how to evolve structures more complex than a vector or a string.
Let's say that I'm using a covariance matrix for a certain computation (to compute a mahalanobis distance for example) and I want to look for a better matrix to do the job and linimize a certain criteria, are there any classic examples on how to evolve the matrix and which crossover operators to use ?
Thanks !
Any structure of fixed size and shape that is made of numbers (or any other elements) can be rewritten to a 1-D vector and back. You can then use any operator you like which works on vectors.
If you wanted to work with matrices (or any other structures) directly you can always design your own operators, but a matrix basically is a vector, just written in a different way. For the matrix case there are a number of possibilites of operators (crossover):
Swap rows/columns (between the parents)
Swap submatrices (generalization of the above)
Continuous-space crossover methos like BLX-alpha, PCX, arithmetic crossover... These all are designed for vectors but you will just treat the matrix as a vector (it's really not that different).
Mutation is probably going to be more or less identical to the vector-like - you just mutate the elements (or some of them).

How to compute Discrete Fourier Transform?

I've been trying to find some places to help me better understand DFT and how to compute it but to no avail. So I need help understanding DFT and it's computation of complex numbers.
Basically, I'm just looking for examples on how to compute DFT with an explanation on how it was computed because in the end, I'm looking to create an algorithm to compute it.
I assume 1D DFT/IDFT ...
All DFT's use this formula:
X(k) is transformed sample value (complex domain)
x(n) is input data sample value (real or complex domain)
N is number of samples/values in your dataset
This whole thing is usually multiplied by normalization constant c. As you can see for single value you need N computations so for all samples it is O(N^2) which is slow.
Here mine Real<->Complex domain DFT/IDFT in C++ you can find also hints on how to compute 2D transform with 1D transforms and how to compute N-point DCT,IDCT by N-point DFT,IDFT there.
Fast algorithms
There are fast algorithms out there based on splitting this equation to odd and even parts of the sum separately (which gives 2x N/2 sums) which is also O(N) per single value, but the 2 halves are the same equations +/- some constant tweak. So one half can be computed from the first one directly. This leads to O(N/2) per single value. if you apply this recursively then you get O(log(N)) per single value. So the whole thing became O(N.log(N)) which is awesome but also adds this restrictions:
All DFFT's need the input dataset is of size equal to power of two !!!
So it can be recursively split. Zero padding to nearest bigger power of 2 is used for invalid dataset sizes (in audio tech sometimes even phase shift). Look here:
mine Complex->Complex domain DFT,DFFT in C++
some hints on constructing FFT like algorithms
Complex numbers
c = a + i*b
c is complex number
a is its real part (Re)
b is its imaginary part (Im)
i*i=-1 is imaginary unit
so the computation is like this
addition:
c0+c1=(a0+i.b0)+(a1+i.b1)=(a0+a1)+i.(b0+b1)
multiplication:
c0*c1=(a0+i.b0)*(a1+i.b1)
=a0.a1+i.a0.b1+i.b0.a1+i.i.b0.b1
=(a0.a1-b0.b1)+i.(a0.b1+b0.a1)
polar form
a = r.cos(θ)
b = r.sin(θ)
r = sqrt(a.a + b.b)
θ = atan2(b,a)
a+i.b = r|θ
sqrt
sqrt(r|θ) = (+/-)sqrt(r)|(θ/2)
sqrt(r.(cos(θ)+i.sin(θ))) = (+/-)sqrt(r).(cos(θ/2)+i.sin(θ/2))
real -> complex conversion:
complex = real+i.0
[notes]
do not forget that you need to convert data to different array (not in place)
normalization constant on FFT recursion is tricky (usually something like /=log2(N) depends also on the recursion stopping condition)
do not forget to stop the recursion if N=1 or 2 ...
beware FPU can overflow on big datasets (N is big)
here some insights to DFT/DFFT
here 2D FFT and wrapping example
usually Euler's formula is used to compute e^(i.x)=cos(x)+i.sin(x)
here How do I obtain the frequencies of each value in an FFT?
you find how to obtain the Niquist frequencies
[edit1] Also I strongly recommend to see this amazing video (I just found):
But what is the Fourier Transform A visual introduction
It describes the (D)FT in geometric representation. I would change some minor stuff in it but still its amazingly simple to understand.

Excel Polynomial Curve-Fitting Algorithm

What is the algorithm that Excel uses to calculate a 2nd-order polynomial regression (curve fitting)? Is there sample code or pseudo-code available?
I found a solution that returns the same formula that Excel gives:
Put together an augmented matrix of values used in a Least-Squares Parabola. See the sum equations in http://www.efunda.com/math/leastsquares/lstsqr2dcurve.cfm
Use Gaussian elimination to solve the matrix. Here is C# code that will do that http://www.codeproject.com/Tips/388179/Linear-Equation-Solver-Gaussian-Elimination-Csharp
After running that, the left-over values in the matrix (M) will equal the coefficients given in Excel.
Maybe I can find the R^2 somehow, but I don't need it for my purposes.
The polynomial trendlines in charts use least squares based on a QR decomposition method like the LINEST worksheet function ( http://support.microsoft.com/kb/828533 ). A second order or quadratic trend for given (x,y) data could be calculated using =LINEST(y,x^{1,2}).
You can call worksheet formulas from C# using the Worksheet.Evaluate method.
It depends, because there are a lot of ways to do such a thing depending on the data you supply and how important it is to have the curve pass through those points.
I'm guessing that you have many more points than you do coefficients in the polynomial (e.g. more than three points for a 2nd order curve).
If that's true, then the best you can do is least square fitting, which calculates the coefficients that minimize the mean square error between all the points and the resulting curve.
Since this is second order, my recommendation would be just create the damn second order terms and do a linear regression.
Ex. If you are doing z~second_order(x,y), it is equivalent to doing z~first_order(x,y,x^2,y^2, xy).

Random projection algorithm pseudo code

I am trying to apply Random Projections method on a very sparse dataset. I found papers and tutorials about Johnson Lindenstrauss method, but every one of them is full of equations which makes no meaningful explanation to me. For example, this document on Johnson-Lindenstrauss
Unfortunately, from this document, I can get no idea about the implementation steps of the algorithm. It's a long shot but is there anyone who can tell me the plain English version or very simple pseudo code of the algorithm? Or where can I start to dig this equations? Any suggestions?
For example, what I understand from the algorithm by reading this paper concerning Johnson-Lindenstrauss is that:
Assume we have a AxB matrix where A is number of samples and B is the number of dimensions, e.g. 100x5000. And I want to reduce the dimension of it to 500, which will produce a 100x500 matrix.
As far as I understand: first, I need to construct a 100x500 matrix and fill the entries randomly with +1 and -1 (with a 50% probability).
Edit:
Okay, I think I started to get it. So we have a matrix A which is mxn. We want to reduce it to E which is mxk.
What we need to do is, to construct a matrix R which has nxk dimension, and fill it with 0, -1 or +1, with respect to 2/3, 1/6 and 1/6 probability.
After constructing this R, we'll simply do a matrix multiplication AxR to find our reduced matrix E. But we don't need to do a full matrix multiplication, because if an element of Ri is 0, we don't need to do calculation. Simply skip it. But if we face with 1, we just add the column, or if it's -1, just subtract it from the calculation. So we'll simply use summation rather than multiplication to find E. And that is what makes this method very fast.
It turned out a very neat algorithm, although I feel too stupid to get the idea.
You have the idea right. However as I understand random project, the rows of your matrix R should have unit length. I believe that's approximately what the normalizing by 1/sqrt(k) is for, to normalize away the fact that they're not unit vectors.
It isn't a projection, but, it's nearly a projection; R's rows aren't orthonormal, but within a much higher-dimensional space, they quite nearly are. In fact the dot product of any two of those vectors you choose will be pretty close to 0. This is why it is a generally good approximation of actually finding a proper basis for projection.
The mapping from high-dimensional data A to low-dimensional data E is given in the statement of theorem 1.1 in the latter paper - it is simply a scalar multiplication followed by a matrix multiplication. The data vectors are the rows of the matrices A and E. As the author points out in section 7.1, you don't need to use a full matrix multiplication algorithm.
If your dataset is sparse, then sparse random projections will not work well.
You have a few options here:
Option A:
Step 1. apply a structured dense random projection (so called fast hadamard transform is typically used). This is a special projection which is very fast to compute but otherwise has the properties of a normal dense random projection
Step 2. apply sparse projection on the "densified data" (sparse random projections are useful for dense data only)
Option B:
Apply SVD on the sparse data. If the data is sparse but has some structure SVD is better. Random projection preserves the distances between all points. SVD preserves better the distances between dense regions - in practice this is more meaningful. Also people use random projections to compute the SVD on huge datasets. Random Projections gives you efficiency, but not necessarily the best quality of embedding in a low dimension.
If your data has no structure, then use random projections.
Option C:
For data points for which SVD has little error, use SVD; for the rest of the points use Random Projection
Option D:
Use a random projection based on the data points themselves.
This is very easy to understand what is going on. It looks something like this:
create a n by k matrix (n number of data point, k new dimension)
for i from 0 to k do #generate k random projection vectors
randomized_combination = feature vector of zeros (number of zeros = number of features)
sample_point_ids = select a sample of point ids
for each point_id in sample_point_ids do:
random_sign = +1/-1 with prob. 1/2
randomized_combination += random_sign*feature_vector[point_id] #this is a vector operation
normalize the randomized combination
#note that the normal random projection is:
# randomized_combination = [+/-1, +/-1, ...] (k +/-1; if you want sparse randomly set a fraction to 0; also good to normalize by length]
to project the data points on this random feature just do
for each data point_id in dataset:
scores[point_id, j] = dot_product(feature_vector[point_id], randomized_feature)
If you are still looking to solve this problem, write a message here, I can give you more pseudocode.
The way to think about it is that a random projection is just a random pattern and the dot product (i.e. projecting the data point) between the data point and the pattern gives you the overlap between them. So if two data points overlap with many random patterns, those points are similar. Therefore, random projections preserve similarity while using less space, but they also add random fluctuations in the pairwise similarities. What JLT tells you is that to make fluctuations 0.1 (eps)
you need about 100*log(n) dimensions.
Good Luck!
An R Package to perform Random Projection using Johnson- Lindenstrauss Lemma
RandPro

Pseudorandom Number Generation with Specific Non-Uniform Distributions

I'm writing a program that simulates various random walks (with differing distributions). At each timestep, I need randomly generated, two dimensional step distances and angles from the distribution of the random walk. I'm hoping someone can check my understanding of how to generate these random numbers.
As I understand it I can use Inverse Transform Sampling as follows:
If f(x) is the pdf of our random walk that has a non-uniform distribution, and y is a random number from a uniform distribution.
Then if we let f(x) = y and solve to find x then we have a random number from the non-uniform distribution.
Is this a feasible solution?
Not quite. The function that needs to be inverted is not f(x), the pdf, but F(x)=P(X<=x)=int_{-inf}^{x}f(t)dt, the cdf. The good thing is that F is monotone, so actually has a unique inverse (unlike f).
There are multiple other ways of generating random numbers according to a given distribution. For example, if the cdf F is difficult to compute or to invert, rejection sampling can be a good option if f is easy to compute.
You are close, but not quite. Every probability density function (pdf) has a corresponding cumulative density function (cdf). An important property about CDF(x) is that they are always between 0 and 1. Because it is relatively easy to draw a random number between 0 and 1, we can use that to work our way backwards to the distribution. So changing the word pdf to CDF in your question makes the statement correct.
As an aside for this to make sense computationally you need to find an easy to calculate inverse of the CDF. One way to do this is to fit a polynomial approximation to the CDF and find the inverse of that function. There are more advanced techniques for simulating probability distributions with messy distributions. See this book chapter for the details.

Resources