How exactly do you compute the Fast Fourier Transform? - algorithm

I've been reading a lot about Fast Fourier Transform and am trying to understand the low-level aspect of it. Unfortunately, Google and Wikipedia are not helping much at all.. and I have like 5 different algorithm books open that aren't helping much either.
I'm trying to find the FFT of something simple like a vector [1,0,0,0]. Sure I could just plug it into Matlab but that won't help me understand what's going on underneath. Also, when I say I want to find the FFT of a vector, is that the same as saying I want to find the DFT of a vector just with a more efficient algorithm?

You're right, "the" Fast Fourier transform is just a name for any algorithm that computes the discrete Fourier transform in O(n log n) time, and there are several such algorithms.
Here's the simplest explanation of the DFT and FFT as I think of them, and also examples for small N, which may help. (Note that there are alternative interpretations, and other algorithms.)
Discrete Fourier transform
Given N numbers f0, f1, f2, …, fN-1, the DFT gives a different set of N numbers.
Specifically: Let ω be a primitive Nth root of 1 (either in the complex numbers or in some finite field), which means that ωN=1 but no smaller power is 1. You can think of the fk's as the coefficients of a polynomial P(x) = ∑fkxk. The N new numbers F0, F1, …, FN-1 that the DFT gives are the results of evaluating the polynomial at powers of ω. That is, for each n from 0 to N-1, the new number Fn is P(ωn) = ∑0≤k≤N-1 fkωnk.
[The reason for choosing ω is that the inverse DFT has a nice form, very similar to the DFT itself.]
Note that finding these F's naively takes O(N2) operations. But we can exploit the special structure that comes from the ω's we chose, and that allows us to do it in O(N log N). Any such algorithm is called the fast Fourier transform.
Fast Fourier Transform
So here's one way of doing the FFT. I'll replace N with 2N to simplify notation. We have f0, f1, f2, …, f2N-1, and we want to compute P(ω0), P(ω1), … P(ω2N-1) where we can write
P(x) = Q(x) + ωNR(x) with
Q(x) = f0 + f1x + … + fN-1xN-1
R(x) = fN + fN+1x + … + f2N-1x2N-1
Now here's the beauty of the thing. Observe that the value at ωk+N is very simply related to the value at ωk:
P(ωk+N) = ωN(Q(ωk) + ωNR(ωk)) = R(ωk) + ωNQ(ωk). So the evaluations of Q and R at ω0 to ωN-1 are enough.
This means that your original problem — of evaluating the 2N-term polynomial P at 2N points ω0 to ω2N-1 — has been reduced to the two problems of evaluating the N-term polynomials Q and R at the N points ω0 to ωN-1. So the running time T(2N) = 2T(N) + O(N) and all that, which gives T(N) = O(N log N).
Examples of DFT
Note that other definitions put factors of 1/N or 1/√N.
For N=2, ω=-1, and the Fourier transform of (a,b) is (a+b, a-b).
For N=3, ω is the complex cube root of 1, and the Fourier transform of (a,b,c) is (a+b+c, a+bω+cω2, a+bω2+cω). (Since ω4=ω.)
For N=4 and ω=i, and the Fourier transform of (a,b,c,d) is (a+b+c+d, a+bi-c-di, a-b+c-d, a-bi-c+di). In particular, the example in your question: the DFT on (1,0,0,0) gives (1,1,1,1), not very illuminating perhaps.

The FFT is just an efficient implementation of the DFT. The results should be identical for both, but in general the FFT will be much faster. Make sure you understand how the DFT works first, since it is much simpler and much easier to grasp.
When you understand the DFT then move on to the FFT. Note that although the general principle is the same, there are many different implementations and variations of the FFT, e.g. decimation-in-time v decimation-in-frequency, radix 2 v other radices and mixed radix, complex-to-complex v real-to-complex, etc.
A good practical book to read on the subject is Fast Fourier Transform and Its Applications by E. Brigham.

Yes, the FFT is merely an efficient DFT algorithm. Understanding the FFT itself might take some time unless you've already studied complex numbers and the continuous Fourier transform; but it is basically a base change to a base derived from periodic functions.
(If you want to learn more about Fourier analysis, I recommend the book Fourier Analysis and Its Applications by Gerald B. Folland)

I'm also new to Fourier transforms and I found this online book very helpful:
The Scientists and Engineer's Guide to Digital Signal Processing
The link takes you to the chapter on the Discrete Fourier Transform. This chapter explains the difference between all the Fourier transforms, as well as where you'd use which one and pseudocode that shows how you go about calculating the Discrete Fourier Transform.

If you seek a plain English explanation of DFT and a bit of FFT as well, instead of academic goggledeegoo, then you must read this: http://blogs.zynaptiq.com/bernsee/dft-a-pied/
I couldn't have explained it better myself.

Related

Fast solution of linear equations starting from approximate solution

In a problem I'm working on, there is a need to solve Ax=b where A is a n x n square matrix (typically n = a few thousand), and b and x are vectors of size n. The trick is, it is necessary to do this many (billions) of times, where A and b change only very slightly in between successive calculations.
Is there a way to reuse an existing approximate solution for x (or perhaps inverse of A) from the previous calculation instead of solving the equations from scratch?
I'd also be interested in a way to get x to within some (defined) accuracy (eg error in any element of x < 0.001), rather than an exact solution (again, reusing the previous calculations).
You could use the Sherman–Morrison formula to incrementally update the inverse of matrix A.
To speed up the matrix multiplications, you could use a suitable matrix multiplication algorithm or a library tuned for high-performance computing. The classic matrix multiplication has complexity O(n³). Strassen-type algorithms have O(n^2.8) and better.
A similiar question without real answer was asked here.

'Squaring a polynomial' versus 'Multiplying the polynomial with itself using FFT'

The fastest known algorithm for polynomial multiplication is using Fast Fourier Transformation (FFT).
In a special case of multiplying a polynomial with itself, I am interested in knowing if any squaring algorithm performs better than FFT. I couldn't find any resource which deals with this aspect.
Number Theoretic Transform (NTT) is faster than FFT
Why? Because you using just integer modular arithmetics on some ring instead of floating point complex numbers while the properties stays the same (as NTT is sort of form of DFT ...). So if your polynomials are integer use NTT which is faster... if they are float you need to use FFT
FFT based squaring is faster than FFT based multiplying by itself
Why? Because you need just 1x NTT/FFT and 1x iNTT/iFFT while multiplying needs 2xNTT/FFT and 1x iNTT/iFFT so you spare one transformation ... the rest is the same
for small enough polynomials is squaring without FFT fastest
For more info see:
Fast bignum square computation
its not the same problem but very similar ... as bignum data words are similar to your polynomial coefficients. So most of it applies to your problem too

Algorithms bounds and functions describing algorithm

ok so basically i need to understand how i can compare functions so that i can find big O big theta and big omega for algorithms of a program
my mathematics background is not very strong but i have the foundations down
and my question is
is there a mathematical way to find where two functions will intersect and eventually one dominate the other from some point n
for example if i have a function
2n^2 and 64nlog(n) [with log to base 2]
how can i find at which values of n, 2n^2 will upper bound( hope i used the correct term here) 64nlog(n) and also how to apply this to any other function
is it just simply guess work?
You just need to find the intercepts of the two functions. So set them equal to each other and solve for n. Or just plot them and get a rough idea of the intercepts so you know at what size of data you should switch between each algorithm. Also, Wolfram Alpha has useful function solving and plotting tools too, just in case you are a bit rusty on math.
Big O, Big Theta, and Big Omega are about general sorts of growth patterns for algorithms. Each algorithm can be implemented an infinite number of ways. Each implementation may have a specific execution time, which relates to the Big O of the algorithm. You can compare the execution-time-per-input-size of two implementations of an algorithm, but not for an algorithm by itself. Big O, Big Theta, and Big Omega say next-to-nothing about the execution-time of an algorithm. As such, you cannot even guess a specific intersection point of where one algorithm becomes faster, because such a concept makes no sense. We can discuss intersection points in theory, but not in detail, because it has no value for algorithms, only implementations.
It's also important to note that Big O (and similar) have no constant factors, like your functions do.
http://en.wikipedia.org/wiki/Big_O_notation
You can divide out an n on both sides, and a common factor 2, and then solve this:
32 log(n) <= c n for n >= n_0
Let's say c = 32, because then it's log(n) <= n (which looks nice), and n_0 can be 1.
But this sort of thing has already been done. Unless you are in CS class, you can just say that O(n log n) is a subset of O(n2) without proving it, it's well-known anyway.
For big-O, big-Theta, and big-Omega, you don't have to find the intersection point. You just have to prove that one function eventually dominates the other (up to constant factors). One useful tool for this for example is L'Hopital's rule: Take the ratio of the two running time functions and compute the limit as n goes to infinity, and if the limit is infinity/infinity then you can apply L'Hopital's rule to get that the limit of the ratio is equal to the limit of the ratio of derivatives, and sometimes get a solution. For example this shows that log(n) = O(n^c) for any c > 0.

computing derivatives of discrete periodic data

I have an array y[x], x=0,1,2,...,10^6 describing a periodic signal with y(10^6)=y(0), and I want to compute its derivative dy/dx with a fast method.
I tried the spectral difference method, namely
dy/dx = inverse_fourier_transform ( i*k fourier_transform(y)[k] ) .................(1)
and the result is different from (y[x+1]-y[x-1])/2 i.e. suggested by finite difference method.
Which of the two is more accurate, and which is faster? Are there other comparable methods?
Below is an effort to understand the difference of the results:
If one expand both the sum for the fourier_transform and that for the inverse_fourier_transform in (1), one can express dy/dx as a linear combination of y[x] with coefficients a[x]. I computed these coefficients and they seem to go as 1/n (when the length of the array goes to infinity) with n being the distance to where the derivative is examined. Compared to the finite differencing method which uses only the two neighboring points, the spectral difference is highly non-local... Am I correct with this result, and if yes, how to understand this?
if you are sampling the signal above the nyquist frequency then the fourier method gives you an exact answer because your data completely describe the signal (assuming no noise).
the finite difference method is a first order approximation and so is not exact. but still, if you plot the two, they should show the same basic trends. if they look completely different then you probably have an error somewhere.
however, a fast ft is O(nlog(n)) while finite differences are O(n) so the latter is faster (but not so much faster that it should be automatically preferred).
the fourier approach is non-local in the sense that it constructs the whole signal, exactly (and so uses all wavelengths).

Efficient algorithm for finding largest eigenpair of small general complex matrix

I am looking for an efficient algorithm to find the largest eigenpair of a small, general (non-square, non-sparse, non-symmetric), complex matrix, A, of size m x n. By small I mean m and n is typically between 4 and 64 and usually around 16, but with m not equal to n.
This problem is straight forward to solve with the general LAPACK SVD algorithms, i.e. gesvd or gesdd. However, as I am solving millions of these problems and only require the largest eigenpair, I am looking for a more efficient algorithm. Additionally, in my application the eigenvectors will generally be similar for all cases. This lead me to investigate Arnoldi iteration based methods, but I have neither found a good library nor algorithm that applies to my small general complex matrix. Is there an appropriate algorithm and/or library?
Rayleigh iteration has cubic convergence. You may want to implement also the power method and see how they compare, since you need LU or QR decomposition of your matrix.
http://en.wikipedia.org/wiki/Rayleigh_quotient_iteration
Following #rchilton's comment, you can apply this to A* A.
The idea of looking for the largest eigenpair is analogous to finding a large power of the matrix, as the lower frequency modes get damped out during the iteration. The Lanczos algorithm, is one of a few such algorithms that rely on the so-called Ritz eigenvectors during the decomposition. From Wikipedia:
The Lanczos algorithm is an iterative algorithm ... that is an adaptation of power methods to find eigenvalues and eigenvectors of a square matrix or the singular value decomposition of a rectangular matrix. It is particularly useful for finding decompositions of very large sparse matrices. In latent semantic indexing, for instance, matrices relating millions of documents to hundreds of thousands of terms must be reduced to singular-value form.
The technique works even if the system is not sparse, but if it is large and dense it has the advantage that it doesn't all have to be stored in memory at the same time.
How does it work?
The power method for finding the largest eigenvalue of a matrix A can be summarized by noting that if x_{0} is a random vector and x_{n+1}=A x_{n}, then in the large n limit, x_{n} / ||x_{n}|| approaches the normed eigenvector corresponding to the largest eigenvalue.
Non-square matrices?
Noting that your system is not a square matrix, I'm pretty sure that the SVD problem can be decomposed into separate linear algebra problems where the Lanczos algorithm would apply. A good place to ask such questions would be over at https://math.stackexchange.com/.

Resources