How to compute frequency of data using FFT? - algorithm

I want to know the frequency of data. I had a little bit idea that it can be done using FFT, but I am not sure how to do it. Once I passed the entire data to FFT, then it is giving me 2 peaks, but how can I get the frequency?
Thanks a lot in advance.

Here's what you're probably looking for:
When you talk about computing the frequency of a signal, you probably aren't so interested in the component sine waves. This is what the FFT gives you. For example, if you sum sin(2*pi*10x)+sin(2*pi*15x)+sin(2*pi*20x)+sin(2*pi*25x), you probably want to detect the "frequency" as 5 (take a look at the graph of this function). However, the FFT of this signal will detect the magnitude of 0 for the frequency 5.
What you are probably more interested in is the periodicity of the signal. That is, the interval at which the signal becomes most like itself. So most likely what you want is the autocorrelation. Look it up. This will essentially give you a measure of how self-similar the signal is to itself after being shifted over by a certain amount. So if you find a peak in the autocorrelation, that would indicate that the signal matches up well with itself when shifted over that amount. There's a lot of cool math behind it, look it up if you are interested, but if you just want it to work, just do this:
Window the signal, using a smooth window (a cosine will do. The window should be at least twice as large as the largest period you want to detect. 3 times as large will give better results). (see http://zone.ni.com/devzone/cda/tut/p/id/4844 if you are confused).
Take the FFT (however, make sure the FFT size is twice as big as the window, with the second half being padded with zeroes. If the FFT size is only the size of the window, you will effectively be taking the circular autocorrelation, which is not what you want. see https://en.wikipedia.org/wiki/Discrete_Fourier_transform#Circular_convolution_theorem_and_cross-correlation_theorem )
Replace all coefficients of the FFT with their square value (real^2+imag^2). This is effectively taking the autocorrelation.
Take the iFFT
Find the largest peak in the iFFT. This is the strongest periodicity of the waveform. You can actually be a little more clever in which peak you pick, but for most purposes this should be enough. To find the frequency, you just take f=1/T.

Suppose x[n] = cos(2*pi*f0*n/fs) where f0 is the frequency of your sinusoid in Hertz, n=0:N-1, and fs is the sampling rate of x in samples per second.
Let X = fft(x). Both x and X have length N. Suppose X has two peaks at n0 and N-n0.
Then the sinusoid frequency is f0 = fs*n0/N Hertz.
Example: fs = 8000 samples per second, N = 16000 samples. Therefore, x lasts two seconds long.
Suppose X = fft(x) has peaks at 2000 and 14000 (=16000-2000). Therefore, f0 = 8000*2000/16000 = 1000 Hz.

If you have a signal with one frequency (for instance:
y = sin(2 pi f t)
With:
y time signal
f the central frequency
t time
Then you'll get two peaks, one at a frequency corresponding to f, and one at a frequency corresponding to -f.
So, to get to a frequency, can discard the negative frequency part. It is located after the positive frequency part. Furthermore, the first element in the array is a dc-offset, so the frequency is 0. (Beware that this offset is usually much more than 0, so the other frequency components might get dwarved by it.)
In code: (I've written it in python, but it should be equally simple in c#):
import numpy as np
from pylab import *
x = np.random.rand(100) # create 100 random numbers of which we want the fourier transform
x = x - mean(x) # make sure the average is zero, so we don't get a huge DC offset.
dt = 0.1 #[s] 1/the sampling rate
fftx = np.fft.fft(x) # the frequency transformed part
# now discard anything that we do not need..
fftx = fftx[range(int(len(fftx)/2))]
# now create the frequency axis: it runs from 0 to the sampling rate /2
freq_fftx = np.linspace(0,2/dt,len(fftx))
# and plot a power spectrum
plot(freq_fftx,abs(fftx)**2)
show()
Now the frequency is located at the largest peak.

If you are looking at the magnitude results from an FFT of the type most common used, then a strong sinusoidal frequency component of real data will show up in two places, once in the bottom half, plus its complex conjugate mirror image in the top half. Those two peaks both represent the same spectral peak and same frequency (for strictly real data). If the FFT result bin numbers start at 0 (zero), then the frequency of the sinusoidal component represented by the bin in the bottom half of the FFT result is most likely.
Frequency_of_Peak = Data_Sample_Rate * Bin_number_of_Peak / Length_of_FFT ;
Make sure to work out your proper units within the above equation (to get units of cycles per second, per fortnight, per kiloparsec, etc.)
Note that unless the wavelength of the data is an exact integer submultiple of the FFT length, the actual peak will be between bins, thus distributing energy among multiple nearby FFT result bins. So you may have to interpolate to better estimate the frequency peak. Common interpolation methods to find a more precise frequency estimate are 3-point parabolic and Sinc convolution (which is nearly the same as using a zero-padded longer FFT).

Assuming you use a discrete Fourier transform to look at frequencies, then you have to be careful about how to interpret the normalized frequencies back into physical ones (i.e. Hz).
According to the FFTW tutorial on how to calculate the power spectrum of a signal:
#include <rfftw.h>
...
{
fftw_real in[N], out[N], power_spectrum[N/2+1];
rfftw_plan p;
int k;
...
p = rfftw_create_plan(N, FFTW_REAL_TO_COMPLEX, FFTW_ESTIMATE);
...
rfftw_one(p, in, out);
power_spectrum[0] = out[0]*out[0]; /* DC component */
for (k = 1; k < (N+1)/2; ++k) /* (k < N/2 rounded up) */
power_spectrum[k] = out[k]*out[k] + out[N-k]*out[N-k];
if (N % 2 == 0) /* N is even */
power_spectrum[N/2] = out[N/2]*out[N/2]; /* Nyquist freq. */
...
rfftw_destroy_plan(p);
}
Note it handles data lengths that are not even. Note particularly if the data length is given, FFTW will give you a "bin" corresponding to the Nyquist frequency (sample rate divided by 2). Otherwise, you don't get it (i.e. the last bin is just below Nyquist).
A MATLAB example is similar, but they are choosing the length of 1000 (an even number) for the example:
N = length(x);
xdft = fft(x);
xdft = xdft(1:N/2+1);
psdx = (1/(Fs*N)).*abs(xdft).^2;
psdx(2:end-1) = 2*psdx(2:end-1);
freq = 0:Fs/length(x):Fs/2;
In general, it can be implementation (of the DFT) dependent. You should create a test pure sine wave at a known frequency and then make sure the calculation gives the same number.

Frequency = speed/wavelength.
Wavelength is the distance between the two peaks.

Related

How to calculate the length of cable on a winch given the rotations of the drum

I have a cable winch system that I would like to know how much cable is left given the number of rotations that have occurred and vice versa. This system will run on a low-cost microcontroller with low computational resources and should be able to update quickly, long for/while loop iterations are not ideal.
The inputs are cable diameter, inner drum diameter, inner drum width, and drum rotations. The output should be the length of the cable on the drum.
At first, I was calculating the maximum number of wraps of cable per layer based on cable diameter and inner drum width, I could then use this to calculate the length of cable per layer. The issue comes when I calculate the total length as I have to loop through each layer, a costly operation (could be 100's of layers).
My next approach was to precalculate a table with each layer, then perform a 3-5 degree polynomial regression down to an easy to calculate formula.
This appears to work for the most part, however, there are slight inaccuracies at the low and high end (0 rotations could be + or - a few units of cable length). The real issue comes when I try and reverse the function to get the current rotations of the drum given the length. So far, my reversed formula does not seem to equal the forward formula (I am reversing X and Y before calculating the polynomial).
I have looked high and low and cannot seem to find any formulas for cable length to rotations that do not use recursion or loops. I can't figure out how to reverse my polynomial function to get the reverse value without losing precision. If anyone happens to have an insight/ideas or can help guide me in the right direction that would be most helpful. Please see my attempts below.
// Units are not important
CableLength = 15000
CableDiameter = 5
DrumWidth = 50
DrumDiameter = 5
CurrentRotations = 0
CurrentLength = 0
CurrentLayer = 0
PolyRotations = Array
PolyLengths = Array
PolyLayers = Array
WrapsPerLayer = DrumWidth / CableDiameter
While CurrentLength < CableLength // Calcuate layer length for each layer up to cable length
CableStackHeight = CableDiameter * CurrentLayer
DrumDiameterAtLayer = DrumDiameter + (CableStackHeight * 2) // Assumes cables stack vertically
WrapDiameter = DrumDiameterAtLayer + CableDiameter // Center point of cable
WrapLength = WrapDiameter * PI
LayerLength = WrapLength * WrapsPerLayer
CurrentRotations += WrapsPerLayer // 1 Rotation per wrap
CurrentLength += LayerLength
CurrentLayer++
PolyRotations.Push(CurrentRotations)
PolyLengths.Push(CurrentLength)
PolyLayers.Push(CurrentLayer)
End
// Using 5 degree polynomials, any lower = very low precision
PolyLengthToRotation = CreatePolynomial(PolyLengths, PolyRotations, 5) // 5 Degrees
PolyRotationToLength = CreatePolynomial(PolyRotations, PolyLengths, 5) // 5 Degrees
// 40 Rotations should equal about 3141.593 units
RealRotation = 40
RealLength = 3141.593
CalculatedLength = EvaluatePolynomial(RealRotation,PolyRotationToLength)
CalculatedRotations = EvaluatePolynomial(RealLength,PolyLengthToRotation)
// CalculatedLength = 3141.593 // Good
// CalculatedRotations = 41.069 // No good
// CalculatedRotations != RealRotation // These should equal
// 0 Rotations should equal 0 length
RealRotation = 0
RealLength = 0
CalculatedLength = EvaluatePolynomial(RealRotation,PolyRotationToLength)
CalculatedRotations = EvaluatePolynomial(RealLength,PolyLengthToRotation)
// CalculatedLength = 1.172421e-9 // Very close
// CalculatedRotations = 1.947, // No good
// CalculatedRotations != RealRotation // These should equal
Side note: I have a "spool factor" parameter to calibrate for the actual cable spooling efficiency that is not shown here. (cable is not guaranteed to lay mathematically perfect)
#Bathsheba May have meant cable, but a table is a valid option (also experimental numbers are probably more interesting in the real world).
A bit slow, but you could always do it manually. There's only 40 rotations (though optionally for better experimental results, repeat 3 times and take the average...). Reel it completely in. Then do a rotation (depending on the diameter of your drum, half rotation). Measure and mark how far it spooled out (tape), record it. Repeat for the next 39 rotations. You now have a lookup table you can find the length in O(log N) via binary search (by sorting the data) and a bit of interpolation (IE: 1.5 rotations is about half way between 1 and 2 rotations).
You can also use this to derived your own experimental data. Do the same thing, but with a cable half as thin (perhaps proportional to the ratio of the inner diameter and the cable radius?). What effect does it have on the numbers? How about twice or half the diameter? Math says circumference is linear (2πr), so half the radius, half the amount per rotation. Might be easier to adjust the table data.
The gist is that it may be easier for you to have a real world reference for your numbers rather than relying purely on an abstract mathematically model (not to say the model would be wrong, but cables don't exactly always wind up perfectly, who knows perhaps you can find a quirk about your winch that would have lead to errors in a pure mathematical approach). Who knows might be able to derive the formula yourself :) with a fudge factor for the real world even lol.

Real-time peak detection in noisy sinusoidal time-series

I have been attempting to detect peaks in sinusoidal time-series data in real time, however I've had no success thus far. I cannot seem to find a real-time algorithm that works to detect peaks in sinusoidal signals with a reasonable level of accuracy. I either get no peaks detected, or I get a zillion points along the sine wave being detected as peaks.
What is a good real-time algorithm for input signals that resemble a sine wave, and may contain some random noise?
As a simple test case, consider a stationary, sine wave that is always the same frequency and amplitude. (The exact frequency and amplitude don't matter; I have arbitrarily chosen a frequency of 60 Hz, an amplitude of +/− 1 unit, at a sampling rate of 8 KS/s.) The following MATLAB code will generate such a sinusoidal signal:
dt = 1/8000;
t = (0:dt:(1-dt)/4)';
x = sin(2*pi*60*t);
Using the algorithm developed and published by Jean-Paul, I either get no peaks detected (left) or a zillion "peaks" detected (right):
I've tried just about every combination of values for these 3 parameters that I could think of, following the "rules of thumb" that Jean-Paul gives, but I have so far been unable to get my expected result.
I found an alternative algorithm, developed and published by Eli Billauer, that does give me the results that I want—e.g.:
Even though Eli Billauer's algorithm is much simpler and does tend to reliably produce the results that I want, it is not suitable for real-time applications.
As another example of a signal that I'd like to apply such an algorithm to, consider the test case given by Eli Billauer for his own algorithm:
t = 0:0.001:10;
x = 0.3*sin(t) + sin(1.3*t) + 0.9*sin(4.2*t) + 0.02*randn(1, 10001);
This is a more unusual (less uniform/regular) signal, with a varying frequency and amplitude, but still generally sinusoidal. The peaks are plainly obvious to the eye when plotted, but hard to identify with an algorithm.
What is a good real-time algorithm to correctly identify the peaks in a sinusoidal input signal? I am not really an expert when it comes to signal processing, so it would be helpful to get some rules of thumb that consider sinusoidal inputs. Or, perhaps I need to modify e.g. Jean-Paul's algorithm itself in order to work properly on sinusoidal signals. If that's the case, what modifications would be required, and how would I go about making these?
Case 1: sinusoid without noise
If your sinusoid does not contain any noise, you can use a very classic signal processing technique: taking the first derivative and detecting when it is equal to zero.
For example:
function signal = derivesignal( d )
% Identify signal
signal = zeros(size(d));
for i=2:length(d)
if d(i-1) > 0 && d(i) <= 0
signal(i) = +1; % peak detected
elseif d(i-1) < 0 && d(i) >= 0
signal(i) = -1; % trough detected
end
end
end
Using your example data:
% Generate data
dt = 1/8000;
t = (0:dt:(1-dt)/4)';
y = sin(2*pi*60*t);
% Add some trends
y(1:1000) = y(1:1000) + 0.001*(1:1000)';
y(1001:2000) = y(1001:2000) - 0.002*(1:1000)';
% Approximate first derivative (delta y / delta x)
d = [0; diff(y)];
% Identify signal
signal = derivesignal(d);
% Plot result
figure(1); clf; set(gcf,'Position',[0 0 677 600])
subplot(4,1,1); hold on;
title('Data');
plot(t,y);
subplot(4,1,2); hold on;
title('First derivative');
area(d);
ylim([-0.05, 0.05]);
subplot(4,1,3); hold on;
title('Signal (-1 for trough, +1 for peak)');
plot(t,signal); ylim([-1.5 1.5]);
subplot(4,1,4); hold on;
title('Signals marked on data');
markers = abs(signal) > 0;
plot(t,y); scatter(t(markers),y(markers),30,'or','MarkerFaceColor','red');
This yields:
This method will work extremely well for any type of sinusoid, with the only requirement that the input signal contains no noise.
Case 2: sinusoid with noise
As soon as your input signal contains noise, the derivative method will fail. For example:
% Generate data
dt = 1/8000;
t = (0:dt:(1-dt)/4)';
y = sin(2*pi*60*t);
% Add some trends
y(1:1000) = y(1:1000) + 0.001*(1:1000)';
y(1001:2000) = y(1001:2000) - 0.002*(1:1000)';
% Add some noise
y = y + 0.2.*randn(2000,1);
Will now generate this result because first differences amplify noise:
Now there are many ways to deal with noise, and the most standard way is to apply a moving average filter. One disadvantage of moving averages is that they are slow to adapt to new information, such that signals may be identified after they have occurred (moving averages have a lag).
Another very typical approach is to use Fourier Analysis to identify all the frequencies in your input data, disregard all low-amplitude and high-frequency sinusoids, and use the remaining sinusoid as a filter. The remaining sinusoid will be (largely) cleansed from the noise and you can then use first-differencing again to determine the peaks and troughs (or for a single sine wave you know the peaks and troughs happen at 1/4 and 3/4 pi of the phase). I suggest you pick up any signal processing theory book to learn more about this technique. Matlab also has some educational material about this.
If you want to use this algorithm in hardware, I would suggest you also take a look at WFLC (Weighted Fourier Linear Combiner) with e.g. 1 oscillator or PLL (Phase-Locked Loop) that can estimate the phase of a noisy wave without doing a full Fast Fourier Transform. You can find a Matlab algorithm for a phase-locked loop on Wikipedia.
I will suggest a slightly more sophisticated approach here that will identify the peaks and troughs in real-time: fitting a sine wave function to your data using moving least squares minimization with initial estimates from Fourier analysis.
Here is my function to do that:
function [result, peaks, troughs] = fitsine(y, t, eps)
% Fast fourier-transform
f = fft(y);
l = length(y);
p2 = abs(f/l);
p1 = p2(1:ceil(l/2+1));
p1(2:end-1) = 2*p1(2:end-1);
freq = (1/mean(diff(t)))*(0:ceil(l/2))/l;
% Find maximum amplitude and frequency
maxPeak = p1 == max(p1(2:end)); % disregard 0 frequency!
maxAmplitude = p1(maxPeak); % find maximum amplitude
maxFrequency = freq(maxPeak); % find maximum frequency
% Initialize guesses
p = [];
p(1) = mean(y); % vertical shift
p(2) = maxAmplitude; % amplitude estimate
p(3) = maxFrequency; % phase estimate
p(4) = 0; % phase shift (no guess)
p(5) = 0; % trend (no guess)
% Create model
f = #(p) p(1) + p(2)*sin( p(3)*2*pi*t+p(4) ) + p(5)*t;
ferror = #(p) sum((f(p) - y).^2);
% Nonlinear least squares
% If you have the Optimization toolbox, use [lsqcurvefit] instead!
options = optimset('MaxFunEvals',50000,'MaxIter',50000,'TolFun',1e-25);
[param,fval,exitflag,output] = fminsearch(ferror,p,options);
% Calculate result
result = f(param);
% Find peaks
peaks = abs(sin(param(3)*2*pi*t+param(4)) - 1) < eps;
% Find troughs
troughs = abs(sin(param(3)*2*pi*t+param(4)) + 1) < eps;
end
As you can see, I first perform a Fourier transform to find initial estimates of the amplitude and frequency of the data. I then fit a sinusoid to the data using the model a + b sin(ct + d) + et. The fitted values represent a sine wave of which I know that +1 and -1 are the peaks and troughs, respectively. I can therefore identify these values as the signals.
This works very well for sinusoids with (slowly changing) trends and general (white) noise:
% Generate data
dt = 1/8000;
t = (0:dt:(1-dt)/4)';
y = sin(2*pi*60*t);
% Add some trends
y(1:1000) = y(1:1000) + 0.001*(1:1000)';
y(1001:2000) = y(1001:2000) - 0.002*(1:1000)';
% Add some noise
y = y + 0.2.*randn(2000,1);
% Loop through data (moving window) and fit sine wave
window = 250; % How many data points to consider
interval = 10; % How often to estimate
result = nan(size(y));
signal = zeros(size(y));
for i = window+1:interval:length(y)
data = y(i-window:i); % Get data window
period = t(i-window:i); % Get time window
[output, peaks, troughs] = fitsine(data,period,0.01);
result(i-interval:i) = output(end-interval:end);
signal(i-interval:i) = peaks(end-interval:end) - troughs(end-interval:end);
end
% Plot result
figure(1); clf; set(gcf,'Position',[0 0 677 600])
subplot(4,1,1); hold on;
title('Data');
plot(t,y); xlim([0 max(t)]); ylim([-4 4]);
subplot(4,1,2); hold on;
title('Model fit');
plot(t,result,'-k'); xlim([0 max(t)]); ylim([-4 4]);
subplot(4,1,3); hold on;
title('Signal (-1 for trough, +1 for peak)');
plot(t,signal,'r','LineWidth',2); ylim([-1.5 1.5]);
subplot(4,1,4); hold on;
title('Signals marked on data');
markers = abs(signal) > 0;
plot(t,y,'-','Color',[0.1 0.1 0.1]);
scatter(t(markers),result(markers),30,'or','MarkerFaceColor','red');
xlim([0 max(t)]); ylim([-4 4]);
Main advantages of this approach are:
You have an actual model of your data, so you can predict signals in the future before they happen! (e.g. fix the model and calculate the result by inputting future time periods)
You don't need to estimate the model every period (see parameter interval in the code)
The disadvantage is that you need to select a lookback window, but you will have this problem with any method that you use for real-time detection.
Video demonstration
Data is the input data, Model fit is the fitted sine wave to the data (see code), Signal indicates the peaks and troughs and Signals marked on data gives an impression of how accurate the algorithm is. Note: watch the model fit adjust itself to the trend in the middle of the graph!
That should get you started. There are also a lot of excellent books on signal detection theory (just google that term), which will go much further into these types of techniques. Good luck!
Consider using findpeaks, it is fast, which may be important for realtime. You should filter high-frequency noise to improve accuracy. here I smooth the data with a moving window.
t = 0:0.001:10;
x = 0.3*sin(t) + sin(1.3*t) + 0.9*sin(4.2*t) + 0.02*randn(1, 10001);
[~,iPeak0] = findpeaks(movmean(x,100),'MinPeakProminence',0.5);
You can time the process (0.0015sec)
f0 = #() findpeaks(movmean(x,100),'MinPeakProminence',0.5)
disp(timeit(f0,2))
To compare, processing the slope is only a bit faster (0.00013sec), but findpeaks have many useful options, such as minimum interval between peaks etc.
iPeaks1 = derivePeaks(x);
f1 = #() derivePeaks(x)
disp(timeit(f1,1))
Where derivePeaks is:
function iPeak1 = derivePeaks(x)
xSmooth = movmean(x,100);
goingUp = find(diff(movmean(xSmooth,100)) > 0);
iPeak1 = unique(goingUp([1,find(diff(goingUp) > 100),end]));
iPeak1(iPeak1 == 1 | iPeak1 == length(iPeak1)) = [];
end

Algorithm to smooth a curve while keeping the area under it constant

Consider a discrete curve defined by the points (x1,y1), (x2,y2), (x3,y3), ... ,(xn,yn)
Define a constant SUM = y1+y2+y3+...+yn. Say we change the value of some k number of y points (increase or decrease) such that the total sum of these changed points is less than or equal to the constant SUM.
What would be the best possible manner to adjust the other y points given the following two conditions:
The total sum of the y points (y1'+y2'+...+yn') should remain constant ie, SUM.
The curve should retain as much of its original shape as possible.
A simple solution would be to define some delta as follows:
delta = (ym1' + ym2' + ym3' + ... + ymk') - (ym1 + ym2 + ym3 + ... + ymk')
and to distribute this delta over the rest of the points equally. Here ym1' is the value of the modified point after modification and ym1 is the value of the modified point before modification to give delta as the total difference in modification.
However this would not ensure a totally smoothed curve as area near changed points would appear ragged. Does a better solution/algorithm exist for the this problem?
I've used the following approach, though it is a bit OTT.
Consider adding d[i] to y[i], to get s[i], the smoothed value.
We seek to minimise
S = Sum{ 1<=i<N-1 | sqr( s[i+1]-2*s[i]+s[i-1] } + f*Sum{ 0<=i<N | sqr( d[i])}
The first term is a sum of the squares of (an approximate) second derivative of the curve, and the second term penalises moving away from the original. f is a (positive) constant. A little algebra recasts this as
S = sqr( ||A*d - b||)
where the matrix A has a nice structure, and indeed A'*A is penta-diagonal, which means that the normal equations (ie d = Inv(A'*A)*A'*b) can be solved efficiently. Note that d is computed directly, there is no need to initialise it.
Given the solution d to this problem we can compute the solution d^ to the same problem but with the constraint One'*d = 0 (where One is the vector of all ones) like this
d^ = d - (One'*d/Q) * e
e = Inv(A'*A)*One
Q = One'*e
What value to use for f? Well a simple approach is to try out this procedure on sample curves for various fs and pick a value that looks good. Another approach is to pick a estimate of smoothness, for example the rms of the second derivative, and then a value that should attain, and then search for an f that gives that value. As a general rule, the bigger f is the less smooth the smoothed curve will be.
Some motivation for all this. The aim is to find a 'smooth' curve 'close' to a given one. For this we need a measure of smoothness (the first term in S) and a measure of closeness (the second term. Why these measures? Well, each are easy to compute, and each are quadratic in the variables (the d[]); this will mean that the problem becomes an instance of linear least squares for which there are efficient algorithms available. Moreover each term in each sum depends on nearby values of the variables, which will in turn mean that the 'inverse covariance' (A'*A) will have a banded structure and so the least squares problem can be solved efficiently. Why introduce f? Well, if we didn't have f (or set it to 0) we could minimise S by setting d[i] = -y[i], getting a perfectly smooth curve s[] = 0, which has nothing to do with the y curve. On the other hand if f is gigantic, then to minimise s we should concentrate on the second term, and set d[i] = 0, and our 'smoothed' curve is just the original. So it's reasonable to suppose that as we vary f, the corresponding solutions will vary between being very smooth but far from y (small f) and being close to y but a bit rough (large f).
It's often said that the normal equations, whose use I advocate here, are a bad way to solve least squares problems, and this is generally true. However with 'nice' banded systems -- like the one here -- the loss of stability through using the normal equations is not so great, while the gain in speed is so great. I've used this approach to smooth curves with many thousands of points in a reasonable time.
To see what A is, consider the case where we had 4 points. Then our expression for S comes down to:
sqr( s[2] - 2*s[1] + s[0]) + sqr( s[3] - 2*s[2] + s[1]) + f*(d[0]*d[0] + .. + d[3]*d[3]).
If we substitute s[i] = y[i] + d[i] in this we get, for example,
s[2] - 2*s[1] + s[0] = d[2]-2*d[1]+d[0] + y[2]-2*y[1]+y[0]
and so we see that for this to be sqr( ||A*d-b||) we should take
A = ( 1 -2 1 0)
( 0 1 -2 1)
( f 0 0 0)
( 0 f 0 0)
( 0 0 f 0)
( 0 0 0 f)
and
b = ( -(y[2]-2*y[1]+y[0]))
( -(y[3]-2*y[2]+y[1]))
( 0 )
( 0 )
( 0 )
( 0 )
In an implementation, though, you probably wouldn't want to form A and b, as they are only going to be used to form the normal equation terms, A'*A and A'*b. It would be simpler to accumulate these directly.
This is a constrained optimization problem. The functional to be minimized is the integrated difference of the original curve and the modified curve. The constraints are the area under the curve and the new locations of the modified points. It is not easy to write such codes on your own. It is better to use some open source optimization codes, like this one: ool.
what about to keep the same dynamic range?
compute original min0,max0 y-values
smooth y-values
compute new min1,max1 y-values
linear interpolate all values to match original min max y
y=min1+(y-min1)*(max0-min0)/(max1-min1)
that is it
Not sure for the area but this should keep the shape much closer to original one. I got this Idea right now while reading your question and now I face similar problem so I try to code it and try right now anyway +1 for the getting me this Idea :)
You can adapt this and combine with the area
So before this compute the area and apply #1..#4 and after that compute new area. Then multiply all values by old_area/new_area ratio. If you have also negative values and not computing absolute area then you have to handle positive and negative areas separately and find multiplication ration to best fit original area for booth at once.
[edit1] some results for constant dynamic range
As you can see the shape is slightly shifting to the left. Each image is after applying few hundreds smooth operations. I am thinking of subdivision to local min max intervals to improve this ...
[edit2] have finished the filter for mine own purposes
void advanced_smooth(double *p,int n)
{
int i,j,i0,i1;
double a0,a1,b0,b1,dp,w0,w1;
double *p0,*p1,*w; int *q;
if (n<3) return;
p0=new double[n<<2]; if (p0==NULL) return;
p1=p0+n;
w =p1+n;
q =(int*)((double*)(w+n));
// compute original min,max
for (a0=p[0],i=0;i<n;i++) if (a0>p[i]) a0=p[i];
for (a1=p[0],i=0;i<n;i++) if (a1<p[i]) a1=p[i];
for (i=0;i<n;i++) p0[i]=p[i]; // store original values for range restoration
// compute local min max positions to p1[]
dp=0.01*(a1-a0); // min delta treshold
// compute first derivation
p1[0]=0.0; for (i=1;i<n;i++) p1[i]=p[i]-p[i-1];
for (i=1;i<n-1;i++) // eliminate glitches
if (p1[i]*p1[i-1]<0.0)
if (p1[i]*p1[i+1]<0.0)
if (fabs(p1[i])<=dp)
p1[i]=0.5*(p1[i-1]+p1[i+1]);
for (i0=1;i0;) // remove zeros from derivation
for (i0=0,i=0;i<n;i++)
if (fabs(p1[i])<dp)
{
if ((i> 0)&&(fabs(p1[i-1])>=dp)) { i0=1; p1[i]=p1[i-1]; }
else if ((i<n-1)&&(fabs(p1[i+1])>=dp)) { i0=1; p1[i]=p1[i+1]; }
}
// find local min,max to q[]
q[n-2]=0; q[n-1]=0; for (i=1;i<n-1;i++) if (p1[i]*p1[i-1]<0.0) q[i-1]=1; else q[i-1]=0;
for (i=0;i<n;i++) // set sign as +max,-min
if ((q[i])&&(p1[i]<-dp)) q[i]=-q[i]; // this shifts smooth curve to the left !!!
// compute weights
for (i0=0,i1=1;i1<n;i0=i1,i1++) // loop through all local min,max intervals
{
for (;(!q[i1])&&(i1<n-1);i1++); // <i0,i1>
b0=0.5*(p[i0]+p[i1]);
b1=fabs(p[i1]-p[i0]);
if (b1>=1e-6)
for (b1=0.35/b1,i=i0;i<=i1;i++) // compute weights bigger near local min max
w[i]=0.8+(fabs(p[i]-b0)*b1);
}
// smooth few times
for (j=0;j<5;j++)
{
for (i=0;i<n ;i++) p1[i]=p[i]; // store data to avoid shifting by using half filtered data
for (i=1;i<n-1;i++) // FIR smooth filter
{
w0=w[i];
w1=(1.0-w0)*0.5;
p[i]=(w1*p1[i-1])+(w0*p1[i])+(w1*p1[i+1]);
}
for (i=1;i<n-1;i++) // avoid local min,max shifting too much
{
if (q[i]>0) // local max
{
if (p[i]<p[i-1]) p[i]=p[i-1]; // can not be lower then neigbours
if (p[i]<p[i+1]) p[i]=p[i+1];
}
if (q[i]<0) // local min
{
if (p[i]>p[i-1]) p[i]=p[i-1]; // can not be higher then neigbours
if (p[i]>p[i+1]) p[i]=p[i+1];
}
}
}
for (i0=0,i1=1;i1<n;i0=i1,i1++) // loop through all local min,max intervals
{
for (;(!q[i1])&&(i1<n-1);i1++); // <i0,i1>
// restore original local min,max
a0=p0[i0]; b0=p[i0];
a1=p0[i1]; b1=p[i1];
if (a0>a1)
{
dp=a0; a0=a1; a1=dp;
dp=b0; b0=b1; b1=dp;
}
b1-=b0;
if (b1>=1e-6)
for (dp=(a1-a0)/b1,i=i0;i<=i1;i++)
p[i]=a0+((p[i]-b0)*dp);
}
delete[] p0;
}
so p[n] is the input/output data. There are few things that can be tweaked like:
weights computation (constants 0.8 and 0.35 means weights are <0.8,0.8+0.35/2>)
number of smooth passes (now 5 in the for loop)
the bigger the weight the less the filtering 1.0 means no change
The main Idea behind is:
find local extremes
compute weights for smoothing
so near local extremes are almost none change of the output
smooth
repair dynamic range per each interval between all local extremes
[Notes]
I did also try to restore the area but that is incompatible with mine task because it distorts the shape a lot. So if you really need the area then focus on that and not on the shape. The smoothing causes signal to shrink mostly so after area restoration the shape rise on magnitude.
Actual filter state has none markable side shifting of shape (which was the main goal for me). Some images for more bumpy signal (the original filter was extremly poor on this):
As you can see no visible signal shape shifting. The local extremes has tendency to create sharp spikes after very heavy smoothing but that was expected
Hope it helps ...

equal power crossfade in Audio Unit?

This is actually more of a theoretical question, but here it goes:
I'm developing an effect audio unit and it needs an equal power crossfade between dry and wet signals.
But I'm confused about the right way to do the mapping function from the linear fader to the scaling factor (gain) for the signal amplitudes of dry and wet streams.
Basically, I'ev seen it done with cos / sin functions or square roots... essentially approximating logarithmic curves. But if our perception of amplitude is logarithmic to start with, shouldn't these curves mapping the fader position to an amplitude actually be exponential?
This is what I mean:
Assumptions:
signal[i] means the ith sample in a signal.
each sample is a float ranging [-1, 1] for amplitudes between [0,1].
our GUI control is an NSSlider ranging from [0,1], so it is in
principle linear.
fader is a variable with the value of the NSSlider.
First Observation:
We perceive amplitude in a logarithmic way. So if we have a linear fader and merely adjust a signal's amplitude by doing: signal[i] * fader what we are perceiving (hearing, regardless of the math) is something along the lines of:
This is the so-called crappy fader-effect: we go from silence to a drastic volume increase across the leftmost segment in the slider and past the middle the volume doesn't seem to get that louder.
So to do the fader "right", we instead either express it in a dB scale and then, as far as the signal is concerned, do: signal[i] * 10^(fader/20) or, if we were to keep or fader units in [0,1], we can do :signal[i] * (.001*10^(3*fader))
Either way, our new mapping from the NSSlider to the fader variable which we'll use for multiplying in our code, looks like this now:
Which is what we actually want, because since we perceive amplitude logarithmically, we are essentially mapping from linear (NSSLider range 0-1) to exponential and feeding this exponential output to our logarithmic perception. And it turns out that : log(10^x)=x so we end up perceiving the amplitude change in a linear (aka correct) way.
Great.
Now, my thought is that an equal-power crossfade between two signals (in this case a dry / wet horizontal NSSlider to mix together the input to the AU and the processed output from it) is essentially the same only that with one slider acting on both hypothetical signals dry[i] and wet[i].
So If my slider ranges from 0 to 100 and dry is full-left and wet is full-right), I'd end up with code along the lines of:
Float32 outputSample, wetSample, drySample = <assume proper initialization>
Float32 mixLevel = .01 * GetParameter(kParameterTypeMixLevel);
Float32 wetPowerLevel = .001 * pow(10, (mixLevel*3));
Float32 dryPowerLevel = .001 * pow(10, ((-3*mixLevel)+1));
outputSample = (wetSample * wetPowerLevel) + (drySample * dryPowerLevel);
The graph of which would be:
And same as before, because we perceive amplitude logarithmically, this exponential mapping should actually make it where we hear the crossfade as linear.
However, I've seen implementations of the crossfade using approximations to log curves. Meaning, instead:
But wouldn't these curves actually emphasize our logarithmic perception of amplitude?
The "equal power" crossfade you're thinking of has to do with keeping the total output power of your mix constant as you fade from wet to dry. Keeping total power constant serves as a reasonable approximation to keeping total perceived loudness constant (which in reality can be fairly complicated).
If you are crossfading between two uncorrelated signals of equal power, you can maintain a constant output power during the crossfade by using any two functions whose squared values sum to 1. A common example of this is the set of functions
g1(k) = ( 0.5 + 0.5*cos(pi*k) )^.5
g2(k) = ( 0.5 - 0.5*cos(pi*k) )^.5,
where 0 <= k <= 1 (note that g1(k)^2 + g2(k)^2 = 1 is satisfied, as mentioned). Here's a proof that this results in a constant power crossfade for uncorrelated signals:
Say we have two signals x1(t) and x2(t) with equal powers E[ x1(t)^2 ] = E[ x2(t)^2 ] = Px, which are also uncorrelated ( E[ x1(t)*x2(t) ] = 0 ). Note that any set of gain functions satisfying the previous condition will have that g2(k) = (1 - g1(k)^2)^.5. Now, forming the sum y(t) = g1(k)*x1(t) + g2(k)*x2(t), we have that:
E[ y(t)^2 ] = E[ (g1(k) * x1(t))^2 + 2*g1(k)*(1 - g1(k)^2)^.5 * x1(t) * x2(t) + (1 - g1(k)^2) * x2(t)^2 ]
= g1(k)^2 * E[ x1(t)^2 ] + 2*g1(k)*(1 - g1(k)^2)^.5 * E[ x1(t)*x2(t) ] + (1 - g1(k)^2) * E[ x2(t)^2 ]
= g1(k)^2 * Px + 0 + (1 - g1(k)^2) * Px = Px,
where we have used that g1(k) and g2(k) are deterministic and can thus be pulled outside the expectation operator E[ ], and that E[ x1(t)*x2(t) ] = 0 by definition because x1(t) and x2(t) are assumed to be uncorrelated. This means that no matter where we are in the crossfade (whatever k we choose) our output will still have the same power, Px, and thus hopefully equal perceived loudness.
Note that for completely correlated signals, you can achieve constant output power by doing a "linear" fade - using and two functions that sum to one ( g1(k) + g2(k) = 1 ). When mixing signals that are somewhat correlated, gain functions between those two would theoretically be appropriate.
What you're thinking of when you say
And same as before, because we perceive amplitude logarithmically,
this exponential mapping should actually make it where we hear the
crossfade as linear.
is that one signal should perceptually decrease in loudness as a linear function of slider position (k), while the other signal should perceptually increase in loudness as a linear function of slider position, when applying your derived crossfade. While your derivation of that seems pretty spot on, unfortunately that may not the best way to blend your dry and wet signals in terms of consistency - often, maintaining equal output loudness, regardless of slider position, is the better thing to shoot for. In any case, it might be worth trying a couple different functions to see what is most usable and consistent.

Confusion with FFT algorithm

I am trying to understand the FFT algorithm and so far I think that I understand the main concept behind it. However I am confused as to the difference between 'framesize' and 'window'.
Based on my understanding, it seems that they are redundant with each other? For example, I present as input a block of samples with a framesize of 1024. So I have byte[1024] presented as input.
What then is the purpose of the windowing function? Since initially, I thought the purpose of the windowing function is to select the block of samples from the original data.
Thanks!
What then is the purpose of the windowing function?
It's to deal with so-called "spectral leakage": the FFT assumes an infinite series that repeats the given sample frame over and over again. If you have a sine wave that is an integral number of cycles within the sample frame, then all is good, and the FFT gives you a nice narrow peak at the proper frequency. But if you have a sine wave that is not an integral number of cycles, there's a discontinuity between the last and first sample, and the FFT gives you false harmonics.
Windowing functions lower the amplitudes at the beginning and the end of the sample frame, to reduce the harmonics caused by this discontinuity.
some diagrams from a National Instruments webpage on windowing:
integral # of cycles:
non-integer # of cycles:
for additional information:
http://www.tmworld.com/article/322450-Windowing_Functions_Improve_FFT_Results_Part_I.php
http://zone.ni.com/reference/en-XX/help/371361B-01/lvanlsconcepts/char_smoothing_windows/
http://www.physik.uni-wuerzburg.de/~praktiku/Anleitung/Fremde/ANO14.pdf
A rectangular window of length M has frequency response of sin(ω*M/2)/sin(ω/2), which is zero when ω = 2*π*k/M, for k ≠ 0. For a DFT of length N, where ω = 2*π*n/N, there are nulls at n = k * N/M. The ratio N/M isn't necessarily an integer. For example, if N = 40, and M = 32, then there are nulls at multiples of 1.25, but only the integer multiples will appear in the DFT, which is bins 5, 10, 15, and 20 in this case.
Here's a plot of the 1024-point DFT of a 32-point rectangular window:
M = 32
N = 1024
w = ones(M)
W = rfft(w, N)
K = N/M
nulls = abs(W[K::K])
plot(abs(W))
plot(r_[K:N/2+1:K], nulls, 'ro')
xticks(r_[:512:64])
grid(); axis('tight')
Note the nulls at every N/M = 32 bins. If N=M (i.e. the window length equals the DFT length), then there are nulls at all bins except at n = 0.
When you multiply a window by a signal, the corresponding operation in the frequency domain is the circular convolution of the window's spectrum with the signal's spectrum. For example, the DTFT of a sinusoid is a weighted delta function (i.e. an impulse with infinite height, infinitesimal extension, and finite area) located at the positive and negative frequency of the sinusoid. Convolving a spectrum with a delta function just shifts it to the location of the delta and scales it by the delta's weight. Therefore when you multiply a window by a sinusoid in the sample domain, the window's frequency response is scaled and shifted to the frequency of the sinusoid.
There are a couple of scenarios to examine regarding the length of a rectangular window. First let's look at the case where the window length is an integer multiple of the sinusoid's period, e.g. a 32-sample rectangular window of a cosine with a period of 32/8 = 4 samples:
x1 = cos(2*pi*8*r_[:32]/32) # ω0 = 8π/16, bin 8/32 * 1024 = 256
X1 = rfft(x1 * w, 1024)
plot(abs(X1))
xticks(r_[:513:64])
grid(); axis('tight')
As before, there are nulls at multiples of N/M = 32. But the window's spectrum has been shifted to bin 256 of the sinusoid and scaled by its magnitude, which is 0.5 split between the positive frequency and the negative frequency (I'm only plotting positive frequencies). If the DFT length had been 32, the nulls would line up at every bin, prompting the appearance that there's no leakage. But that misleading appearance is only a function of the DFT length. If you pad the windowed signal with zeros (as above), you'll get to see the sinc-like response at frequencies between the nulls.
Now let's look at a case where the window length is not an integer multiple of the sinusoid's period, e.g. a cosine with an angular frequency of 7.5π/16 (the period is 64 samples):
x2 = cos(2*pi*15*r_[:32]/64) # ω0 = 7.5π/16, bin 15/64 * 1024 = 240
X2 = rfft(x2 * w, 1024)
plot(abs(X2))
xticks(r_[-16:513:64])
grid(); axis('tight')
The center bin location is no longer at an integer multiple of 32, but shifted by a half down to bin 240. So let's see what the corresponding 32-point DFT would look like (inferring a 32-point rectangular window). I'll compute and plot the 32-point DFT of x2[n] and also superimpose a 32x decimated copy of the 1024-point DFT:
X2_32 = rfft(x2, 32)
X2_sample = X2[::32]
stem(r_[:17],abs(X2_32))
plot(abs(X2_sample), 'rs') # red squares
grid(); axis([0,16,0,11])
As you can see in the previous plot, the nulls are no longer aligned at multiples of 32, so the magnitude of the 32-point DFT is non-zero at each bin. In the 32 point DFT, the window's nulls are still spaced every N/M = 32/32 = 1 bin, but since ω0 = 7.5π/16, the center is at 'bin' 7.5, which puts the nulls at 0.5, 1.5, etc, so they're not present in the 32-point DFT.
The general message is that spectral leakage of a windowed signal is always present but can be masked in the DFT if the signal specrtum, window length, and DFT length come together in just the right way to line up the nulls. Beyond that you should just ignore these DFT artifacts and concentrate on the DTFT of your signal (i.e. pad with zeros to sample the DTFT at higher resolution so you can clearly examine the leakage).
Spectral leakage caused by convolving with a window's spectrum will always be there, which is why the art of crafting particularly shaped windows is so important. The spectrum of each window type has been tailored for a specific task, such as dynamic range or sensitivity.
Here's an example comparing the output of a rectangular window vs a Hamming window:
from pylab import *
import wave
fs = 44100
M = 4096
N = 16384
# load a sample of guitar playing an open string 6
# with a fundamental frequency of 82.4 Hz
g = fromstring(wave.open('dist_gtr_6.wav').readframes(-1),
dtype='int16')
L = len(g)/4
g_t = g[L:L+M]
g_t = g_t / float64(max(abs(g_t)))
# compute the response with rectangular vs Hamming window
g_rect = rfft(g_t, N)
g_hamm = rfft(g_t * hamming(M), N)
def make_plot():
fmax = int(82.4 * 4.5 / fs * N) # 4 harmonics
subplot(211); title('Rectangular Window')
plot(abs(g_rect[:fmax])); grid(); axis('tight')
subplot(212); title('Hamming Window')
plot(abs(g_hamm[:fmax])); grid(); axis('tight')
if __name__ == "__main__":
make_plot()
If you don't modify the sample values, and select the same length of data as the FFT length, this is equivalent to using a rectangular window, in which case the frame and the window are identical. However multiplying your input data by a rectangular window in the time domain is the same as convolving the input signal's spectrum with a Sinc function in the frequency domain, which will spread any spectral peaks for frequencies which are not exactly periodic in the FFT aperture across the entire spectrum.
Non-rectangular windows are often used so the the resulting FFT spectrum is convolved with something a bit more "focused" than a Sinc function.
You can also use a rectangular window that is a different size than the FFT length or aperture. In the case of a shorter data window, the FFT frame can be zero padded, which can result in an smoother looking interpolated FFT result spectrum. You can even use a rectangular window that is longer that the length of the FFT by wrapping data around the FFT aperture in a summed circular manner for some interesting effects with the frequency resolution.
ADDED due to a request:
Multiplying by a window in the time domain produces the same result as convolving with the transform of that window in the frequency domain.
In general, a narrower time domain window with produce a wider looking frequency domain transform. This is the reason that zero-padding produces a smoother frequency plot. The narrower time domain window produces a wider Sinc with fatter and smoother curves in relation to the frame width than would a window the full width of the FFT frame, thus making the interpolated frequency results look smoother than an non-zero padded FFT of the same frame length.
The converse is also true to some extent. A wider rectangular window will produce a narrower Sinc, with the nulls closer to the peak. Thus you might be able to use a carefully chosen wider window to produce a narrower looking Sinc to null a frequency closer to a bin of interest than 1 frequency bin away. How do you use a wider window? Wrap the data around and sum, which is identical to using FT basis vectors that are not truncated to 1 FFT frame in length. However, since when doing this the FFT result vector is shorter than the data, this is a lossy process which will introduce artifacts, and introduce some new novel aliasing. But it will give you a sharper frequency selection peak at each bin, and notch filters that can be placed less than 1 bin away, say halfway between bins, etc.

Resources