I have been attempting to detect peaks in sinusoidal time-series data in real time, however I've had no success thus far. I cannot seem to find a real-time algorithm that works to detect peaks in sinusoidal signals with a reasonable level of accuracy. I either get no peaks detected, or I get a zillion points along the sine wave being detected as peaks.
What is a good real-time algorithm for input signals that resemble a sine wave, and may contain some random noise?
As a simple test case, consider a stationary, sine wave that is always the same frequency and amplitude. (The exact frequency and amplitude don't matter; I have arbitrarily chosen a frequency of 60 Hz, an amplitude of +/− 1 unit, at a sampling rate of 8 KS/s.) The following MATLAB code will generate such a sinusoidal signal:
dt = 1/8000;
t = (0:dt:(1-dt)/4)';
x = sin(2*pi*60*t);
Using the algorithm developed and published by Jean-Paul, I either get no peaks detected (left) or a zillion "peaks" detected (right):
I've tried just about every combination of values for these 3 parameters that I could think of, following the "rules of thumb" that Jean-Paul gives, but I have so far been unable to get my expected result.
I found an alternative algorithm, developed and published by Eli Billauer, that does give me the results that I want—e.g.:
Even though Eli Billauer's algorithm is much simpler and does tend to reliably produce the results that I want, it is not suitable for real-time applications.
As another example of a signal that I'd like to apply such an algorithm to, consider the test case given by Eli Billauer for his own algorithm:
t = 0:0.001:10;
x = 0.3*sin(t) + sin(1.3*t) + 0.9*sin(4.2*t) + 0.02*randn(1, 10001);
This is a more unusual (less uniform/regular) signal, with a varying frequency and amplitude, but still generally sinusoidal. The peaks are plainly obvious to the eye when plotted, but hard to identify with an algorithm.
What is a good real-time algorithm to correctly identify the peaks in a sinusoidal input signal? I am not really an expert when it comes to signal processing, so it would be helpful to get some rules of thumb that consider sinusoidal inputs. Or, perhaps I need to modify e.g. Jean-Paul's algorithm itself in order to work properly on sinusoidal signals. If that's the case, what modifications would be required, and how would I go about making these?
Case 1: sinusoid without noise
If your sinusoid does not contain any noise, you can use a very classic signal processing technique: taking the first derivative and detecting when it is equal to zero.
For example:
function signal = derivesignal( d )
% Identify signal
signal = zeros(size(d));
for i=2:length(d)
if d(i-1) > 0 && d(i) <= 0
signal(i) = +1; % peak detected
elseif d(i-1) < 0 && d(i) >= 0
signal(i) = -1; % trough detected
end
end
end
Using your example data:
% Generate data
dt = 1/8000;
t = (0:dt:(1-dt)/4)';
y = sin(2*pi*60*t);
% Add some trends
y(1:1000) = y(1:1000) + 0.001*(1:1000)';
y(1001:2000) = y(1001:2000) - 0.002*(1:1000)';
% Approximate first derivative (delta y / delta x)
d = [0; diff(y)];
% Identify signal
signal = derivesignal(d);
% Plot result
figure(1); clf; set(gcf,'Position',[0 0 677 600])
subplot(4,1,1); hold on;
title('Data');
plot(t,y);
subplot(4,1,2); hold on;
title('First derivative');
area(d);
ylim([-0.05, 0.05]);
subplot(4,1,3); hold on;
title('Signal (-1 for trough, +1 for peak)');
plot(t,signal); ylim([-1.5 1.5]);
subplot(4,1,4); hold on;
title('Signals marked on data');
markers = abs(signal) > 0;
plot(t,y); scatter(t(markers),y(markers),30,'or','MarkerFaceColor','red');
This yields:
This method will work extremely well for any type of sinusoid, with the only requirement that the input signal contains no noise.
Case 2: sinusoid with noise
As soon as your input signal contains noise, the derivative method will fail. For example:
% Generate data
dt = 1/8000;
t = (0:dt:(1-dt)/4)';
y = sin(2*pi*60*t);
% Add some trends
y(1:1000) = y(1:1000) + 0.001*(1:1000)';
y(1001:2000) = y(1001:2000) - 0.002*(1:1000)';
% Add some noise
y = y + 0.2.*randn(2000,1);
Will now generate this result because first differences amplify noise:
Now there are many ways to deal with noise, and the most standard way is to apply a moving average filter. One disadvantage of moving averages is that they are slow to adapt to new information, such that signals may be identified after they have occurred (moving averages have a lag).
Another very typical approach is to use Fourier Analysis to identify all the frequencies in your input data, disregard all low-amplitude and high-frequency sinusoids, and use the remaining sinusoid as a filter. The remaining sinusoid will be (largely) cleansed from the noise and you can then use first-differencing again to determine the peaks and troughs (or for a single sine wave you know the peaks and troughs happen at 1/4 and 3/4 pi of the phase). I suggest you pick up any signal processing theory book to learn more about this technique. Matlab also has some educational material about this.
If you want to use this algorithm in hardware, I would suggest you also take a look at WFLC (Weighted Fourier Linear Combiner) with e.g. 1 oscillator or PLL (Phase-Locked Loop) that can estimate the phase of a noisy wave without doing a full Fast Fourier Transform. You can find a Matlab algorithm for a phase-locked loop on Wikipedia.
I will suggest a slightly more sophisticated approach here that will identify the peaks and troughs in real-time: fitting a sine wave function to your data using moving least squares minimization with initial estimates from Fourier analysis.
Here is my function to do that:
function [result, peaks, troughs] = fitsine(y, t, eps)
% Fast fourier-transform
f = fft(y);
l = length(y);
p2 = abs(f/l);
p1 = p2(1:ceil(l/2+1));
p1(2:end-1) = 2*p1(2:end-1);
freq = (1/mean(diff(t)))*(0:ceil(l/2))/l;
% Find maximum amplitude and frequency
maxPeak = p1 == max(p1(2:end)); % disregard 0 frequency!
maxAmplitude = p1(maxPeak); % find maximum amplitude
maxFrequency = freq(maxPeak); % find maximum frequency
% Initialize guesses
p = [];
p(1) = mean(y); % vertical shift
p(2) = maxAmplitude; % amplitude estimate
p(3) = maxFrequency; % phase estimate
p(4) = 0; % phase shift (no guess)
p(5) = 0; % trend (no guess)
% Create model
f = #(p) p(1) + p(2)*sin( p(3)*2*pi*t+p(4) ) + p(5)*t;
ferror = #(p) sum((f(p) - y).^2);
% Nonlinear least squares
% If you have the Optimization toolbox, use [lsqcurvefit] instead!
options = optimset('MaxFunEvals',50000,'MaxIter',50000,'TolFun',1e-25);
[param,fval,exitflag,output] = fminsearch(ferror,p,options);
% Calculate result
result = f(param);
% Find peaks
peaks = abs(sin(param(3)*2*pi*t+param(4)) - 1) < eps;
% Find troughs
troughs = abs(sin(param(3)*2*pi*t+param(4)) + 1) < eps;
end
As you can see, I first perform a Fourier transform to find initial estimates of the amplitude and frequency of the data. I then fit a sinusoid to the data using the model a + b sin(ct + d) + et. The fitted values represent a sine wave of which I know that +1 and -1 are the peaks and troughs, respectively. I can therefore identify these values as the signals.
This works very well for sinusoids with (slowly changing) trends and general (white) noise:
% Generate data
dt = 1/8000;
t = (0:dt:(1-dt)/4)';
y = sin(2*pi*60*t);
% Add some trends
y(1:1000) = y(1:1000) + 0.001*(1:1000)';
y(1001:2000) = y(1001:2000) - 0.002*(1:1000)';
% Add some noise
y = y + 0.2.*randn(2000,1);
% Loop through data (moving window) and fit sine wave
window = 250; % How many data points to consider
interval = 10; % How often to estimate
result = nan(size(y));
signal = zeros(size(y));
for i = window+1:interval:length(y)
data = y(i-window:i); % Get data window
period = t(i-window:i); % Get time window
[output, peaks, troughs] = fitsine(data,period,0.01);
result(i-interval:i) = output(end-interval:end);
signal(i-interval:i) = peaks(end-interval:end) - troughs(end-interval:end);
end
% Plot result
figure(1); clf; set(gcf,'Position',[0 0 677 600])
subplot(4,1,1); hold on;
title('Data');
plot(t,y); xlim([0 max(t)]); ylim([-4 4]);
subplot(4,1,2); hold on;
title('Model fit');
plot(t,result,'-k'); xlim([0 max(t)]); ylim([-4 4]);
subplot(4,1,3); hold on;
title('Signal (-1 for trough, +1 for peak)');
plot(t,signal,'r','LineWidth',2); ylim([-1.5 1.5]);
subplot(4,1,4); hold on;
title('Signals marked on data');
markers = abs(signal) > 0;
plot(t,y,'-','Color',[0.1 0.1 0.1]);
scatter(t(markers),result(markers),30,'or','MarkerFaceColor','red');
xlim([0 max(t)]); ylim([-4 4]);
Main advantages of this approach are:
You have an actual model of your data, so you can predict signals in the future before they happen! (e.g. fix the model and calculate the result by inputting future time periods)
You don't need to estimate the model every period (see parameter interval in the code)
The disadvantage is that you need to select a lookback window, but you will have this problem with any method that you use for real-time detection.
Video demonstration
Data is the input data, Model fit is the fitted sine wave to the data (see code), Signal indicates the peaks and troughs and Signals marked on data gives an impression of how accurate the algorithm is. Note: watch the model fit adjust itself to the trend in the middle of the graph!
That should get you started. There are also a lot of excellent books on signal detection theory (just google that term), which will go much further into these types of techniques. Good luck!
Consider using findpeaks, it is fast, which may be important for realtime. You should filter high-frequency noise to improve accuracy. here I smooth the data with a moving window.
t = 0:0.001:10;
x = 0.3*sin(t) + sin(1.3*t) + 0.9*sin(4.2*t) + 0.02*randn(1, 10001);
[~,iPeak0] = findpeaks(movmean(x,100),'MinPeakProminence',0.5);
You can time the process (0.0015sec)
f0 = #() findpeaks(movmean(x,100),'MinPeakProminence',0.5)
disp(timeit(f0,2))
To compare, processing the slope is only a bit faster (0.00013sec), but findpeaks have many useful options, such as minimum interval between peaks etc.
iPeaks1 = derivePeaks(x);
f1 = #() derivePeaks(x)
disp(timeit(f1,1))
Where derivePeaks is:
function iPeak1 = derivePeaks(x)
xSmooth = movmean(x,100);
goingUp = find(diff(movmean(xSmooth,100)) > 0);
iPeak1 = unique(goingUp([1,find(diff(goingUp) > 100),end]));
iPeak1(iPeak1 == 1 | iPeak1 == length(iPeak1)) = [];
end
I need an algorithm (preferable in a Pascal-like language, but it the end doesn't really matter) that will make the "signal" (actually a series of data points) in left look like the one in right.
Signal origin:
The signal is generated by a machine. Oversimplifying the explanation, the machine is measuring the density of a liquid flowing through a transparent tube. So, the signal is nothing similar to an electrical signal (audio/radio frequency).
The data points could look like this: [1, 2, 1, 3, 4, 5, 4, 3, 2, 1, 13, 14, 15, 18, 23, 19, 17, 15, 15, 15, 14, 11, 9, 4, 1, 1, 2, 2, 1, 2]
What I need:
I need to accurately detect the 'peaks'. For this, I already have a piece of code but it is not working on poor signals as the one shown in the image below.
I think we can see this, as a signal that was accidentally passed through a low-pass filter, and now I want to restore it.
Notes:
There are 4 signals, but they are separated so they can be analyzed individually. So, it is enough to think about how to process just one of them.
After a peak, if the signal is not coming down fast enough we can consider that there are multiple peaks (you can best see that in the 'red' signal at the end of the series).
The advantage is that the whole series is available (so the signal is not in real time, it is already stored on file)!
[Edit by Spektre] I extracted Red sample points from the image
float f0[]={ 73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,72,71,69,68,66,64,62,58,54,49,41,33,25,17,13,15,21,30,39,47,54,59,62,64,66,67,68,69,70,71,71,72,72,72,71,71,70,69,68,67,66,65,63,62,60,56,51,45,37,29,22,18,18,22,28,33,35,36,35,32,26,20,15,12,15,20,26,31,35,37,36,34,30,25,22,22,27,33,41,48,55,60,63,66,67,68,69,70,71,72,72,73,73,73,73,73,73,72,71,70,69,67,65,63,60,55,49,40,30,21,13, 7,10,17,27,36,45,52,56,59,60,61,62,62,62,62,61,61,59,57,53,47,40,32,24,18,15,18,23,28,32,33,31,28,23,16,10, 6, 8,13,20,27,31,32,31,28,22,15,10, 6,10,16,23,30,34,36,36,34,29,24,20,19,24,30,37,44,51,56,59,61,62,63,64,64,64,65,64,64,62,60,57,53,48,43,38,36,39,43,49,54,59,63,66,68,69,70,71,72,72,73,73,73,73,73,73,73,73,73,73,73,73,73,73 };
float f1[]={ 55,58,60,62,64,66,67,68,68,69,69,70,71,72,72,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,73,72,72,72,72,72,72,72,72,72,72,72,72,72,72,72,73,73,73,72,72,72,72,71,71,71,71,71,71,70,70,69,68,67,66,64,63,61,60,59,57,55,52,49,46,43,40,37,35,34,33,32,32,33,34,36,37,39,41,43,45,47,50,52,55,57,59,60,61,61,62,62,62,62,61,61,60,58,57,55,53,51,49,48,46,44,42,40,38,35,32,30,29,28,27,27,26,26,26,25,25,24,23,23,23,24,24,25,25,26,26,26,27,28,29,31,33,35,38,40,41,43,44,46,48,50,53,55,57,59,60,61,62,63,64,64,65,65,64,63,61,59,57,54,52,50,47,45,42,39,37,34,32,31,30,30,30,31,32,34,36,37,39,40,41,42,43,44,44,44,44,43,42,41,40,38,36,34,32,30,28,26,25,24,23,22,21,20,18,17,17,17,17,18,18,18,18,18,18,18,18,18,18,19,19,19,19,20,20,21,23,24,25,26,26,26,27,28,29,31,34,36,37,38,40,41,43,45,47,48,49,50,51,51,51,50,49,49,48,48,47,47,47,47,47,47,48,60 };
const int n=sizeof(f0)/sizeof(f0[0]);
All the values need to be transformed:
f0[i] = 73.0-f0[i];
f1[i] = 73.0-f1[i];
To offset back from image... f0 is the original Red signal and f1 is the distorted Yellow one.
This is the closest I get to with first order FIR filter:
The upper half is plot of used FIR filter weights (editable by mouse so the FIR is hand drawed for fast weights find...). Bellow are the signal plots:
Red original (Ideal) signal f0
Dark green measured signal f1
Light Green FIR filtered Ideal signal f2
The FIR filter is just convolution where zero offset element is the last and here the weight values:
float fir[35] = { 0.0007506932, 0.0007506932, 0.0007506932, 0.0007506932, 0.0007506932, 0.0007506932, 0.0007506932, 0.0007506932, 0.0007506932, 0.003784107, 0.007575874, 0.01060929, 0.01591776, 0.02198459, 0.03032647, 0.04170178, 0.05686884, 0.06445237, 0.06900249, 0.07203591, 0.07203591, 0.0705192, 0.06900249, 0.06672744, 0.06217732, 0.05611049, 0.04928531, 0.04170178, 0.03335989, 0.02653471, 0.02046788, 0.01515941, 0.009850933, 0.005300813, 0.0007506932 };
So either there is also some higher degree of FIR or the weights need to be tweaked a bit more. Anyway this should be enough for the deconvolution and or fitting ... btw the FIR filter is done as follows:
const int fir_n=35; // size (exposure time) [samples]
const int fir_x=fir_n-1; // zero offset element [samples]
int x,y,i,j,ii;
float a,f2[n];
for (i=0;i<n;i++)
for (f2[i]=0.0,ii=i-fir_x,j=0;j<fir_n;j++,ii++)
if ((ii>=0)&&(ii<n)) f2[i]+=fir[j]*f0[ii];
In my opinion, you should start with a pretty simple system identification and consecutive signal reconstruction. Also, I would recommend to implement your algorithm first in a mathematical prototyping tool like Matlab (commerical license) or Octave (free → https://www.gnu.org/software/octave/download.html). These tools provide an ease of signal processing no programming language like Pascal or Java could ever offer, no matter what library you use. After you successfully designed your algorithm with Matlab or Octave, then think about how to implement it with Pascal.
Lets assume the behaviour of the tube can be characterized by a linear time-invariant system (e.g. a linear lowpass filter). This is by no means guaranteed but a worthwhile approach (at least until it fails:) ). Following the same approach for non-linear and/or time-variant systems becomes pretty involved and you will need professional help to do this I would imagine.
If I understood your description correctly, you have access to both the input and output signals of the tube. If I am wrong and you don’t know the input signal, you might be able to apply some calibration signal first, whose characteristics you know, and record the output signal. Knowing input and output signals is a prerequisite for the following approach. Without both signals you can not approximate the impulse response h of the tube. After calculating an approximation of h, we can design an inverse filer called ge and eventually reconstruct the input from the output signal.
Here is the signal flow of a input signal x[n] passing your tube described by h producing the output signal y[n]. Taking y[n] and applying an inverse filtering operation described by ge we obtain xr[n]
x[n] →| h | → y[n] → | ge | → xr[n]
Take an input vector x of length N and a corresponding vector of y of the same length.
Now you express the output y as a convolution of an input convolution matrix X (see the code below for its implementation) with the unknown impulse response of your system, i.e.
y = X * h
with the vector and matrix sizes y = N x 1, X = N x N and h = N x 1
You can calculate a least-squares approximation of the impulse response he, by calculating
he = inv(X'*X)*X' * y
where X' describes the transpose and inv() the matrix inverse of X. he represents a column vector of the identified impulse response of your tube which we obtained via a 1-D deconvolution. You can check how well the identification worked by calculating the output of your estimated system,
ye = X * he
and by comparing ye and y. Now, we try to reconstruct x from y and he. The reconstructed input vector xr is calculated by
xr = Ge * y
where Ge = inv(He) and He is the N x N convolution matrix of he.
Here is some Octave code. Copy both functions in their own dedicated file (reconstruct.m and getConvolutionMatrix.m) and type "reconstruct.m" into the Octave command line to examine the outputs of the example. Please note the sample code works only for vector of odd length (N is odd). Play around with the size N of your vectors. This might help the approximation accuracy.
function [Ge] = reconstruct ()
x = [1 2 3]'; # input signal
# h = [1 3 2]'; # unknown impulse response
y = [5 11 13]'; # output signal
y = y + 0.001*randn(length(y),1) # add noise to output signal
Xm = getConvolutionMatrix(x)
Xps = inv(Xm'*Xm)*Xm';
he = Xps * y
He = getConvolutionMatrix(he);
Ge = inv(He);
# reconstructed impulse signal
xr = Ge*y
endfunction
function [mH] = getConvolutionMatrix(h)
h = h(:)';
hlen = length(h);
Nc = (hlen-1)/2;
mH= zeros(hlen, hlen);
hp = [zeros(1,Nc) h zeros(1,Nc)];
for c=1:hlen
for r=1:hlen
mH(r,c) = hp(r+hlen-c);
end
end
endfunction