convert the time domain signal into frequency domain signal - time

I have represented the acceleration data of 3 axes(x, y, and z) in time domain as shown in the Graph
I would like to extract from the acceleration data some measurements (e.g. mean, energy, entropy and correlation) in the frequency domain. Therefore, I applied FFT transform in order to convert the time domain signal into frequency domain signal.
xx= fft(x_Segments{1});
plot(xx)
yy= fft(y_Segments{1});
plot(yy)
zz= fft(z_Segments{1});
plot(zz)
However, the resulted graphs make no sense( so strange) which is not expected at all !! This is an example of the frequency domain signal of the x-axis.
Note, x_Segments, y_Segments, and z_Segments contain the data of X, Y, Z axes in the time domain respectively.
As we can see the graph doesn't make sense, so, is there any steps that I should follow before using the FFT function to get the expected frequency domain signal?
I really appreciate your help guys. thanks

Related

power sampling in GNU Radio

I am using GNU Radio with hackrf. I need to get a picking for each frequency in accordance with the selected decibel level or a level above a certain decibel threshold and save the frequencies/db to a file.
Trying to solve this, I decided to recreate the "QT GUI Frequency Sink" algorithm through the Embedded python block, but unfortunately I lack knowledge of how to convert comlex64 data to fft freq / amplitude signal. I hit the wall for several months, I will be glad to any advice.
First off: the hackRF is not a calibrated measurement device. You need to use a calibrated device to know at any given gain, sampling rate and RF frequency, what a digital power corresponds to in physical units. No exceptions.
but unfortunately I lack knowledge of how to convert comlex64 data to fft freq / amplitude signal.
You literally just apply the FFT of the length of your choosing, then calculate the magnitude of each complex value in that result. There's nothing more to it. (If the FFT being an algorithm that works on vectors of complex number and produces vectors of complex numbers of the same size confuses you, you don't have a programming, but a math basics problem!)
In GNU Radio, you can ask for your signal to come in multiples of any given length, so that getting vectors of input samples to transform is trivial python slicing.

Understanding Bayesian updating: application to distance sensors

I would need to measure the actual distance D of a wall using a distance sensor (both the sensor and the wall are fixed, so that D is constant in time). From the Bayesian updating it follows that P(D|X) ∝ P(X|D)P(D), where P(D) is the prior, P(D|X) is the posterior, and P(X|D) is the conditional probability of the measurement X|D, whose distribution is estimated based on a sample of 50 values recorded with the distance sensor.
I have the following question: after replacing the prior with the posterior, can I repeat the upgrade by re-using the same sensor? That would mean: P(D'|X') ∝ P(X'|D')P(D'), with P(D') = P(D|X).
My answer would be no. Indeed, as the sensor is the same, the two measurements X' and X are defined by the same random variable. I am not sure if my understanding of Bayes' theorem is correct, but I would imagine that a second update could be done only if I were using a different sensor, whose measurement is totally independent from the measurement of the first sensor. However, if the two sensors were based on the same technology and shared the same manufacturing process, I guess their readings (X' and X) would be somehow correlated, and again I wouldn't know how to proceed in that case.

Produce a similarity ranking for a set of time series signals

Given a set of 24 hour signal with hourly data points which represent energy consumption patterns, provide each with a similarity score? The peaks vary in their height, width and placement in the signal. The aim is to rank the signals such that those with closer scores are more similar than those with further scores (like how age or income works). I.e., there should be a lower gap between these two signals' scores
than these.
Finding the correlation between each one with a base case was not adequate since if one signal had a high peak in the morning and a low one in the afternoon, a signal with peaks in the opposite pattern would be classified as similar to the first signal. Returns correlation was also not suitable. The same issue was produced when using RMSE between signals and a base case.
After some thought, I attempted to find the peaks of a signal and then score a peak in the following way:
public double score(){
int b1 = max-start;
int b2 = end - max;
double h1 = maxHeight-startHeight;
double h2 = maxHeight - endHeight;
double a1 = 0.5*h1*b1;
double a2 = 0.5*h2*b2;
return Math.sqrt(Math.pow(h1,2)+ Math.pow(h2,2));
}
Where start, max, and end represent the start, max, and end times of the peak, respectively.
I think this could be a working method; however, I'm having difficulty finding the peaks themselves. All the methods I've tried have some flaws.
I have tried the method in this post: Peak signal detection in realtime timeseries data
Some peaks were defined as starting too early. Since some peaks could persist for several hours, I tried making lag longer. However, if the lag was too long, peaks beginning before time=lag were missed.
I also tried to use standard deviation of the gradient as a signal that a peak was beginning. I.e., if gradient of a given point is factor*stdev(all gradients) then a peak is beginning. *Factor was 0.6
This failed when certain signals had one very steep peak in the evening and a shallower one in the morning (or vice versa). The stdev of the gradient would be too high and the algorithm missed the shallower peak. If I made the factor low enough to pick up the shallow peak as well, false peaks were detected.
Inspired by the method in the post above, I tried using a moving stdev of the gradient. However, this algorithm still misses some peaks.

Algorithm for smoothing wifi signal strength

When measuring the strength of a Wifi signal between two static points, the measurement constantly fluctuates due to environmental factors.
What is a good algorithm to use smooth out small fluctuations and detect significant changes? Exponential moving average?
Some sort of low pass filtering usually works for things like this:
y[i] = alpha * x[i] + (1-alpha) * y[i-1]
where alpha is chosen based on the amount of smoothing desired. x contains the raw input samples and y contains the filtered result.
Exponential moving average is a good way of estimating the current true value of the signal, which as you can see above has popped up under a number of disguises with a number of different justifications.
The problem of detecting significant changes is slightly different, and has been studied as part of statistical quality control. One simple tool for this is http://en.wikipedia.org/wiki/CUSUM. The wikipedia page tells you enough to implement this, but not how to set W in S[n+1] = S[n] + Min(0, S[n] + X[n] - W), or what value of S[n] means that it has detected something. You could search further than I have, look in texts such as "Introduction to Statistical Quality Control" by Montgomery, or just grab lots of data and see what works in real life.
I would start by setting W to be the average of the typical value of long term signal strength when everything is OK and the first value of long term signal strength that should make you actually do something, and then plot the results of this on historical data to see if it looks sane and, if so, what value of S[n] should make you actually do something. (X[n] is of course the raw measured signal strength).

Peak detection of measured signal

We use a data acquisition card to take readings from a device that increases its signal to a peak and then falls back to near the original value. To find the peak value we currently search the array for the highest reading and use the index to determine the timing of the peak value which is used in our calculations.
This works well if the highest value is the peak we are looking for but if the device is not working correctly we can see a second peak which can be higher than the initial peak. We take 10 readings a second from 16 devices over a 90 second period.
My initial thoughts are to cycle through the readings checking to see if the previous and next points are less than the current to find a peak and construct an array of peaks. Maybe we should be looking at a average of a number of points either side of the current position to allow for noise in the system. Is this the best way to proceed or are there better techniques?
We do use LabVIEW and I have checked the LAVA forums and there are a number of interesting examples. This is part of our test software and we are trying to avoid using too many non-standard VI libraries so I was hoping for feedback on the process/algorithms involved rather than specific code.
There are lots and lots of classic peak detection methods, any of which might work. You'll have to see what, in particular, bounds the quality of your data. Here are basic descriptions:
Between any two points in your data, (x(0), y(0)) and (x(n), y(n)), add up y(i + 1) - y(i) for 0 <= i < n and call this T ("travel") and set R ("rise") to y(n) - y(0) + k for suitably small k. T/R > 1 indicates a peak. This works OK if large travel due to noise is unlikely or if noise distributes symmetrically around a base curve shape. For your application, accept the earliest peak with a score above a given threshold, or analyze the curve of travel per rise values for more interesting properties.
Use matched filters to score similarity to a standard peak shape (essentially, use a normalized dot-product against some shape to get a cosine-metric of similarity)
Deconvolve against a standard peak shape and check for high values (though I often find 2 to be less sensitive to noise for simple instrumentation output).
Smooth the data and check for triplets of equally spaced points where, if x0 < x1 < x2, y1 > 0.5 * (y0 + y2), or check Euclidean distances like this: D((x0, y0), (x1, y1)) + D((x1, y1), (x2, y2)) > D((x0, y0),(x2, y2)), which relies on the triangle inequality. Using simple ratios will again provide you a scoring mechanism.
Fit a very simple 2-gaussian mixture model to your data (for example, Numerical Recipes has a nice ready-made chunk of code). Take the earlier peak. This will deal correctly with overlapping peaks.
Find the best match in the data to a simple Gaussian, Cauchy, Poisson, or what-have-you curve. Evaluate this curve over a broad range and subtract it from a copy of the data after noting it's peak location. Repeat. Take the earliest peak whose model parameters (standard deviation probably, but some applications might care about kurtosis or other features) meet some criterion. Watch out for artifacts left behind when peaks are subtracted from the data.
Best match might be determined by the kind of match scoring suggested in #2 above.
I've done what you're doing before: finding peaks in DNA sequence data, finding peaks in derivatives estimated from measured curves, and finding peaks in histograms.
I encourage you to attend carefully to proper baselining. Wiener filtering or other filtering or simple histogram analysis is often an easy way to baseline in the presence of noise.
Finally, if your data is typically noisy and you're getting data off the card as unreferenced single-ended output (or even referenced, just not differential), and if you're averaging lots of observations into each data point, try sorting those observations and throwing away the first and last quartile and averaging what remains. There are a host of such outlier elimination tactics that can be really useful.
You could try signal averaging, i.e. for each point, average the value with the surrounding 3 or more points. If the noise blips are huge, then even this may not help.
I realise that this was language agnostic, but guessing that you are using LabView, there are lots of pre-packaged signal processing VIs that come with LabView that you can use to do smoothing and noise reduction. The NI forums are a great place to get more specialised help on this sort of thing.
This problem has been studied in some detail.
There are a set of very up-to-date implementations in the TSpectrum* classes of ROOT (a nuclear/particle physics analysis tool). The code works in one- to three-dimensional data.
The ROOT source code is available, so you can grab this implementation if you want.
From the TSpectrum class documentation:
The algorithms used in this class have been published in the following references:
[1] M.Morhac et al.: Background
elimination methods for
multidimensional coincidence gamma-ray
spectra. Nuclear Instruments and
Methods in Physics Research A 401
(1997) 113-
132.
[2] M.Morhac et al.: Efficient one- and two-dimensional Gold
deconvolution and its application to
gamma-ray spectra decomposition.
Nuclear Instruments and Methods in
Physics Research A 401 (1997) 385-408.
[3] M.Morhac et al.: Identification of peaks in
multidimensional coincidence gamma-ray
spectra. Nuclear Instruments and
Methods in Research Physics A
443(2000), 108-125.
The papers are linked from the class documentation for those of you who don't have a NIM online subscription.
The short version of what is done is that the histogram flattened to eliminate noise, and then local maxima are detected by brute force in the flattened histogram.
I would like to contribute to this thread an algorithm that I have developed myself:
It is based on the principle of dispersion: if a new datapoint is a given x number of standard deviations away from some moving mean, the algorithm signals (also called z-score). The algorithm is very robust because it constructs a separate moving mean and deviation, such that signals do not corrupt the threshold. Future signals are therefore identified with approximately the same accuracy, regardless of the amount of previous signals. The algorithm takes 3 inputs: lag = the lag of the moving window, threshold = the z-score at which the algorithm signals and influence = the influence (between 0 and 1) of new signals on the mean and standard deviation. For example, a lag of 5 will use the last 5 observations to smooth the data. A threshold of 3.5 will signal if a datapoint is 3.5 standard deviations away from the moving mean. And an influence of 0.5 gives signals half of the influence that normal datapoints have. Likewise, an influence of 0 ignores signals completely for recalculating the new threshold: an influence of 0 is therefore the most robust option.
It works as follows:
Pseudocode
# Let y be a vector of timeseries data of at least length lag+2
# Let mean() be a function that calculates the mean
# Let std() be a function that calculates the standard deviaton
# Let absolute() be the absolute value function
# Settings (the ones below are examples: choose what is best for your data)
set lag to 5; # lag 5 for the smoothing functions
set threshold to 3.5; # 3.5 standard deviations for signal
set influence to 0.5; # between 0 and 1, where 1 is normal influence, 0.5 is half
# Initialise variables
set signals to vector 0,...,0 of length of y; # Initialise signal results
set filteredY to y(1,...,lag) # Initialise filtered series
set avgFilter to null; # Initialise average filter
set stdFilter to null; # Initialise std. filter
set avgFilter(lag) to mean(y(1,...,lag)); # Initialise first value
set stdFilter(lag) to std(y(1,...,lag)); # Initialise first value
for i=lag+1,...,t do
if absolute(y(i) - avgFilter(i-1)) > threshold*stdFilter(i-1) then
if y(i) > avgFilter(i-1)
set signals(i) to +1; # Positive signal
else
set signals(i) to -1; # Negative signal
end
# Adjust the filters
set filteredY(i) to influence*y(i) + (1-influence)*filteredY(i-1);
set avgFilter(i) to mean(filteredY(i-lag,i),lag);
set stdFilter(i) to std(filteredY(i-lag,i),lag);
else
set signals(i) to 0; # No signal
# Adjust the filters
set filteredY(i) to y(i);
set avgFilter(i) to mean(filteredY(i-lag,i),lag);
set stdFilter(i) to std(filteredY(i-lag,i),lag);
end
end
Demo
> For more information, see original answer
This method is basically from David Marr's book "Vision"
Gaussian blur your signal with the expected width of your peaks.
this gets rid of noise spikes and your phase data is undamaged.
Then edge detect (LOG will do)
Then your edges were the edges of features (like peaks).
look between edges for peaks, sort peaks by size, and you're done.
I have used variations on this and they work very well.
I think you want to cross-correlate your signal with an expected, exemplar signal. But, it has been such a long time since I studied signal processing and even then I didn't take much notice.
I don't know very much about instrumentation, so this might be totally impractical, but then again it might be a helpful different direction. If you know how the readings can fail, and there is a certain interval between peaks given such failures, why not do gradient descent at each interval. If the descent brings you back to an area you've searched before, you can abandon it. Depending upon the shape of the sampled surface, this also might help you find peaks faster than search.
Is there a qualitative difference between the desired peak and the unwanted second peak? If both peaks are "sharp" -- i.e. short in time duration -- when looking at the signal in the frequency domain (by doing FFT) you'll get energy at most bands. But if the "good" peak reliably has energy present at frequencies not existing in the "bad" peak, or vice versa, you may be able to automatically differentiate them that way.
You could apply some Standard Deviation to your logic and take notice of peaks over x%.

Resources