how do I estimate SNR from a single audio file containing speech?
I know of two methods:
log power histogram pecentile difference (aka "NIST quick method"), described here: http://labrosa.ee.columbia.edu/~dpwe/tmp/nist/doc/stnr.txt
10*log10( (S-N)/N ), where
S = sum{x[i]^2 * e[i]}
N = sum{x[i]^2 * (1-e[i])}
e[i] some sort of voice activity detection (speech/non-speech indicator)
are there any better methods that do not require stereo data (or data in both clean and noisy version)? I also would like to avoid the "second method" described in the NIST document (see 1.) that makes strong assumptions about the distributions.
Human voice uses frequencies from 300 Hz to 3 kHz. This is what (old) telephone systems are using. Human voice never uses all these frequencies at a time, this is why we can do a frequency analysis for finding the noise floor - without any reference or voice activity detection e[i]:
Compute FFT with a frequency resolution of ~ 10 - 20 Hz.
With a samplerate of 48 kHz you would use an FFT length of samplerate/resolution = 4800 samples, which should the get rounded to the nearest power of 2, which is 4096
Identify the necessary bins which hold the results from 300 - 3000 Hz.
The bin index k holds the result for frequency k*samplerate/FFT_length. For above 48 kHz input and FFT length 4096 this is k(300 Hz) = 300 * 4096 / 48000 ~= 25 and k(3000 Hz) = 3000 * 4096 / 48000 ~= 250.
Calculate the energy in each necessary bin: E[k] = FFT[k].re ^2 + FFT[k].im ^2. It depends on your FFT algorithm "where" the real and imaginary parts are written.
N = min{ E[k=25..250] } * number_of_bins (=250-25+1)
S = sum{ E[k=25..250] }
SNR = (S-N)/N. The level is 10*log10(SNR)
As the SNR varies over time, go back to step 1 with some new samples - probably with some overlap
Related
I have a figure like x axis Bit Error Rate and y axis Data Rate. And I want to find minimum Bit Error Rate and maximum Data rate in this figure. Namely I want to do thing that in this figure there are 18 points and I want to find optimal result but I cannot it ?
I believe what you are asking is to find the optimal ratio between BitErrorRate and DataRate, in that case you need to calculate the ratio of BitErrorRate per DataRate and then find the min or max of that depending on whether you are looking for fewer or more BitErrorRate per DataRate. Assuming the BitErrorRate and BitRate are saved in arrays you could use code like this:
% Give some random numbers to illustrate functionality
BitErrorRate = [2 7 3 5 8 1];
DataRate = [1 3 4 2 6 5];
% Find the ratio of BitErrorRate per DataRate
ErrorRatio = BitErrorRate ./ BitRate;
% Optimal ratio with minimal errors per data rate and corresponding index
[MinError, MinIndex] = min(ErrorRatio);
% Print results in console
disp(['Optimal error rate ratio is ' num2str(ErrorRatio(MinIndex)) ...
' BitErrorRate per DataRate with a Bit Error Rate of ' ...
num2str(BitErrorRate(MinIndex)) ' and a Bit Rate of ' ...
num2str(BitRate(MinIndex)) '.']);
% Sort in ascending manner according to top row, below this is DataRate
SortedForDataRate = sortrows([DataRate;BitErrorRate;ErrorRatio]')';
fig = figure(1);
subplot(3,1,1)
plot(SortedForDataRate(1,:))
title('BitErrorRate')
subplot(3,1,2)
plot(SortedForDataRate(2,:))
title('DataRate')
subplot(3,1,3)
plot(SortedForDataRate(3,:))
title('ErrorRatio')
I have a cable winch system that I would like to know how much cable is left given the number of rotations that have occurred and vice versa. This system will run on a low-cost microcontroller with low computational resources and should be able to update quickly, long for/while loop iterations are not ideal.
The inputs are cable diameter, inner drum diameter, inner drum width, and drum rotations. The output should be the length of the cable on the drum.
At first, I was calculating the maximum number of wraps of cable per layer based on cable diameter and inner drum width, I could then use this to calculate the length of cable per layer. The issue comes when I calculate the total length as I have to loop through each layer, a costly operation (could be 100's of layers).
My next approach was to precalculate a table with each layer, then perform a 3-5 degree polynomial regression down to an easy to calculate formula.
This appears to work for the most part, however, there are slight inaccuracies at the low and high end (0 rotations could be + or - a few units of cable length). The real issue comes when I try and reverse the function to get the current rotations of the drum given the length. So far, my reversed formula does not seem to equal the forward formula (I am reversing X and Y before calculating the polynomial).
I have looked high and low and cannot seem to find any formulas for cable length to rotations that do not use recursion or loops. I can't figure out how to reverse my polynomial function to get the reverse value without losing precision. If anyone happens to have an insight/ideas or can help guide me in the right direction that would be most helpful. Please see my attempts below.
// Units are not important
CableLength = 15000
CableDiameter = 5
DrumWidth = 50
DrumDiameter = 5
CurrentRotations = 0
CurrentLength = 0
CurrentLayer = 0
PolyRotations = Array
PolyLengths = Array
PolyLayers = Array
WrapsPerLayer = DrumWidth / CableDiameter
While CurrentLength < CableLength // Calcuate layer length for each layer up to cable length
CableStackHeight = CableDiameter * CurrentLayer
DrumDiameterAtLayer = DrumDiameter + (CableStackHeight * 2) // Assumes cables stack vertically
WrapDiameter = DrumDiameterAtLayer + CableDiameter // Center point of cable
WrapLength = WrapDiameter * PI
LayerLength = WrapLength * WrapsPerLayer
CurrentRotations += WrapsPerLayer // 1 Rotation per wrap
CurrentLength += LayerLength
CurrentLayer++
PolyRotations.Push(CurrentRotations)
PolyLengths.Push(CurrentLength)
PolyLayers.Push(CurrentLayer)
End
// Using 5 degree polynomials, any lower = very low precision
PolyLengthToRotation = CreatePolynomial(PolyLengths, PolyRotations, 5) // 5 Degrees
PolyRotationToLength = CreatePolynomial(PolyRotations, PolyLengths, 5) // 5 Degrees
// 40 Rotations should equal about 3141.593 units
RealRotation = 40
RealLength = 3141.593
CalculatedLength = EvaluatePolynomial(RealRotation,PolyRotationToLength)
CalculatedRotations = EvaluatePolynomial(RealLength,PolyLengthToRotation)
// CalculatedLength = 3141.593 // Good
// CalculatedRotations = 41.069 // No good
// CalculatedRotations != RealRotation // These should equal
// 0 Rotations should equal 0 length
RealRotation = 0
RealLength = 0
CalculatedLength = EvaluatePolynomial(RealRotation,PolyRotationToLength)
CalculatedRotations = EvaluatePolynomial(RealLength,PolyLengthToRotation)
// CalculatedLength = 1.172421e-9 // Very close
// CalculatedRotations = 1.947, // No good
// CalculatedRotations != RealRotation // These should equal
Side note: I have a "spool factor" parameter to calibrate for the actual cable spooling efficiency that is not shown here. (cable is not guaranteed to lay mathematically perfect)
#Bathsheba May have meant cable, but a table is a valid option (also experimental numbers are probably more interesting in the real world).
A bit slow, but you could always do it manually. There's only 40 rotations (though optionally for better experimental results, repeat 3 times and take the average...). Reel it completely in. Then do a rotation (depending on the diameter of your drum, half rotation). Measure and mark how far it spooled out (tape), record it. Repeat for the next 39 rotations. You now have a lookup table you can find the length in O(log N) via binary search (by sorting the data) and a bit of interpolation (IE: 1.5 rotations is about half way between 1 and 2 rotations).
You can also use this to derived your own experimental data. Do the same thing, but with a cable half as thin (perhaps proportional to the ratio of the inner diameter and the cable radius?). What effect does it have on the numbers? How about twice or half the diameter? Math says circumference is linear (2πr), so half the radius, half the amount per rotation. Might be easier to adjust the table data.
The gist is that it may be easier for you to have a real world reference for your numbers rather than relying purely on an abstract mathematically model (not to say the model would be wrong, but cables don't exactly always wind up perfectly, who knows perhaps you can find a quirk about your winch that would have lead to errors in a pure mathematical approach). Who knows might be able to derive the formula yourself :) with a fudge factor for the real world even lol.
This is actually more of a theoretical question, but here it goes:
I'm developing an effect audio unit and it needs an equal power crossfade between dry and wet signals.
But I'm confused about the right way to do the mapping function from the linear fader to the scaling factor (gain) for the signal amplitudes of dry and wet streams.
Basically, I'ev seen it done with cos / sin functions or square roots... essentially approximating logarithmic curves. But if our perception of amplitude is logarithmic to start with, shouldn't these curves mapping the fader position to an amplitude actually be exponential?
This is what I mean:
Assumptions:
signal[i] means the ith sample in a signal.
each sample is a float ranging [-1, 1] for amplitudes between [0,1].
our GUI control is an NSSlider ranging from [0,1], so it is in
principle linear.
fader is a variable with the value of the NSSlider.
First Observation:
We perceive amplitude in a logarithmic way. So if we have a linear fader and merely adjust a signal's amplitude by doing: signal[i] * fader what we are perceiving (hearing, regardless of the math) is something along the lines of:
This is the so-called crappy fader-effect: we go from silence to a drastic volume increase across the leftmost segment in the slider and past the middle the volume doesn't seem to get that louder.
So to do the fader "right", we instead either express it in a dB scale and then, as far as the signal is concerned, do: signal[i] * 10^(fader/20) or, if we were to keep or fader units in [0,1], we can do :signal[i] * (.001*10^(3*fader))
Either way, our new mapping from the NSSlider to the fader variable which we'll use for multiplying in our code, looks like this now:
Which is what we actually want, because since we perceive amplitude logarithmically, we are essentially mapping from linear (NSSLider range 0-1) to exponential and feeding this exponential output to our logarithmic perception. And it turns out that : log(10^x)=x so we end up perceiving the amplitude change in a linear (aka correct) way.
Great.
Now, my thought is that an equal-power crossfade between two signals (in this case a dry / wet horizontal NSSlider to mix together the input to the AU and the processed output from it) is essentially the same only that with one slider acting on both hypothetical signals dry[i] and wet[i].
So If my slider ranges from 0 to 100 and dry is full-left and wet is full-right), I'd end up with code along the lines of:
Float32 outputSample, wetSample, drySample = <assume proper initialization>
Float32 mixLevel = .01 * GetParameter(kParameterTypeMixLevel);
Float32 wetPowerLevel = .001 * pow(10, (mixLevel*3));
Float32 dryPowerLevel = .001 * pow(10, ((-3*mixLevel)+1));
outputSample = (wetSample * wetPowerLevel) + (drySample * dryPowerLevel);
The graph of which would be:
And same as before, because we perceive amplitude logarithmically, this exponential mapping should actually make it where we hear the crossfade as linear.
However, I've seen implementations of the crossfade using approximations to log curves. Meaning, instead:
But wouldn't these curves actually emphasize our logarithmic perception of amplitude?
The "equal power" crossfade you're thinking of has to do with keeping the total output power of your mix constant as you fade from wet to dry. Keeping total power constant serves as a reasonable approximation to keeping total perceived loudness constant (which in reality can be fairly complicated).
If you are crossfading between two uncorrelated signals of equal power, you can maintain a constant output power during the crossfade by using any two functions whose squared values sum to 1. A common example of this is the set of functions
g1(k) = ( 0.5 + 0.5*cos(pi*k) )^.5
g2(k) = ( 0.5 - 0.5*cos(pi*k) )^.5,
where 0 <= k <= 1 (note that g1(k)^2 + g2(k)^2 = 1 is satisfied, as mentioned). Here's a proof that this results in a constant power crossfade for uncorrelated signals:
Say we have two signals x1(t) and x2(t) with equal powers E[ x1(t)^2 ] = E[ x2(t)^2 ] = Px, which are also uncorrelated ( E[ x1(t)*x2(t) ] = 0 ). Note that any set of gain functions satisfying the previous condition will have that g2(k) = (1 - g1(k)^2)^.5. Now, forming the sum y(t) = g1(k)*x1(t) + g2(k)*x2(t), we have that:
E[ y(t)^2 ] = E[ (g1(k) * x1(t))^2 + 2*g1(k)*(1 - g1(k)^2)^.5 * x1(t) * x2(t) + (1 - g1(k)^2) * x2(t)^2 ]
= g1(k)^2 * E[ x1(t)^2 ] + 2*g1(k)*(1 - g1(k)^2)^.5 * E[ x1(t)*x2(t) ] + (1 - g1(k)^2) * E[ x2(t)^2 ]
= g1(k)^2 * Px + 0 + (1 - g1(k)^2) * Px = Px,
where we have used that g1(k) and g2(k) are deterministic and can thus be pulled outside the expectation operator E[ ], and that E[ x1(t)*x2(t) ] = 0 by definition because x1(t) and x2(t) are assumed to be uncorrelated. This means that no matter where we are in the crossfade (whatever k we choose) our output will still have the same power, Px, and thus hopefully equal perceived loudness.
Note that for completely correlated signals, you can achieve constant output power by doing a "linear" fade - using and two functions that sum to one ( g1(k) + g2(k) = 1 ). When mixing signals that are somewhat correlated, gain functions between those two would theoretically be appropriate.
What you're thinking of when you say
And same as before, because we perceive amplitude logarithmically,
this exponential mapping should actually make it where we hear the
crossfade as linear.
is that one signal should perceptually decrease in loudness as a linear function of slider position (k), while the other signal should perceptually increase in loudness as a linear function of slider position, when applying your derived crossfade. While your derivation of that seems pretty spot on, unfortunately that may not the best way to blend your dry and wet signals in terms of consistency - often, maintaining equal output loudness, regardless of slider position, is the better thing to shoot for. In any case, it might be worth trying a couple different functions to see what is most usable and consistent.
I am trying to understand the FFT algorithm and so far I think that I understand the main concept behind it. However I am confused as to the difference between 'framesize' and 'window'.
Based on my understanding, it seems that they are redundant with each other? For example, I present as input a block of samples with a framesize of 1024. So I have byte[1024] presented as input.
What then is the purpose of the windowing function? Since initially, I thought the purpose of the windowing function is to select the block of samples from the original data.
Thanks!
What then is the purpose of the windowing function?
It's to deal with so-called "spectral leakage": the FFT assumes an infinite series that repeats the given sample frame over and over again. If you have a sine wave that is an integral number of cycles within the sample frame, then all is good, and the FFT gives you a nice narrow peak at the proper frequency. But if you have a sine wave that is not an integral number of cycles, there's a discontinuity between the last and first sample, and the FFT gives you false harmonics.
Windowing functions lower the amplitudes at the beginning and the end of the sample frame, to reduce the harmonics caused by this discontinuity.
some diagrams from a National Instruments webpage on windowing:
integral # of cycles:
non-integer # of cycles:
for additional information:
http://www.tmworld.com/article/322450-Windowing_Functions_Improve_FFT_Results_Part_I.php
http://zone.ni.com/reference/en-XX/help/371361B-01/lvanlsconcepts/char_smoothing_windows/
http://www.physik.uni-wuerzburg.de/~praktiku/Anleitung/Fremde/ANO14.pdf
A rectangular window of length M has frequency response of sin(ω*M/2)/sin(ω/2), which is zero when ω = 2*π*k/M, for k ≠ 0. For a DFT of length N, where ω = 2*π*n/N, there are nulls at n = k * N/M. The ratio N/M isn't necessarily an integer. For example, if N = 40, and M = 32, then there are nulls at multiples of 1.25, but only the integer multiples will appear in the DFT, which is bins 5, 10, 15, and 20 in this case.
Here's a plot of the 1024-point DFT of a 32-point rectangular window:
M = 32
N = 1024
w = ones(M)
W = rfft(w, N)
K = N/M
nulls = abs(W[K::K])
plot(abs(W))
plot(r_[K:N/2+1:K], nulls, 'ro')
xticks(r_[:512:64])
grid(); axis('tight')
Note the nulls at every N/M = 32 bins. If N=M (i.e. the window length equals the DFT length), then there are nulls at all bins except at n = 0.
When you multiply a window by a signal, the corresponding operation in the frequency domain is the circular convolution of the window's spectrum with the signal's spectrum. For example, the DTFT of a sinusoid is a weighted delta function (i.e. an impulse with infinite height, infinitesimal extension, and finite area) located at the positive and negative frequency of the sinusoid. Convolving a spectrum with a delta function just shifts it to the location of the delta and scales it by the delta's weight. Therefore when you multiply a window by a sinusoid in the sample domain, the window's frequency response is scaled and shifted to the frequency of the sinusoid.
There are a couple of scenarios to examine regarding the length of a rectangular window. First let's look at the case where the window length is an integer multiple of the sinusoid's period, e.g. a 32-sample rectangular window of a cosine with a period of 32/8 = 4 samples:
x1 = cos(2*pi*8*r_[:32]/32) # ω0 = 8π/16, bin 8/32 * 1024 = 256
X1 = rfft(x1 * w, 1024)
plot(abs(X1))
xticks(r_[:513:64])
grid(); axis('tight')
As before, there are nulls at multiples of N/M = 32. But the window's spectrum has been shifted to bin 256 of the sinusoid and scaled by its magnitude, which is 0.5 split between the positive frequency and the negative frequency (I'm only plotting positive frequencies). If the DFT length had been 32, the nulls would line up at every bin, prompting the appearance that there's no leakage. But that misleading appearance is only a function of the DFT length. If you pad the windowed signal with zeros (as above), you'll get to see the sinc-like response at frequencies between the nulls.
Now let's look at a case where the window length is not an integer multiple of the sinusoid's period, e.g. a cosine with an angular frequency of 7.5π/16 (the period is 64 samples):
x2 = cos(2*pi*15*r_[:32]/64) # ω0 = 7.5π/16, bin 15/64 * 1024 = 240
X2 = rfft(x2 * w, 1024)
plot(abs(X2))
xticks(r_[-16:513:64])
grid(); axis('tight')
The center bin location is no longer at an integer multiple of 32, but shifted by a half down to bin 240. So let's see what the corresponding 32-point DFT would look like (inferring a 32-point rectangular window). I'll compute and plot the 32-point DFT of x2[n] and also superimpose a 32x decimated copy of the 1024-point DFT:
X2_32 = rfft(x2, 32)
X2_sample = X2[::32]
stem(r_[:17],abs(X2_32))
plot(abs(X2_sample), 'rs') # red squares
grid(); axis([0,16,0,11])
As you can see in the previous plot, the nulls are no longer aligned at multiples of 32, so the magnitude of the 32-point DFT is non-zero at each bin. In the 32 point DFT, the window's nulls are still spaced every N/M = 32/32 = 1 bin, but since ω0 = 7.5π/16, the center is at 'bin' 7.5, which puts the nulls at 0.5, 1.5, etc, so they're not present in the 32-point DFT.
The general message is that spectral leakage of a windowed signal is always present but can be masked in the DFT if the signal specrtum, window length, and DFT length come together in just the right way to line up the nulls. Beyond that you should just ignore these DFT artifacts and concentrate on the DTFT of your signal (i.e. pad with zeros to sample the DTFT at higher resolution so you can clearly examine the leakage).
Spectral leakage caused by convolving with a window's spectrum will always be there, which is why the art of crafting particularly shaped windows is so important. The spectrum of each window type has been tailored for a specific task, such as dynamic range or sensitivity.
Here's an example comparing the output of a rectangular window vs a Hamming window:
from pylab import *
import wave
fs = 44100
M = 4096
N = 16384
# load a sample of guitar playing an open string 6
# with a fundamental frequency of 82.4 Hz
g = fromstring(wave.open('dist_gtr_6.wav').readframes(-1),
dtype='int16')
L = len(g)/4
g_t = g[L:L+M]
g_t = g_t / float64(max(abs(g_t)))
# compute the response with rectangular vs Hamming window
g_rect = rfft(g_t, N)
g_hamm = rfft(g_t * hamming(M), N)
def make_plot():
fmax = int(82.4 * 4.5 / fs * N) # 4 harmonics
subplot(211); title('Rectangular Window')
plot(abs(g_rect[:fmax])); grid(); axis('tight')
subplot(212); title('Hamming Window')
plot(abs(g_hamm[:fmax])); grid(); axis('tight')
if __name__ == "__main__":
make_plot()
If you don't modify the sample values, and select the same length of data as the FFT length, this is equivalent to using a rectangular window, in which case the frame and the window are identical. However multiplying your input data by a rectangular window in the time domain is the same as convolving the input signal's spectrum with a Sinc function in the frequency domain, which will spread any spectral peaks for frequencies which are not exactly periodic in the FFT aperture across the entire spectrum.
Non-rectangular windows are often used so the the resulting FFT spectrum is convolved with something a bit more "focused" than a Sinc function.
You can also use a rectangular window that is a different size than the FFT length or aperture. In the case of a shorter data window, the FFT frame can be zero padded, which can result in an smoother looking interpolated FFT result spectrum. You can even use a rectangular window that is longer that the length of the FFT by wrapping data around the FFT aperture in a summed circular manner for some interesting effects with the frequency resolution.
ADDED due to a request:
Multiplying by a window in the time domain produces the same result as convolving with the transform of that window in the frequency domain.
In general, a narrower time domain window with produce a wider looking frequency domain transform. This is the reason that zero-padding produces a smoother frequency plot. The narrower time domain window produces a wider Sinc with fatter and smoother curves in relation to the frame width than would a window the full width of the FFT frame, thus making the interpolated frequency results look smoother than an non-zero padded FFT of the same frame length.
The converse is also true to some extent. A wider rectangular window will produce a narrower Sinc, with the nulls closer to the peak. Thus you might be able to use a carefully chosen wider window to produce a narrower looking Sinc to null a frequency closer to a bin of interest than 1 frequency bin away. How do you use a wider window? Wrap the data around and sum, which is identical to using FT basis vectors that are not truncated to 1 FFT frame in length. However, since when doing this the FFT result vector is shorter than the data, this is a lossy process which will introduce artifacts, and introduce some new novel aliasing. But it will give you a sharper frequency selection peak at each bin, and notch filters that can be placed less than 1 bin away, say halfway between bins, etc.
I want to know the frequency of data. I had a little bit idea that it can be done using FFT, but I am not sure how to do it. Once I passed the entire data to FFT, then it is giving me 2 peaks, but how can I get the frequency?
Thanks a lot in advance.
Here's what you're probably looking for:
When you talk about computing the frequency of a signal, you probably aren't so interested in the component sine waves. This is what the FFT gives you. For example, if you sum sin(2*pi*10x)+sin(2*pi*15x)+sin(2*pi*20x)+sin(2*pi*25x), you probably want to detect the "frequency" as 5 (take a look at the graph of this function). However, the FFT of this signal will detect the magnitude of 0 for the frequency 5.
What you are probably more interested in is the periodicity of the signal. That is, the interval at which the signal becomes most like itself. So most likely what you want is the autocorrelation. Look it up. This will essentially give you a measure of how self-similar the signal is to itself after being shifted over by a certain amount. So if you find a peak in the autocorrelation, that would indicate that the signal matches up well with itself when shifted over that amount. There's a lot of cool math behind it, look it up if you are interested, but if you just want it to work, just do this:
Window the signal, using a smooth window (a cosine will do. The window should be at least twice as large as the largest period you want to detect. 3 times as large will give better results). (see http://zone.ni.com/devzone/cda/tut/p/id/4844 if you are confused).
Take the FFT (however, make sure the FFT size is twice as big as the window, with the second half being padded with zeroes. If the FFT size is only the size of the window, you will effectively be taking the circular autocorrelation, which is not what you want. see https://en.wikipedia.org/wiki/Discrete_Fourier_transform#Circular_convolution_theorem_and_cross-correlation_theorem )
Replace all coefficients of the FFT with their square value (real^2+imag^2). This is effectively taking the autocorrelation.
Take the iFFT
Find the largest peak in the iFFT. This is the strongest periodicity of the waveform. You can actually be a little more clever in which peak you pick, but for most purposes this should be enough. To find the frequency, you just take f=1/T.
Suppose x[n] = cos(2*pi*f0*n/fs) where f0 is the frequency of your sinusoid in Hertz, n=0:N-1, and fs is the sampling rate of x in samples per second.
Let X = fft(x). Both x and X have length N. Suppose X has two peaks at n0 and N-n0.
Then the sinusoid frequency is f0 = fs*n0/N Hertz.
Example: fs = 8000 samples per second, N = 16000 samples. Therefore, x lasts two seconds long.
Suppose X = fft(x) has peaks at 2000 and 14000 (=16000-2000). Therefore, f0 = 8000*2000/16000 = 1000 Hz.
If you have a signal with one frequency (for instance:
y = sin(2 pi f t)
With:
y time signal
f the central frequency
t time
Then you'll get two peaks, one at a frequency corresponding to f, and one at a frequency corresponding to -f.
So, to get to a frequency, can discard the negative frequency part. It is located after the positive frequency part. Furthermore, the first element in the array is a dc-offset, so the frequency is 0. (Beware that this offset is usually much more than 0, so the other frequency components might get dwarved by it.)
In code: (I've written it in python, but it should be equally simple in c#):
import numpy as np
from pylab import *
x = np.random.rand(100) # create 100 random numbers of which we want the fourier transform
x = x - mean(x) # make sure the average is zero, so we don't get a huge DC offset.
dt = 0.1 #[s] 1/the sampling rate
fftx = np.fft.fft(x) # the frequency transformed part
# now discard anything that we do not need..
fftx = fftx[range(int(len(fftx)/2))]
# now create the frequency axis: it runs from 0 to the sampling rate /2
freq_fftx = np.linspace(0,2/dt,len(fftx))
# and plot a power spectrum
plot(freq_fftx,abs(fftx)**2)
show()
Now the frequency is located at the largest peak.
If you are looking at the magnitude results from an FFT of the type most common used, then a strong sinusoidal frequency component of real data will show up in two places, once in the bottom half, plus its complex conjugate mirror image in the top half. Those two peaks both represent the same spectral peak and same frequency (for strictly real data). If the FFT result bin numbers start at 0 (zero), then the frequency of the sinusoidal component represented by the bin in the bottom half of the FFT result is most likely.
Frequency_of_Peak = Data_Sample_Rate * Bin_number_of_Peak / Length_of_FFT ;
Make sure to work out your proper units within the above equation (to get units of cycles per second, per fortnight, per kiloparsec, etc.)
Note that unless the wavelength of the data is an exact integer submultiple of the FFT length, the actual peak will be between bins, thus distributing energy among multiple nearby FFT result bins. So you may have to interpolate to better estimate the frequency peak. Common interpolation methods to find a more precise frequency estimate are 3-point parabolic and Sinc convolution (which is nearly the same as using a zero-padded longer FFT).
Assuming you use a discrete Fourier transform to look at frequencies, then you have to be careful about how to interpret the normalized frequencies back into physical ones (i.e. Hz).
According to the FFTW tutorial on how to calculate the power spectrum of a signal:
#include <rfftw.h>
...
{
fftw_real in[N], out[N], power_spectrum[N/2+1];
rfftw_plan p;
int k;
...
p = rfftw_create_plan(N, FFTW_REAL_TO_COMPLEX, FFTW_ESTIMATE);
...
rfftw_one(p, in, out);
power_spectrum[0] = out[0]*out[0]; /* DC component */
for (k = 1; k < (N+1)/2; ++k) /* (k < N/2 rounded up) */
power_spectrum[k] = out[k]*out[k] + out[N-k]*out[N-k];
if (N % 2 == 0) /* N is even */
power_spectrum[N/2] = out[N/2]*out[N/2]; /* Nyquist freq. */
...
rfftw_destroy_plan(p);
}
Note it handles data lengths that are not even. Note particularly if the data length is given, FFTW will give you a "bin" corresponding to the Nyquist frequency (sample rate divided by 2). Otherwise, you don't get it (i.e. the last bin is just below Nyquist).
A MATLAB example is similar, but they are choosing the length of 1000 (an even number) for the example:
N = length(x);
xdft = fft(x);
xdft = xdft(1:N/2+1);
psdx = (1/(Fs*N)).*abs(xdft).^2;
psdx(2:end-1) = 2*psdx(2:end-1);
freq = 0:Fs/length(x):Fs/2;
In general, it can be implementation (of the DFT) dependent. You should create a test pure sine wave at a known frequency and then make sure the calculation gives the same number.
Frequency = speed/wavelength.
Wavelength is the distance between the two peaks.