Extending Goertzel algorithm to 24 kHz, 32 kHz and 48 kHz in python - algorithm

I'm learning to implement Goertzel's algorithm to detect DTMF tones from recorded wave files. I got one implemented in python from here. It supports audio sampled at 8 kHz and 16 kHz. I would like to extend it to support audio files sampled at 24 kHz, 32 kHz and 48 kHz.
From the code I got from the link above, I see that the author has set the following precondition parameters/constants:
self.MAX_BINS = 8
if pfreq == 16000:
self.GOERTZEL_N = 210
self.SAMPLING_RATE = 16000
else:
self.GOERTZEL_N = 92
self.SAMPLING_RATE = 8000
According to this article, before one can do the actual Goertzel, two of the preliminary calculations are:
Decide on the sampling rate.
Choose the block size, N
So, the author has clearly set block size as 210 for 16k sampled inputs and 92 for 8k sampled inputs. Now, I would like to understand:
how the author has arrived at this block size?
what would be the block size for 24k, 32k and 48k samples?

The block size determines the frequency resolution/selectivity and the time it takes to gather a block of samples.
The bandwidth of your detector is about Fs/N, and of course the time it takes to gather a block is N/Fs.
For equivalent performance, you should keep the ratio between Fs and N roughly the same, so that both of those measurements remain unchanged.
It is also important, though, to adjust your block size to be as close as possible to a multiple of the wave lengths you want to detect. The Goertzel algorithm is basically a quick way to calculate a few selected DFT bins, and this adjustment puts the frequencies you want to see near the center of those bins.
Optimization of the block size according to the last point is probably why Fs/N is not exactly the same in the code you have for 8KHz and 16Khz sampling rates.
You could redo this optimization for the other sampling rates you want to support, but really performance will be equivalent to what you already have if you just use N = 210 * Fs / 16000
You can find a detailed description of the block size choice here: http://www.telfor.rs/telfor2006/Radovi/10_S_18.pdf

Related

USRP N320 recording the edges

When I record a signal with USRP N320 SDR, it has some problems on the edges of the spectrum. For example, when I choose sample rate 50 Msps, 2 MHz of the start of the spectrum and 2 MHz of the end of the spectrum, gives the wrong results. When it see a pulse on the edges it decreases the power and changes the frequency little bit. But 46 MHz bandwidth is perfectly working.
Sample rate: 50 Msps, Properly working bandwidth: 46 MHz
Sample rate: 100 Msps, Properly working bandwidth: 90 MHz
Sample rate: 200 Msps, Properly working bandwidth: 180 MHz
I tried to filter the edges with bandpass filter but it does give the OOOOOO problem. Even if I choose the sample rate 50 Msps. But normally, I can record successfully without bandpass filter when I choose sample rate 200 Msps.
Is there a solution to record the edges correctly. Or filtering it without dropping samples.
First off:
I tried to filter the edges with bandpass filter but it does give the OOOOOO problem
means that your computer isn't fast enough to apply the filter to the data stream. That might mean two things: you've designed a filter that's too long and could be shorter and still do what you want, or what you want to do requires a filter of that length and you will need to find a faster PC (hard) or use a faster filter implementation (did you try the FFT filters?).
For example, when I choose sample rate 50 Msps, 2 MHz of the start of the spectrum and 2 MHz of the end of the spectrum, gives the wrong results.
This is not surprising! Remember that anything with a ADC needs an anti-aliasing filter on the analog side, and these can't be arbitrarily sharp. So, necessarily, the spectrum at the edge of your band gets a bit dampened, and there's a bit of aliasing there. The dampening, you could counteract by throwing an equalizing filter on your PC at it, which would need to necessarily be more compute-intense than what is happening on the USRP, but the aliasing of the lowest frequencies onto the highest, and vice versa, due to finite steepness of the analog anti-aliasing filter you cannot repair. That's the signal processing blues for any kind of acquisition device.
There's one trick though, which the USRP uses: when your requested sampling rate is lower than the ADC's sampling rate, the USRP can internally apply a (better!) digital filter to select that target sampling rate as bandwidth, and decimate to that.
Thus, depending on the ADC rate to output sampling rate relationship (in UHD, the ADC rate is called "master clock rate", MCR), there's further digital filtering and decimation going on in the digital logic inside the N320. These filters also can't be infinitely sharp – and you might see that.
Generally, you'd want that decimation between MCR and the sampling rate you've requested to be an even number, and not too large. Don't have the N320's digital signal processing architecture in my head right now, but I bet using a decimation that's a multiple of 4 or even 8 is a good move – you get to use the nicer half-band filters then.
Modern UHD also has the filter API, with which you can work with these digital filters manually; this rarely is what you really want to do here, though.

What’s the most compression that we can hope for a file that contains 1000 bits (Huffman algorithm)?

How much file that contains 1000 bits, where 1 appears with a 10% probability of 0 - 90% probability can be compressed with Huffman code?
Maybe a factor of two.
But only if you do not include the overhead of sending the description of the Huffman code along with the data. For 1000 bits, that overhead will dominate the problem, and determine your maximum compression ratio. I find that for that small of a sample, 125 bytes, general-purpose compressors get it down to only around 100 to 120 bytes, due to the overhead.
A custom Huffman code just for this on bytes from such a stream gives a factor of 2.10, assuming the other side already knows the code. The best you could hope for is the entropy, e.g. with an arithmetic code, which gives 2.13.

Defining minimum cache size for input data with specific frequency and frame rate

Task: My requirement is to find the minimum cache size to process the frame and details are as below:
Please help with calculations:
Consider a camera capturing a 2K resolution video at 30 FPS format is NV12.
Consider the requirement of a pre-processing step on DSP which need to
be applied on the captured video by directly streaming the data in to
the cache of the DSP.
DSP #800MHZ #10x step Cycles, 64 bytes of data, Q. what is the minimum data size needed to stream a cache of 30FPS?
If the DSP is running at 800 MHZ and the processing step needs 10
Cycles of DSP for processing 64 bytes of Data, what is the minimum
data Size that need to be streamed in the cache so that 30FPS is met.
Q. Can any one also comment if the DSP running at 800 MHz will suffice for doing a pre-processing of 4K video at 30FPS?
Q. Can any one please help me understanding the minimum cache size required in such senarios. I googled about cache but no where found the size relations with the frequency.

Calculating CPU Performance in MIPS

i was taking an exam earlier and i memorized the questions that i didnt know how to answer but somehow got it correct(since the online exam using electronic classrom(eclass) was done through the use of multiple choice.. The exam was coded so each of us was given random questions at random numbers and random answers on random choices, so yea)
anyways, back to my questions..
1.)
There is a CPU with a clock frequency of 1 GHz. When the instructions consist of two
types as shown in the table below, what is the performance in MIPS of the CPU?
-Execution time(clocks)- Frequency of Appearance(%)
Instruction 1 10 60
Instruction 2 15 40
Answer: 125
2.)
There is a hard disk drive with specifications shown below. When a record of 15
Kbytes is processed, which of the following is the average access time in milliseconds?
Here, the record is stored in one track.
[Specifications]
Capacity: 25 Kbytes/track
Rotation speed: 2,400 revolutions/minute
Average seek time: 10 milliseconds
Answer: 37.5
3.)
Assume a magnetic disk has a rotational speed of 5,000 rpm, and an average seek time of 20 ms. The recording capacity of one track on this disk is 15,000 bytes. What is the average access time (in milliseconds) required in order to transfer one 4,000-byte block of data?
Answer: 29.2
4.)
When a color image is stored in video memory at a tonal resolution of 24 bits per pixel,
approximately how many megabytes (MB) are required to display the image on the
screen with a resolution of 1024 x768 pixels? Here, 1 MB is 106 bytes.
Answer:18.9
5.)
When a microprocessor works at a clock speed of 200 MHz and the average CPI
(“cycles per instruction” or “clocks per instruction”) is 4, how long does it take to
execute one instruction on average?
Answer: 20 nanoseconds
I dont expect someone to answer everything, although they are indeed already answered but i am just wondering and wanting to know how it arrived at those answers. Its not enough for me knowing the answer, ive tried solving it myself trial and error style to arrive at those numbers but it seems taking mins to hours so i need some professional help....
1.)
n = 1/f = 1 / 1 GHz = 1 ns.
n*10 * 0.6 + n*15 * 0.4 = 12 ns (=average instruction time) = 83.3 MIPS.
2.)3.)
I don't get these, honestly.
4.)
Here, 1 MB is 10^6 bytes.
3 Bytes * 1024 * 768 = 2359296 Bytes = 2.36 MB
But often these 24 bits are packed into 32 bits b/c of the memory layout (word width), so often it will be 4 Bytes*1024*768 = 3145728 Bytes = 3.15 MB.
5)
CPI / f = 4 / 200 MHz = 20 ns.

How to calculate the frames per second(fps) performance of a video decoder?

How do we get the performance of a video decoder as to how many frames it can decode per second. I know following parameters are used to arrive at fps but not able to relate them in a formula which gives the exact answer:
seconds taken to decode a video sequence, total number of frames in the encoded video sequence, clock rate of the hardware/processor which executes the code, Million cycles per second(MCPS) of the decoder
How is MCPS and fps related?
Given the calculation of Byron. I think it should be more in the lines of:
A file F to be encoded which consists of N frames
takes T seconds to be encoded on a processor which can do X MCPS
than I would say the encoder uses: (T*X)/N MC(million cycles) per frame
given that the framerate is F (for instance 25 frames a second)
than the above value times F gives the used MCPS for the encoder.
if this is lower than the MCPS of your processor, you can encode realtime (or faster).
R
When a codec quotes a MCPS number it is for a specific hardware configuration.
Million Cycles Per Second. This parameter describes the performance of any software on a given processor. For example, when we say a codec takes 100 MCPS on a given processor, it means that it consumes 100 Million cycles of the processor every second. Reference
Also some video is encoded better by different codecs. Different video streams will have different performance characteristics based on the type of video encoded. There are codecs that encode anime very well and fast, but do horribly on DVD movies. There are many parameters to consider.
The best way to determine the performance a specific algorithm is to run it on the same hardware against the type of streams you think you will be encoding. you should do multiple runs with different video and average.
That said for a specific stream on a specific peice of hardware the math is relativly simple:
If it takes a 2.5Ghz processor 5 seconds to encode a file, the MCPS for that encoder is 2500/5 or 500 MCPS
There is also a peak MCPS number, where peak mcps can be defined as:
...Peak MCPS [quoted here] is the maximum average MCPS calculated over a sliding window of 4 pictures. The actual MCPS number may vary within a +/- 5% range.
Reference

Resources