Create repeating dtmf tone to play with AVAudioPlayer - core-audio

HI, i'm trying to create a repeating dtmf tone so i can play it with AVAudioPlayer. Currently
when i loop it in some audio editing software such as audacity there is always a glitch or change in tone at the point where it repeats. Is there some particular length of time i need to create it to avoid this. I initally created a one second dtmf tone in audacity but this does not repeat smoothly.

It can't repeat smoothly, as much as you try.
You should calculate period of both frequencies, and calculate loop length accordingly.
For example, if you combine 770 and 1336 hz, your smallest sample is of 1000/770= and 1000/1336.
Then, use your sample rate here. Let it be 44100. Your samples would be of length:
1000*44100/770 = 57272 samples
and
1000*44100/1336 = 33009 samples
Least common multiple for that lengths is 1890491448, and in terms of seconds, that would be 42868 seconds.
So, creating a loop and playing it isn't really feasible.
You can either: create sine wave on the fly and mix it, or create sine wave samples for base frequencies, and then mix them or play them simultaneously.

Related

What is a good algorithm to detect silence over a variety of recording environments?

My app processes samples from microphone audio streams. The task I'm asking about: programmatically make a good guess at what ranges of the audio stream samples should be considered signal versus noise. The "signal", in this case, is human speech.
The audio I'm getting from users comes from recording environments that I can't control or know much about. These users may be speaking into a professional microphone from a treated space or into a crummy laptop mic in their living room. Very noisy environments with excessive background noise, e.g. a busy restaurant, are outside of what I need to accommodate.
It would make the analysis simpler, but I don't want to request the user to record a room noise sample within my app. And I don't want the user to manually designate a range of audio as silence.
If I'm just looking at recorded audio within a DAW, it is simple and intuitive to spot where silence (nobody talking) is within the waveform. Basically, there's a bunch of relatively flat horizontal lines that are closer to negative infinity db (absolute silence) than any other flat lines.
I have no problems using various sound APIs to access samples. I've omitted technical specifics because I'm just asking about a suitable algorithm, and even if an existing library were available, I'd prefer my own implementation.
What algorithm is good for classifying sample ranges as silence for the requirements I've described?
Here is one simple algorithm that has its pros and cons:
Perform the calculation of noise floor as described below once. Or to continuously respond to changing conditions during recording, you can perform this calculation at some reasonably responsive interval like once every second.
Let K = a constant ratio of noise to signal, e.g. 10%, expected in worst cases.
Let A = the sample with the highest amplitude from the microphone audio stream.
Noise floor is A * K
Once the noise floor has been calculated, you can apply it to any range of the audio stream samples to classify values above the noise floor as signal and below the noise floor as noise.
With the above algorithm, samples are assumed to be stored in a typical computer audio format, e.g. PCM, where 0 is silence and a negative/positive sample value is air pressure creating sound. Negative samples can be negated to positive values when evaluating them.
K could be something like 10%. It's the noise/signal ratio expected in one of the poorest recording environments you'd like to support. Analyzing test recordings will show what the ratio should be. The higher the ratio is, the more noise will be miss-classified as signal by the algorithm.
Pros:
Easy to implement.
Computationally inexpensive. O(n) for a single pass over a sample array to find the highest peak value.
Cons:
It depends on the samples used to calculate a noise floor having signal (speech) in them. So there has to be some way of knowing the samples contain signal outside of the algorithm.
Any loud noises, e.g. a hand clap, that aren't speech but have a higher amplitude, can cause the noise floor to raise above speech causing speech to be miss-classified as noise.
The K value is a fudge factor. There are more direct ways to detect the actual noise floor from the samples.

How to calculate the number of frames extracted at certain FPS?

I have a bunch of videos in a folder at different fps. I need to calculate the number of frames that can be extracted at 0.5 fps.
Its kind of a simple mathematical problem, I need help on the formula.
Thanks
if the duration of video is t and I am assuming that's when the ffmpeg will terminate then it should ideally return you t/2 frames.

Reducing one frequency in song

How would I take a song input and output the same song without certain frequencies?
Based on my research so far, the song should be broken down into chucks, FFT it, reduce the target frequencies, iFFT it, and stitch the chunks back together. However, I am unsure if this is the right approach to take, and if so, how I would convert from the audio to FFT input (what seems to be a vector matrix), how to reduce the target frequencies, and how to convert back from the FFT output to audio, and how to restitch.
For background, my grandpa loves music. However, recently he cannot listen to it, as he has become hypersensitive to certain frequencies within songs. I'm a high school student who has some background coding, and am just getting into algorithmic work and thus have very little experience using these algorithms. Please excuse me if these are basic questions; any pointers would be helpful.
EDIT: So far, I've understood the basic premise of fft (through basic 3blue1brown yt videos and the like) and that it is available through scipy/numpi, and figured out how to convert from youtube to 0.25 second chunks in a wav format.
Your approach is right.
Concerning subquestions:
from the audio to FFT input - assign audio samples to real part of complex signal, imaginary part is zero
how to reduce the target frequencies - multiply FFT results near needed frequency by smooth (to diminish artifacts) function that becomes zero at that freqency and reaches 1.0 some samples apart
how to convert back - just make inverse FFT (don't forget about scale multiplier like 1/N) and copy real part into audio channel
Also consider using of simpler digital filtering - band-stop or notch filter.
Arbitrary found example1 example2
(calculation of parameters perhaps requires some understanding of DSP)

Canceling noise given a LPCM array of 44 samples per second

I have an array of 44100 samples per second of LPCM data. Actually I have two channels worth of data.
Every 11.61 milliseconds I get around 512 samples.
Now I want to follow the directions on How to cancel noise from audio
However, that explanation assumes the input is a sinusoidal wave.
Should I convert my LPCM to sinusoidal waves to cancel the noise? That is, am I required to run FFT on the LPCM in order to apply this technique?
And if so, how do I convert the resulting wave forms back to LPCM so they can be played?
If you need to convert back from frequency domain to time domain, you can just use inverse Fourier transform. Otherwise, I think there are plenty more noise-reduction algorithms.
you want to cancel noise where? to cancel noise at the detector (where you have the microphone) you simply need to invert the signal (swap + and - cables), match the amplitude, and shape the frequency to correct for your equipment. to cancel noise at some other point on a line that joins source and microphone you need to also add a delay (if cancelling further away) or somehow "advance" the sound (if cancelling between source and microphone). if cancelling off-axis then things get more complicated (and you need both signals).
in the complicated cases (off-axis or before the microphone) you need to do some kind of more advanced signal processing. one way to do that is to use ffts, but it might be more efficient to find approximations that use digital filters
i would guess that bose headphones and the like use signal inversion, amplification and some fairly simple frequency shaping, plus perhaps some kind of feedback detection (to avoid deafening people if it all goes wrong).
update: here is a really good paper on how the headphones work. they have various approaches including training a filter on white noise. and it's more complex than i guessed above.

Finding the frequency of a sine wave in the microphone

I am looking for a way to get the frequency of a sine wave from a tape recorder plugged into a microphone socket on a Windows PC. It's for a small project I'm working on to see if I can store data on sound tapes, so I'll be reading and writing frequencies to the tape to store data.
Thanks
A simple way to estimate the frequency of a sine wave is doing a spectrum analysis and look for the "loudest" frequency (roughly):
take one chunk of audio (for example 256 samples) from the audio
file, or from the audio input
window the audio chunk ^
compute its power spectrum (using an FFT algorithm^^)
look for the dominant frequency, which should be the frequency of the sine wave
repeat until you have audio data
I expect it to work well with simple tones.
^ see http://en.wikipedia.org/wiki/Window_function
^^ there are plenty of FFT implementations available, for example http://www.fftw.org/
If there's only one sine wave at any given time, you can count how many times per second the signal changes its value from positive to negative (IOW, crosses 0) or the other way around and that will give you the frequency. Or you can measure the time between consecutive zero crossings. This is a pretty simple and cheap solution.

Resources