Adding an Effect Audio Unit on to my current Setup - core-audio

I am creating a toy synth project for my iphone to where i can drag my finger around and frequency and volume changes based on x & y coordinates. It works beautifully, sounds great and the color even changes based on the tone and pitch of the sound. Yahoo. But I am now trying to add effects to this sound and I have reached some sort of confusion.
Currently, I am not using AUGraph. I am simpler calling upon the remoteIO unit and assigning it a render callback function where I am feeding it a continuous stream of sample values to form a sin wave. and i hear a clear 440.00hz sin wave play out of my iphone6+ and it is very nice.
But if I want to add reverb as a second component here I'm not sure what to do because isn't the output unit the "last" unit before the audio hardware? How can I setup another audio unit called reverbUnit and connect it to my current remoteiO ?? It doesn't even make sense. One needs 3 units here. The first to generate the sin wave, the second to add the reverb filter, and than the third to push to hardware.
What am I missing? Can I tack on reverb just by using the remoteio by it self?

Yes the best way would be to use a graph.
The RemoteIO input is actually a pull architecture (not a push). The render callback is where you provide your input samples (sin wav data). It calls back every X milliseconds and asks that you copy samples into the. So it pulls your data. You are NOT constructing a buffer and "pushing" into the audio system on your terms. Rather, you copy it in as it requests more data (pull).
So if you want to add more audio units, you need to connect them with a graph. The remoteIO unit would be the last one in the chain. A reverb unit would be added before the remote IO. So it would look like this:
[ Reverb ] - [ RemoteIO (output element) ]
Your reverb output goes to the remote IO input. When the remote IO needs samples, it PULLs from the reverb unit. The graph connection takes care of the remote IO passing the pull on to the reverb unit. This would then automatically trigger the callback for your reverb unit. So you need to write your samples now in the reverb input callback.
Here's what happens:
Hardware says gimme some samples.
It invokes your remote IO render
callback.
Your remote IO invokes your reverb input callback
(via the graph connection)
You provide samples to the reverb input
callback.
The graph makes it nice because you can just "connect" things together and add/remove things in the signal chain. It just keeps pulling through the chain and you eventually provide data to the first unit in the chain.
If you never made a graph before, be sure to absolute examine the return code of EVERY SINGLE STEP.
All of these functions have error code returns an OSStatus
AUGraphOpen, AUGraphNodeInfo, AUGraphConnectNodeInput, AUGraphInitialize, AudioUnitSetProperty, AUGraphStart, etc.
After you initialize your graph, you can display it to the console using CAShow(_audioGraph) and get some output like:
Member Nodes:
node 1: 'auou' 'rioc' 'appl', instance 0x7a141060 O I
node 2: 'aumx' 'mcmx' 'appl', instance 0x7a021810 O I
node 3: 'aufx' 'rvb2' 'appl', instance 0x7a0a84a0 O I
node 4: 'aufc' 'splt' 'appl', instance 0x7a025b90 O I
node 5: 'aufc' 'conv' 'appl', instance 0x7a24b9e0 O I
node 6: 'augn' 'afpl' 'appl', instance 0x7a22a220 O
Connections:
node 2 bus 0 => node 3 bus 0 [ 2 ch, 44100 Hz, 'lpcm' (0x00000029) 32-bit little-endian float, deinterleaved]
node 3 bus 0 => node 1 bus 0 [ 2 ch, 44100 Hz, 'lpcm' (0x00000029) 32-bit little-endian float, deinterleaved]
node 4 bus 0 => node 2 bus 0 [ 2 ch, 44100 Hz, 'lpcm' (0x00000029) 32-bit little-endian float, deinterleaved]
node 5 bus 0 => node 4 bus 0 [ 2 ch, 44100 Hz, 'lpcm' (0x00000029) 32-bit little-endian float, deinterleaved]
Input Callbacks:
{0x6ccf0, 0x7a13da00} => node 5 bus 0 [2 ch, 44100 Hz]
CurrentState:
mLastUpdateError=0, eventsToProcess=F, isInitialized=T, isRunning=F

Related

Synchronisation for audio decoders

There's a following setup (it's basically a pair of TWS earbuds and a smartphone):
2 audio sink devices (or buds), both are connected to the same source device. One of these devices is primary (and is responsible for handling connection), other is secondary (and simply sniffs data).
Source device transmits a stream of encoded data and sink device need to decode and play it in sync with each other. There problem is that there's a considerable delay between each receiver (~5 ms # 300 kbps, ~10 ms # 600 kbps and # 900 kbps).
It seems that synchronisation mechanism which is already implemented simply doesn't want to work, so it seems that my only option is to implement another one.
It's possible to send messages between buds (but because this uses the same radio interface as sink-to-source communication, only small amount of bytes at relatively big interval could be transferred, i.e. 48 bytes per 300 ms, maybe few times more, but probably not by much) and to control the decoder library.
I tried the following simple algorithm: secondary will send every 50 milliseconds message to primary containing number of decoded packets. Primary would receive it and update state of decoder accordingly. The decoder on primary only decodes if the difference between number of already decoded frame and received one from peer is from 0 to 100 (every frame is 2.(6) ms) and the cycle continues.
This actually only makes things worse: now latency is about 200 ms or even higher.
Is there something that could be done to my synchronization method or I'd be better using something other? If so, what would be the best in such case? Probably fixing already existing implementation would be the best way, but it seems that it's closed-source, so I cannot modify it.

STM32F411 I need to send a lot of data by USB with high speed

I'm using STM32F411 with USB CDC library, and max speed for this library is ~1Mb/s.
I'm creating a project where I have 8 microphones connected into ADC line (this part works fine), I need a 16-bit signal, so I'm increasing accuracy by adding first 16 signals from one line (ADC gives only 12-bits signal). In my project, I need 96k 16-bit samples for one line, so it's 0,768M signals for all 8 lines. This signal needs 12000Kb space, but STM32 have only 128Kb SRAM, so I decided to send about 120 with 100Kb data in one second.
The conclusion is I need ~11,72Mb/s to send this.
The problem is that I'm unable to do that because CDC USB limited me to ~1Mb/s.
Question is how to increase USB speed to 12Mb/s for STM32F4. I need some prompt or library.
Or maybe should I set up "audio device" in CubeMX?
If small b means byte in your question, the answer is: it is not possible as your micro has FS USB which max speeds is 12M bits per second.
If it means bits your 1Mb (bit) speed assumption is wrong. But you will not reach the 12M bit payload transfer.
You may try to write (only if b means bit) your own class but I afraid you will not find a ready made library. You will need also to write the device driver on the host computer

av_interleaved_write_frame return 0 but no data written

I use the ffmpeg to stream the encoded aac data , i use the
av_interleaved_write_frame()
to write frame.
The return value is 0,and it means success as the description.
Write a packet to an output media file ensuring correct interleaving.
The packet must contain one audio or video frame. If the packets are already correctly interleaved, the application should call av_write_frame() instead as it is slightly faster. It is also important to keep in mind that completely non-interleaved input will need huge amounts of memory to interleave with this, so it is preferable to interleave at the demuxer level.
Parameters
s media file handle
pkt The packet containing the data to be written. pkt->buf must be set to a valid AVBufferRef describing the packet data. Libavformat takes ownership of this reference and will unref it when it sees fit. The caller must not access the data through this reference after this function returns. This can be NULL (at any time, not just at the end), to flush the interleaving queues. Packet's stream_index field must be set to the index of the corresponding stream in s.streams. It is very strongly recommended that timing information (pts, dts duration) is set to correct values.
Returns
0 on success, a negative AVERROR on error.
However, I found no data written.
What did i miss ? How to solve it ?
av_interleaved_write_frame() must hold data in memory before it writes it out. interleaving is the process of taking multiple streams (one audio stream, one video for example) and serializing them in a monotonic order. SO, if you write an audio frame, it will keep in in memory until you write a video frame that comes 'later'. Once a later video frame is written, the audio frame can be flushed' This way streams can be processed at different speeds or in different threads, but the output is still monotonic. If you are only writing one stream (one acc stream, no video) then use av_write_frame() as suggested.

MME Audio Output Buffer Size

I am currently playing around with outputting FP32 samples via the old MME API (waveOutXxx functions). The problem I've bumped into is that if I provide a buffer length that does not evenly divide the sample rate, certain audible clicks appear in the audio stream; when recorded, it looks like some of the samples are lost (I'm generating a sine wave for the test). Currently I am using the "magic" value of 2205 samples per buffer for 44100 sample rate.
The question is, does anybody know the reason for these dropouts and if there is some magic formula that provides a way to compute the "proper" buffer size?
Safe alignment of data buffers is the value of nBlockAlign of WAVEFORMATEX structure.
Software must process a multiple of nBlockAlign bytes of data at a
time. Data written to and read from a device must always start at the
beginning of a block. For example, it is illegal to start playback of
PCM data in the middle of a sample (that is, on a non-block-aligned
boundary).
For PCM formats this is the amount of bytes for single sample across all channels. Non-PCM formats have their own alignments, often equal to length of format-specific block, e.g. 20 ms.
Back in time when waveOutXxx was the primary API for audio, carrying over unaligned bytes was an unreasonable burden for the API and unneeded performance overhead. Right now this API is a compatibility layer on top of other audio APIs, and I suppose that unaligned bytes are just stripped to still play the rest of the content, which would otherwise be rejected in full due to this small glitch, which might be just a smaller and non-fatal caller's inaccuracy.
if you fill the audio buffer with sine sample and play it looped , very easily it will click , unless the buffer length is not a multiple of the frequence, as you said ... the audible click in fact is a discontinuity in the wave ...an advanced techinques is to fill the buffer dinamically , that is, you should set a callback notification while the buffer pointer advance and fill the buffer with appropriate data at appropriate offset. i would use a more large buffer as 2205 is too short to get an async notification , calculate data , and write the buffer ,all that while playing , but it would depend of cpu power

DirectShow push sources, syncing and timestamping

I have a filter graph that takes raw audio and video input and then uses the ASF Writer to encode them to a WMV file.
I've written two custom push source filters to provide the input to the graph. The audio filter just uses WASAPI in loopback mode to capture the audio and send the data downstream. The video filter takes raw RGB frames and sends them downstream.
For both the audio and video frames, I have the performance counter value for the time the frames were captured.
Question 1: If I want to properly timestamp the video and audio, do I need to create a custom reference clock that uses the performance counter or is there a better way for me to sync the two inputs, i.e. calculate the stream time?
The video input is captured from a Direct3D buffer somewhere else and I cannot guarantee the framerate, so it behaves like a live source. I always know the start time of a frame, of course, but how do I know the end time?
For instance, let's say the video filter ideally wants to run at 25 FPS, but due to latency and so on, frame 1 starts perfectly at the 1/25th mark but frame 2 starts later than the expected 2/25th mark. That means there's now a gap in the graph since the end time of frame 1 doesn't match the start time of frame 2.
Question 2: Will the downstream filters know what to do with the delay between frame 1 and 2, or do I manually have to decrease the length of frame 2?
One option is to omit time stamps, but this might end up in filters fail to process this data. Another option is to use System Reference Clock to generate time stamps - in any way this is preferable to directly using performance counter as a time stamp source.
Yes you need to time stamp video and audio in order to keep them in sync, this is the only way to tell that data is actually attributed to the same time
Video samples don't have time, you can omit stop time or set it equal to start time, a gap between video frame stop time and next frame start time has no consequences
Renderers are free to choose whether they need to respect time stamps or not, with audio you of course will want smooth stream without gaps in time stamps

Resources