This is a follow-up to a previous question:
OSX CoreAudio: Getting inNumberFrames in advance - on initialization?
I am trying to figure out what will be the AudioUnit API for possibly setting inNumberFrames or preffered IO buffer duration of an input callback for a single HAL audio component instance in OSX (not a plug-in!).
While I understand there is a comprehensive documentation on how this can be achieved in iOS, by means of AVAudioSession API, I can neither figure out nor find documentation on setting these values in OSX, whichever API.
The web is full of expert, yet conflicting statements ranging from "There is an Audio Unit API to request a sample rate and a preferred buffer duration...", to "You can definitely get the number of frames, but only for the current callback call...".
Is there a way of at least getting (and adapting to) the inNumberFrames or the audio buffer length offerd by the system, for the input-selected sampling rates in OSX? For example, for 44.1k and its multiples (this seems to work partly), as well as for 48k and its multiples (this doesn't seem to work at all, I don't know where's the hack which allows for adapting the buffer lenfth to these values)? Here's the console printout:
Available 7 Sample Rates
Available Sample Rate value : 8000.000000
Available Sample Rate value : 16000.000000
Available Sample Rate value : 32000.000000
Available Sample Rate value : 44100.000000
Available Sample Rate value : 48000.000000
Available Sample Rate value : 88200.000000
Available Sample Rate value : 96000.000000
.mSampleRate = 48000.00
.mFormatID = 1819304813
.mBytesPerPacket = 8
.mFramesPerPacket = 1
.mBytesPerFrame = 8
.mChannelsPerFrame = 2
.mBitsPerChannel = 32
.mFormatFlags = 9
_mFormatHumanReadable = kAudioFormatFlagIsFloat
kAudioFormatFlagIsPacked
kLinearPCMFormatFlagIsFloat
kLinearPCMFormatFlagIsPacked
kLinearPCMFormatFlagsSampleFractionShift
kAppleLosslessFormatFlag_16BitSourceData
kAppleLosslessFormatFlag_24BitSourceData
expectedInNumberFrames = 512
Couldn't render in current context (Error -10863)
The expected inNumberFrames is read from the system:
UInt32 expectedInNumberFrames = 0;
UInt32 propSize = sizeof(UInt32);
AudioUnitGetProperty(gInputUnitComponentInstance,
kAudioDevicePropertyBufferFrameSize,
kAudioUnitScope_Global,
0,
&expectedInNumberFrames,
&propSize);
Thanks in advance for pointing me at the right direction!
See this Apple Technical Note: https://developer.apple.com/library/mac/technotes/tn2321/_index.html#//apple_ref/doc/uid/DTS40013499-CH1-THE_I_O_BUFFER_SIZE
See the OS X example code in this technical note for GetIOBufferFrameSizeRange(), GetCurrentIOBufferFrameSize(), and SetCurrentIOBufferFrameSize().
Note that there is an API property returning an allowed range, and an error return on the property setter. Also note the various Mac power saving modes may change the buffer size while an app is running, so the actual buffer size, inNumberFrames, may not stay constant, or even be known until the Audio Unit starts running.
If you get unusual buffer sizes (not a power of 2), it may be that the actual audio hardware on a particular Apple product model has a fixed or limited range of audio sample rates, and thus OS software is being used to resample and thus resize the buffers being sent to audio unit callbacks depending on that hardware, if the app requests a sample rate not supported by the actual codec chips on the circuit board.
Related
I have been working on my ANC project. For this I have two microphone inputs and one loud speaker output, but initially I am using single microphone and dspStreamingPassthrough to pass microphone input to loud speaker. Here is my code
% Initialization
numIterations = 500;
% Construct sources (for all inputs)
src1 = dsp.AudioRecorder('DeviceName','Mikrofon (USB-Audiogerät)','NumChannels',1);
% Construct sinks (for all outputs)
sink1_1 = dsp.SpectrumAnalyzer('SampleRate',44100, ...
'PlotAsTwoSidedSpectrum',false, ...
'ShowLegend',true);
sink1_2 = dsp.TimeScope('BufferLength',44100, ...
'SampleRate',44100, ...
'TimeSpan',1, ...
'ShowLegend',true, ...
'ShowGrid',true, ...
'YLimits',[-0.5 0.5]);
sink1_3 =
dsp.AudioPlayer('BufferSizeSource','Property','BufferSize',1024,...
'QueueDuration',0,'OutputNumUnderrunSamples',true);
sink1_3.DeviceName = 'Lautsprecher (USB-Audiogerät)';
% Stream processing loop
clear dspStreamingPassthrough;
for i = 1:numIterations
% Sources
in1 = step(src1);
% User Algorithm
out1 = dspStreamingPassthrough(in1);
% Sinks
step(sink1_3,out1);
step(sink1_1,out1);
step(sink1_2,out1);
nUnderrun=step(sink1_3,out1);
end
% Clean up
release(src1);
release(sink1_1);
release(sink1_2);
I am using Windows DirectSound Audio driver ( I cannot use ASIO driver as I cannot access individual audio devices names. ! ) Now I have the audio latency of 1.2 seconds i.e if I say ''hello'' in microphone now, after 1.2 seconds speaker is saying ''hello''(this is absolutely with out any audio input data processing just 'dspStreamingPassthrough'). How to reduce this incredible delay ?
For my project of 1 meter length pipe(air duct), I should be able to process the data in 1.7 msec or less !! I have tried with lowest 'BufferSize' and lowest 'QueueDuration' possible !!
What other parameters can influence to speed up this process? Is it possible with MATLAB or not ?
PS: -sorry for whole code. -I am using a cheap quality Sound card (7 euros)
DirectSound has way higher latency than ASIO because DirectSound is not suited for low-latency apps. DSP System Toolbox does not support WASAPI yet.
The latency performance of these objects was greatly improved starting in 15a. I am not sure which version you are running but try and upgrade to 15a or higher.
As for tuning your latency, try the following:
* Set the Queue Duration property to 0 seconds for both player and recorder.
* For the recorder, match the SamplesPerFrame and BufferSize property.
* For the player, ensure that the size of the data matches the BuferSize property.
The BufferSize property is the size at which the sound card operates.
If you get drops, increase the BufferSize value. There can be many reasons for the drops:
* The algorithm you are running is not faster than BufferSize/SampleRate
* The soundcard is not able to operate at this BufferSize. Some sound-cards allow you to modify this when using ASIO.
* Limitation of the player/recorder objects.
Hope this helps.
Dinesh
I need a step by step walkthrough on how to use audioConverterFillComplexBuffer and its callback. No, don't tell me to read the Apple docs. I do everything they say and the conversion always fails. No, don't tell me to go look for examples of audioConverterFillComplexBuffer and its callback in use - I've duplicated about a dozen such examples both line for line and modified and the conversion always fails. No, there isn't any problem with the input data. No, it isn't an endian issue. No, the problem isn't my version of OS X.
The problem is that I don't understand how audioConverterFillComplexBuffer works, so I don't know what I'm doing wrong. And nothing out there is helping me understand, because it seems like nobody on Earth really understands how audioConverterFillComplexBuffer works, either. From the people who actually use it(I spy cargo cult programming in their code) to even the authors of Learning Core Audio and/or Apple itself(http://stackoverflow.com/questions/13604612/core-audio-how-can-one-packet-one-byte-when-clearly-one-packet-4-bytes).
This isn't just a problem for me, it's a problem for anybody who wants to program high-performance audio on the Mac platform. Threadbare documentation that's apparently wrong and examples that don't work are no fun.
Once again, to be clear: I NEED A STEP BY STEP WALKTHROUGH ON HOW TO USE audioConverterFillComplexBuffer plus its callback and so does the entire Mac developer community.
This is a very old question but I think is still relevant. I've spent a few days fighting this and have finally achieved a successful conversion. I'm certainly no expert but I'll outline my understanding of how it works. Note I'm using Swift, which I'm also just learning.
Here are the main function arguments:
inAudioConverter: AudioConverterRef: This one is simple enough, just pass in a previously created AudioConverterRef.
inInputDataProc: AudioConverterComplexInputDataProc: The very complex callback. We'll come back to this.
inInputDataProcUserData, UnsafeMutableRawPointer?: This is a reference to whatever data you may need to be provided to the callback function. Important because even in swift the callback can't inherit context. E.g. you may need to access an AudioFileID or keep track of the number of packets read so far.
ioOutputDataPacketSize: UnsafeMutablePointer<UInt32>: This one is a little misleading. The name implies it's the packet size but reading the documentation we learn it's the total number of packets expected for the output format. You can calculate this as outPacketCount = frameCount / outStreamDescription.mFramesPerPacket.
outOutputData: UnsafeMutablePointer<AudioBufferList>: This is an audio buffer list which you need to have already initialized with enough space to hold the expected output data. The size can be calculated as byteSize = outPacketCount * outMaxPacketSize.
outPacketDescription: UnsafeMutablePointer<AudioStreamPacketDescription>?: This is optional. If you need packet descriptions, pass in a block of memory the size of outPacketCount * sizeof(AudioStreamPacketDescription).
As the converter runs it will repeatedly call the callback function to request more data to convert. The main job of the callback is simply to read the requested number packets from the source data. The converter will then convert the packets to the output format and fill the output buffer. Here are the arguments for the callback:
inAudioConverter: AudioConverterRef: The audio converter again. You probably won't need to use this.
ioNumberDataPackets: UnsafeMutablePointer<UInt32>: The number of packets to read. After reading, you must set this to the number of packets actually read (which may be less than the number requested if we reached the end).
ioData: UnsafeMutablePointer<AudioBufferList>: An AudioBufferList which is already configured except for the actual data. You need to initialise ioData.mBuffers.mData with enough capacity to hold the expected number of packets, i.e. ioNumberDataPackets * inMaxPacketSize. Set the value of ioData.mBuffers.mDataByteSize to match.
outDataPacketDescription: UnsafeMutablePointer<UnsafeMutablePointer<AudioStreamPacketDescription>?>?: Depending on the formats used, the converter may need to keep track of packet descriptions. You need to initialise this with enough capacity to hold the expected number of packet descriptions.
inUserData: UnsafeMutableRawPointer?: The user data that you provided to the converter.
So, to start you need to:
Have sufficient information about your input and output data, namely the number of frames and maximum packet sizes.
Initialise an AudioBufferList with sufficient capacity to hold the output data.
Call AudioConverterFillComplexBuffer.
And on each run of the callback you need to:
Initialise ioData with sufficient capacity to store ioNumberDataPackets of source data.
Initialise outDataPacketDescription with sufficient capacity to store ioNumberDataPackets of AudioStreamPacketDescriptions.
Fill the buffer with source packets.
Write the packet descriptions.
Set ioNumberDataPackets to the number of packets actually read.
return noErr if successful.
Here's an example where I read the data from an AudioFileID:
var converter: AudioConverterRef?
// User data holds an AudioFileID, input max packet size, and a count of packets read
var uData = (fRef, maxPacketSize, UnsafeMutablePointer<Int64>.allocate(capacity: 1))
err = AudioConverterNew(&inStreamDesc, &outStreamDesc, &converter)
err = AudioConverterFillComplexBuffer(converter!, { _, ioNumberDataPackets, ioData, outDataPacketDescription, inUserData in
let uData = inUserData!.load(as: (AudioFileID, UInt32, UnsafeMutablePointer<Int64>).self)
ioData.pointee.mBuffers.mDataByteSize = uData.1
ioData.pointee.mBuffers.mData = UnsafeMutableRawPointer.allocate(byteCount: Int(uData.1), alignment: 1)
outDataPacketDescription?.pointee = UnsafeMutablePointer<AudioStreamPacketDescription>.allocate(capacity: Int(ioNumberDataPackets.pointee))
let err = AudioFileReadPacketData(uData.0, false, &ioData.pointee.mBuffers.mDataByteSize, outDataPacketDescription?.pointee, uData.2.pointee, ioNumberDataPackets, ioData.pointee.mBuffers.mData)
uData.2.pointee += Int64(ioNumberDataPackets.pointee)
return err
}, &uData, &numPackets, &bufferList, nil)
Again, I'm no expert, this is just what I've learned by trial and error.
The documentation for Audio Queue Services under OS 10.6 now includes a pitch parameter:
kAudioQueueParam_Pitch
The number of cents to pitch-shift the audio queue’s playback, in the range -2400through 2400 cents (where 1200 cents corresponds to one musical octave.)
This parameter is usable only if the time/pitch processor is enabled.
Other sections of the same document still say that volume is the only available parameter, and I can't find any reference to the time/pitch processor mentioned above.
Does anyone know what this refers to? Directly writing a value to the parameter has no effect on playback (although no error is thrown). Similarly writing the volume setting does work.
Frustrating as usual with no support from Apple.
This is only available on OSX until iOS 7. If you look at AudioQueue.h you'll find it is conditionally available only on iOS 7. [note: on re-reading I see you were referring to OS X, not iOS, but hopefully the following is cross-platform]
Also, you need to enable the queue for time_pitch before setting the time_pitch algorithm, and only the Spectral algorithm supports pitch (all of them support rate)
result = AudioQueueNewOutput(&(pAqData->mDataFormat), aqHandleOutputBuffer, pAqData,
0, kCFRunLoopCommonModes , 0, &(pAqData->mQueue));
// enable time_pitch
UInt32 trueValue = 1;
AudioQueueSetProperty(pAqData->mQueue, kAudioQueueProperty_EnableTimePitch, &trueValue, sizeof(trueValue));
UInt32 timePitchAlgorithm = kAudioQueueTimePitchAlgorithm_Spectral; // supports rate and pitch
AudioQueueSetProperty(pAqData->mQueue, kAudioQueueProperty_TimePitchAlgorithm, &timePitchAlgorithm, sizeof(timePitchAlgorithm));
I am trying to get the volume of the audio heard by an input device using Core-Audio.
So far I have used AudioDeviceAddIOProc and AudioDeviceStart with my AudioDeviceIOProc function to get the input data in the form of an AudioBufferList containing AudioBuffers.
How do I get the volume of the data from the AudioBuffer? Or am I going about this completely the wrong way?
On Mac OS X 10.5, the APIs you mentioned are deprecated and you should read Tech Note TN2223.
Anyway, assuming that you are getting buffers of linear PCM sample data in 32-bit float format, you just need to write a for loop that determines the fabs(y) of each sample and then takes the max of all those values. Then you can save that value or convert it to decibels.
I'm now creating an Application using speech recognition.To check the Audio Power coming in through the microphone,
I wrote a method as follows.
-(void)checkPower(AudioqueRef)queue{
UInt32 expectedSize= sizeof(AudioQueueLevelMeterState);
AudioQueueGetProperty(queue,
kAudioQueueProperty_CurrentLevelMeter,
audioLevels,
expectedSize);
NSLog(#"average:%f peak:%f",audioLevels.mAveragePower,audioLevels.mPeakPower);
}
I found that sometimes mAveragePower was larger than mPeakPower,
and when mAveragePower was 1.0, in other words, averagePower
is regarded as max, mPeakPower was lower than 1.0.
I think that generally this result is inpossible.
please Let me know if you have any information about sound power on CoreAudio.
thanks.
I think mPeakPower means channel power at CURRENT moment & mAveragePower - average channel power for ALL record time, and if it's right your result possible.