I'm working on an application that plays sounds through the default audio device on a Mac. I want to change the output sampling rate and bit depth of the default output device but it always gives me a kAudioUnitErr_PropertyNotWritable error code.
Here is my test code:
AudioStreamBasicDescription streamFormat;
AudioStreamBasicDescription newStreamFormat;
newStreamFormat.mSampleRate = 96000; // the sample rate of the audio stream
newStreamFormat.mFormatID = kAudioFormatLinearPCM; // the specific encoding type of audio stream
newStreamFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger;//kAudioFormatFlagsNativeEndian | kAudioFormatFlagIsNonMixable;
newStreamFormat.mFramesPerPacket = 1;
newStreamFormat.mChannelsPerFrame = 1;
newStreamFormat.mBitsPerChannel = 24;
newStreamFormat.mBytesPerPacket = 2;
newStreamFormat.mBytesPerFrame = 2;
UInt32 size = sizeof(AudioStreamBasicDescription);
result = AudioUnitGetProperty(myUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 0, &streamFormat, &size);
result = AudioOutputUnitStop(myUnit);
result = AudioUnitUninitialize(myUnit);
result = AudioUnitSetProperty(myUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 0, &newStreamFormat, size);
result = AudioUnitInitialize(myUnit);
result = AudioOutputUnitStart(myUnit);
result = AudioUnitGetProperty(myUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 0, &streamFormat, &size);
result = AudioUnitGetProperty(myUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 0, &streamFormat, &size);
When I make the call to set the stream format on kAudioUnitScope_Input I don't get any error but when I set it on kAudioUnitScope_Output if fails with the property not writable error.
It must be possible to do this programmatically (Audio MIDI Setup does it) but I have searched and searched but I haven't been able to find any solution.
I did find this post that implies that setting the input sampling rate of the device will update the output as well. I tried this but when I read back the property the output doesn't match what I set on the input.
I'm pretty sure it's not the output AudioUnit's job to configure devices. It's more of an intermediary between clients and audio devices. Which means AudioUnitSetProperty() is the wrong API for the job.
So if you want to configure the device, try setting kAudioDevicePropertyNominalSampleRate on it using the AudioObjectSetPropertyData() function.
Then, unless you want a gratuitous rate conversion, you probably want to make sure your audio unit input format matches the new device sample rate by doing what you're already doing: calling AudioUnitSetProperty() on the input (data going into the audio unit) scope.
Related
I am using ffmpeg to acquire audio from .mov files. Looking over my settings, I am not sample rate converting the audio buffers I am generating so that is unlikely to account for the issues I am having. Regardless of the sample rate I set on my Built-in Output, my audio files that are at 44.1 kHz playback at the correct rate. If I playback a 48kHz file, the file plays back slower (at 91% of the normal rate) which indicates that the true rate is 44.1kHz. I can change my built-in output to 44.1, 48, or 96 kHz and the same phenomenon exists. I change my default output rate using the Audio Midi Setup app. I then verify my sample rate using AudioUnitGetProperty on my ouputAudioUnit. This matches the sample rate in the Audio Midi Setup.
Thoughts? I am including my audio graph code
CheckError(NewAUGraph(&fp.graph), "Couldn't create a new AUGraph");
//varispeednode has an input callback
//the vairspeed node feeds an output node which is running
//at the frequency of the system default output
AUNode outputNode;
AudioComponentDescription outputcd = [self defaultOutputComponent];
CheckError(AUGraphAddNode(fp.graph, &outputcd, &outputNode),
"AUGraphAddNode[kAudioUnitSubType_DefaultOutput] failed");
AUNode varispeedNode;
AudioComponentDescription varispeedcd = [self variSpeedComponent];
CheckError(AUGraphAddNode(fp.graph, &varispeedcd, &varispeedNode),
"AUGraphAddNode[kAudioUnitSubType_Varispeed] failed");
CheckError(AUGraphOpen(fp.graph),
"Couldn't Open AudioGraph");
CheckError(AUGraphNodeInfo(fp.graph, outputNode, NULL, &fp.outputAudioUnit),
"Couldn't Retrieve output node");
CheckError(AUGraphNodeInfo(fp.graph, varispeedNode, NULL, &fp.variSpeedAudioUnit),
"Couldn't Retrieve Varispeed Audio Unit");
AURenderCallbackStruct input;
input.inputProc = CBufferProviderCallback;
input.inputProcRefCon = &playerStruct;
CheckError(AudioUnitSetProperty(fp.variSpeedAudioUnit,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Input,
0,
&input,
sizeof(input)),
"AudioUnitSetProperty failed");
CheckError(AUGraphConnectNodeInput(fp.graph, varispeedNode, 0, outputNode, 0),
"Couldn't Connect varispeed to output");
CheckError(AUGraphInitialize(fp.graph),
"Couldn't Initialize AUGraph");
// check output sample rate
Float64 outputSampleRate = 48000.0;
UInt32 sizeOfFloat64 = sizeof(Float64);
outputSampleRate = 0.0;
CheckError(AudioUnitGetProperty(fp.outputAudioUnit,
kAudioUnitProperty_SampleRate,
kAudioUnitScope_Global,
0,
&outputSampleRate,
&sizeOfFloat64),
"Couldn't get output sampleRate");
I solved the issue. When building the audio graph, you need to specify the input sample rate of the varispeed audio unit before you connect it to an output node inside of the augraph. See the example code at
https://developer.apple.com/library/content/samplecode/CAPlayThrough/Listings/ReadMe_txt.html
CheckError(NewAUGraph(&fp.graph), "BuildGraphError");
AUNode outputNode;
AudioComponentDescription outputcd = [self defaultOutputComponent];
CheckError(AUGraphAddNode(fp.graph, &outputcd, &outputNode),
"AUGraphAddNode[kAudioUnitSubType_DefaultOutput] failed");
AUNode varispeedNode;
AudioComponentDescription varispeedcd = [self variSpeedComponent];
CheckError(AUGraphAddNode(fp.graph, &varispeedcd, &varispeedNode),
"AUGraphAddNode[kAudioUnitSubType_Varispeed] failed");
CheckError(AUGraphOpen(fp.graph),
"Couldn't Open AudioGraph");
CheckError(AUGraphNodeInfo(fp.graph, outputNode, NULL, &fp.outputAudioUnit),
"Couldn't Retrieve File Audio Unit");
CheckError(AUGraphNodeInfo(fp.graph, varispeedNode, NULL, &fp.variSpeedAudioUnit),
"Couldn't Retrieve Varispeed Audio Unit");
AURenderCallbackStruct input;
input.inputProc = CBufferProviderCallback;
input.inputProcRefCon = &playerStruct;
CheckError(AudioUnitSetProperty(fp.variSpeedAudioUnit,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Input,
0,
&input,
sizeof(input)),
"AudioUnitSetProperty failed");
//you have to set the varispeed rate before you connect it
//see CAPlayThrough
AudioStreamBasicDescription asbd = {0};
UInt32 size;
Boolean outWritable;
//Gets the size of the Stream Format Property and if it is writable
OSStatus result = AudioUnitGetPropertyInfo(fp.variSpeedAudioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output,
0,
&size,
&outWritable);
//Get the current stream format of the output
result = AudioUnitGetProperty (fp.variSpeedAudioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output,
0,
&asbd,
&size);
asbd.mSampleRate = targetSampleRate;
//Set the stream format of the output to match the input
result = AudioUnitSetProperty (fp.variSpeedAudioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
0,
&asbd,
size);
printf("AudioUnitSetProperty result %d %d\n", result, noErr);
CheckError(AUGraphConnectNodeInput(fp.graph, varispeedNode, 0, outputNode, 0),
"Couldn't Connect varispeed to output");
CheckError(AUGraphInitialize(fp.graph),
"Couldn't Initialize AUGraph");
Float64 outputSampleRate = 48000.0;
UInt32 sizeOfFloat64 = sizeof(Float64);
outputSampleRate = 0.0;
CheckError(AudioUnitGetProperty(fp.outputAudioUnit,
kAudioUnitProperty_SampleRate,
kAudioUnitScope_Global,
0,
&outputSampleRate,
&sizeOfFloat64),
"Couldn't get output sampleRate");
NSLog(#"Output Sample Rate of the ->%f", outputSampleRate);
I need access to audio data from microphone on macbook. I have the an example program for recording microphone data based on the one in "Learning Core Audio". When I run this program and break on the call back routine I see the inBuffer pointer and the mAudioData pointer. However I am having a heck of a time making sense of the data. I've tried casting the void* pointer to mAudioData to SInt16, to SInt32 and to float and tried a number of endian conversions all with nonsense looking results. What I need to know definitively is the number format for the data in the buffer. The example actually works writing microphone data to a file which I can play so I know that real audio is being recorded.
AudioStreamBasicDescription recordFormat;
memset(&recordFormat,0,sizeof(recordFormat));
//recordFormat.mFormatID = kAudioFormatMPEG4AAC;
recordFormat.mFormatID = kAudioFormatLinearPCM;
recordFormat.mChannelsPerFrame = 2;
recordFormat.mBitsPerChannel = 16;
recordFormat.mBytesPerPacket = recordFormat.mBytesPerFrame = recordFormat.mChannelsPerFrame * sizeof(SInt16);
recordFormat.mFramesPerPacket = 1;
MyGetDefaultInputDeviceSampleRate(&recordFormat.mSampleRate);
UInt32 propSize = sizeof(recordFormat);
CheckError(AudioFormatGetProperty(kAudioFormatProperty_FormatInfo,
0,
NULL,
&propSize,
&recordFormat),
"AudioFormatProperty failed");
//set up queue
AudioQueueRef queue = {0};
CheckError(AudioQueueNewInput(&recordFormat,
MyAQInputCallback,
&recorder,
NULL,
kCFRunLoopCommonModes,
0,
&queue),
"AudioQueueNewInput failed");
UInt32 size = sizeof(recordFormat);
CheckError(AudioQueueGetProperty(queue,
kAudioConverterCurrentOutputStreamDescription,
&recordFormat,
&size), "Couldn't get queue's format");
I have a strange problem in my C/C++ FFmpeg transcoder, which takes an input MP4 (varying input codecs) and produces and output MP4 (x264, baseline & AAC LC #44100 sample rate with libfdk_aac):
The resulting mp4 video has fine images (x264) and the audio (AAC LC) works fine as well, but is only played until exactly the half of the video.
The audio is not slowed down, not stretched and doesn't stutter. It just stops right in the middle of the video.
One hint may be that the input file has a sample rate of 22050 and 22050/44100 is 0.5, but I really don't get why this would make the sound just stop after half the time. I'd expect such an error leading to sound being at the wrong speed. Everything works just fine if I don't try to enforce 44100 and instead just use the incoming sample_rate.
Another guess would be that the pts calculation doesn't work. But the audio sounds just fine (until it stops) and I do exactly the same for the video part, where it works flawlessly. "Exactly", as in the same code, but "audio"-variables replaced with "video"-variables.
FFmpeg reports no errors during the whole process. I also flush the decoders/encoders/interleaved_writing after all the package reading from the input is done. It works well for the video so I doubt there is much wrong with my general approach.
Here are the functions of my code (stripped off the error handling & other class stuff):
AudioCodecContext Setup
outContext->_audioCodec = avcodec_find_encoder(outContext->_audioTargetCodecID);
outContext->_audioStream =
avformat_new_stream(outContext->_formatContext, outContext->_audioCodec);
outContext->_audioCodecContext = outContext->_audioStream->codec;
outContext->_audioCodecContext->channels = 2;
outContext->_audioCodecContext->channel_layout = av_get_default_channel_layout(2);
outContext->_audioCodecContext->sample_rate = 44100;
outContext->_audioCodecContext->sample_fmt = outContext->_audioCodec->sample_fmts[0];
outContext->_audioCodecContext->bit_rate = 128000;
outContext->_audioCodecContext->strict_std_compliance = FF_COMPLIANCE_EXPERIMENTAL;
outContext->_audioCodecContext->time_base =
(AVRational){1, outContext->_audioCodecContext->sample_rate};
outContext->_audioStream->time_base = (AVRational){1, outContext->_audioCodecContext->sample_rate};
int retVal = avcodec_open2(outContext->_audioCodecContext, outContext->_audioCodec, NULL);
Resampler Setup
outContext->_audioResamplerContext =
swr_alloc_set_opts( NULL, outContext->_audioCodecContext->channel_layout,
outContext->_audioCodecContext->sample_fmt,
outContext->_audioCodecContext->sample_rate,
_inputContext._audioCodecContext->channel_layout,
_inputContext._audioCodecContext->sample_fmt,
_inputContext._audioCodecContext->sample_rate,
0, NULL);
int retVal = swr_init(outContext->_audioResamplerContext);
Decoding
decodedBytes = avcodec_decode_audio4( _inputContext._audioCodecContext,
_inputContext._audioTempFrame,
&p_gotAudioFrame, &_inputContext._currentPacket);
Converting (only if decoding produced a frame, of course)
int retVal = swr_convert( outContext->_audioResamplerContext,
outContext->_audioConvertedFrame->data,
outContext->_audioConvertedFrame->nb_samples,
(const uint8_t**)_inputContext._audioTempFrame->data,
_inputContext._audioTempFrame->nb_samples);
Encoding (only if decoding produced a frame, of course)
outContext->_audioConvertedFrame->pts =
av_frame_get_best_effort_timestamp(_inputContext._audioTempFrame);
// Init the new packet
av_init_packet(&outContext->_audioPacket);
outContext->_audioPacket.data = NULL;
outContext->_audioPacket.size = 0;
// Encode
int retVal = avcodec_encode_audio2( outContext->_audioCodecContext,
&outContext->_audioPacket,
outContext->_audioConvertedFrame,
&p_gotPacket);
// Set pts/dts time stamps for writing interleaved
av_packet_rescale_ts( &outContext->_audioPacket,
outContext->_audioCodecContext->time_base,
outContext->_audioStream->time_base);
outContext->_audioPacket.stream_index = outContext->_audioStream->index;
Writing (only if encoding produced a packet, of course)
int retVal = av_interleaved_write_frame(outContext->_formatContext, &outContext->_audioPacket);
I am quite out of ideas about what would cause such a behaviour.
So, I finally managed to figure things out myself.
The problem was indeed in the difference of the sample_rate.
You'd assume that a call to swr_convert() would give you all the samples you need for converting the audio frame when called like I did.
Of course, that would be too easy.
Instead, you need to call swr_convert (potentially) multiple times per frame and buffer its output, if required. Then you need to grab a single frame from the buffer and that is what you will have to encode.
Here is my new convertAudioFrame function:
// Calculate number of output samples
int numOutputSamples = av_rescale_rnd(
swr_get_delay(outContext->_audioResamplerContext, _inputContext._audioCodecContext->sample_rate)
+ _inputContext._audioTempFrame->nb_samples,
outContext->_audioCodecContext->sample_rate,
_inputContext._audioCodecContext->sample_rate,
AV_ROUND_UP);
if (numOutputSamples == 0)
{
return;
}
uint8_t* tempSamples;
av_samples_alloc( &tempSamples, NULL,
outContext->_audioCodecContext->channels, numOutputSamples,
outContext->_audioCodecContext->sample_fmt, 0);
int retVal = swr_convert( outContext->_audioResamplerContext,
&tempSamples,
numOutputSamples,
(const uint8_t**)_inputContext._audioTempFrame->data,
_inputContext._audioTempFrame->nb_samples);
// Write to audio fifo
if (retVal > 0)
{
retVal = av_audio_fifo_write(outContext->_audioFifo, (void**)&tempSamples, retVal);
}
av_freep(&tempSamples);
// Get a frame from audio fifo
int samplesAvailable = av_audio_fifo_size(outContext->_audioFifo);
if (samplesAvailable > 0)
{
retVal = av_audio_fifo_read(outContext->_audioFifo,
(void**)outContext->_audioConvertedFrame->data,
outContext->_audioCodecContext->frame_size);
// We got a frame, so also set its pts
if (retVal > 0)
{
p_gotConvertedFrame = 1;
if (_inputContext._audioTempFrame->pts != AV_NOPTS_VALUE)
{
outContext->_audioConvertedFrame->pts = _inputContext._audioTempFrame->pts;
}
else if (_inputContext._audioTempFrame->pkt_pts != AV_NOPTS_VALUE)
{
outContext->_audioConvertedFrame->pts = _inputContext._audioTempFrame->pkt_pts;
}
}
}
This function I basically call until there are no more frame in the audio fifo buffer.
So, the audio was only half as long because I only encoded as many frames as I decoded. Where I actually needed to encode 2 times as many frames due to 2 times the sample_rate.
When attempting to play AAC-HE content in an mp4 container, the reported sampling rate found in the mp4 container appears to be half of the actual sampling rate.
E.g it appears as 24kHz instead of 48kHz.
Using the FFmpeg AAC decoder, retrieving the actual sampling rate can be done by simply decoding an audio packet using
avcodec_decode_audio4
And looking at AVCodecContext::sample_rate which will be updated appropriately. From that it's easy to adapt the output.
With CoreAudio decoder, I would use a AudioConverterRef set the input and output AudioStreamBasicDescription
and call AudioConverterFillComplexBuffer
As the converter performs all the required internal conversion including resampling it's fine. But it plays the content after resampling it to 24kHz (as that's what the input AudioStreamBasicDescription contains.
Would there be a way to retrieve the actual sampling rate as found be the decoder (rather than the demuxer) in a similar fashion as one can with FFmpeg ?
Would prefer to avoid losing audio quality if at all possible, and not downmix data
Thanks
Found this :
https://developer.apple.com/library/ios/qa/qa1639/_index.html
explaining on how to retrieve the higher quality stream..
resulting code is as follow:
AudioStreamBasicDescription inputFormat;
AudioFormatListItem* formatListPtr = NULL;
UInt32 propertySize;
OSStatus rv = noErr;
rv = AudioFileStreamGetPropertyInfo(mStream,
kAudioFileStreamProperty_FormatList,
&propertySize,
NULL);
if (rv == noErr) {
// allocate memory for the format list items
formatListPtr = static_cast<AudioFormatListItem*>(malloc(propertySize));
if (!formatListPtr) {
LOG("Error %d constructing AudioConverter", rv);
mCallback->Error();
return;
}
// get the list of Audio Format List Item's
rv = AudioFileStreamGetProperty(mStream,
kAudioFileStreamProperty_FormatList,
&propertySize,
formatListPtr);
if (rv == noErr) {
UInt32 itemIndex;
UInt32 indexSize = sizeof(itemIndex);
// get the index number of the first playable format -- this index number will be for
// the highest quality layer the platform is capable of playing
rv = AudioFormatGetProperty(kAudioFormatProperty_FirstPlayableFormatFromList,
propertySize,
formatListPtr,
&indexSize,
&itemIndex);
if (rv != noErr) {
free(formatListPtr);
LOG("Error %d retrieving best format for AudioConverter", rv);
return;
}
// copy the format item at index we want returned
inputFormat = formatListPtr[itemIndex].mASBD;
}
free(formatListPtr);
} else {
// Fill in the input format description from the stream.
nsresult rv = AppleUtils::GetProperty(mStream,
kAudioFileStreamProperty_DataFormat,
&inputFormat);
if (NS_FAILED(rv)) {
LOG("Error %d retrieving default format for AudioConverter", rv);
return;
}
}
// Fill in the output format manually.
PodZero(&mOutputFormat);
mOutputFormat.mFormatID = kAudioFormatLinearPCM;
mOutputFormat.mSampleRate = inputFormat.mSampleRate;
mOutputFormat.mChannelsPerFrame = inputFormat.mChannelsPerFrame;
I'm creating a Mac OS X CoreAudio command-line program with proprietary rendering of some alphanumeric terminal input into a live audio signal, by means of AudioUnits, trying to stay as simple as possible. All works fine up to matching output sample rate.
As a starting point I'm using the Chapter 07 tutorial code of Addisson Wesley's "Learning Core Audio", CH07_AUGraphSineWave.
I initialize the AudioComponent "by the book":
void CreateAndConnectOutputUnit (MyGenerator *generator)
{
AudioComponentDescription theoutput = {0};
theoutput.componentType = kAudioUnitType_Output;
theoutput.componentSubType = kAudioUnitSubType_DefaultOutput;
theoutput.componentManufacturer = kAudioUnitManufacturer_Apple;
AudioComponent comp = AudioComponentFindNext (NULL, &theoutput);
if (comp == NULL) {
printf ("can't get output unit");
exit (-1);
}
CheckError (AudioComponentInstanceNew(comp, &generator->outputUnit),
"Couldn't open component for outputUnit");
AURenderCallbackStruct input;
input.inputProc = MyRenderProc;
input.inputProcRefCon = generator;
CheckError(AudioUnitSetProperty(generator->outputUnit,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Input,
0,
&input,
sizeof(input)),
"AudioUnitSetProperty failed");
CheckError (AudioUnitInitialize(generator->outputUnit),
"Couldn't initialize output unit");
}
My main problem is in my not knowing how to retreive the output hardware sample rate for the rendering AURenderCallbackStruct
since it does play a vital part in the signal generating process. I can't afford having the sample rate hard-coded into the rendering callback, although knowing it's the easiest way, since rate mismatch causes the signal being played at wrong pitch.
Is there a way of getting the default output's sample rate on such a low-level API?
Is there a way of matching it somehow, without getting overly complicated?
Have I missed something?
Thanks in advance.
Regards,
Tom
When calling AudioUnitGetProperty, the 6th parameter must be a pointer to a variable that will get the size of the answer.
Float64 sampleRate;
UInt32 sampleRateSize;
CheckError(AudioUnitGetProperty(generator->outputUnit,
kAudioUnitProperty_SampleRate,
kAudioUnitScope_Input,
0,
&sampleRate,
&sampleRateSize),
"AudioUnitGetProperty failed");
However, as long as the sample rate has not been set, the function does not return a value (but there is also no error!)
You can however set the sample rate with for instance:
Float64 sampleRate = 48000;
CheckError(AudioUnitSetProperty(generator->outputUnit,
kAudioUnitProperty_SampleRate,
kAudioUnitScope_Input,
0,
&sampleRate,
sizeof(sampleRate)),
"AudioUnitGetProperty failed");
From now on you can also read the value with the Get-call.
This does not answer the question, what the default value is. As far as I know that is always 44100 Hz.
The sample-rate is a property of all AudioUnits - see kAudioUnitProperty_SampleRate (documentation here) - although ultimately it's the IO Unit (RemoteIO on iOS or HAL unit on MacOSX) that drives the sample-rate at the audio interface. This is not available in the call-back structure; you need to read this property with AudioUnitGetProperty() in your initialisation code.
In your case, the following would probably do it:
Float64 sampleRate;
CheckError(AudioUnitGetProperty(generator->outputUnit,
kAudioUnitProperty_SampleRate,
kAudioUnitScope_Input,
0,
&sampleRate,
sizeof(sampleRate)),
If you're targeting iOS, you also need to interact with the Audio Session.