audio power on AudioQueue - cocoa

I'm now creating an Application using speech recognition.To check the Audio Power coming in through the microphone,
I wrote a method as follows.
-(void)checkPower(AudioqueRef)queue{
UInt32 expectedSize= sizeof(AudioQueueLevelMeterState);
AudioQueueGetProperty(queue,
kAudioQueueProperty_CurrentLevelMeter,
audioLevels,
expectedSize);
NSLog(#"average:%f peak:%f",audioLevels.mAveragePower,audioLevels.mPeakPower);
}
I found that sometimes mAveragePower was larger than mPeakPower,
and when mAveragePower was 1.0, in other words, averagePower
is regarded as max, mPeakPower was lower than 1.0.
I think that generally this result is inpossible.
please Let me know if you have any information about sound power on CoreAudio.
thanks.

I think mPeakPower means channel power at CURRENT moment & mAveragePower - average channel power for ALL record time, and if it's right your result possible.

Related

OSX: CoreAudio API for setting IO Buffer length?

This is a follow-up to a previous question:
OSX CoreAudio: Getting inNumberFrames in advance - on initialization?
I am trying to figure out what will be the AudioUnit API for possibly setting inNumberFrames or preffered IO buffer duration of an input callback for a single HAL audio component instance in OSX (not a plug-in!).
While I understand there is a comprehensive documentation on how this can be achieved in iOS, by means of AVAudioSession API, I can neither figure out nor find documentation on setting these values in OSX, whichever API.
The web is full of expert, yet conflicting statements ranging from "There is an Audio Unit API to request a sample rate and a preferred buffer duration...", to "You can definitely get the number of frames, but only for the current callback call...".
Is there a way of at least getting (and adapting to) the inNumberFrames or the audio buffer length offerd by the system, for the input-selected sampling rates in OSX? For example, for 44.1k and its multiples (this seems to work partly), as well as for 48k and its multiples (this doesn't seem to work at all, I don't know where's the hack which allows for adapting the buffer lenfth to these values)? Here's the console printout:
Available 7 Sample Rates
Available Sample Rate value : 8000.000000
Available Sample Rate value : 16000.000000
Available Sample Rate value : 32000.000000
Available Sample Rate value : 44100.000000
Available Sample Rate value : 48000.000000
Available Sample Rate value : 88200.000000
Available Sample Rate value : 96000.000000
.mSampleRate = 48000.00
.mFormatID = 1819304813
.mBytesPerPacket = 8
.mFramesPerPacket = 1
.mBytesPerFrame = 8
.mChannelsPerFrame = 2
.mBitsPerChannel = 32
.mFormatFlags = 9
_mFormatHumanReadable = kAudioFormatFlagIsFloat
kAudioFormatFlagIsPacked
kLinearPCMFormatFlagIsFloat
kLinearPCMFormatFlagIsPacked
kLinearPCMFormatFlagsSampleFractionShift
kAppleLosslessFormatFlag_16BitSourceData
kAppleLosslessFormatFlag_24BitSourceData
expectedInNumberFrames = 512
Couldn't render in current context (Error -10863)
The expected inNumberFrames is read from the system:
UInt32 expectedInNumberFrames = 0;
UInt32 propSize = sizeof(UInt32);
AudioUnitGetProperty(gInputUnitComponentInstance,
kAudioDevicePropertyBufferFrameSize,
kAudioUnitScope_Global,
0,
&expectedInNumberFrames,
&propSize);
Thanks in advance for pointing me at the right direction!
See this Apple Technical Note: https://developer.apple.com/library/mac/technotes/tn2321/_index.html#//apple_ref/doc/uid/DTS40013499-CH1-THE_I_O_BUFFER_SIZE
See the OS X example code in this technical note for GetIOBufferFrameSizeRange(), GetCurrentIOBufferFrameSize(), and SetCurrentIOBufferFrameSize().
Note that there is an API property returning an allowed range, and an error return on the property setter. Also note the various Mac power saving modes may change the buffer size while an app is running, so the actual buffer size, inNumberFrames, may not stay constant, or even be known until the Audio Unit starts running.
If you get unusual buffer sizes (not a power of 2), it may be that the actual audio hardware on a particular Apple product model has a fixed or limited range of audio sample rates, and thus OS software is being used to resample and thus resize the buffers being sent to audio unit callbacks depending on that hardware, if the app requests a sample rate not supported by the actual codec chips on the circuit board.

Audiorecorder, Choosing the input channels. Mac and RME

I am trying to simply record sound through an external sound card: RME Fireface 400.
This is the code I am using:
AO = audioplayer(mls_o,fs,16,5); % mls_o is the signal that is played.
AI = audiorecorder(fs,16,2,5); % 2 CHANNELS BUT HOW DO I ASSIGN THEM
play(AO);%playing
recordblocking(AI,1,2);%recording
y_rec=getaudiodata(AI);
delete(AI);% Deleting the objects
delete(AO);
I can only chose the number of channels, but not to address them.
Audiorecorder only supports 2 channels and Fireface have 8 input channels. I have to use the first two analog: ch5 and ch6. However, by default audio recorder only look 1st and 2nd one, which are mic inputs.
Otherwise do you know any other way of doing that?
I don't think you can do this with just the audiorecorder object. Check out the playrec library and pa-wavplay FileExchange submission, either of which should be capable of doing what you need. I believe Psychtoolbox also allows you to do this, but that might be overkill for your needs.

How to use audioConverterFillComplexBuffer and its callback?

I need a step by step walkthrough on how to use audioConverterFillComplexBuffer and its callback. No, don't tell me to read the Apple docs. I do everything they say and the conversion always fails. No, don't tell me to go look for examples of audioConverterFillComplexBuffer and its callback in use - I've duplicated about a dozen such examples both line for line and modified and the conversion always fails. No, there isn't any problem with the input data. No, it isn't an endian issue. No, the problem isn't my version of OS X.
The problem is that I don't understand how audioConverterFillComplexBuffer works, so I don't know what I'm doing wrong. And nothing out there is helping me understand, because it seems like nobody on Earth really understands how audioConverterFillComplexBuffer works, either. From the people who actually use it(I spy cargo cult programming in their code) to even the authors of Learning Core Audio and/or Apple itself(http://stackoverflow.com/questions/13604612/core-audio-how-can-one-packet-one-byte-when-clearly-one-packet-4-bytes).
This isn't just a problem for me, it's a problem for anybody who wants to program high-performance audio on the Mac platform. Threadbare documentation that's apparently wrong and examples that don't work are no fun.
Once again, to be clear: I NEED A STEP BY STEP WALKTHROUGH ON HOW TO USE audioConverterFillComplexBuffer plus its callback and so does the entire Mac developer community.
This is a very old question but I think is still relevant. I've spent a few days fighting this and have finally achieved a successful conversion. I'm certainly no expert but I'll outline my understanding of how it works. Note I'm using Swift, which I'm also just learning.
Here are the main function arguments:
inAudioConverter: AudioConverterRef: This one is simple enough, just pass in a previously created AudioConverterRef.
inInputDataProc: AudioConverterComplexInputDataProc: The very complex callback. We'll come back to this.
inInputDataProcUserData, UnsafeMutableRawPointer?: This is a reference to whatever data you may need to be provided to the callback function. Important because even in swift the callback can't inherit context. E.g. you may need to access an AudioFileID or keep track of the number of packets read so far.
ioOutputDataPacketSize: UnsafeMutablePointer<UInt32>: This one is a little misleading. The name implies it's the packet size but reading the documentation we learn it's the total number of packets expected for the output format. You can calculate this as outPacketCount = frameCount / outStreamDescription.mFramesPerPacket.
outOutputData: UnsafeMutablePointer<AudioBufferList>: This is an audio buffer list which you need to have already initialized with enough space to hold the expected output data. The size can be calculated as byteSize = outPacketCount * outMaxPacketSize.
outPacketDescription: UnsafeMutablePointer<AudioStreamPacketDescription>?: This is optional. If you need packet descriptions, pass in a block of memory the size of outPacketCount * sizeof(AudioStreamPacketDescription).
As the converter runs it will repeatedly call the callback function to request more data to convert. The main job of the callback is simply to read the requested number packets from the source data. The converter will then convert the packets to the output format and fill the output buffer. Here are the arguments for the callback:
inAudioConverter: AudioConverterRef: The audio converter again. You probably won't need to use this.
ioNumberDataPackets: UnsafeMutablePointer<UInt32>: The number of packets to read. After reading, you must set this to the number of packets actually read (which may be less than the number requested if we reached the end).
ioData: UnsafeMutablePointer<AudioBufferList>: An AudioBufferList which is already configured except for the actual data. You need to initialise ioData.mBuffers.mData with enough capacity to hold the expected number of packets, i.e. ioNumberDataPackets * inMaxPacketSize. Set the value of ioData.mBuffers.mDataByteSize to match.
outDataPacketDescription: UnsafeMutablePointer<UnsafeMutablePointer<AudioStreamPacketDescription>?>?: Depending on the formats used, the converter may need to keep track of packet descriptions. You need to initialise this with enough capacity to hold the expected number of packet descriptions.
inUserData: UnsafeMutableRawPointer?: The user data that you provided to the converter.
So, to start you need to:
Have sufficient information about your input and output data, namely the number of frames and maximum packet sizes.
Initialise an AudioBufferList with sufficient capacity to hold the output data.
Call AudioConverterFillComplexBuffer.
And on each run of the callback you need to:
Initialise ioData with sufficient capacity to store ioNumberDataPackets of source data.
Initialise outDataPacketDescription with sufficient capacity to store ioNumberDataPackets of AudioStreamPacketDescriptions.
Fill the buffer with source packets.
Write the packet descriptions.
Set ioNumberDataPackets to the number of packets actually read.
return noErr if successful.
Here's an example where I read the data from an AudioFileID:
var converter: AudioConverterRef?
// User data holds an AudioFileID, input max packet size, and a count of packets read
var uData = (fRef, maxPacketSize, UnsafeMutablePointer<Int64>.allocate(capacity: 1))
err = AudioConverterNew(&inStreamDesc, &outStreamDesc, &converter)
err = AudioConverterFillComplexBuffer(converter!, { _, ioNumberDataPackets, ioData, outDataPacketDescription, inUserData in
let uData = inUserData!.load(as: (AudioFileID, UInt32, UnsafeMutablePointer<Int64>).self)
ioData.pointee.mBuffers.mDataByteSize = uData.1
ioData.pointee.mBuffers.mData = UnsafeMutableRawPointer.allocate(byteCount: Int(uData.1), alignment: 1)
outDataPacketDescription?.pointee = UnsafeMutablePointer<AudioStreamPacketDescription>.allocate(capacity: Int(ioNumberDataPackets.pointee))
let err = AudioFileReadPacketData(uData.0, false, &ioData.pointee.mBuffers.mDataByteSize, outDataPacketDescription?.pointee, uData.2.pointee, ioNumberDataPackets, ioData.pointee.mBuffers.mData)
uData.2.pointee += Int64(ioNumberDataPackets.pointee)
return err
}, &uData, &numPackets, &bufferList, nil)
Again, I'm no expert, this is just what I've learned by trial and error.

audioqueue kAudioQueueParam_Pitch

The documentation for Audio Queue Services under OS 10.6 now includes a pitch parameter:
kAudioQueueParam_Pitch
The number of cents to pitch-shift the audio queue’s playback, in the range -2400through 2400 cents (where 1200 cents corresponds to one musical octave.)
This parameter is usable only if the time/pitch processor is enabled.
Other sections of the same document still say that volume is the only available parameter, and I can't find any reference to the time/pitch processor mentioned above.
Does anyone know what this refers to? Directly writing a value to the parameter has no effect on playback (although no error is thrown). Similarly writing the volume setting does work.
Frustrating as usual with no support from Apple.
This is only available on OSX until iOS 7. If you look at AudioQueue.h you'll find it is conditionally available only on iOS 7. [note: on re-reading I see you were referring to OS X, not iOS, but hopefully the following is cross-platform]
Also, you need to enable the queue for time_pitch before setting the time_pitch algorithm, and only the Spectral algorithm supports pitch (all of them support rate)
result = AudioQueueNewOutput(&(pAqData->mDataFormat), aqHandleOutputBuffer, pAqData,
0, kCFRunLoopCommonModes , 0, &(pAqData->mQueue));
// enable time_pitch
UInt32 trueValue = 1;
AudioQueueSetProperty(pAqData->mQueue, kAudioQueueProperty_EnableTimePitch, &trueValue, sizeof(trueValue));
UInt32 timePitchAlgorithm = kAudioQueueTimePitchAlgorithm_Spectral; // supports rate and pitch
AudioQueueSetProperty(pAqData->mQueue, kAudioQueueProperty_TimePitchAlgorithm, &timePitchAlgorithm, sizeof(timePitchAlgorithm));

Detecting if the microphone is on

Is there a way to programmatically detect whether the microphone is on on Windows?
No, microphones don't tell you whether they're ‘on’ or that a particular sound channel is connected to a microphone device. The best you can do is to read audio data from the input channel you suspect to be a microphone (eg. the Windows default input device/channel), and see if there's any signal on it.
To do that you'd have to remove any DC offset and look for any signal above a reasonable noise floor. (Be generous: many cheap audio input devices are quite noisy even when there is no signal coming in. A mid-band filter/FFT would also be useful to detect only signals in the mid-range of a voice and not low-frequency hum and transient clicks.)
This is not tested in any way, but I would try to read some samples and see if there is any variation. If the mike is on then you should get different values from the ambient sounds. If the mike is off you should get a 0. Again this is just how I imagine things should work - I don't know if they actually work that way.
Due to a happy accident, I may have discovered that yes there is a way to detect the presence of a connected microphone.
If your windows "recording devices" shows "no microphone", then this approach (using the Microsoft Speech API) will work and confirm you have no mic. If windows however thinks you have a mic, this won't disagree.
#include <sapi.h>
#include <sapiddk.h>
#include <sphelper.h>
CComPtr<ISpRecognizer> m_cpEngine;
m_cpEngine.CoCreateInstance(CLSID_SpInprocRecognizer);
CComPtr<ISpObjectToken> pAudioToken;
HRESULT hr = SpGetDefaultTokenFromCategoryId(SPCAT_AUDIOIN, &pAudioToken);
if (FAILED(hr)) ::OutputDebugString("no input, aka microphone, detected");
more specifically, hr will return this result:
SPERR_NOT_FOUND 0x8004503a -2147200966
The requested data item (data key, value, etc.) was not found.

Resources