Detecting if the microphone is on - windows

Is there a way to programmatically detect whether the microphone is on on Windows?

No, microphones don't tell you whether they're ‘on’ or that a particular sound channel is connected to a microphone device. The best you can do is to read audio data from the input channel you suspect to be a microphone (eg. the Windows default input device/channel), and see if there's any signal on it.
To do that you'd have to remove any DC offset and look for any signal above a reasonable noise floor. (Be generous: many cheap audio input devices are quite noisy even when there is no signal coming in. A mid-band filter/FFT would also be useful to detect only signals in the mid-range of a voice and not low-frequency hum and transient clicks.)

This is not tested in any way, but I would try to read some samples and see if there is any variation. If the mike is on then you should get different values from the ambient sounds. If the mike is off you should get a 0. Again this is just how I imagine things should work - I don't know if they actually work that way.

Due to a happy accident, I may have discovered that yes there is a way to detect the presence of a connected microphone.
If your windows "recording devices" shows "no microphone", then this approach (using the Microsoft Speech API) will work and confirm you have no mic. If windows however thinks you have a mic, this won't disagree.
#include <sapi.h>
#include <sapiddk.h>
#include <sphelper.h>
CComPtr<ISpRecognizer> m_cpEngine;
m_cpEngine.CoCreateInstance(CLSID_SpInprocRecognizer);
CComPtr<ISpObjectToken> pAudioToken;
HRESULT hr = SpGetDefaultTokenFromCategoryId(SPCAT_AUDIOIN, &pAudioToken);
if (FAILED(hr)) ::OutputDebugString("no input, aka microphone, detected");
more specifically, hr will return this result:
SPERR_NOT_FOUND 0x8004503a -2147200966
The requested data item (data key, value, etc.) was not found.

Related

on macOS, can an app disable/suppress all system audio output which is not emitted by itself?

In an app, I'm driving a laser projection device using a connected USB audio interface on macOS.
The laser device takes analog audio as an input.
As a safety feature, it would be great if I could make the audio output from my app the exclusive output, because any other audio from other apps or from the OS itself which is routed to the USB audio interface is mixed with my laser control audio, is unwanted and a potential safety hazard.
Is it possible on macOS to make my app's audio output exclusive? I know you can configure AVAudioSession on iOS to achieve this (somewhat - you can duck other apps' audio, but notification sounds will in turn duck your app), but is something like this possible on the Mac? It does not need to be AppStore compatible.
Yes, you can request that CoreAudio gives you exclusive access to an audio output device. This is called hogging the device. If you hogged all of the devices, no other application (including the system) would be able to emit any sound.
Something like this would do the trick for a single device:
AudioObjectPropertyAddress HOG_MODE_PROPERTY = { kAudioDevicePropertyHogMode, kAudioObjectPropertyScopeGlobal, kAudioObjectPropertyElementMaster };
AudioDeviceID deviceId = // your audio device ID
pid_t hoggingProcess = -1; // -1 means attempt to acquire exclusive access
UInt32 size = sizeof(pid_t);
AudioObjectSetPropertyData(deviceId, &HOG_MODE_PROPERTY, 0, NULL, size, &hoggingProcess);
assert(hoggingProcess == getpid()); // check that you have exclusive access
Hog mode works by setting an AudioObject property called kAudioDevicePropertyHogMode. The value of the property is -1 if the device is not hogged. If it is hogged the value is the process id of the hogging process.
If you jump to definition on kAudioDevicePropertyHogMode in Xcode you can read the header doc for the hog mode property. That is the best way to learn about how this property (and pretty much anything and everything else in CoreAudio) works.
For completeness, here's the header doc:
A pid_t indicating the process that currently owns exclusive access to the
AudioDevice or a value of -1 indicating that the device is currently
available to all processes. If the AudioDevice is in a non-mixable mode,
the HAL will automatically take hog mode on behalf of the first process to
start an IOProc.
Note that when setting this property, the value passed in is ignored. If
another process owns exclusive access, that remains unchanged. If the
current process owns exclusive access, it is released and made available to
all processes again. If no process has exclusive access (meaning the current
value is -1), this process gains ownership of exclusive access. On return,
the pid_t pointed to by inPropertyData will contain the new value of the
property.

How to properly use MIDIReadProc?

According to apple's docs it says:
Because your MIDIReadProc callback is invoked from a separate thread,
be aware of the synchronization issues when using data provided by
this callback.
Does this mean, use #synchronize to do thread blocking for safety?
Or does this literally mean synchronization timing issues may happen?
I am currently trying to read a midi file, and use a MIDIReadProc to trigger the note-on / note-off of a software synth based off of midi events. I need this to be extremely reliable and perfectly in-time. Right now, I am noticing that when I consume these midi events and write the audio to a buffer (all done from the MIDIReadProc), the timing is extremely sloppy and not sounding right at all. So I would like to know, what is the "proper" way to consume midi events from a MIDIReadProc?
Also, is a MIDIReadProc the only option for consuming midi events from a midi file?
Is there another option as far as setting up a virtual endpoint that could be directly consumed by my synthesizer? If so, how does that work exactly?
If you presume a function of this format to be the midiReadProc,
void midiReadProc(const MIDIPacketList *packetList,
void* readProcRefCon,
void* srcConnRefCon)
{
MIDIPacket *packet = (MIDIPacket*)packetList->packet;
int count = packetList->numPackets;
for (int k=0; k<count; k++) {
Byte midiStatus = packet->data[0];
Byte midiChannel= midiStatus & 0x0F;
Byte midiCommand = midiStatus >> 4;
//parse MIDI messages, extract relevant information and pass it to the controller
//controller must be visible from the midiReadProc
}
packet = MIDIPacketNext(packet);
}
the MIDI client has to be declared in the controller, interpreted MIDI events get stored into the controller from MIDI callback and read by the audioRenderCallback() on each audio render cycle. This way you can minimize timing imprecisions to the
length of the audio buffer, which you can negotiate during AudioUnit setup to be as short as the system allows for.
A controller can be a #interface myMidiSynthController : NSViewController you define, consisting of a matrix of MIDI channels and a pre-determined maximum-polyphony-per-channel, among other relevant data such as interface elements, phase accumulators for each active voice, AudioComponentInstance, etc... It would be wrong to resize the controller based on the midiReadProc() input. RAM is cheap nowadays.
I'm using such MIDI callbacks for processing live input from MIDI devices. Concerning playback of MIDI files, if you
want to process streams or files of arbitrary complexity, you may also run into surprises. MIDI standard itself
has timing features, which work as good as MIDI hardware allows for. Once you read an entire file into the memory, you can
translate your data into whatever you want and use your own code for controlling sound synthesis.
Please, observe not to use any code which would block the audio render thread (i.e. inside audioRenderCallback()), or would do memory management on it.
You could use AVAudioEngine.musicSequence and prepare your audio unit graph. Then use the MusicSequence API to load your GM file. Like this you don’t need to do the timing by yourself. Note I have not done this myself so far but I understand in theory it should work like this.
After you instantiate your synthesizer audio unit, you attach and connect it to the AVAudioEngine graph.
Does this mean, use #synchronize to do thread blocking for safety?
The opposite of what you’ve said is true: You should certainly not lock in a realtime thread. The #synchronized directive will lock if the resource is already locked. You may consider to use lock-free queues for realtime threads. See also Four common mistakes in audio development.
If you have to use CoreMIDI and MIDIReadProc, you can send MIDI commands to the synthesizer audio unit by calling MusicDeviceMIDIEvent right from your callback.

windows audio waveOutSetVolume cross connects midiOutSetVolume

i have a program that generates both midi and wav audio. i need to be able to control the volume and balance of midi and audio separately and in theory, its seems like all i need to do is call
unsigned short left = (unsigned short)(wavvol*wavbal/100.0);
unsigned short right = (unsigned short)(wavvol*(100-wavbal)/100.0);
MMRESULT err = waveOutSetVolume(hWaveOut, left | (right<<16)); // for audio
and
unsigned short left = (unsigned short)(midivol*midibal/100.0);
unsigned short right = (unsigned short)(midivol*(100-midibal)/100.0);
MMRESULT err = midiOutSetVolume(s_hmidiout, left | (right<<16)); // for midi
for midi
the problem it, controlling midi volume sets wave volume and visa-verse, its like they are glues together inside windows
does anyone know if there is a way to unglue them?
BTW, i'm on windows 7, i know Microsoft messed up audio in win7. on XP i had an audio control panel with separate controls for midi and wave, that seems to have gone. i guess they just mix it down internally now and don't expose that even at the API level so maybe i've answered my own question.
still interested to know if there is a better answer through.
thanks, steve
I don't think they can be separated. You could move to the newer IAudioClient interface and use two WASAPI sessions to control the volume separately - one for wav and one for midi. This won't work on anything below Vista tho.
Alternatively you could track the volume levels in-code and as long as you don't play back both wav and midi at the same time reset them before playback.

Is it possible to use midiOutLongMsg to play a chord? (Win32 API)

This guys says yes:
http://web.tiscalinet.it/giordy/midi-tech/lowmidi.htm
Same with a really old book from 1998 (Maximum MIDI).
MSDN doesn't mention it.
I'm not getting any sound.
I fill a char buffer with status|note|velocity|status|note|velocity...
Set lpData, dwBufferLength, and dwFlags of a MIDIHDR struct
call midiOutPrepareHeader (MMSYSERR_NOERROR)
call midiOutLongMsg (MMSYSERR_NOERROR)
Still no sound! Spamming midiOutShortMsg is working but will that work for slower machines? Did they change the functionality?
Thanks.
I'm an idiot! I figured it out: Microsoft GS Wavetable Synth does NOT support sending multiple short messages in midiOutLongMsg. The MIDI Mapper DOES!
midiOutShortMsg should be plenty fast, even on slow machines. MIDI interfaces themselves (hardware that is, but some software will limit themselves) run at 31,250 baud. This of course is ignoring any slow code you may have wrapped around where you call midiOutShortMsg.
Anyway, technically you should also be able to get away with one status byte, if the following notes use the same status byte. So, if you want to do note on/off (using velocity 0 for off) and those notes are on the same channel, you could do this:
status|note|velocity|note|velocity|note|velocity|note|velocity
This is called running status.

Multiple mice on OS X

I am developing an OS X application that is supposed to take input from two mice. I want to read the motion of each mouse independently. What would be the best way to do this?
If that is not possible, is there a way to disable/enable either of the mice programmatically?
The HID Class Device Interface is definitely what you need. There are basically two steps:
First you need to find the mouse devices. To do this you need to construct a matching dictionary and then search the IO Registry with it. There is some sample code here, you will need to add some additional elements to the dictionary so you just get the mice instead of the all HID devices on the system. Something like this should do the trick:
// Set up a matching dictionary to search the I/O Registry by class
// name for all HID class devices`
hidMatchDictionary = IOServiceMatching(kIOHIDDeviceKey);
// Add key for device usage page - 0x01 for "Generic Desktop"
UInt32 usagePage = 0x01;
CFNumberRef usagePageRef = ::CFNumberCreate( kCFAllocatorDefault, kCFNumberLongType, &usagePage );
::CFDictionarySetValue( hidMatchDictionary, CFSTR( kIOHIDPrimaryUsagePageKey ), usagePageRef );
::CFRelease( usagePageRef );
// Add key for device usage - 0x02 for "Mouse"
UInt32 usage = 0x02;
CFNumberRef usageRef = ::CFNumberCreate( kCFAllocatorDefault, kCFNumberLongType, &usage );
::CFDictionarySetValue( hidMatchDictionary, CFSTR( kIOHIDPrimaryUsageKey ), usageRef );
::CFRelease( usageRef );
You then need to listen to the X/Y/button queues from the devices you found above. This sample code should point you in the right direction. Using the callbacks is much more efficient than polling!
The HID code looks much more complex than it is - it's made rather "wordy" by the CF stuff.
It looks like the HID Manager is what you're looking for.
You're going to want to check out the I/O Kit and HID (Human Interface Device) manager stuff.
HID manager is part of I/O Kit, so looking into there might be useful. There are two APIs for HID management, the older API is a bit more painful and then you have the new 10.5 and above API which is a bit more comfortable.
Important thing to understand is this isn't going to probably be just a quick fix, it may take some significant work to get it running. If you can assume you'll have 10.5 or better installed, using the Leopard API will definitely help.
Also; depending on how you accomplish what you're doing, may be important for you to hide the mouse cursor as it may still move a lot even if you're receiving the information from both mice. If your application grabs the screen, I'd use CoreGraphics to disable the cursor and just draw my own.
You might also consider finding a wrapper for one of these APIs, an example can be found in this question.
Unless you can force one of the mice to not be dealt with as a mouse, both will continue to control the pointer. However, you can use IOKit to write a custom USB HID driver to allow your app to read from one or both of the mice (although this would probably interfere with using them as normal mice). Building Customized User Client Drivers for USB Devices would be a good place to start for how to interact directly with USB mice.
You could look at the USB/PS-2 device interrupt.
Even if you don't want to rewrite a so called driver, it could be usefull since all the mice send their data through.
You could also check this page that could give some hints http://multicursor-wm.sourceforge.net/
maybe it's a solution for you to use usb->rsr232 converter and go by reading the serial port by yourself ?

Resources