I listed the waveInGetDevCaps and it shows me microphone. I however need to record the speaker audio. Is it possible to record devices listed by waveOutGetDevCaps? All examples I find are of waveIn
I am trying to record audio of system. Not audio of mic.
I have two goals, one is a record the sound then do music recognition on it, and the second goal is to record screen and system audio togather. Does DirectShow apis record audio as well?
Edit: So I started the DirectShow thing and am able to list CLSID_AudioInputDeviceCategory but I can't find an example out there of how to record system audio, does anyone know of one or can provide one?
Related
I am reading about Audio Units on OSX, but it's not totally clear to me what an Audio Unit is.
I would like to insert custom audio processing in any stream that is being captured from a microphone or played by any application.
Is it possible to implement the custom audio processing as an Audio Unit which is automatically inserted into any capture or render streams on the machine?
If so, are there any good examples in the public domain that I can take a look at?
I'm trying to use various audio sources in DirectShow and I have these capture devices in my system which I think are quite common (provided by chipset drivers):
Realtek HD Audio Line input
Realtek HD Audio Stereo input
Realtek HD Audio Mic input
They look like capture sources, expose analog input and 24-bit pcm output, and can connect the output to other filters (renderer etc).
But the return code from IMediaFilter::Run of the capture filter is ERROR_BAD_COMMAND which does not say much. I tried it in my program and also in GraphStudioNext which did not reveal any extra information.
Is it possible to use these for capture and how?
Update
For instance, I tried this graph with mic input (actually connected and working). In this setup, the graph does not start (ERROR_BAD_COMMAND) but with the other source, it would start.
This is the same device but different drivers. The one that works is from the category "Audio capture sources" the one that does not "WDM Streaming Capture Devices".
The easiest way to check the device with GraphStudioNext is to build a recording graph with the PCM audio input device itself, AVI Mux filter and File Writer filter connected as this (with default media types):
You hit Run and the recording graph produces non-empty file via Filter Writer in the location prompted during graph building.
--
So now I realized your question is a bit different. You see filters corresponding to your audio input device both under
Audio Capture Sources -- CLSID_AudioInputDeviceCategory
WDM Streaming Capture Devices -- AM_KSCATEGORY_CAPTURE
And the question is that the first filter works and the other does not.
A similar filter from AM_KSCATEGORY_CAPTURE seems to be connecting into topology, but attempt to run triggers ERROR_BAD_COMMAND.
First of all, these are indeed different filters. Even though underlying hardware might be the same, the "frontend" filters are different. The wrapper that "works" is Audio Capture Filter backed by WDM device. In the other case it is Generic WDM Filter Proxy which behavior is, generally speaking, undefined. The filter is not documented and, I am guessing, it does not receive sufficient initialization or does not implement required behavior otherwise, so this proxy is not and is not supposed to be interchangeable with Audio Capture Filter proxy.
I am looking for a way I can modify an output stream from the microphone.
The idea is to modify the output stream merging two audio streams into single one.
My use case is the following. When a person makes a skype call it adds a background song to the output stream.
Is there any way to do this for Windows ?
If you are talking about manipulating the input that other programs see this would be fairly difficult to implement, you would have to create a virtual audio device and then have the target program use that. There are existig packages that already provide that functionality, however, perhaps a search for "virtual audio cable" or "virtual mixer" would come up with something that would work.
I am executing VLC from my application to capture and encode from a DirectShow audio capture device. VLC sends the encoded data to my application via STDOUT. I need a way to enumerate DirectShow audio capture devices. Unfortunately, VLC doesn't seem to provide any non-GUI way for this.
While looking for a simple way to get a list of device names, I stumbled on these registry keys where child keys are named after audio capture devices:
HKEY_CURRENT_USER\Software\Microsoft\ActiveMovie\devenum 64-bit\{33D9A762-90C8-11D0-BD43-00A0C911CE86}
HKEY_CURRENT_USER\Software\Microsoft\ActiveMovie\devenum\{33D9A762-90C8-11D0-BD43-00A0C911CE86}
Is this registry location guaranteed to be in the same place for other machines and recent versions of DirectX? Short of implementing a ton of DirectX code, is there some other way to get a list of the DirectShow audio device names? (Possibly through some output of a diagnostic tool.)
The list of DirectShow (a Windows core API, not a part of DirectX anymore) devices is provided by enumerators listing specific category (audio input devices in this case, CLSID_AudioInputDeviceCategory) on request. This is the GUID in question and registry does not necessarily contains entries for all devices there. Instead, enumerator provides the list of devices programmatically via API, combining the available devices of different types.
There is no way to affect enumeration order in well defined/documented way.
The easiest way to enumerate the devices is Windows SDK GraphEdt.exe tool, or its nicer alternate options GraphStudio/GraphStudioNext. Ctrl+F and then select the category:
You can also enumerate devices and their capabilities with EnumerateAudioCaptureFilterCapabilities command line tool (source code), where "Friendly Name" lines list devices in enumeration order:
Moniker Display Name: #device:cm:{33D9A762-90C8-11D0-BD43-00A0C911CE86}\Stereo Mix (Realtek High Defini
Friendly Name: Stereo Mix (Realtek High Defini
Pin: Capture
Capability Count: 23
Capability 0:
AM_MEDIA_TYPE:
.bFixedSizeSamples: 1
.bTemporalCompression: 0
.lSampleSize: 4
.cbFormat: 18
WAVEFORMATEX:
.wFormatTag: 1
.nChannels: 2
.nSamplesPerSec: 44100
.nAvgBytesPerSec: 176400
.nBlockAlign: 4
.wBitsPerSample: 16
.cbSize: 0
To affect the order, such as to place a device on interest on top of the list, I can only think of API hooking, which is a possible thing but not recommended for wide use due to alteration of standard system behavior.
I ran a test today with a DirectShow graph I assembled that had a Capture Filter assigned to my VOIP phone at the top of the graph. The app takes the audio from the capture filter and writes a WAV file, as part of the filter graph's operations. Out of curiosity I ran two copies of the program, fully expecting one of them to throw an error complaining that the capture device was "in use". Much to my surprise both copies of the program worked fine and each created its own WAV file of the recorded audio. The audio in both files was smooth and without problem and twins of each other in regards to the contained audio data.
Can I count on all DirectShow capture filters to exhibit the ability to be shared between multiple filter graphs? Or is every device/driver different?
If the filter instances don't share internally any exclusive access resources (such as hardware, specific TCP ports etc), you are free to duplicate them within a process, or in multiple processes. There are no implications as for specific filter to be only active in a single instance throughout the system.
Important example include:
USB video capture: a video capture device is normally intended to be used by one application only, so as soon as it is active it is locked no other application or filter instance can capture from it
Audio playback: popular user mode API for audio is a layer on top of actual playback implementation, internally a driver mixes audio from mutliple audio-enabled applications; so when you play audio, there is no exclusive lock involved because actualy device is shared between applications and there is code running around which combines audio from the applications transparently.