I'm making an application where I will:
Record from the microphone and do some realtime processing on the input
Play an MP3 file (a regular song), but manipulating the output in realtime
Every now and then I'll need to play additional sounds over this song too, but I guess I can do that by simply adding the buffers.
In short, I need to have circular buffers for both recording and playing, and I need to be "feeding" the output buffer every 20 ms or so with the new data that is just about to be played.
I've been looking at DirectSound, and it doesn't seem to help a lot. The reading and writing to the output buffers seem very similar to Win32, the only place where it seems it'd help is in playing the "additional sounds" over the main song.
Should I use DirectSound, or should I go straight to raw Windows APIs?
Is DirectSound going to do anything for me?
Thanks in Advance!
The Directsound API's give you better realtime control. They are also the supported way to use sound in Windows. I was under the impression that the win32 api's were depracated, but I could be wrong on this.
This question is close to yours
https://stackoverflow.com/questions/314522/what-is-the-best-c-sound-api-for-windows
also
Is DirectSound the best audio abstraction layer for Windows?
last but not least, this is what microsoft has to say http://msdn.microsoft.com/en-us/library/dd370784(VS.85).aspx
Neither? :)
The story is that DirectSound is the replacement for waveOut, but DirectSound joined DirectInput as deprecated APIs in Vista and is replaced with WASAPI. DirectSound and waveOut are implemented on top of the User-Space WASAPI in Vista. On XP, waveOut and DirectSound feed to the same kernel level Mixer API.
To consolidate all of these interfaces take a look at something like OpenAL, it's a well supported audio standard along the same lines as OpenGL.
It sounds like you're going to be quite sensitive to latency. It might pay to look at ASIO
I found Harmony Central - Audio Programming. Also read w:DirectSound.
Windows Vista features a completely
re-written audio stack based on the
Universal Audio Architecture. Because
of the architectural changes in the
redesigned audio stack, a direct path
from DirectSound to the audio drivers
does not exist.
Because of Xbox 360 and Microsoft
Windows integration, Microsoft is
actively pushing developers to migrate
new applications to equivalent Xbox
audio APIs such as XAudio and XACT.
OpenAL looks promising.
Related
I saw in this slides that the winmm and directsound in vista is based on wasapi.
Does it means that winmm and directsound actually do their work by calling the functions in WASAPI ?
I fail to find this information after some google search and hope someone here will know.
With a couple of highly specialized exceptions (directks, ASIO, openal), all audio rendering in Windows goes through WASAPI.
This includes audio rendering via winmm and directsound.
I have an old computer vision experiment that uses Video for Windows to grab frames from a camera connected to the PC. It's a hack, it uses VfW to create a preview window, then it does a GetDIBits from the window DC.
I'm finally ready to port this to DirectShow. My understanding was that I could grab frames from a video capture graph by using ISampleGrabber, but now I read that ISampleGrabber is deprecated.
What's the non-deprecated way to grab frames from a video feed? Do I have to implement my own DirectShow filter that does essentially what ISampleGrabber does?
DirectShow is not deprecated; just the DirectShow Editing Services. I would strongly recommend using DirectShow because of the much wider level of support, unless there are specific features of MF that are needed.
There's been no development of DES for some years, but the sample grabber is a widely-used filter that is somewhat independent of DES. I would be happy to recommend that you use it. If there is an issue in future versions of windows, it would not be more than a day or two's work to replace the filter.
G
I think Windows Media Foundation would be your best bet if you are only targeting Vista/Win7, otherwise you can still use DirectShow/SampleGrabber approach, I doubt it will be removed any time soon. Related question here.
I have this garden variety USB video camera, and it came with two mini-apps, one that just lets you see what the camera sees, and one that records to an .avi file.
But what's the API if I want to grab images from the camera in my own C program? I am making the assumptions that it's (1) possible and (2) desirable to make some call and have a 2D array of pixel information filled in.
What I really want to do is tinker with image processing algorithms, and for that I'd really like to get my code around some live data.
EDIT -
Having had a healthy exposure to Linux, I can grasp how (ideally/in theory) you could open() the device, use ioctl() to configure it, and read() the data. And I'm virtually certain that that's not how Windows is going to present the API. Not knowing what function names Windows might use for a video device API, or even if it has one, makes it difficult to look up, at least with the win32 api search capabilities that I have at my disposal.
You'll probably need the DirectShow API, provided that's how the camera operates. If the manufacturer created their own code path, you'll need their API.
Your first step, as pointed out by ChrisBD, is to check if Windows supports your device.
If that is the case you have three possible Windows APIs for capture:
DirectShow
VFW. Has more or less been replaced by DirectShow
MediaFoundation. Is the newest API that is intended to replace DirectShow. AFAIK not fully implemented yet and only available in Vista.
From the three DirectShow is the best choice. However, learning and using DirectShow is not a trivial task. An excellent example can be found here.
Another possibility is to use OpenCV. OpenCV is an image processing library, that you can also use for processing the images. OpenCV has an image capture API that provides a simpler abstraction and is easier to use than the Windows APIs.
The API is the way to go.
A good indication of whether the camera requires a bespoke one or not is to see if it is recognised by a PC without the manufacturer's applications installed. If windows has the drivers built in the you should be able to use the windows APIs to capture the images.
Alternatively if you know what compression codec has been used for the AVI file you could unpack it.
Ideally it would be good if you could capture the video in native (YUV, RGB15 or similar) format as then you can work on compression as well as manipulation.
How would it be possible to capture the audio programmatically? I am implementing an application that streams in real time the desktop on the network. The video part is finished. I need to implement the audio part. I need a way to get PCM data from the sound card to feed to my encoder (implemented using Windows Media Format).
I think the answer is related to the openMixer(), waveInOpen() functions in Win32 API, but I am not sure exactly what should I do.
How to open the necessary channel and how to read PCM data from it?
Thanks in advance.
The new Windows Vista Core Audio APIs have support for this explicitly (called Loopback Recording), so if you can live with a Vista only application this is the way to go.
See the Loopback Recording article on MSDN for instructions on how to do this.
I don't think there is a direct way to do this using the OS - it's a feature that may (or may not) be present on the sound card. Some sound cards have a loopback interface - Creative calls it "What U Hear". You simply select this as the input rather than the microphone, and record from it using the normal waveInOpen() that you already know about.
If the sound card doesn't have this feature then I think you're out of luck other than by doing something crazy like making your own driver. Or you could convince your users to run a cable from the speaker output to the line input :)
I'd like to pull a stream of PCM samples from a Mac's line-in or built-in mic and do a little live analysis (the exact nature doesn't pertain to this question, but it could be an FFT every so often, or some basic statistics on the sample levels, or what have you).
What's a good fit for this? Writing an AudioUnit that just passes the sound through and incidentally hands it off somewhere for analysis? Writing a JACK-aware app and figuring out how to get it to play with the JACK server? Ecasound?
This is a cheesy proof-of-concept hobby project, so simplicity of API is the driving factor (followed by reasonable choice of programming language).
The principal framework for audio development in Mac OS X is Core Audio; it's the basis for all audio I/O. There are layers on top of it like Audio Toolbox, Audio Queue Services, QuickTime, and QTKit that you can use if you want a simplified API for common tasks.
To just pull a stream of samples, you'd probably want to use Audio Queue Services; the AudioQueueNewInput function will set up recording of PCM data and pass it to a callback you supply.
On your Mac there's a set of Core Audio examples in /Developer/Examples/CoreAudio/SimpleSDK that includes a use (AQRecord in AudioQueueTools) of the Audio Queue Services recording APIs.
I think portaudio is what you need.
Reading from the mike from a console app is a 10 line C file (see patests in the portaudio distrib).
Apple provides sample code for reading and writing audio data. Additionally there is a lot of good information in the Audio section of the Apple Developer site.