I'd like to pull a stream of PCM samples from a Mac's line-in or built-in mic and do a little live analysis (the exact nature doesn't pertain to this question, but it could be an FFT every so often, or some basic statistics on the sample levels, or what have you).
What's a good fit for this? Writing an AudioUnit that just passes the sound through and incidentally hands it off somewhere for analysis? Writing a JACK-aware app and figuring out how to get it to play with the JACK server? Ecasound?
This is a cheesy proof-of-concept hobby project, so simplicity of API is the driving factor (followed by reasonable choice of programming language).
The principal framework for audio development in Mac OS X is Core Audio; it's the basis for all audio I/O. There are layers on top of it like Audio Toolbox, Audio Queue Services, QuickTime, and QTKit that you can use if you want a simplified API for common tasks.
To just pull a stream of samples, you'd probably want to use Audio Queue Services; the AudioQueueNewInput function will set up recording of PCM data and pass it to a callback you supply.
On your Mac there's a set of Core Audio examples in /Developer/Examples/CoreAudio/SimpleSDK that includes a use (AQRecord in AudioQueueTools) of the Audio Queue Services recording APIs.
I think portaudio is what you need.
Reading from the mike from a console app is a 10 line C file (see patests in the portaudio distrib).
Apple provides sample code for reading and writing audio data. Additionally there is a lot of good information in the Audio section of the Apple Developer site.
Related
I have a simple audio playing app that uses QTMovie for a few of it's features. I'm also developing a little ethernet-enabled board to stream MP3 or PCM data to.
Is there any way of 'grabbing' what QTMovie is outputting, format it into an array of bytes and send it over ethernet to a specific IP? Somehow iTunes manages to do this with AirPlay, so there's some sort of way to do this.
Thanks for any answers!
There are off-the-shelf products like Rogue Ameoba's airfoil that you might want to look at:
http://www.rogueamoeba.com/airfoil/mac/
But if you really want to get your hands dirty and develop something yourself, it looks like QTMovie just outputs to Core Audio, and you can set which device:
http://developer.apple.com/library/mac/#qa/qa1578/_index.html
There's a bit of Q&A on the topic of how programs that intercept Core Audio devices do that:
Code sample for capturing audio from a Mac in Cocoa and saving to file?
In Windows XP, you can configure your sound card properties via the preloaded windows software. In the recording properties, if "stereo mix" or "wave out" (or something similar) is selected as the recording device, programs that can record audio ("Sound Recorder" in windows for example) record a decent quality wave file of the audio stream. I usually use Goldwave from download.com to do this as an example of a third-party application that functions the same.
Well, I've had trouble getting this scenario to happen on Windows Vista or later in a direct no-bullsh*t manner as described above. It's more than just Vista+, it's also that some sound cards don't have that option at all.
I was just wondering if there is a way to run a windows-friendly program (VB?) that takes your audio output stream and converts it (in realtime, obviously) to a WAV file with the default sampling rate as other WAV files have.
Ideally, it would cool if it worked on any operating system, so is it possible to write a web service that "listens" to your audio card like that without making the computer think it's getting a virus attack or something?
Possibly related question:
How to save web audio streaming to file ( c++ / java )
I'm only aware of one manufacturer of sound cards that enabled that option (Creative). However Vista and beyond support a "loopback" mode which gives you effectively the same functionality. You need to use the low level WASAPI rendering stack but it should work just fine.
https://github.com/rdp/virtual-audio-output-sniffer provides a directshow input device to capture the sum of wave out for vista+
You could use low level waveOut API injection and capture what it receives.
I have SkypeMXrecorder, a software that does just that - inject into any exe and 'sniffs' what's going out from it and into the sound hardware. But, it seems rather complicated to implement...
How would it be possible to capture the audio programmatically? I am implementing an application that streams in real time the desktop on the network. The video part is finished. I need to implement the audio part. I need a way to get PCM data from the sound card to feed to my encoder (implemented using Windows Media Format).
I think the answer is related to the openMixer(), waveInOpen() functions in Win32 API, but I am not sure exactly what should I do.
How to open the necessary channel and how to read PCM data from it?
Thanks in advance.
The new Windows Vista Core Audio APIs have support for this explicitly (called Loopback Recording), so if you can live with a Vista only application this is the way to go.
See the Loopback Recording article on MSDN for instructions on how to do this.
I don't think there is a direct way to do this using the OS - it's a feature that may (or may not) be present on the sound card. Some sound cards have a loopback interface - Creative calls it "What U Hear". You simply select this as the input rather than the microphone, and record from it using the normal waveInOpen() that you already know about.
If the sound card doesn't have this feature then I think you're out of luck other than by doing something crazy like making your own driver. Or you could convince your users to run a cable from the speaker output to the line input :)
I'm making an application where I will:
Record from the microphone and do some realtime processing on the input
Play an MP3 file (a regular song), but manipulating the output in realtime
Every now and then I'll need to play additional sounds over this song too, but I guess I can do that by simply adding the buffers.
In short, I need to have circular buffers for both recording and playing, and I need to be "feeding" the output buffer every 20 ms or so with the new data that is just about to be played.
I've been looking at DirectSound, and it doesn't seem to help a lot. The reading and writing to the output buffers seem very similar to Win32, the only place where it seems it'd help is in playing the "additional sounds" over the main song.
Should I use DirectSound, or should I go straight to raw Windows APIs?
Is DirectSound going to do anything for me?
Thanks in Advance!
The Directsound API's give you better realtime control. They are also the supported way to use sound in Windows. I was under the impression that the win32 api's were depracated, but I could be wrong on this.
This question is close to yours
https://stackoverflow.com/questions/314522/what-is-the-best-c-sound-api-for-windows
also
Is DirectSound the best audio abstraction layer for Windows?
last but not least, this is what microsoft has to say http://msdn.microsoft.com/en-us/library/dd370784(VS.85).aspx
Neither? :)
The story is that DirectSound is the replacement for waveOut, but DirectSound joined DirectInput as deprecated APIs in Vista and is replaced with WASAPI. DirectSound and waveOut are implemented on top of the User-Space WASAPI in Vista. On XP, waveOut and DirectSound feed to the same kernel level Mixer API.
To consolidate all of these interfaces take a look at something like OpenAL, it's a well supported audio standard along the same lines as OpenGL.
It sounds like you're going to be quite sensitive to latency. It might pay to look at ASIO
I found Harmony Central - Audio Programming. Also read w:DirectSound.
Windows Vista features a completely
re-written audio stack based on the
Universal Audio Architecture. Because
of the architectural changes in the
redesigned audio stack, a direct path
from DirectSound to the audio drivers
does not exist.
Because of Xbox 360 and Microsoft
Windows integration, Microsoft is
actively pushing developers to migrate
new applications to equivalent Xbox
audio APIs such as XAudio and XACT.
OpenAL looks promising.
How do I go about programmatically creating audio streams using Cocoa on the Mac. To make, say a white-noise generator using core frameworks on Mac OSX in Cocoa apps?
One way is using the CoreAudio DefaultOutputUnit.
You can configure it with parameters such as output sampling rate, resolution, and output sample format. Then you can programmatically create a raw sound wave and provide this to the output unit.
Take a look at this example on your machine at /Developer/Examples/CoreAudio/SimpleSDK/DefaultOutputUnit/
Which uses the default output unit to play a programmatically rendered sine wave. Using that as a starting point and you can write a routine to render any thing else to output.
This location at /Developer/Examples/CoreAudio/ also contains tons of other core audio examples.
Look at Audio Queue Services.