Access to System Audio on Mac - macos

For a some-what small (at least hopefully) project, I am hoping to gain access to the current audio being played through the "main line" (i.e. what is heard through the speakers.) Specifically, I'd like to create a visual equalizer of the audio currently being played. I do not wish to capture or "tamper" with the audio in any way, just run a little analysis on it. That being said, I'd imagine access to such information is not handed out nicely in a high-level API.
I noticed a similar question which is concerned with looking at system sound. The accepted answer points to looking into Soundflower's source code. I am not completely adverse to doing this but I'd like to ensure there isn't a simpler way before I got into it (especially because I have no real audio programming experience, especially at the system level.)
Any input is very much appreciated,
--Sam

There is no simple way to do this on OS X. You really have to do this from a kext, unfortunately.

Related

How to check programmatically if an app is making a certain sound?

I need to check if an app is making a certain sound. This app only produces a single specific sound, so a solution that simply checks if there's any sound whatsoever from the app will also work.
I don't need to find out which app makes a sound or anything like that. I know the app that should produce a sound and I know what sound it's going to be, I simply need to detect the exact time this sound is played.
The only solution I know of is to listen to the audio output for the whole OS and then detect my specific sound with some audio recognition software, but it won't work properly if there's music or a movie playing on the background, so it's not an option.
I need a solution to do it via WinAPI methods. The language isn't very important here - I can use C#, javascript, Python or another language. I just need to find out a general approach on how to extract sound produced by a specific application in Windows 7.
The general approach here is to trace calls from a given process to OS to play audio. These calls are more commonly known as "system calls".
This will show only direct attempts by a process to produce sound.
The only hardest part here is to identify all the system calls, that play sound in windows.
This question has some answers on how to trace system calls on Windows
Have you looked at SO answer on similar topic with a bunch of useful .Net wrappers for IAudioSessionManager2 and related API: Controlling Application's Volume: By Process-ID
I think that general approach of
Finding IAudioSession by process name
Subscribing to its events via IAudioSessionEvents
Listening to the OnStateChanged event
should do it for you.
And don't forget that you should pump Windows messages which might require some explicit code in non-UI applications. In UI applications this is what Application.Run does internally anyway.

Mac OSX Audio keyboard

I am creating an application that will pre-record a user's voice for each letter on the keyboard and when the app is running, if the user calls out '5', the system types 5 to which ever application is capable of accepting the input at that time. I am .NET person and venturing into XCode.
I have done some research and I am pretty sure of using AV Foundation for recording the audio. The question is how to use speech recognition in OSX and use it to identify a particular key on the keyboard...Will highly appreciate any feedback even if it might be general advice for the approach that I should take to tackle this project!
ThankS IN ADVANCE :) !
Let me be clear first. I have never done this before, but i have a general idea of how it is done. You need to Bind a audio file to a certain number/Key. Whenever a user speaks into the mic, you record their voice and upload it to a server, which compares the Audio File from the User to the pre recorded audio file the user made.
Here is a SO Question that talks about Audio Fingerprinting.
How can I Compare 2 Audio Files Programmatically?
You can compare the audio files in PHP/Python, and have it return a value. For example. If audio file a.mp3 (on server) matches to the newRecorded.mp3 the user just recorded, return a.mp3, then just strip the .mp3 and keep the key.
As far as recording sentences and commands, you might be able to do the same. I will continue to do more research on this and help you out as much as i can.
Hopefully this gives you a better idea and easier way of doing things.
Also there is this
https://developer.apple.com/library/mac/documentation/cocoa/reference/ApplicationKit/Classes/NSSpeechRecognizer_Class/Reference/Reference.html
and
https://developer.apple.com/library/mac/documentation/cocoa/conceptual/speech/Articles/RecognizeSpeech.html#//apple_ref/doc/uid/20002081-BCIHEBFH
This could be really helpful and would use built in speech recognition.

Media players that can be programmatically controlled? (Ruby)

I already know that iTunes has an interface that I can control, but the API is a bit opaque and I can't find it documented anywhere. Does anyone know of any good open-source or at least well-working media players that can be programmatically controlled?
In particular, I would like to be able to search a media library for a song by title or artist, and play, pause, resume, stop the song.
Ruby would be nice, because I'm working in it, but C would work too. I could write a wrapper.
Edit: My solution has to work on Windows, as that is the environment I am developing in.
XMMS works on a server / client basis. This means that it is relatively easy to control the playback, and the song queue. I'm not sure how easy is to handle file metadata (song info), but maybe that part can be handled independently.
Check this guide to get an overview of functions you can use.
Back in the day, I used MPD.

Finding programming challenge for a (probably) Qt project with tight time frame (interview level)

What would you suggest would be a good challenge for a programmer to show us her/his skills? I'm thinking of a small demo implementation of a GUI program which would not take too much time to do.
Here are the circumstances: (this should not imply the intention to find programmers here, I think there'd be other forums to do that)
We are planning a project which has a tight time frame but apparently we are short on resources so we want to pull in external developers. The project is targetted to be Qt based (although this is not yet finally set) on the Windows platform. We'd prefer Qt as this allows to use own resources later when features need to be added to the software and we are familiar with the Qt platform.
The project needs to interface with HID USB hardware (writing some data blocks out, reading back the result, within to be guaranteed time frames) and a GUI showing graphs of the analyses.
The main intention however is not to find a Qt programmer (although we would prefer that) but a capable programmer - thus the important part of this question is about the challenge.
Don't ask programmers to write something from scratch as an interview task. It's far too suspect.
Think of the qualities that you want in a developer and then write an application that has all of those things done wrong, and ask them to fix it. For example, if you want an Object Oriented developer, give them an application with the data tables directly bound to the UI and ask them to make it OO - it means they can show you in a few minutes that they have OO skills.
By starting with a sample application that is "fixed up" with all the problems, it makes it really easy to compare the results and it will be a much faster test than if you ask people to write something from scratch.
Don't forget to make the test measurable. Score each thing you are testing as well as how long it takes.

How to implement a voice changer?

I want to write a app which change the microphone input voice and make it like robot or some funny man's voice.It must support send changed voice to all application like IM Software or Game Client. Which technology should I pick up? Windows WaveForm Api? DirectX?
audio driver?
Thank you very much!
There's an MSDN Coding4Fun article that explains how to create a voice changer that operates over Skype, in C# (.NET). The full source code is also hosted as a project on CodePlex. In addition, it should be fairly easy do something else with the audio (as opposed to streaming it via Skype), since the project is based around the NAudio framework, which contains a good level of abstraction. Anyway, it is a reasonably complete (and stable) example - definitely worth checking out in my opinion.
If you want/need to use C++ or some other language for development, then this project should at least give you some ideas about how to go about it. Still, if you can use .NET, then you're in luck I think.
Robot voice is often done with a ring modulator effect, mixing the voice with a sine wave - this is easier. Or use a vocoder effect, modulating the voice onto some other waveform, like rectangle - might be a bit more tricky. Go read up how the effects work, get a program with which you can check out how they sound (Audacity works for the ring modulator, finding and using a vocoder may be a bit harder). Then read how it's done or get a library which will do the processing for you.
You are looking to support VSTi or DXi plugins.
There are tons that also act as vocoders, even for free.
You just need to write the host application.
Take a look here :)
Now that's a neat idea, especially for a mobile app.
I'd probably start off-line by using a .wav file as input to get the effects working the way I wanted. You can use any high level language for this, but you probably want something that will map reasonably well into C/C++.
In terms of a production version, I'd go native and do this in C or C++. You want something fast for real time audio processing & I like to avoid dependencies on things like .net for distribution. (Not that I have anything against .net, it's great for servers and distribution within a company but I'm not so keen on having it as a dependency for shrink wrap software.)
Windows DirectShow would be a tempting option - you could do some interesting effects with multi-media as well if you had the voice morpher implemented as a direct show filter.
What you're looking for is a vocoder. I don't know if any of the technologies listed above has a vocoder effect, but the best chance would be with DirectX.
Try this sample app .I think its useful to you.Link

Resources