Windows Core audio API - windows

I'm developing an windows application which needs to get the sound output level of the current audio device. I'm currently doing this using windows core audio API - EndpointVolume API (IAudioMeterInformations). The application checks the sound output level every 10ms and does its own logic according to the level.
The key of the app is to manipulate the sound before it reaches the speakers (so when you here it, it was already processed).. The current solutions (using EndpointVolume) kind of does this but it processes the sound which WAS already played.. but i would like to process the sound just before it is played.
Would it be better to use the peak meter from DeviceTopology API in stead of the peak meter in AudioEndpoint API?
I am asking this because the applications needs to react as fast as possible to the sound output level, so the manipulation wont be noticeable. So i though if i would use DeviceTopology (which is placed before the Endpoint device) it would make it more responsive and less noticeable?
Is my assumption correct or am i barking the wrong tree?

Related

How to check programmatically if an app is making a certain sound?

I need to check if an app is making a certain sound. This app only produces a single specific sound, so a solution that simply checks if there's any sound whatsoever from the app will also work.
I don't need to find out which app makes a sound or anything like that. I know the app that should produce a sound and I know what sound it's going to be, I simply need to detect the exact time this sound is played.
The only solution I know of is to listen to the audio output for the whole OS and then detect my specific sound with some audio recognition software, but it won't work properly if there's music or a movie playing on the background, so it's not an option.
I need a solution to do it via WinAPI methods. The language isn't very important here - I can use C#, javascript, Python or another language. I just need to find out a general approach on how to extract sound produced by a specific application in Windows 7.
The general approach here is to trace calls from a given process to OS to play audio. These calls are more commonly known as "system calls".
This will show only direct attempts by a process to produce sound.
The only hardest part here is to identify all the system calls, that play sound in windows.
This question has some answers on how to trace system calls on Windows
Have you looked at SO answer on similar topic with a bunch of useful .Net wrappers for IAudioSessionManager2 and related API: Controlling Application's Volume: By Process-ID
I think that general approach of
Finding IAudioSession by process name
Subscribing to its events via IAudioSessionEvents
Listening to the OnStateChanged event
should do it for you.
And don't forget that you should pump Windows messages which might require some explicit code in non-UI applications. In UI applications this is what Application.Run does internally anyway.

SmartEyeglasses and Subtitles - Accessibility

I work for a performing arts institution and have been asked to look into incorporating wearable technology into accessibility for our patrons. I am interested in finding out more information regarding the use of SmartEyeglasses for supertitles (aka, subtitles) in live or pre-recorded performance. Is it possible to program several glasses to show the user(s) the same supertitles at the same time? How does this programming process work? Can several pairs of SmartEyeglasses connect with the same host device?
Any information is very much appreciated. I look forward to hearing from you!
Your question is overly broad and liable to be closed as such, but I'll bite:
The documentation for the SDK is available here: https://developer.sony.com/develop/wearables/smarteyeglass-sdk/api-overview/ - it describes itself as being based on Android's. The content of the wearable display is defined in a "card" (an Android UI concept: https://developer.android.com/training/material/lists-cards.html ) and the software runs locally on the glasses.
Things like subtitles for prerecorded and pre-scripted live performances could be stored using file formats like .srt ( http://www.matroska.org/technical/specs/subtitles/srt.html ) which are easy to work with and already have a large ecosystem around them, such as freely available tools to create them and software libraries to read them.
Building such a system seems simple then: each performance has an .srt file stored on a webserver somewhere. The user selects the performance somehow, and you'd write software which reads the .srt file and displays text on the Card based on the current timecode through until the end of the script.
...this approach has the advantage of keeping server-side requirements to a minimum (just a static webserver will do).
If you have more complex requirements, such as live transcribing, support for interruptions and unscripted events then you'd have to write a custom server which sends "live" subtitles to the glasses, presumably over TCP, this would drain the device's battery life as the Wi-Fi radio would be active for much longer. An alternative might be to consider Bluetooth, but I don't know how you'd build a system that can handle 100+ simultaneous long-range Bluetooth connections.
A compromise is to use .srt files, but have the glasses poll the server every 30 seconds or so to check for any unscripted events. How you handle this is up to you.
(As an aside, this looks like a fun project - please contact me if you're looking to hire someone to build it :D)
Each phone can only host only 1 SmartEyeglass. So you would need separate host phones for each SmartEyeglass.

Minimum sound power required at iPhone 5 microphone?

I'm writing an Objective C iOS app which will react to specific sounds.
What is the minimum sound power that the iPhone (5,5c,5s) microphone must sense in order to do audio
acquisition?
There is no minimum sound power - you can initiate recording at any time. There are multiple ways to do this depending on exactly what you want to do, but the simplest is through the AV Foundation Framework. If you need more control you can use the Core Audio Framework, which provides a very powerful set of tools but is more complicated. See https://developer.apple.com/library/ios/documentation/Miscellaneous/Conceptual/iPhoneOSTechOverview/MediaLayer/MediaLayer.html#//apple_ref/doc/uid/TP40007898-CH9-SW2.
But maybe this isn't exactly what you're asking? If you're asking is there some way to have your app "come alive" and start recording when a certain dB level is reached, there is no way to do that. You need to have the app constantly running and recording, and monitor the sound yourself.

Use an output device as a recording source under Vista/Win7 new sound API?

As I understand it, Vista introduced a completely rearchitectured sound input/output system to the OS. In particular, before Vista there was a single system-wide sound mixer, to which output devices could be connected. For recording, it was possible to retrieve data directly from a recording device or from this mixer.
In Vista and later, as I understand it, there is no longer a system-wide mixer. It is possible, in theory, to route some sounds to one output device and other sounds to a different output device,1 and this requires separate mixers for each output device.
Now, I have a simple recording application that I would like to update to take advantage of this new API. In particular, I was hoping it would be possible to let the user select one of the output devices as an audio data source. My reasoning is that the OS probably mixes all the inputs into each sound device anyway, and hopefully provides a way to tap into the mixed data.
Is it possible to select an output device as an input into my recording application, and if so, how?
1Although I am yet to find any UI that actually lets one do this.
Loopback Recording

OnLive: How does it work? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
OnLive is a cloud computing solution for gaming. It offers streaming of high-end games to any pc, regardless of its hardware. I wonder how it works: sending raw HD res image and audio data seems unlikely. Would relatively simple compression, like jpeg and mp3/ogg, do the trick?
Have you read this article? Excerpts thereof:
It's essentially the gaming version of cloud computing - everything is computed, rendered and housed online. In its simplest description, your controller inputs are uploaded, a high-end server takes your inputs and plays the game, and then a video stream of the output is sent back to your computer. Think of it as something like Youtube or Hulu for games.
The service works with pretty much any Windows or Mac machine as a small browser plug-in. Optionally, you will also be able to purchase a small device, called the OnLive MicroConsole, that you can hook directly into your TV via HDMI, though if your computer supports video output to your TV, you can just do it that way instead. Of course, you can also just play on your computer's display if you don't want to pipe it out to your living room set.
[...]
OnLive has worked diligently to overcome lag issues. The first step in this was creating a video compression algorithm that was as quick as possible.
It's basically games-over-VNC. Obviously they use video compression; of what sort I'm not sure. The two obvious alternatives would seem to be something fairly computationally lightweight, such as motion JPEG or even MPEG 2, running on the same server that's running the game, or something more computationally intensive but compact, such as H264, running on dedicated hardware.
Personally, If I were designing the service, I'd go for the latter: It allows you to have better compression without massively upgrading all your servers, for the cost of a relatively inexpensive codec chip. Because the video stream is smaller, you can attract people who have connections that would have been marginal or too slow using a poorer codec.
This is what I understood: It is a thin client based gaming solution. Different from the gaming consoles like Wii, X-Box or Play Station, no CPU/GPU or any processing is needed at player’s side. The game is streamed from a monster server via internet, just like a HiFi terminal session (RDP/Remote Desktop) but with HD graphics. Controls (inputs) are sent to the server and graphics is sent back. It can be played on Mac or PC via a web browser add-in or in a TV with a small unit to connect to the server. Requires a 5mbps connection for HD and a 1.5mbps for SD. Almost all game titles will be available or ported to this platform. No need to buy a console or a game. No need of high end gaming PCs… Just a broadband connection (of course this should be high end).
I think that they are using something like an HDMI video h264 encoder in order to stream a video directly from an hdmi audio/video output.
Something like this HDMI encoder or this h264 realtime encoder
You can also use a frame-grabber card like this: http://www.epiphan.com/products/frame-grabbers/vga2ethernet/
There is also one more solition now. If you have a recent Nvidia graphics card, you can have the benefits of hardware accelerated capture, without the extra hardware. It's called "Gamestream" You can buy one of the Nvidia devices supporting the protocol, or you can download an open source app called "Moonlight" http://moonlight-stream.com

Resources