Windows API to subscribe for VoIP activities like "Sound -> Communications" does? - windows

Situation: at Windows "Control Panel", you can visit "Sound" widget and switch to "Communications" tab. There, you can configure how much %% the OS should reduce all other sounds if we have incoming VoIP call ringing (to not miss the call, indeed).
Question: is there any API that allows a developer to subscribe and react on such events too? (let say, auto-pause your game app, or "do not disturb" auto-status for the call duration in your messenger app, or any other smart thing you can do for better user experience).
Note: I'm looking for OS-wide API, not "SDK for VoIP app X only".

It turns out that the Microsoft term for this is Custom Ducking Behavior. The seemingly-odd name is explained by the Wikipedia page on ducking:
Ducking is an audio effect commonly used in radio and pop music,
especially dance music. In ducking, the level of one audio signal is
reduced by the presence of another signal. In radio this can typically
be achieved by lowering (ducking) the volume of a secondary audio
track when the primary track starts, and lifting the volume again when
the primary track is finished. A typical use of this effect in a daily
radio production routine is for creating a voice-over: a foreign
language original sound is dubbed (and ducked) by a professional
speaker reading the translation. Ducking becomes active as soon as the
translation starts.
From the MSDN, the APIs you need to implement custom ducking behavior are COM-based. In summary:
MMDevice API for multimedia device enumeration and selection.
WASAPI for accessing the communications capture and render device, stream management operations, and handling ducking events.
WAVE APIs for accessing the communications device and capturing audio input.
Code samples to implement the functionality you want are available at the respective MSDN pages.

Related

How can I capture microphone data and route it to a virtual microphone device?

Recently, I wanted to get my hands dirty with Core Audio, so I started working on a simple desktop app that will apply effects (eg. echo) on the microphone data in real-time and then the processed data can be used on communication apps (eg. Skype, Zoom, etc).
To do that, I figured that I have to create a virtual microphone, to be able to send processed (with the applied effects) data over communication apps. For example, the user will need to select this new microphone (virtual) device as Input Device in a Zoom call so that the other users in the call can hear her with her voiced being processed.
My main concern is that I need to find a way to "route" the voice data captured from the physical microphone (eg. the built-in mic) to the virtual microphone. I've spent some time reading the book "Learning Core Audio" by Adamson and Avila, and in Chapter 8 the author explains how to write an app that a) uses an AUHAL in order to capture data from the system's default input device and b) then sends the data to the system's default output using an AUGraph. So, following this example, I figured that I also need to do create an app that captures the microphone data only when it's running.
So, what I've done so far:
I've created the virtual microphone, for which I followed the NullAudio driver example from Apple.
I've created the app that captures the microphone data.
For both of the above "modules" I'm certain that they work as expected independently, since I've tested them with various ways. The only missing piece now is how to "connect" the physical mic with the virtual mic. I need to connect the output of the physical microphone with the input of the virtual microphone.
So, my questions are:
Is this something trivial that can be achieved using the AUGraph approach, as described in the book? Should I just find the correct way to configure the graph in order to achieve this connection between the two devices?
The only related thread I found is this, where the author states that the routing is done by
sending this audio data to driver via socket connection So other apps that request audio from out virtual mic in fact get this audio from user-space application that listen for mic at the same time (so it should be active)
but I'm not quite sure how to even start implementing something like that.
The whole process I did for capturing data from the microphone seems quite long and I was thinking if there's a more optimal way to do this. The book seems to be from 2012 with some corrections done in 2014. Has Core Audio changed dramatically since then and this process can be achieved more easily with just a few lines of code?
I think you'll get more results by searching for the term "play through" instead of "routing".
The Adamson / Avila book has an ideal play through example that unfortunately for you only works for when both input and output are handled by the same device (e.g. the built in hardware on most mac laptops and iphone/ipad devices).
Note that there is another audio device concept called "playthru" (see kAudioDevicePropertyPlayThru and related properties) which seems to be a form of routing internal to a single device. I wish it were a property that let you set a forwarding device, but alas, no.
Some informal doco on this: https://lists.apple.com/archives/coreaudio-api/2005/Aug/msg00250.html
I've never tried it but you should be able to connect input to output on an AUGraph like this. AUGraph is however deprecated in favour of AVAudioEngine which last time I checked did not handle non default input/output devices well.
I instead manually copy buffers from the input device to the output device via a ring buffer (TPCircularBuffer works well). The devil is in the detail, and much of the work is deciding on what properties you want and their consequences. Some common and conflicting example properties:
minimal lag
minimal dropouts
no time distortion
In my case, if output is lagging too much behind input, I brutally dump everything bar 1 or 2 buffers. There is some dated Apple sample code called CAPlayThrough which elegantly speeds up the output stream. You should definitely check this out.
And if you find a simpler way, please tell me!
Update
I found a simpler way:
create an AVCaptureSession that captures from your mic
add an AVCaptureAudioPreviewOutput that references your virtual device
When routing from microphone to headphones, it sounded like it had a few hundred milliseconds' lag, but if AVCaptureAudioPreviewOutput and your virtual device handle timestamps properly, that lag may not matter.

Wear Actions execute very slow or not at all when phone is in doze mode

I am building an Android App to control power outlets with a smartphone. The app features an Android Wear app so people can control their lights right from their wrist.
When the user wants to control a light I send a String action via the MessageApi from the smartwatch to the smartphone, which receives this action in a WearableListenerService and sends the appropriate network signal to the power outlet/gateway in an AsyncTask.
This works fine as long as the phone has not been in idle for too long. However if the phone is still on the table for too long and doze kicks in Wear actions do execute very slow or sometimes not at all. I guess this is in part intended behavior however it is not practical in my case as the user cant wait that long for his lights to turn on if he wants to enter a dark room.
I am aware that doze completely cuts the networking for everything except FCM/GCM if you are not on the doze whitelist. But even when my app is on this whitelist and the networking part works actions can take a long time to execute on the phone.
So my specific question is:
Whats the recommended way to handle this scenario, where an action from a wearable device needs to be done via network on the connected smartphone which is in doze mode?
Is there a way to exit doze for a quick amount of time to execute calculations triggered by the wearable companion app faster?
I know the AlarmManager has a new method that works even in doze mode, but will this fix the processing delay too? Firing an alarm after receiving a MessageEvent from MessagApi seems like a workaround to me.
Or maybe is an AsyncTask just the wrong way to handle background networking and thats where the delay comes from?
Actually, there are a few options that you can do to handle Doze's effects as given in Adapting your app to Doze. You may want to consider the following options:
If your app requires a persistent connection to the network to receive messages, you should use Google Cloud Messaging (GCM) if possible.
GCM is optimized to work with Doze and App Standby idle modes by means of high-priority GCM messages. GCM high-priority messages let you reliably wake your app to access the network, even if the user’s device is in Doze or the app is in App Standby mode.
To help with scheduling alarms, Android 6.0 (API level 23) introduces two new AlarmManager methods: setAndAllowWhileIdle() and setExactAndAllowWhileIdle(). With these methods, you can set alarms that will fire even if the device is in Doze.
However, please note that with these methods, neither setAndAllowWhileIdle() nor setExactAndAllowWhileIdle() can fire alarms more than once per 9 minutes, per app.
Please try going through Optimizing for Doze and App Standby for a more detailed information or discussion.
In addition to these given documentations, the same options in handling Doze were also given and discussed in Diving into Doze Mode for Developers which might also help.

Get Source Tower Information From SMS at Destination

I'm planing to start some sms based application and currently in feasibility study part. In my application client have to sms their problem to the server and we have to analyse the problem and take reasonable action. Also We have to find the tentative location through which tower they have been connected. I have seen about silent sms feature but not understand. Is any body have experience on how to detect location of sms creator (not in android or iphone). Please help me on determining whether it is possible or not to find the location. If possible then how?
In short this is not possible.
an SMS message weather in PDU mode or text mode does not carry the information to match the source location to the message in any way shape or form.
With reference to the article you linked to in your opening post, I'm sorry to say that there's so much B$$l S$$t in that post that I can smell it from here.
In all the years Iv'e worked with GSM systems, both as a network maintenance engineer and later as a developer writing software to use these systems, not once have I heard of anything such as an 'LMU' or an 'E-OTD' in fact the only acronym that article really got correct was 'BTS' oh and the bit on passing the data over the signalling channel.
As for the silent SMS, well that part actually is true. The special type of SMS they refer to is actually called a Ping-SMS and it exists for exactly the same reason that a regular PING on a TCP/IP network exists, and that's to see if the remote system is alive and responding.
What it's NOT used for is the purpose outlined in the article, and that's for criminal gangs to send it to your phone and find out where you are.
For one, the ONLY people that can correctly send these messages are the telephone operator themselves. That's not to say that it's impossible to send one from a consumer device by directly programming a PDU if you have the necessary equipment and know how. You could for instance pull this stunt off using a normal GSM modem, a batch of AT commands and some serious bit twiddling.
However, since this message would by it's very nature have to go through your operators SMSC and most operators filter out anything from a subscriber connection that's not deemed regular consumer traffic, then there's a high chance this would fail.
You could if you had an account, also send this message using a web sms provider that allowed you to directly construct binary messages, but again they are likely to filter out anything not deemed consumer grade messages.
Finally, if you where to manage to send an SMS to a target device, the target device would not reply with anything anywhere near a chunk of location based info, cell tower, GPS or otherwise. The reason the SMS operators (and ultimately the law enforcement agencies know this info) is because EVERY handset that's attached to the GSM network MUST register itself in the operators MSC (Mobile switching centre), this registration (Known as ratching up) is required by the network so it can track what channels are in use by which device on which towers so that it knows where to send paging and signalling info.
Because of the way the PING SMS works it causes the destination device to re-register itself, usually forcing the MSC to do a location update on the handset which causes a re-registration.
Even then, all you get in the MSC is an identifier of the cell site the device is attached too, so unless you have a database in the organisation of all cell sites along with their exact lat/long co-ordinates, it's really not going to help you all that much.
As for the triangulation aspect, well for that to work you'd need to know at least 2 other transmitters that the device in question can see, and what's more you'd need that device to report that info back to someone inside the network.
Since typically it's only the Ril (Radio interface layer) on the device that actually keeps track of which transmitters it can see, and since the AT commands for many consumer grade GSM modems have the ability to query this information disabled, then it's often not easy to get that info without actually hacking the firmware in the device in question.
How does Google do it? well quite easy, they actually have commercial agreements with network providers that pass the details of registered towers to their back-end infrastructure, in the apps themselves, they have ways of getting the 'BSS List' and sending that list back to Google HQ, where it's cross referenced with the data from the network operator, and the info they have in their own very large transmitter database and finally all this is mashed together with some insane maths to get an approximate location.
Some GSM Modems and some Mobile phone handsets do have the required AT commands enabled to allow you to get this information easy, and if you can then match that information to your own database you can locate the handset your running from, but being able to send a special SMS to another device and get location info back is just a pipe dream nothing more, something like this is only going to work if your target device is already running some custom software that you can control, and if your device is running software that someone else is controlling, then you have bigger problems to worry about.

WP7 how to detect when audio track goes to end?

I'm using BackgroundAudioPlayer agent in my Windows Phone 7 application. When the track end, the agent side receives TrackEnded event, but UI side doesnot receive any events.
Also, when I intentionally set audio track 's position to its end, then call Play(), the agent side receives TrackEnded event (because the track has come to an end), but the UI side does receive Stopped in its PlayStateChanged handler. So weird !
How to let UI know that an track has come to an end ? Communicating through isolated storage is not my favorite !
From research and a little testing, using Isolated Storage as a middle-man between Background and Foreground instances of the BackgroundAudioPlayer is still the only route for Windows Phone 7. The options are mentioned here (which I know you're aware of)...
http://blogs.msdn.com/b/wpukcoe/archive/2012/02/10/background-audio-in-windows-phone-7-5-part-2.aspx
http://msdn.microsoft.com/en-us/library/windowsphone/develop/hh202944(v=vs.105).aspx
https://stackoverflow.com/a/11419680/247257
This was also confirmed by Peter Torr who said:
For example, the agent may need to tell the foreground “I just started pre-downloading the next track,” or “I updated a database table and you should refresh your state”. Such notifications are impossible to create with Windows Phone OS 7.1; at best you can model them by using polling techniques, but this approach is inefficient and prone to errors.
The only good news is that in the same post, he gives a solution (using named events for IPC) for Windows Phone 8 which is a lot more reliable...
http://blogs.windows.com/windows_phone/b/wpdev/archive/2013/03/27/using-named-events-to-coordinate-foreground-apps-and-background-agents.aspx

Windows Phone 7 - Events triggered on phone-call-connect and phone-call-disconnect

I'm writing an application for Windows Mobile 7 which required information about "When a voice call was placed" and "when a voice call was hanged up or disconnected". Are there any API's or events/triggers that can give me this information.
The current SDK doesn't offer this capability - generally, you cannot keep track of user activity (like calls) outside the application due to a sandboxed environment that by default doesn't offer any system process hooks.
While you can't get any information about a specific phone call, if your application is running you can be informed when a call is received (and ended) by using the Obscured and Unobsured events on the page.
Please note that this will be triggered when ANY piece of UI chrome covers the page. In addition to incoming call notifications, this will also include incoming SMS notifications, alarms, etc.
These event are an important part of the application lifecycle for some types of apps (typically games) but are often overlooked.

Resources