I'm looking for the best approach to implement voice command in a Xamarin app.
Here are my requirements:
I don't need to launch my app by voice. Instead, my users will launch the app through touch (so, when the app is not running, no voice recognition is needed by my app)
My app is a client/server app and it will work always on (the backend will run on azure)
My app will be used primarily by car (so consider environment noise)
My app will work in many languages, such as Italian, Spanish, French and English
My app should be developed with xamarin (and eventually mvvmcross or similar)
In my app there will be two kinds of voice commands:
to select an item from a short list: app will show a list of items, such as "apple, kiwi, banana and strawberry" and user will have to say one of those words.
to change current view. Typically these voice commands will be something like "cancel", "confirm", "more" and stuff like these
The typical interaction between user, app and server should be this:
user says one of the available commands in current view/activity/page
suppose here that the user perfectly knows which commands he/she can use, it does no matter now how he/she knows these commands (he/she just knows them)
user could put before the commands some special words, such as "hey 'appname'", to have a command like "hey 'appname', confirm"
Note: the "hey 'appname'" part of the voice command has the only purpose to allow the app to know when the command starts. The app can be always in listening mode, but has to avoid to send the audio stream continuously to the server to recognize commands
best case is if app would recognize these commands locally, without involve the remote server, since the voice commands are predefined and well-known in each view. Anyway, app can send the audio wave to the server which will return a string (in this example the text returned will be "confirm", since the audio was "hey 'appname', confirm")
app will map the text recognized with the available commands, and will invoke the right one
user will receive a feedback by the app. The feedback could be:
voice feedback (text-to-speech)
visual feedback (something on the screen)
both above
I was looking for azure-cognitive-services, but in this case, as far as I've understood, there is no way to recognize the start of the command locally (everything works on server side through REST api or clients). So the user would have to press a button before every voice command, and I need to avoid this kind of interaction.
Since the app is running, my user has him/her hands on the steering wheel, and he/she can't touch everytime the display. Isn't it?
Moreover, I was looking for cortana-skills-kit and botframework, but:
It seems that Cortana Skills are available in English only
Actually, I don't need to involve Cortana to launch my app
I don't have experiences on these topics, so, hope that my question is clear and, generally speaking, that can be useful for other newbie users as well.
* UPDATE 1 *
The Speech Recognition with the Voice Command Definition (VCD) file is really close to what I'd need, because:
it has a way to activate the command through a command name shortcut
It works in foreground (and background as well, even if in my case I don't need the background)
Unfortunately, this service works only on Windows, since it uses the local API. Maybe the right approach could be based on the following considerations:
Every platform exposes a local speech recognition api (Cortana, Siri, Google Now)
Xamarin exposes Siri and Google Now apis and make them available through C#
It would be useful to create a facade component to expose the three different local speech api through a common interface
I'm wondering if there is something other solution to this. Cortana, as personal assistant, is available on Windows, iOS and Android. Since Cortana works both with local api and with remote service (Cortana Skills), is Cortana the right approach? Has Cortana the support for many languages (or, at least, has the support a road map)?
So, just some thoughts here. If you have some other ideas, or suggestions, please add here. Thanks
Related
In Outlook, excel and Word I’ve created context menus, which when chosen by the user, allows the user to jump to a desktop application (passing along the context of course as well). It all seems old school these days but think VSTO, Addins or even vba.
Is something like this possible from a Teams conversation?
EDIT - Example:
In a conversation in MS Team John types a message to Fred "Hey Fred please look at file number 123456." Now currently Fred has to highlight and copy this number, open a Desktop app and paste the number to search for the information.
If John writes the same message as an EMAIL to Fred, then because Fred has my addin installed, the addin recognizes the number 123456 and Fred simply right clicks on the number and chooses a context menu. (The addin sends a message through a WCF connection to the Desktop app) The Desktop app springs up to the foreground and displays the file to Fred.
So far from my reading of MS Teams I only read and see things about the http protocol which is nice but I am hoping there is something more.
From what I can understand developing with Teams currently means web only addons/extension or whatever they call them now. Communication with native applications is not possible for developers, even Microsoft is still trying to link documents in Teams to their own desktop apps.
I never want to open the document in Teams, or in Office Online. I always want to use the native desktop program. Would be nice if there was a global setting so documents always opened in desktop applications. (Microsoft Teams UserVoice December 2017)
It appears that Microsoft Teams does not support any of the coding opportunities currently available for Outlook, Word or Excel as according to the comments above these are "Advanced Threat Protection blocks unsafe protocols".
Sadly web only addons/extension and the requirement for them to be centrally uploaded effectively makes things very difficult for people working within a corporate environment where there is
an IT department who creates so much red tape that your application for something ends with a negative result. Gone are the days where people can code up something for a few people to use in their organization.
If you have landed here from a Google Search my suggestion would be to create a browser extension with a native host. You can do whatever it is you want with the Teams user interface and
send the information through the native host to your desktop applications. This will not work with the Microsoft Teams App, however as this app is simply the website in a window it is possible that people will just use a browser anyway.
Question is simple but answer is not present anywhere!
Is there possibility to use Cortana to start my app with Command "Hey MyApp" instead of "Hey Cortana"?
I don’t mean to run Cortana then say run my app via voice command.
Than you for any information.
No that is not possible. "Hey Cortana" is the only thing that triggers the voice recognition, this is built into the Windows core.
If they would not have done that it would cause a lot of processing power to constantly having to listen and evaluate what one is saying. And on the other hand you could easily trigger all kind of actions and apps by just having your phone next to you and talking to another person.
Therefore, the start point for any voice command will be "Hey Cortana" on Windows, "Ok Google" on Android and "Hey Siri" on iOS.
You can implement certain 'skills' into Cortana, with which you can trigger actions within your app.
To get started, head over to the Cortana Dev Center.
My organization utilizes a browser-based app (Chrome/Firefox) and Skype to allow kindergarten students to read books with adults, remotely.
One of the biggest problems we have happens with this scenario:
Tutor calls a dedicated Skype number that resides on student laptop (Windows 7)
Student answers Skype call
Skype application window remains open, blocking browser
Student does not know how to close Skype window
Student cannot see browser until support remotely connects to laptop and minimizes Skype window
Training the students, at that age, doesn't work. Teachers tell them not to mess with the laptops, anyways.
So the question is this:
Is there a way to automatically minimize Windows application windows via a custom script or is there a way to force browser windows to regain priority on the desktop based on certain triggers?
I'm looking more for pointers on where to do research or what to look at. I can probably build a solution if I know there's a tool or library where I can start.
Have you tried to start Skype with the command line parameter
/minimized
That would start Skype minimized to the tray.
And on a second note I would imagine that when the kids can answer a Skype-call, they could be taught to press ALT+TAB after they have answered the call (and thereby switch to the browser).
You could also try http://ahkscript.org/ (Auto Hot Key) which is a scripting language for automation of common tasks.
Is there a way to make my winRT application as a screen saver in xaml?
As Jerry says, there's no straightforward way to make a Windows Store app screensaver. However, there's a roundabout solution that might work for you on Windows 8, but not Windows RT. I have it nearly working. I'll share what I have so far.
A screensaver is just an executable with a .scr extension that's kept in C:\Windows\System32. For example, look at C:\Windows\System32\Bubbles.scr. The solution I have in mind is to create a .scr screensaver whose only purpose is to launch your Windows Store application, which you say will use XAML.
You can't launch a Windows Store app from the command line directly, so you'll create a launcher app. Take a look at a blog post called Automating the testing of Windows 8 apps by Ashwin Needamangala. Partway down the article, look for the section called Automating the activation of your app. It contains a sample C++ application which can launch Windows Store apps in the following way:
C:>Win8AppLaunch.exe Microsoft.BingNews_8wekyb3d8bbwe!AppexNews
The sample launcher on that page needs to be modified, but before you do that just copy the code into a C++ console app:
You're almost ready to test it out from the command line, but you need to specify the name of the app as an AppUserModelId. The details are in Ashwin's post, but to paraphrase you first want to allow the execution of PowerShell scripts on your system with:
PS C:> Set-ExecutionPolicy AllSigned
Then run this PowerShell script:
$installedapps = get-AppxPackage
foreach ($app in $installedapps)
{
foreach ($id in (Get-AppxPackageManifest $app).package.applications.application.id)
{
$app.packagefamilyname + "!" + $id
}
}
You might like running it in the Windows PowerShell ISE. It's pretty slick. Find the AppUserModelId of your app and then test Win8AppLaunch.exe from the command line, as shown above. This should launch your Windows Store app from command line.
Next, modify the C++ launcher to hard-code the AppUserModelId of your application instead of parsing it from a command line argument. I created a Gist of this. The important part is the line where I declare myApp.
Build the new executable, rename it MyScreenSaver.scr and put it in C:\Windows\System32. It will then appear in the Screen Saver Settings Control Panel. You can preview the screensaver there, and it works. However, if you wait for the screensaver to launch, it will briefly bring up a console window and never fully launch. I'm not sure why. I tried disabling the creation of the console window by switching the project to a Windows app, but that didn't help. You can try that yourself by changing Properties | Configuration | Linker | System | SubSystem to WINDOWS. It's a little more involved, as you'll also need to change the entry point from _tMain to _tWinMain. Contact me through my blog if you want the details. My StackOverflow profile lists it.
At this point it's almost fully working. You might try starting with a blank C++ screensaver that you know works, and then copy in the above code. If I get more time, maybe I'll try this myself.
Cool idea. But, no.
If you want your application to really do something for Windows other than run as a simple app, then you write an extension app. Here's the official word:
Extensions An extension is like an agreement between an app and Windows. Extensions lets app developers extend or customize standard Windows features primarily for use in their apps and potentially for use in other apps.
There are these types of extension apps right now:
Account picture provider (extension)
When users decide to change their account picture, they can either select an existing picture or use an app to take a new one. If your app can take pictures, you can use this extension to have Windows list your app in the Account Picture Settings control panel. From there, users can select it to create a new account picture. For more info about this extension, see the UserInformation reference topic. You can also check out our Account picture name sample.
AutoPlay (extension)
When the user connects a device to a computer, Windows fires an AutoPlay event. This extension enables your app to be listed as an AutoPlay choice for the one or more AutoPlay events.
Background tasks (extension)
Apps can use background tasks to run app code even when the app is suspended. Background tasks are intended for small work items that require no interaction with the user.
Camera settings (extension)
Your app can provide a custom user interface for selecting camera options and choosing effects when a camera is used to capture photos or video. For more info about this extension, see Developing Windows Store device apps for cameras.
Contact picker (extension)
This extension enables your app to register to provide contact data. Your app is included in the list of apps that Windows displays whenever the user needs access to their contacts.
For more info about this extension, see the Windows.ApplicationModel.Contacts.Provider reference topic. You can also check out Managing user contacts.
File activation (extension)
Files that have the same file name extension are of the same file type. Your app can use existing, well known file types, such as .txt, or create a new file type. The file activation extension enables you to define a new file type or register to handle a file type.
Game Explorer (extension)
Your app can register with Windows as a game. To do this, you must create a Game Definition File (GDF), build it as a binary resource in your app, and declare that resource in the package manifest.
Print task settings (extension)
You can design an app that displays a custom print-related user interface and communicates directly with a print device. When you highlight the features that are specific to a particular make and model of print device, you can provide a richer, more enhanced user experience.
Protocol activation (extension)
Your app can use existing protocols for communication, such as mailto, or create a custom protocol. The protocol activation extension enables you to define a custom protocol or register to handle an existing protocol.
SSL/certificates (extension)
Digital certificates are used to authenticate one entity to another. For example, certificates are often used to authenticate a user to web services over SSL. This extension enables you to install a digital certificate with your app.
cite: http://msdn.microsoft.com/en-us/library/windows/apps/hh464906.aspx
Unfortunately, nothing has to do with screen savers. The technical reason, at this time, you cannot write a Windows 8 app that functions as a screensaver is because Windows 8 apps are fundamentally tied to run inside the WinRT execution environment. That shell does not extend out past the Start menu in this current version of Windows. So, there's no way to execute outside - like as a screen saver. Screen savers are still built the "old fashion way".
I am programming an app for an experiment by the University of Queensland Psych Department. The app needs to be impossible to exit, or at least it would be preferable if it were impossible to exit. This is not a virus- it is for an experiment with the Grute Eylandt Aborigines. Anyways, do any of you guys have any idea how to set the app to be impossible to exit, or even better, to set it so that you have to enter a password to exit it? Furthermore, on a separate subject, do you guys have any idea how I can save the information in the app to the iPhone? This app will not go through the App Store so it does not need to follow App Store rules. Therefore, if there was a way to save "Button (whatever button it is) pushed at (time and date)" to the notes section of the iPhone every time a button was pushed in the app, and/or to save audio recorded using the AudioToolbox framework to the actual iPod library, that would be fantastic. Otherwise I would have to make some sort of db or plist file to save everything with if-then statements, I think. Thank you!
Check-out iOS 6 Accessibility feature:
It allows a parent, teacher, or administrator to limit an iOS device
to one app by disabling the Home button, as well as restrict touch
input on certain areas of the screen
Put the device in a "kiosk" case so keep the home button from being pressed. For storing the data to the device, if it is a small amount of data, use NSUserDefaults, if this will be a large amount of data, I would lean more toward Core Data
Easiest solution on the market -- MOKIMOBILITY has developed software that allows you to lock the home button. It is Mobile Device Management software with a full range of security features. It essentially locks down your iPad so the user is only able to use what you what them to use. It is called +MDM www.mokimobility.com The software can be managed mobile-y from a central interface. Slick software.