How to convert speech to Text (expect IBM watson API)

How to convert speech to Text (expect IBM watson API) - asp.net-web-api

How to convert speech to text, with out using IBM watson API?
that means i need another API for conversion.

You can try:
Google Cloud Speech : https://cloud.google.com/speech-to-text/ it provides dictation mode, and you can also select the context of the speech (e.g Medical, School, etc)
Bing Speech : https://azure.microsoft.com/en-us/services/cognitive-services/speech/ it provides dictation mode, and also you can select the context of the speech
Microsoft Cognitive Speech : https://azure.microsoft.com/en-us/services/cognitive-services/custom-speech-service/ you can make your own language model and accoustic model by sending data training to the azure
CMUSphinx : https://cmusphinx.github.io/ it's open source, you can make your own language model, accoustic model, dictionary etc, but you have to handle everything by yourself. (Very Recommended)

Related

Adding Speech to Text feature in RASA

I need to add Speech to text feature in RASA, where user can ask questionsusing his voice and bot will answer him by chat. Does anyone know how can I do it in RASA?
As my front-end will be an Android Application. Kindly do tell me how to do it.
Thanks in Advance.

You can build a voice bot with Rasa Open Source as long as you use a Speech to Text (STT) API, since Rasa will only process text. This would involve building a custom channel that takes the voice as input, sends it to a STT API and returns the text to Rasa.
You can find some detailed examples on the Rasa blog:
https://blog.rasa.com/how-to-build-a-voice-assistant-with-open-source-rasa-and-mozilla-tools/
https://blog.rasa.com/how-to-build-a-mobile-voice-assistant-with-open-source-rasa-and-aimybox/
If you don't mind using something closed source, integrating the Google Speech API is also an option.

How to integrate Speech to Text with QnA Maker Based Bot?

I have developed a FAQ Bot using C# and Bot Builder SDK 3.15.3. We have a large set of question/answer pairs which are uploaded to a QNA Maker Service. I have enabled the Direct Line Channel and the bot is displayed on a web page. I have used the Web Chat control provided by Microsoft with some customization and skinning.
Now I want to enable voice interaction with the bot, for that I decided to use the Microsoft Speech to Text Cognitive Service.
What I want to do is that when ever user speaks some utterance, I want to send the utterance to my bot service similar to like text is sent. Then inside C# code I want to run the Speech to Text and further do a Spell Check on the text retrieved and finally send to QNA Maker Service. The response for now will only be showed as text, but can also opt to read the response for the user.
Kindly guide my how this is achievable, as after looking at CognitiveService.js and other articles on enabling speech, I notice that Web Chat control directly sends the voice input to speech to text service.

You can make a hybrid between a calling bot which utilizes speech to text and a QnA bot to achieve your goal. For the calling bot, look over the SimpleIVRbot sample to get you going. For QnAMaker, you can reference the SimpleQnABot. It shouldn't take too much work bridging the two into a single unified bot. Just be sure to remove duplicate code and combine files where necessary.
Hope of help!

Culture-specific understanding in LUIS

We are trying to create a multi-language chat bot using Azure Bot Framework and LUIS.
While designing the architecture we are struggling to understand following points:
I am not able to see 'en-GB' in the list of supported languages mentioned in following blog.
https://learn.microsoft.com/en-us/azure/cognitive-services/luis/luis-supported-languages.
Does that mean LUIS does not support 'en-GB'?
If so, will LUIS really struggle to understand the query written in 'en-GB' as far as this app is just a chat bot and not voice bot?
Do we need to do anything special so that LUIS can understand query written in any supported language say 'de-DE' and map it to utterances modeled in 'English'?

I am not able to see 'en-GB' in the list of supported languages mentioned in following blog.
https://learn.microsoft.com/en-us/azure/cognitive-services/luis/luis-supported-languages.
Does that mean LUIS does not support 'en-GB'?
Yes. But in fact the announced language is English, not American English or British English (see below)
If so, will LUIS really struggle to understand the query written in 'en-GB' as far as this app is just a chat bot and not voice bot?
You can use en-US language. There is no link with chat vs voice capability, LUIS is only treating text items. For voice, you need to use other tools first like STT (Speech-to-text) tools.
Do we need to do anything special so that LUIS can understand query written in any supported language say 'de-DE' and map it to
utterances modeled in 'English'?
Yes, you have to translate your items.
When you create a project (called an app) in LUIS, the 1st setting that you must provide is the Culture (see capture below).
If you want to use several languages in a chatbot project for example, you have at least 2 possibilities:
Create 1 LUIS app for each language, and call the right one. You can select the right one by several ways (using the locale if selected by the user, or using Language detection APIs for example)
Create 1 global LUIS app in 1 language (choosing English may be the right option as LUIS main feature are available in English first) and make a translation before calling LUIS.
I would recommend the 1st solution because translation is never perfect and may be loosing context which can be important for LUIS.

German Speech Support in Bot Framework

is the Speech Support in Bot Framework also available for the german language?
kind regards

if you read through this blog in the section "Cross platform speech support in your app using the DirectLine channel" there is this code snippet:
_botClient = new Microsoft.Bot.Client.BotClient(
BotConnection.DirectLineSecret,
BotConnection.ApplicationName
){
// We used the Cognitive Services Speech-To-Text API, with speech priming support as the speech recognizer as well as the Text-To-Speech API.
// Alternate/custom speech recognizer & synthesizer implementation are supported as well.
SpeechRecognizer = new CognitiveServicesSpeechRecognizer(BotConnection.BingSpeechKey),
SpeechSynthesizer = new CognitiveServicesSpeechSynthesizer(BotConnection.BingSpeechKey, Microsoft.Bot.Client.SpeechSynthesis.CognitiveServices.VoiceNames.Jessa_EnUs)
};
in theory you could replace this line:
SpeechSynthesizer = new CognitiveServicesSpeechSynthesizer(BotConnection.BingSpeechKey, Microsoft.Bot.Client.SpeechSynthesis.CognitiveServices.VoiceNames.Jessa_EnUs)
with any synthesizer you want. So assuming there is a German synthesizer out there, the answer to your question is yes.

Get voice input as a mix of predefined grammar and free speech in a UWP application?

Is it possible to get voice input as a mix of predefined grammar(programmatic list constraint/SRGS grammar) and free speech(default dictation grammar) in a UWP application. For example if I say "Search something". Search is a predefined constraint and something is a free form text.

I don't think so. But if the goal is to match intents/actions inside your application (with associated topics) you have another solution. Please try LUIS (Language Understanding Intelligent Service) ; this service is part of Microsoft Cognitive Services.
Just perform a free speech and send the text to this service (once you train it). You can check the following video to obtain additional détails https://www.luis.ai/Help. Note: This is free until 100,000 transactions per month.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio