I am trying to implement the voice recognition using pocketSphinx. Currently, i am able to recognize multiple phrases from the grammar file. using
recognizer.addKeywordSearch(KWS_SEARCH,menuGrammar);
The issue, is only the exact phrases from the grammar file get recognized. I want to be able to recognize ( and print) more English like statements. If I am able to retrieve the statements spoken by the user, i will then search for certain keywords ( and then do the required action)
Can you please guide me on how to recognise multiple statements spoken by the user ( without having to force the user to say the specific phrases as it is)
Related
When a batch gets created documents should get separated automatically without using separator sheet or Barcode separator.
How can I classify documents for Invoice and supporting document.
In our project we get many invoices with supporting document so the scanning person has to insert the separator sheets manually, so to avoid this we want to automatically classify the supporting documents.
In general the concept would be that you would enable separation in the project and then train your classes with examples to be used for the layout or content classifiers.
However, as I'm sure you've seen, the obstacle with invoices is that they are different enough between vendors that it would not reliably classify all to an Invoice class. Similarly with "Supporting Documents" which are likely to be very different from each other, so unfortunately there isn't a completely easy answer without separator sheets (or barcode stickers affixed to supporting docs).
What you might want to do is write code in the one of the separation events like Document_AfterSeparate event. Despite the name, the document has not yet been split at this point, but the classifiers have run. See Scripting Help topic "Server Script Events Sequence > Document Separation > Standard Document Separation" for more detail. Setting the SplitPage property on the CDocPage (pXDoc.CDoc.Pages.ItemByIndex(lPage).SplitPage) will allow you to use your own logic to determine which pages to separate.
For example if you know that you will always have single page invoices, you can split on the first page and classify accordingly. Or you can try to search for something that indicates the end of the invoice like "Total" or other characteristics. There is an example of how you can use locators to help separation in the Scripting Help topic "Script Samples > Use Locator Results for Standard Document Separation". The example uses a Barcode Locator, but the same concept works if you wanted to try it with a Format Locator or anything else.
Without Separator sheets you will need a smart classification software like Kofax Transformation Module (KTM). Its kind of expensive. you will need to verify the cost saving and ROI.
I've got two questions for using the Windows Speech API.
First: I've got my Speech Recognizer set up to detect sentences of a specific structure--namely a verb followed by a noun, with some wildcard stuff in between. However, I also would it like it to be able to recognize a "Help" and "Exit" command that would not fit this structure. How can I get the grammar to recognize another, fundamentally different, structure?
Second: I am using SemanticResultValue to analyze the content of my sentences. I want there to be multiple words that users can say for the same verb--for example, "Go," "Walk," and "Run" would all translate to the same action in the system. How do I assign multiple values to the same SemanticResultValue?
1) Multiple grammars would be the obvious solution here; one grammar for your verb/noun, and a separate grammar for pure verbs.
2) The SemanticResultValue constructor that takes a GrammarBuilder parameter (SemanticResultValue (GrammarBuilder, Object)) would be appropriate here.
Let's say I have a big corpus (for example in english or an arbitrary language), and I want to perform some semantic search on it.
For example I have the query:
"Be careful: [art] armada of [sg] is coming to [do sg]!"
And the corpus contains the following sentence:
"Be careful: an armada of alien ships is coming to destroy our planet!"
It can be seen that my query string could contain "semantic placeholders", such as:
[art] - some placeholder for articles (for example a / an in English)
[sg], [do sg] - some placeholders for NPs and VPs (subjects and predicates)
I would like to develop a library which would be capable to handle these queries efficiently.
I suspect that some kind of POS-tagging would be necessary for parsing the text, but because I don't want to fully reimplement an already existing full-text search engine to make it work, I'm considering that how could I integrate this behaviour into a search engine like Lucene?
I know there are SpanQueries which could behave similarly in some cases, but as I can see, Lucene doesn't do any semantic stuff with stored texts.
It is possible to implement a behavior like this? Or do I have to write an own search engine?
With Lucene, you could add additional tokens to a single item in a TokenStream, but I wouldn't know how to deal with tags that span more than one word.
I want to integrate an English dictionary function into my Windows Phone 7 application. Is there such a file or database available somewhere that contains all valid English words?
By the way, I only need something containing the list of words in order to validate inputs made by users. Other things such as definitions, phonetics, thesauruses, etc... are not needed.
Try this link (text file). It obviously doesn't have all valid words, but it should be reasonably effective.
The spell check API that's integrated isn't exposed to developers, so you'll have to make do with a text file like this.
Also, if you only need to check the validity of the word that the user enters in a text input of some kind, then all you need to do is set InputScope="Text" in the XAML declaration.
Hope this helps.
To make matter more specific:
How to detect people names (seems like simple case of named entity extraction?)
How to detect addresses: my best guess - find postcode (regexes); country and town names and take some text around them.
As for phones, emails - they could be probably caught by various regexes + preprocessing
Don't care about education/working experience at this point
Reasoning:
In order to build a fulltext index on resumes all vulnerable information should be stripped out from them.
P.S. any 3rd party APIs/services won't do as a solution.
The problem you're interested in is information extraction from semi structured sources. http://en.wikipedia.org/wiki/Information_extraction
I think you should download a couple of research papers in this area to get a sense of what can be done and what can't.
I feel it can't be done by a machine.
Every other resume will have a different format and layout.
The best you can do is to design an internal format and manually copy every resume content in there. Or ask candidates to fill out your form (not many will bother).
I think that the problem should be broken up into two search domains:
Finding information relating to proper names
Finding information that is formulaic
Firstly the information relating to proper names could probably be best found by searching for items that are either grammatically important or significant. I.e. English capitalizes only the first word of the sentence and proper nouns. For the gramatical rules you could look for all of the words that have the first letter of the word capitalized and check it against a database that contains the word and the type [i.e. Bob - Name, Elon - Place, England - Place].
Secondly: Information that is formulaic. This is more about the email addresses, phone numbers, and physical addresses. All of these have a specific formats that don't change. Use a regex and use an algorithm to detect the quality of the matches.
Watch out:
The grammatical rules change based on language. German capitalizes EVERY noun. It might be best to detect the language of the document prior to applying your rules. Also, another issue with this [and my resume sometimes] is how it is designed. If the resume was designed with something other than a text editor [designer tools] the text may not line up, or be in a bitmap format.
TL;DR Version: NLP techniques can help you a lot.