Can I define synonyms for verbs in Luis bot framework? - botframework

In our company, we are building a chatbot using Microsoft Luis Bot Framework. For two of the intent's, I don't know how I should annotate the examples.
I have a product called ABC and I need two different intent's: Access ABC and Use ABC. That is, for each of these two intents, we have different answers.
Now, my questions is how can I define synonyms for verbs. Should I define verbs as entities?
Our utterances for the intent Access ABC would be like:
1) How can I access ABC?
2) How can I reach ABC?
Our utterance's for the intent Use ABC would be like:
3) How can I use ABC?
4) Is there any introduction to ABC?
5) I am new to ABC. Is there any usage guideline?
The problem that I see is that if I don't tag the verbs for the first intent, how can the system distinguish between 1 (or 2) and 4?

This is a big problem. In your utterences 1 and 2 for example, all you are teaching a model is that the verb access all the verb reach have no importance. this is because you are telling the models look at those two sentences both works regardless of what is a verb. This mean it will create quite a lot of confusions with your sentence number three. Training this model is still very empirical. We use QBox (disclaimer, I work for them). Having a couple of verbs uterences for each use case might work but you are going to have to try it. But be careful at your training examples. Try to vary as much as you can the word that play less importance.
1- how can I access ABC
2- tell me how to reach ABC
3- I can't access ABC
4- I'm struggling to reach ABC
remember that you will need 10 to 15 examples minimum in order to reach a good level of performance. This depends of the number of intents you have in your model of course

Related

Handling typos / misspellings on list entities

What is the best practice approach to handle typos / misspelling on LUIS List Entities?
I have intents on LUIS which use a list entity (specifically Company Department - HR, Finance, etc). It is common for users to misspell this when putting forward their utterance. LUIS expects an exact match, it doesn't do a "smart" match, and therefore doesn't pick up the misspelled entity.
a) Using bing spell check is not necessarily a good solution. e.g. Certain departments are acronyms such as VRPA - and bing wont correct a typo there.
b) When I used LUIS a year ago, I would pre-process the utterance and use a Levenshtein distance algorithm to fix typos on list entities before feeding them to LUIS.
I would imagine that by now LUIS has some better out of the box way of handling this very common use case.
I'd appreciate input on what the best practice approach is to handle this.
#acambitsis and I exchanged messages via his UserVoice ticket, but I'm going to post the answer here for others.
A combination of Bing and Simple Entities might be what you're looking for, then (they're machine-learned).
I was able to accomplish something close and attached images.
In entities, I created a Simple entity with the role, VRPA. In intents, I created the Show Me intent and added sample utterances "Show me the VRPA" and "Show me the VPRA". I clicked on V**A and selected the Simple Entity:VRPA role. After training, I tried "show me the varp" and it correctly guessed "varp" was the "Simple:VRPA" entity.
You may also find RegEx entities useful. For acronyms, you could do something like: /[vrpa]/i and then any combination of VRPA/VPRA/VARP/ARVP would match.
I highly recommend reading through the Entity Types and Improve App Performance to see if anything jumps out to solve your particular issues.
This may not do exactly what you're looking for. If not, I'd recommend implementing a fuzzy-matching algo of your choice.
entities
intents

Creating Staff Directory Lookup Bot with LUIS Integration

I'm trying to setup LUIS to connect to my Azure WebApp Bot, I've been asked by my IT Director to test the bot on a "Simple" Staff Directory Lookup (hosted in Azure SQL VM's).
I was trying to configure LUIS to understand intents such as 'Who is in Hospitality', or 'Who is Joe Bloggs', but I'm struggling with how to do this.
Do I use entities for departments and people? Are there Pre-Built Intents for 'Greetings' and other commonly used intents?
Any help would be appreciated.
You have several questions so I splitted my answer in 2 parts.
Information detection (departement, names)
[I want to] understand intents such as 'Who is in Hospitality', or 'Who is Joe
Bloggs', but I'm struggling with how to do this.
Do I use entities for departments and people?
Department:
If you have a limited and known list of departments, you can create an Entity which type will be List. It will process an exact text match on the items of this list (see doc here).
If you don't have this list, use an Entity of type Simple (see doc here) and label this entity in several (various) examples utterances that you provide. You can improve the detection by also adding a Phrase list in that case: it will help and is not processing an exact match in the list. And you should improve it over the time.
People:
For the people name detection, it will be a little bit more tricky. You can have a look to Communication.ContactName pre-built entity. If it's not working, create your own simple entity but I'm not sure that the results will be relevant.
"Small talk" part
Are there Pre-Built Intents for 'Greetings' and other commonly used
intents?
There is no pre-built intents but there is a Lab Project called Personality Chat that is designed to manage such cases (in English only for the moment): https://labs.cognitive.microsoft.com/en-us/project-personality-chat
It is still in a lab version, so you should not use in production, but it is mostly open-source so you can give it a try and see if it fits your needs.

dynamically classify categories

I am new at the idea of programming algorithms. I can work with simplistic ideas, but my current project requires that I create something a bit more complicated.
I'm trying to create a categorization system based on keywords and subsets of 'general' categories that filter down into more detailed categories that requires as little work as possible from the user.
I.E.
Sports >> Baseball >> Pitching >> Nolan Ryan
So, if a user decides they want to talk about "Baseball" and they filter the search, I would like to also include 'Sports"
User enters: "baseball"
User is then taken to Sports >> Baseball
Now I understand that this would be impossible without a living - breathing dynamic program that connects those two categories in some way. It would also require 'some' user input initially, and many more inputs throughout the lifetime of the software in order to maintain it and keep it up to date.
But Alas, asking for such an algorithm would be frivolous without detailing very concrete specifics about what I'm trying to do. And i'm not trying to ask for a hand out.
Instead, I am curious if people are aware of similar systems that have already been implemented and if there is documentation out there describing how it has been done. Or even some real life examples of your own projects.
In short, I have a 'plan' but it requires more user input than I really want. I feel getting more info on the subject would be the best course of action before jumping head first into developing this program.
Thanks
IMHO It isn't as hard as you think. What you want is called Tagging and you can do it Automatically just by setting the correlation between tags (i.e. a Tag can have its meaningful information plus its reation with other ones. Then, if user select a Tag well, you related that with others via looking your ADT collection (can be as simple as an array).
Tag:
Sport
Related Tags
Football
Soccer
...
I'm hoping this helps!
It sounds like what you want to do is create a tree/menu structure, and then be able to rapidly retrieve the "breadcrumb" for any given key in the tree.
Here's what I would think:
Create the tree with all the branches. It's okay if you want branches to share keys - as long as you can give the user a "choice" of "Multiple found, please choose which one... ?"
For every key in the tree, generate the breadcrumb. This is time-consuming, and if the tree is very large and updating regularly then it may be something better done offline, in the cloud, or via hadoop, etc.
Store the key and the breadcrumb in a key/value store such as redis, or in memory/cached as desired. You'll want every value to have an array if you want to share keys across categories/branches.
When the user selects a key - the key is looked up in the store, and if the resulting value contains only one match, then you simply construct the breadcrumb to take the user where you want them to go. If it has multiple, you give them a choice.
I would even say, if you need something more organic, say a user can create "new topic" dynamically from anywhere else, then you might want to not use a tree at all after the initial import - instead just update your key/value store in real-time.

Intern Problem Statement for a bank

I saw an intern opportunity in a bank in dubai. They have a defined problem statement to be solved in 2 months. They told us just 2 lines -
"Basically the problem is about name matching logic.
There are two fields (variables) – both are employer names, and it’s a free text field. So we need to write a program to match these two variables."
Can anyone help me in understanding it? Is it just a simple pattern matching stuff?
Any help/comments would be appreciated.
I think this is what they are asking for:
They have two sources of related data, for example, one from an internal database, and the other from name card input.
Because the two fields are free text fields, there will be inconsistency. For example, Nitin Garg, or Garg, Nitin, or Mr. Nitin Garg, etc. Here is an extreme case of Gadaffi.
What you are supposed to do is to find a way to match all the names for a specific person together.
In short, match two pieces of data together by employer names, taking possible inconsistency into account.
Once upon a time there was a nice simple answer to the problem of matching up names despite mis-spellings and different transliterations - Soundex. But people have put a lot of work into this problem, so now you should probably use the results of that work, which is built into databases and add-ons - some free. See Fuzzy matching using T-SQL and http://anastasiosyal.com/archive/2009/01/11/18.aspx and http://msdn.microsoft.com/en-us/magazine/cc163731.aspx

How should I validate and manage a login namespace?

This is silly, but I haven't found this information. If you have names of concepts and suitable references, just let me know.
I'd like to understand how should I validate a given named id for a generic entity, like, say, an email login, just like Yahoo, Google and Microsoft do.
I mean... If you do have an user named foo, trying to create foo2 will be denied, as it is likely to be someone trying to mislead users by using a fake id.
Coming to mind:
Levenshtein Distance
Hamming Distance
You're going to have to take a two pass approach.
The first is a potential RegEx expression to validate that the entity name meets your specifications as much as possible. For example, disallowing certain characters.
The second is to perform some type of fuzzy search during the name creation. This could be as simple as a LIKE '%value%' where clause or as complicated as using some type of full-text search and limiting hits to a certain relevance rating.
That said, I would guess the failure rate (both false positives and false negatives ) match would be high enough to justify not doing this.
Good luck.

Resources