MS LUIS: Number of Intents / Data Imbalance - botframework

I am seeing on the LUIS documentation page here that you absolutely recommend to treat Data Imbalance (e.g. the differing number of total unterances compared amongst various intents) as a first priority. We currently see a mean of 19 Utterances per Intent on our dashboard, so in my opinion I should optimize all Intents towards having about 20 Utterances each as an example.
Now my question: When I use active learning by adding Endpoint Utterances, Utterances will be added to the intent we see them fitting (Active Learning Documentation). How can I ensure, that the number of utterances per intent will always remain equal (e.g. around 20 in our example)? In my opinion naturally by attributing endpoint utterances to Intents, a Data Imbalance will be created again.
Thanks a lot!
Best,
Mark

After your initial model is satisfactory, there no longer needs to be equality between intents, active learning specifically tries to correct for cases that were unseen of before, so if other examples already cover all your cases, then you don’t need to actively correct it.

Related

Is there a maximum number of intents for a LUIS model? And/or Is there a recommended max?

Creating a LUIS model and wondering if there is a maximum number of intents you can make. I found some very old links searching the interwebs that say 20 is the max. Is 20 still the max today? If the max is higher, or there is no max, is there a best-practice recommendation?
Maximum is 500 intents and 100 entities per application
https://blog.botframework.com/2018/01/16/luis-quick-start-list-entities/
Creating a LUIS model and wondering if there is a maximum number of intents you can make.
As rajesh mentioned, the maximum number of intents that a LUIS app can support is 500, but 100 entities per application is for Simple entity. For other types of entities’ limit, you can check LUIS boundaries.
Besides, if you need more than the maximum number of intents, you can divide your intents into multiple LUIS apps and use different LUIS app for your different system. Or you can merge some similar intents to reduce the number of intents.

Why the score changes after I add one utterance and deleted it, shouldn't it be the same?

when I am using LUIS to create intents, I get a 0.53.
when I added one question, it changes to 0.82.
But when I remove the question, the scores did not go back to 0.53, but to 0.62.
Is it normal for LUIS to act like this?
Scores are not absolute, and only have meaning relative to other scores in the same request.
LUIS training is nondeterministic, so between versions, and even between exporting and reimporting the exact same version of the app, an application and its models will not necesssarily return the exact same scores.
Your system should use the highest scoring intent regardless of its value. For example, a score below 0.5 does not necessarily mean that LUIS has low confidence. Providing more training data can help increase the score of the most-likely intent.

Many small models vs. one big model

I am currently implementing kind of a questionnaire with a chatbot and use LUIS to interpret the answers. This questionnaire is divided into segments of maybe 20 questions.
Now the following question came to mind: Which option is better?
I make one LUIS model per question. Since these questions can include implicit subquestions (i.e. "Do you want to pay all at once or in rates" could include a "How high should these rates be?") I end up with maybe 3-5 intents per question (including None).
I can make one model per segment. Let's assume that this is possible and fits in the 80 intents per model.
My first intuition was to think that the first alternative is better since this should be way more robust. When there are only 5 intents to choose from, then it may be easier to determine the right one. As far as I know, there is no restriction in how many models you can have (...is this correct?).
So here is my question for SO: What other benefits/drawbacks are there and is there maybe an option that is objectively the best?
You can have as many models as you want, there is no limit on this. But onto the rest of your question:
You intend to use LUIS to interpret every response? I'm curious as to the actual design of the questionnaire and why you need (or want) open ended responses and not multiple-choice questions. "Do you want to pay all at once or in rates" itself is a binary question. Branching off of this, users might respond with, "Yes I want to pay all at once", which could use LUIS. Or they could respond with, "rates" which could be one of two choices available to the user in a Prompt/FormFlow. "rates" is also much shorter than the first answer and thus a selection that would probably be typed more often than not.
Multiple-choice questions provide a standardized input which would reduce the amount of work you'd have to do in managing your data. It also would most likely reduce the amount of effort needed to maintain the models and questionnaire.
Objectively speaking, one model is more likely to be less work, but we can drill down a little further:
First option:
If your questionnaire segments include 20 questions and you have 2 segments, you have 40 individual models to maintain which is a nightmare.
Additionally, you might experience latency issues depending on your recognizer order, because you have to wait for a response from 40 endpoints. This said it IS possible to turn off recognizers so you might only need to wait for one recognizer. But then you need to manually turn on the next recognizer and turn off the previous one. You should also be aware that handling multiple "None" intents is a nightmare, in case you wish to leave every recognizer active.
I'm assuming that you'll want assistance in managing you models after you realize the pain of handling forty of them by yourself. You can add collaborators, but then you need to add them to multiple models as well. One day you'll (probably) have to remove them from all of the models they have collaborator status on.
The first option IS more robust but also involves a rather extreme amount of work hours. You're partially right in that fewer intents is helpful because of fewer possibilities for the model to predict. But the predictions of your models become more accurate with more utterances and labeling, so any bonus gained by having 5 intents per model is most likely to be rendered moot.
Second option:
Using one model per segment, as mentioned above is less work. It's less work to maintain, but what are some downsides? Until you train your model well enough, there may indeed be some unintended consequences due to false-positive predictions. That said, you could account for that in your questionnaire/bot/questionnaire-bot's code to specifically look for the expected intents for the question and then use the highest scoring intent from this subset if the highest scoring intent overall doesn't match to your question.
Another downfall is that if it's one model and a collaborator makes a catastrophic mistake, it affects the entire segment. With multiple models, the mistake would only affect the one question/model, so that's a plus.
Aside from not having to deal with multiple None-intent handling, you can quickly label utterances that should belong to the None intent. What you label as an intent in a singular model essentially makes it stand out more against the other intents inside of the model. If you have multiple models, an answer that triggers a specific intent in one model needs to trigger the None intent in your other models, otherwise, you'll end up with multiple high scoring intents (and the relevant/expected intents might not be the highest scoring).
End:
I recommend the second object, simply because it's less work. Also, I'm not sure of the questionnaire's goals, but as a general rule, I question the need of putting in AI where it's not needed. Here is a link that talks about factors that do not contribute to a bot's success (note that Natural Language is one of these factors).

i want to train luis ai with sufficient utterances uploaded through luis api

I am new LUIS AI.
I would like to train luis for my bot users who wants to buy books online. It is possible to enter I want XYZ, where XYZ is a book or I want ABC, where ABC is an author.
They can write find, find out, search, searching, looking, would like to see, would like to find or anything they want to write.
My requirement is to begin with an excel-sheet with utterances and entities and when I upload it, click on train, the application should be trained enough to handle all such user input, at least 90%.
The problem here is how should I write utterances to handle huge probability of user input. I have already approx 65 utterances which includes relevant and diverse utterance but still it is not getting trained to handle all user input.
Please suggest me how to proceed with the utterances to meet this requirement.
Scientist often take 30 minutes of conversation or 200 utterances as a good enough sample to conduct research [1] That is an order of magnitude estimate that is good to know and compare ourselves to.
Now, to get the most variability of incoming utterances, one must find a good origin of similar requests. For my case, sites like yahoo answers is great for finding the usual structure of requests on the topic I work into. I would suggest you to find a place where people query with a similar objective: Google adwords helper is a general but solid start.
[1] http://www.scielo.br/scielo.php?pid=S1516-18462015000401143&script=sci_arttext&tlng=en

Is there way to influence AlchemyAPI sentiment analysis

I was using AlchemyAPI for text analysis. I want to know if there is way to influence the API results or fine-tune it as per the requirement.
I was trying to analyse different call center conversations available on internet. To understand the sentiments i.e. whether customer was unsatisfied/angry and hence conversation is negative.
For 9 out of 10 conversations it gave sentiment as positive and for 1 it was negative. That conversation was about emergency response system (#911 in US). It seems that words shooting, fear, panic, police, siren could have cause this result.
But actually the whole conversation was fruitful. Caller was not angry with the service instead call center person solved the caller's problem and caller was relaxed. So logically this should not be treated as negative.
What is the way ahead to customize the AlchemyAPI behavior ?
We are currently looking at the tools that would be required to allow customization of the AlchemyAPI services. Our current service is entirely pre-trained on billions of web pages, but customization is on the road map. I can't give you any timelines this early, but keep checking back!
Zach, Dev Evangelist AlchemyAPI

Resources