Training LUIS to predict entities without Phrase List

Training LUIS to predict entities without Phrase List - azure-language-understanding

I am trying to train LUIS to recognize entity through few utterances. I initially tried to train with few utterances with different entity values. The entity values are made up of two words or more. For example, 'customer engagement', 'empower your teams' etc.
I am not able to get LUIS to identify the entity correctly because of the variation in the number of words.
I cannot use Phrase List as the values as the values are dynamic.
How can I get train LUIS to recognize the multiple words in the utterance and identify the entity effectively?

This still requires you to provide some training data in the form of canonical values and synonyms, but another way to approach this would be to use a list entity inside of a composite entity. Other than this, you'll currently have to provide a larger amounts of training data/phrase list data as LUIS doesn't look at the definition of a word.

Related

How to use ml luis entity and list entity together in bot framework

When i try to use both list entity and ml entity together in bot composer i get the following error:
"Check STATUS with {#idtype} {#id=132354}" has mix of entites with labelled values and ones without. Please update utterance to either include labelled values for all entities or remove labelled values from all entities."
Check STATUS with {#idtype} {#id=132354} .Here idtype is list entity and id is ml entity.

What that means is that, for a specific utterance you have to either add labels (the "=132354" part) to all entities in the utterance or remove them from all entities.
For your specific sample, since you should have the "132354" value included in the list definition, you could remove it from the utterance.
However, ml entities require (at least some) utterances with labeled entities, after all that's the only way the machine can learn what that entity looks like 😉. So, wherever you label an ml entity, you should also label the list entity or any other entity in the utterance.
You can read more about this topic in the Best practices for building a language understanding (LUIS) app documentation page.
BTW an utterance with unlabeled entities is considered a pattern, you can read more about this in the Patterns improve prediction accuracy page.

Difference between Phrase lists and list entities?

I have started working with Microsoft bot framework and LUIS.
I have problem understanding difference between Phrase lists and list entities?
Can you help me? Examples would be great.
Thanks

What is a list entities?
Definition in documentation here, I highlighted the main points
List entities represent a fixed set of related words in your system.
Each list entity may have one or more forms. They aren't machine
learned, and are best used for a known set of variations on ways to
represent the same concept. List entities are not labeled in
utterances or trained by the system.
A list entity is an explicitly specified list of values. Unlike other
entity types, LUIS does not discover additional values for list
entities during training. Therefore, each list entity forms a closed
set.
If there is more than one list entity with the same value, each entity
is returned in the endpoint query.
What is a phrase list feature?
Definition in documentation here:
A phrase list includes a group of values (words or phrases) that
belong to the same class and must be treated similarly (for example,
names of cities or products). What LUIS learns about one of them is
automatically applied to the others as well. This is not a white list
of matched words.
When to use phrase lists instead of list entities
I think the best answer is (still in the documentation, here):
When you use a phrase list, LUIS can still take context into account and generalize to identify items that are similar to, but not an exact match as items in a list. If you need your LUIS app to be able to generalize and identify new items in a category, it's better to use a phrase list.
In contrast, a list entity explicitly defines every value an entity can take, and only identifies values that match exactly. A list entity may be appropriate for an app in which all instances of an entity are known and don't change often, like the food items on a restaurant menu that changes infrequently. In a system in which you want to be able to recognize new instances of an entity, like a meeting scheduler that should recognize the names of new contacts, or an inventory app that should recognize new products, it's better to use another type of entity and then use phrase list features to help guide LUIS to recognize examples of the entity.

Do I have to make my own LUIS entity to recognize the word "latest"

I am currently using the prebuilt entity ORDINAL and it serves very well in recognizing the words FIRST and LAST. However, from tests, I see that my users use the word "latest" and it doesn't recognize it as an ORDINAL.
Should I just make my own entity then? Any help to point me to the right direction would be appreciated.

For this instance you have three options ahead of you, two of which are in LUIS itself.
LUIS: Option 1 - Simple Entity
Create a simple entity in your application and add latest and its synonyms to your LUIS application.
Benefits include less code debt, being able to label tokens manually, and using machine learning to recognize latest and its synonyms (note: you still have to provide the synonyms for LUIS to recognize)
Cons include one less entity to use in your application (current limit for any combination of simple, hierarchical and composite entities is 30 per application).
LUIS: Option 2 - List Entity
Create a list entity in your application and add latest as the canonical form of a sublist with its synonyms as values in the list for matching.
Benefits include RegEx matching, abstracted away from your application. LUIS will recognize any token that already exists in the list entity.
Cons include losing one list entity for one word with a finite set of synonyms (current limit for list entities is 50 per application). You will have to add each token manually to the sublist for it to be recognized. Users are unable to label tokens with a list entity. Not used in the machine learning aspect of LUIS, does not help improve intent prediction scores.
Application level: RegExp/sub-string parsing
Create a token extractor (using RegExp or some other technique) to recognize the word latest and its synonyms
Benefits for this include fewer expenditure of LUIS resources (entities and list entities), and less importantly, perhaps a minuscule reduction in the time it takes to receive results from LUIS.
Cons for this include increased code debt due to the matching you have to perform in your application.

What is the List entity count(x/50) in Luis apart from Entity count?

In LUIS AI tool there is a "list entity count" shown. What is it actually?

I'm not entirely sure what your question is, but here's what I think you're asking about.
The List entity count (x/50) is the total amount of list entities you can have in your LUIS model. This limit is separate from the other types of entities; Prebuilt, Simple, Hierarchical (and their children) and Composite entities. You can only have a combination of up to 30 of these entities.
The reason why these 50 don't contribute to your limit of 30 is that these entities aren't machine learned. They're recognized through regex.
Each list entity can contain up to 20,000 items.
You can find more information on Entities in LUIS here.

How do I create a custom role for entities in Sphinx?

In my project we define threats/risks and countermeasures. I want to keep track and refer to both types of entities in Sphinx, as well as generating a list of both threats/risks and the countermeasures. Let's say I have 30 risks and 50 countermeasures (many-to-many relationship).
I'd be happy just to have a lists of both and the ability to refer to each other by numbers (e.g. "risk #23", "countermeasure #12"). It would be even better if the system could display the relationship automatically.
The content of both is let's say a single paragraph or even shorter, so that's why I dislike to use regular headings. And I cannot refer to items in lists or table rows. So, I'm looking for something like a Figure in Sphinx (numbered, with caption), but then for arbitrary types of entities.
My current approach is to create a custom RST role for this. Is this the right approach? If so, where to start?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio