Difference between Phrase lists and list entities? - botframework

I have started working with Microsoft bot framework and LUIS.
I have problem understanding difference between Phrase lists and list entities?
Can you help me? Examples would be great.
Thanks

What is a list entities?
Definition in documentation here, I highlighted the main points
List entities represent a fixed set of related words in your system.
Each list entity may have one or more forms. They aren't machine
learned, and are best used for a known set of variations on ways to
represent the same concept. List entities are not labeled in
utterances or trained by the system.
A list entity is an explicitly specified list of values. Unlike other
entity types, LUIS does not discover additional values for list
entities during training. Therefore, each list entity forms a closed
set.
If there is more than one list entity with the same value, each entity
is returned in the endpoint query.
What is a phrase list feature?
Definition in documentation here:
A phrase list includes a group of values (words or phrases) that
belong to the same class and must be treated similarly (for example,
names of cities or products). What LUIS learns about one of them is
automatically applied to the others as well. This is not a white list
of matched words.
When to use phrase lists instead of list entities
I think the best answer is (still in the documentation, here):
When you use a phrase list, LUIS can still take context into account and generalize to identify items that are similar to, but not an exact match as items in a list. If you need your LUIS app to be able to generalize and identify new items in a category, it's better to use a phrase list.
In contrast, a list entity explicitly defines every value an entity can take, and only identifies values that match exactly. A list entity may be appropriate for an app in which all instances of an entity are known and don't change often, like the food items on a restaurant menu that changes infrequently. In a system in which you want to be able to recognize new instances of an entity, like a meeting scheduler that should recognize the names of new contacts, or an inventory app that should recognize new products, it's better to use another type of entity and then use phrase list features to help guide LUIS to recognize examples of the entity.

Related

How to use ml luis entity and list entity together in bot framework

When i try to use both list entity and ml entity together in bot composer i get the following error:
"Check STATUS with {#idtype} {#id=132354}" has mix of entites with labelled values and ones without. Please update utterance to either include labelled values for all entities or remove labelled values from all entities."
Check STATUS with {#idtype} {#id=132354} .Here idtype is list entity and id is ml entity.
What that means is that, for a specific utterance you have to either add labels (the "=132354" part) to all entities in the utterance or remove them from all entities.
For your specific sample, since you should have the "132354" value included in the list definition, you could remove it from the utterance.
However, ml entities require (at least some) utterances with labeled entities, after all that's the only way the machine can learn what that entity looks like 😉. So, wherever you label an ml entity, you should also label the list entity or any other entity in the utterance.
You can read more about this topic in the Best practices for building a language understanding (LUIS) app documentation page.
BTW an utterance with unlabeled entities is considered a pattern, you can read more about this in the Patterns improve prediction accuracy page.

Distributed GraphQL in microservices

I'm trying to write microservices in Java. I've implemented GraphQL endpoints using graphql-spring-boot-starter.
Now I have a problem how to make it efficient.
Datamodel is like a tree and I need to query for data from multiple services at once. The problem is how to filter for a member of collection, something like CONTAINS in database, but data is not in separate table, but separate microservice. Maybe the problem is that domain is not correctly splitted between services?
Let's make an example: I have 3 microservices: users, libriaries, books. Every library have collection of users and books (just list of identifiers, like foreign keys). Every book has a name and genre. Every library have lists of books borrowed by user (identifiers too).
Question 1 - should library hosts list of books and users (just identifiers, like foreign keys)? Is it correct approach?
Question 2 - I want to find libraries in which specified users (by surname) have borrowed books of specified genre. Going from top I need to first find libraries containing users. Not easy, as we have names in different service. We need to query first for users, gathers their identifiers, and now we are able to query for libraries. But it isn't all. Now we need to find books for every user and check genres - in different service. And it's not all. I want to have everything presented in nice way, so whole output should be sorted and paged. It force me to collect all data from all services, then page and sort it, which of course will not be efficient.
Please don't concentrate on this example, I'm looking how to solve general approach, not this one example. I've tried to use Datafetchers but it's troublesome and there are not good examples of calling Graphql-to-GraphQL. Most examples covers calling REST endpoints etc.

Training LUIS to predict entities without Phrase List

I am trying to train LUIS to recognize entity through few utterances. I initially tried to train with few utterances with different entity values. The entity values are made up of two words or more. For example, 'customer engagement', 'empower your teams' etc.
I am not able to get LUIS to identify the entity correctly because of the variation in the number of words.
I cannot use Phrase List as the values as the values are dynamic.
How can I get train LUIS to recognize the multiple words in the utterance and identify the entity effectively?
This still requires you to provide some training data in the form of canonical values and synonyms, but another way to approach this would be to use a list entity inside of a composite entity. Other than this, you'll currently have to provide a larger amounts of training data/phrase list data as LUIS doesn't look at the definition of a word.

Do I have to make my own LUIS entity to recognize the word "latest"

I am currently using the prebuilt entity ORDINAL and it serves very well in recognizing the words FIRST and LAST. However, from tests, I see that my users use the word "latest" and it doesn't recognize it as an ORDINAL.
Should I just make my own entity then? Any help to point me to the right direction would be appreciated.
For this instance you have three options ahead of you, two of which are in LUIS itself.
LUIS: Option 1 - Simple Entity
Create a simple entity in your application and add latest and its synonyms to your LUIS application.
Benefits include less code debt, being able to label tokens manually, and using machine learning to recognize latest and its synonyms (note: you still have to provide the synonyms for LUIS to recognize)
Cons include one less entity to use in your application (current limit for any combination of simple, hierarchical and composite entities is 30 per application).
LUIS: Option 2 - List Entity
Create a list entity in your application and add latest as the canonical form of a sublist with its synonyms as values in the list for matching.
Benefits include RegEx matching, abstracted away from your application. LUIS will recognize any token that already exists in the list entity.
Cons include losing one list entity for one word with a finite set of synonyms (current limit for list entities is 50 per application). You will have to add each token manually to the sublist for it to be recognized. Users are unable to label tokens with a list entity. Not used in the machine learning aspect of LUIS, does not help improve intent prediction scores.
Application level: RegExp/sub-string parsing
Create a token extractor (using RegExp or some other technique) to recognize the word latest and its synonyms
Benefits for this include fewer expenditure of LUIS resources (entities and list entities), and less importantly, perhaps a minuscule reduction in the time it takes to receive results from LUIS.
Cons for this include increased code debt due to the matching you have to perform in your application.

How do I create a custom role for entities in Sphinx?

In my project we define threats/risks and countermeasures. I want to keep track and refer to both types of entities in Sphinx, as well as generating a list of both threats/risks and the countermeasures. Let's say I have 30 risks and 50 countermeasures (many-to-many relationship).
I'd be happy just to have a lists of both and the ability to refer to each other by numbers (e.g. "risk #23", "countermeasure #12"). It would be even better if the system could display the relationship automatically.
The content of both is let's say a single paragraph or even shorter, so that's why I dislike to use regular headings. And I cannot refer to items in lists or table rows. So, I'm looking for something like a Figure in Sphinx (numbered, with caption), but then for arbitrary types of entities.
My current approach is to create a custom RST role for this. Is this the right approach? If so, where to start?

Resources