How to setup Microsoft LUIS to detect composed names (dash separated) - azure-language-understanding

I want to detect a person's name in LUIS, including a person with a composed name (eg: Mary-Anne)
Setup:
a simple custom entity for names
a pattern feature for dash separated words: ^\w*-\w*$
a feature Phrase List to try and get at least some examples working: [marc-andre, marie-anne, jean-marc]
I trained and published (on staging) and yet, it never detects the whole composed name, but instead will only return the first part as the entity (eg: entity is "marc" instead of "marc-andre").
Do you know how to configure LUIS to properly detect my composed name entity?
Update taking Denise' answer into account
In the Luis.ai UI, i didn't realize that while labelling an utterance, it is possible to click more than once to select multiple words while specifying an entity.

I was able to configure a simple custom entity like you describe. I posted the JSON that you can import into LUIS here.
Without seeing the JSON for your LUIS app it's hard to tell why it fails to recognize the dash-separated names - feel free to post the JSON for your LUIS app here. Sometimes a LUIS app won't recognize an entity due to a lack of labeling. A key part of getting LUIS to recognize an entity is labeling enough examples. A pattern feature is just a hint to LUIS -- you still need to define example utterances that have the labeled entity. For example, if you have defined an intent called MyNameIs and want to recognize the Name entity within them, you'll want to add a variety of utterances to the MyNameIs intent that contain dash-separated names, and label each name with the entity.
When I added the pattern feature I used + to indicate "one or more" in the regex instead of *. However, this difference shouldn't break your pattern feature.
Another problem that can happen with hyphens is in the JSON that LUIS returns. When you inspect the JSON result from LUIS you can see how the Name entity is identified. Notice that in the entity field, LUIS inserts spaces around the hyphen, but the startIndex and endIndex fields identify the indexes of the entity in the original utterance. So if you have code that parses the entity field without using startIndex and endIndex on the query field the behavior might not be as you expect.
{
"query": "my name is anne-marie",
"topScoringIntent": {
"intent": "MyNameIs",
"score": 0.9912877
},
"entities": [
{
"entity": "anne - marie",
"type": "Name",
"startIndex": 11,
"endIndex": 20,
"score": 0.8978088
}
]
}

Related

Graphql type with id property that can have different values for same id

I was wondering if an object type that has an id property has to have the same content given the same id. At the moment the same id can have different content.
The following query:
const query = gql`
query products(
$priceSelector: PriceSelectorInput!
) {
productProjectionSearch(
priceSelector: $priceSelector
) {
total
results {
masterVariant {
# If you do the following it will work
# anythingButId: id
id
scopedPrice {
country
}
}
}
}
}
`;
If the PriceSelectorInput is {currency: "USD", country: "US"} then the result is:
{
"productProjectionSearch": {
"total": 2702,
"results": [
{
"name": "Sweater Pinko white",
"masterVariant": {
"id": 1,
"scopedPrice": {
"country": "US",
"__typename": "ScopedPrice"
},
"__typename": "ProductSearchVariant"
},
"__typename": "ProductProjection"
}
],
"__typename": "ProductProjectionSearchResult"
}
}
If the PriceSelectorInput is {currency: "EUR", country: "DE"} then the result is:
{
"productProjectionSearch": {
"total": 2702,
"results": [
{
"name": "Sweater Pinko white",
"masterVariant": {
"id": 1,
"scopedPrice": {
"country": "DE",
"__typename": "ScopedPrice"
},
"__typename": "ProductSearchVariant"
},
"__typename": "ProductProjection"
}
],
"__typename": "ProductProjectionSearchResult"
}
}
My question is that masterVariant of type ProductSearchVariant has id of 1 in both cases but different values for scopedPrice. This breaks apollo cache defaultDataIdFromObject function as demonstrated in this repo. My question is; is this a bug in apollo or would this be a violation of a graphql standard in the type definition of ProductSearchVariant?
TLDR
No it does not break the spec. The spec forces absolutely nothing in regards caching.
Literature for people that may be interested
From the end of the overview section
Because of these principles [... one] can quickly become productive without reading extensive documentation and with little or no formal training. To enable that experience, there must be those that build those servers and tools.
The following formal specification serves as a reference for those builders. It describes the language and its grammar, the type system and the introspection system used to query it, and the execution and validation engines with the algorithms to power them. The goal of this specification is to provide a foundation and framework for an ecosystem of GraphQL tools, client libraries, and server implementations -- spanning both organizations and platforms -- that has yet to be built. We look forward to working with the community in order to do that.
As we just saw the spec says nothing about caching or implementation details, that's left out to the community. The rest of the paper proceeds to give details on how the type-system, the language, requests and responses should be handled.
Also note that the document does not mention which underlying protocol is being used (although commonly it's HTTP). You could effectively run GraphQL communication over a USB device or over infra-red light.
We hosted an interesting talk at our tech conferences which you might find interesting. Here's a link:
GraphQL Anywhere - Our Journey With GraphQL Mesh & Schema Stitching • Uri Goldshtein • GOTO 2021
If we "Ctrl+F" ourselves to look for things as "Cache" or "ID" we can find the following section which I think would help get to a conclusion here:
ID
The ID scalar type represents a unique identifier, often used to refetch an object or as the key for a cache. The ID type is serialized in the same way as a String; however, it is not intended to be human‐readable. While it is often numeric, it should always serialize as a String.
Result Coercion
GraphQL is agnostic to ID format, and serializes to string to ensure consistency across many formats ID could represent, from small auto‐increment numbers, to large 128‐bit random numbers, to base64 encoded values, or string values of a format like GUID.
GraphQL servers should coerce as appropriate given the ID formats they expect. When coercion is not possible they must raise a field error.
Input Coercion
When expected as an input type, any string (such as "4") or integer (such as 4) input value should be coerced to ID as appropriate for the ID formats a given GraphQL server expects. Any other input value, including float input values (such as 4.0), must raise a query error indicating an incorrect type.
It mentions that such field it is commonly used as a cache key (and that's the default cache key for the Apollo collection of GraphQL implementations) but it doesn't tell us anything about "consistency of the returned data".
Here's the link for the full specification document for GraphQL
Warning! Opinionated - My take on ID's
Of course all I am about to say has nothing to do with the GraphQL specification
Sometimes an ID is not enough of a piece of information to decide whether to cache something. Let's think about user searches:
If I have a FavouriteSearch entity that has an ID on my database and a field called textSearch. I'd commonly like to expose a property results: [Result!]! on my GraphQL specification referencing all the results that this specific text search yielded.
These results are very likely to be different from the moment I make the search or five minutes later when I revisit my favourite search. (Thinking about a text-search on a platform such as TikTok where users may massively upload content).
So based on this definition of the entity FavouriteSearch it makes sense that the caching behavior is rather unexpected.
If we think of the problem from a different angle we might want a SearchResults entity which could have an ID and a timestamp and have a join-table where we reference all those posts that were related to the initial text-search and in that case it would make sense to return a consistent content for the property results on our GraphQL schema.
Thing is that it depends on how we define our entities and it's ultimately not related to the GraphQL spec
A solution for your problem
You can specify how Apollo generates the key for later use as key on the cache as #Matt already pointed in the comments. You may want to tap into that and override that behavior for those entitites that have a __type equal to your masterVariant property type and return NO_KEY for all of them (or similar) in order to avoid caching from your ApolloClient on those specific fields.
I hope this was helpful!

FHIR search in the reference resource

I have a Slot resource which has the reference to a Schedule resource. The schedule resource has a list of actors including Patient and Practitioner and all. If i want to retrieve a booked slot with the assigned Practitioner name, what will be the correct FHIR server query?
Example data inside Schedule reference:
"actor": [
{
"reference": "Practitioner/{id}",
"display": "Dr.John"
},
...]
I tried with "[base]/{resource with id}&_include=Slot:schedule&_include:iterate=Practitioner:actor[0]". But it is not working.
A few issues:
If you're going to have _include, you have to do 'search', not 'read'. Search must be done against the type, not resource type + id. If you want to filter to a specific resource, that needs to be expressed as a search criteria, not as part of the base path.
You can't specify a repetition to include. I.e. no "[0]". If you include, you get them all.
The second include needs to refer to the path to practitioner from Slot. You can filter by what type you want if desired. I've done that below.
So the search should look like this:
[base]/Slot?_id=123&_include=Slot:schedule&_include:iterate=Schedule:actor:Practitioner
Be aware that not all servers will support all _includes, or even support _include at all.

Dispatch CLI not passing Entities from Luis App

When generating a Dispatch model using the CLI, it doesn't pass the Entities from the Luis app in reference. This drastically affects the accuracy of the dispatch app.
For example, for the utterance "My [iPhone] isn't working", iPhone is attached to an entity list name CellPhoneType. There are three items in the list iPhone, Samsung, Smartphone.
In the bot emulator, using the Dispatch, if I write "my iPhone isn't working", the dispatch model passes it to Luis, as it should. However, if I write "my smartphone isn't working", the dispatch tool sends it over to QnA Maker.
I checked the model, and the entities are not passed in reference. I also tested with simple entities, they do not work as well.
I have the most recent version of the CLI installed.
Is this normal, is this a bug? Is there a work around to fix this?
So a couple things to address here with how you've built your LUIS model and what to expect from dispatch. Skip down to 2.) if you're a user who's reading this post and already has entities working in child LUIS models beautifully. #AlexandreViegas, read point 1.) to help properly build your LUIS model to detect intent properly in dispatch.
1. Use a Simple Entity + Phrase List to take Advantage of LUIS's machine learning--not List Entity
Right now it seems like your choice of using a list entity is not the best way to go here, and not how it's intended to be used. Instead list entities are used for terms that might have multiple ways of referring to the same thing.
Examples of When You Would Want to Use a List Entity
For example, California, Cali, CA, and The Golden State are all terms that refer to the same thing (a state). You can create a "States" list entity, include all 50 U.S. states and their nicknames. Now since this is a closed, explicit list, there is no machine learning when you use a list entity--LUIS will only detect "States" list entity if there's an exact text match.
Another example of when you would want to use list entities would be say with "Departments" for a school. You could have "chemistry", "CHEM142", "chem", etc. all meant to refer to that specific department, and do so with the rest of the departments in the school.
Why you want to use a Simple Entity and add a Phrase List
You can refer to this other StackOverflow answer I wrote, regarding how to create a simple entity and boost the signal of the entity using a phrase list.
To not completely duplicate the answer given in the link above, in essence, you want to use a simple entity, so LUIS can properly predict terms as CellPhoneType entity, even though you did not explicitly include it in your model.
For example you could have a Phone intent with utterances labeling various words as CellPhoneIntententity.
When I go to the Test panel, I type in "sunflower" and "moonstone" as made up mobile phones (maybe some phone company in the future creates phones with these names as their models):
Above you can see LUIS correctly predicts Phone intent and correctly extracts sunflower and moonstone as CellPhoneType entities.
However if I enter in brand names of mobiles that don't exist in the English language--for example Blackberry's "Z3" or T-Mobile's "G2X", LUIS cannot detect this with our model as is right now. (See 2 most recent utterances).
Above you can see utterances "i'd like to order a z3" and "my g2x is broken" do not properly predict as Phone intent, nor do z3 or g2x get detected as CellPhoneType entity. This is where phrase lists come in. As specified in the docs, phrase lists are good for boosting the signal of what a cell phone type may look like, as well as adding proprietary or foreign words to your LUIS model, such as the "made-up" words of many cell phone models. Again, refer to the StackOverflow answer I linked to, if you need guidance on how to create a phrase list.
After adding different names of cell phone models to phrase list
2. Query the endpoint of the LUIS model that was created by dispatch directly
Clarification:
When you add a child LUIS model to dispatch, even if that child LUIS model has entities in it, it will not show up in the model of the parent LUIS model created by dispatch.
the exception to the above bullet would be if you labelled an entity in a pattern
Why entities do not need to be labelled in the parent LUIS model, is because when you call the endpoint of the parent LUIS model, it does sort of a shared call, under the hood, so it doesn't have to ping LUIS twice.
You see the entities labelled from the child LUIS model in the connectedServiceResult property
How to extract entities from child LUIS model, using your parent dispatch LUIS app
Make sure to publish both the child LUIS app and the parent dispatch app.
Going to your parent dispatch-created LUIS app, go to Manage > Keys and Endpoints > click "Endpoint" to open a browser tab where your can query the parent app in the URL after q=
type in your utterances in the URL, after q= to see the entities and intents extracted from the child LUIS model under connectedServiceResult
https://westus.api.cognitive.microsoft.com/luis/v2.0/apps/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx?verbose=true&timezoneOffset=-360&subscription-key=b7xxxxxxxxxxxxxxxxxxxxxxxxxxxx67&q=my%20iphone%20is%20broken
{
"query": "my iphone is broken",
"topScoringIntent": {
"intent": "l_Reminders",
"score": 0.99594605
},
"intents": [
{
"intent": "l_Reminders",
"score": 0.99594605
},
{
"intent": "None",
"score": 0.002990469
}
],
"entities": [],
"connectedServiceResult": {
"query": "my iphone is broken",
"topScoringIntent": {
"intent": "Phone",
"score": 0.9658808
},
"intents": [
{
"intent": "Phone",
"score": 0.9658808
},
{
"intent": "Calendar.Add",
"score": 0.0142210266
},
{
"intent": "Calendar.Find",
"score": 0.0112086516
},
{
"intent": "None",
"score": 0.009813501
},
{
"intent": "Email",
"score": 0.0025855056
}
],
"entities": [
{
"entity": "iphone",
"type": "CellPhoneType",
"startIndex": 3,
"endIndex": 8,
"score": 0.998970151
}
]
}
}
Above you can see that the parent LUIS app created from dispatch properly identifies iphone from the utterance my iphone is broken as a CellphoneType entity.
Note: you will not see results from the child LUIS model in the Test panel of the parent dispatch, because the UI does not show connectedServiceResult

Google Places Nearby search results - missing detail Data?

I'm currently working on a project in which we perform "Nearby" queries for places using keywords, and then we make follow-up "Detail" requests to obtain more information about specific places of interest.
With Google's new pricing model in the works, the documentation warns about the cost of the Nearby search, but the warning seems to imply that the follow-up detail request will no longer be necessary because our original search should give us everything we need:
By default, when a user selects a place, Nearby Search returns all of
the available data fields for the selected place, and you will be
billed accordingly. There is no way to constrain Nearby Search
requests to only return specific fields. To keep from requesting (and
paying for) data that you don't need, use a Find Place request
instead.
However, I'm not seeing this. When I run a sample request, the results from my Nearby request contains only minimal data related to the places Google finds. To get details, I still have to do a follow-up detail request.
Does anyone know if there's something I may be overlooking? I'm including my request URL (sans API key).
https://maps.googleapis.com/maps/api/place/nearbysearch/json?key=xxxxxxxxxx&location=30.7329,-88.081987&radius=5000&keyword=insurance
And this is an example of one of the results I received:
{
"geometry": {
"location": {
"lat": 30.69254,
"lng": -88.0443999
},
"viewport": {
"northeast": {
"lat": 30.69387672989272,
"lng": -88.04309162010728
},
"southwest": {
"lat": 30.69117707010728,
"lng": -88.04579127989273
}
}
},
"icon": "https://maps.gstatic.com/mapfiles/place_api/icons/generic_business-71.png",
"id": "53744cdc03f8a9726593a767424b14f7f8f86049",
"name": "Ann M Hartwell - Aflac Insurance Agent",
"place_id": "ChIJj29KxNZPmogRJovoXjMDpQI",
"plus_code": {
"compound_code": "MXV4+26 Mobile, Alabama",
"global_code": "862HMXV4+26"
},
"reference": "CmRbAAAAcHM1P7KgNiZgVOm1pWojLto9Bqx96h2BkA-IyfN5oAz1-OICsRXiZOgwmwHb-eX7z679eFjpzPzey0brgect1UMsAiyawKpb5NLlgr_Pk8wBJpagRcKQF1VSvEm7Nq6CEhCfR0pM5wiiwpqAE1eE6eCRGhQPJfQWcWllOVQ5e1yVpZYbCsD01w",
"scope": "GOOGLE",
"types": [
"insurance_agency",
"point_of_interest",
"establishment"
],
"vicinity": "70 N Joachim St, Mobile"
}
I thought about deleting this question, but I guess I'll leave it up in case anyone else is confused like I was.
It turns out the extra detail fields I was looking for in the Nearby Search results were there...sort of.
Google's new pricing model categorizes place data fields into three tiers: Basic, Contact, and Atmosphere (Basic data is free, but the other two add to the cost).
As part of these changes, Place API calls have been expanded to allow users to specify the data fields they want so that they don't have to pay for that extra data if they don't need it.
The Nearby Search query, as per the note in the question, includes all the data fields available, and doesn't support a parameter for controlling the data -- it's always going return data that falls into the [Basic + Contact + Atmosphere] bucket.
So far, that's all well and good.
Where things became confusing to me, though, is the specifics of what is included in the different data tiers. I skimmed through these notes several times before I noticed the contents were different.
This is how the fields break down with the Places details request:
Basic
The Basic category includes the following fields: address_component,
adr_address, alt_id, formatted_address, geometry, icon, id, name,
permanently_closed, photo, place_id, plus_code, scope, type, url,
utc_offset, vicinity
Contact
The Contact category includes the following fields:
formatted_phone_number, international_phone_number, opening_hours,
website
Atmosphere
The Atmosphere category includes the following fields: price_level,
rating, review
And this is how it looks for the Places search request:
Basic
The Basic category includes the following fields: formatted_address,
geometry, icon, id, name, permanently_closed, photos, place_id,
plus_code, scope, types
Contact
The Contact category includes the following field: opening_hours
(Place Search returns only open_now; use a Place Details request to
get the full opening_hours results). Atmosphere
The Atmosphere category includes the following fields: price_level,
rating
I haven't found documentation for it, specifically, but the results from a Nearby Search request seems close (but not identical) to the Place search (including Contact and Atmosphere).
I had originally thought the fact that Nearby Search results now include Contact and Atmosphere data (when available), that meant it would contain all the fields itemized as Contact and Atmosphere data in the Place details documentation, but that's not the case.

Allow multiple provider states with parameters ( Golang )

As our team ( namely myself and two other developers ) spiked on PACT past week or so, one of the areas of concern is not having the ability associate parameters to provider states. The absence of this key feature ( which is slated for version 3 release ), we likely will not get buy in from each of our respective service sub-teams.
#MattFellows - Any projections on when version 3 might be available for Go? Any chance we can get this feature earlier?
Allow multiple provider states with parameters
In previous versions, provider states are defined as a descriptive string. There is no way to infer the data required for the state without encoding the values into the description.
{
"providerState": "an alligator with the given name Mary exists and the user Fred is logged in"
}
The change would be:
{
"providerStates": [
{
"name": "an alligator with the given name exists",
"params": {"name" : "Mary"}
}, {
"name": "the user is logged in",
"params" : { "username" : "Fred"}
}
]
}
You are correct in that it won't be available until version 3.
You can still achieve what you are after, however. The state itself is just a handle for the Consumer to some set of data on the Provider - that can be a one-to-one or one-to-many mapping - it's completely up to you.
Typically the Provider is notified of the state during verification, it will then setup a test data fixture (often seeding a database) that sets up the 'state' of the entire system based on that reference, which allows the Consumer test to run.
Whilst the ability to pass through parameters and multiple states is nice, it's somewhat an advanced feature and I very much doubt this will be the first problem you run into as a team. I've never needed to use them myself.
For a crude but effective example of this, take a look at the gin code in the examples folder of the project.

Resources