How do I handle the special character ‘/’ in rasa? - rasa-nlu

For example, the following sentence: “hi good afternoon how are you doing”. The intent of this joke is ‘greet’. When I request the API of ‘model / parse’, it will return the correct intent and entity. But when I add the special character ‘/’ in front of this sentence, such as: “/ hi good afternoon how are you doing”, the intention returned by ‘model / parse’ is ‘/ hi good afternoon how are you doing’ instead of ‘greet’. I read the source code of rasa, as follows:
How do I deal with the special character ‘/’ to take into account the RegexInterpreter of the source code and I cite this example? It is best to solve it without modifying the source code. Please help me, thanks.
answer to:
I want to implement the following three functions at the same time:
When message.text = '/ greet {"people": "tom"}':
The actual result returned by ‘model / parse’ is as follows:
{
"text": "/ greet {\" people \ ": \" tom \ "}",
"intent": {
"name": "greet",
"confidence": 1.0
},
"intent_ranking": [
{
"name": "greet",
"confidence": 1.0
}
],
"entities": [
{
"entity": "people",
"start": 6,
"end": 22,
"value": "tom"
}
]
}
2, when message.text = 'Hi Tom, good afternoon'
The actual result returned by ‘model / parse’ is as follows:
{
"text": "Hi Tom, good afternoon",
"intent": {
"name": "greet",
"confidence": 0.923
},
"intent_ranking": [
{
"name": "greet",
"confidence": 0.923
}
],
"entities": [
{
"entity": "people",
"start": 2,
"end": 5,
"value": "tom",
"confidence": 0.8433478958,
"extractor": "CRFEntityExtractor"
}
]
}
3, when message.text = '/ Hi Tom, good afternoon'
The actual result returned by ‘model / parse’ is as follows (this is not what I want):
{
"text": "/ Hi Tom, good afternoon",
"intent": {
"name": "Hi Tom, good afternoon",
"confidence": 1.0
},
"intent_ranking": [
{
"name": "Hi Tom, good afternoon",
"confidence": 1.0
}
],
"entities": []
}
But the result I expect is as follows:
{
"text": "Hi Tom, good afternoon",
"intent": {
"name": "greet",
"confidence": 0.923
},
"intent_ranking": [
{
"name": "greet",
"confidence": 0.923
}
],
"entities": [
{
"entity": "people",
"start": 2,
"end": 5,
"value": "tom",
"confidence": 0.8433478958,
"extractor": "CRFEntityExtractor"
}
]
}
Note that the third and second difference is that the third message.text only adds '/' at the beginning
Therefore, is there a method that can solve this problem well, and can satisfy the above three situations at the same time?

The presence of / at the beginning of a user message is the set way to trigger an intent directly. From the example you've given, it doesn't look like something that would occur frequently (if at all) in a real user situation. That said, if you do want to be sure that only actual intents can be triggered with the /, you could create a custom component which strips off any leading / if it is not followed by an intent already in your training data.
However, before going to that effort, I'd recommend checking how often this is actually happening.

Related

Entity Extraction fails for Sinhala Language

Trying chatbot development for Sinhala Language using RASA NLU.
My config.yml
pipeline:
- name: "WhitespaceTokenizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "CountVectorsFeaturizer"
- name: "EmbeddingIntentClassifier"
And in data.json I have added sample data as below.
When I train nlu model and try sample input to extract, "සිංහලෙන්" as medium, it only outputs the intent and the entity value, and not the entity.
What am i doing wrong?
{
"text": "සිංහලෙන් දේශන පවත්වන්නේ නැද්ද?",
"intent": "ask_medium",
"entities": [{
"start":0,
"end":8,
"value": "සිංහලෙන්",
"entity": "medium"
}]
},
{
"text": "සිංහලෙන් lectures කරන්නේ නැද්ද?",
"intent": "ask_medium",
"entities": [{
"start":0,
"end":8,
"value": "සිංහලෙන්",
"entity": "medium"
}]
}
The response I get when testing the nlu model is
{'intent':
{'name': 'ask_langmedium', 'confidence': 0.9747527837753296}, 'entities':
[{'start': 10,
'end': 18,
'value': 'සිංහලෙන්',
'entity': '-',
'confidence': 0.5970129041418675,
'extractor': 'CRFEntityExtractor'}],
'intent_ranking': [
{'name': 'ask_langmedium', 'confidence': 0.9747527837753296},
{'name': 'ask_langmedium_request_possibility', 'confidence':
0.07433460652828217}],
'text': 'උගන්නන්නේ සිංහලෙන් ද ?'}
If this is your completed dataset then I am not sure how are you able to generate the model because rasa requires at least two intents. I added another intent with hello and rest of the data I just replicated your data in my own code and it worked out well and this is the output I've got.
Enter a message: උගන්නන්නේ සිංහලෙන් ද?
{
"intent": {
"name": "ask_medium",
"confidence": 0.9638749361038208
},
"entities": [
{
"start": 10,
"end": 18,
"value": "\u0dc3\u0dd2\u0d82\u0dc4\u0dbd\u0dd9\u0db1\u0dca",
"entity": "medium",
"confidence": 0.7177257810884379,
"extractor": "CRFEntityExtractor"
}
]
}
This is my full Code
DataSet.json
{
"rasa_nlu_data": {
"common_examples": [
{
"text": "හෙලෝ",
"intent": "hello",
"entities": []
},
{
"text": "සිංහලෙන් දේශන පවත්වන්නේ නැද්ද?",
"intent": "ask_medium",
"entities": [{
"start":0,
"end":8,
"value": "සිංහලෙන්",
"entity": "medium"
}]
},
{
"text": "සිංහලෙන් lectures කරන්නේ නැද්ද?",
"intent": "ask_medium",
"entities": [{
"start":0,
"end":8,
"value": "සිංහලෙන්",
"entity": "medium"
}]
}
],
"regex_features" : [],
"lookup_tables" : [],
"entity_synonyms": []
}
}
nlu_config.yml
pipeline: "supervised_embeddings"
Training Command
python -m rasa_nlu.train -c ./config/nlu_config.yml --data ./data/sh_data.json -o models --fixed_model_name nlu --project current --verbose
& testing.py
from rasa_nlu.model import Interpreter
import json
interpreter = Interpreter.load('./models/current/nlu')
def predict_intent(text):
results = interpreter.parse(text)
print(json.dumps({
"intent": results["intent"],
"entities": results["entities"]
}, indent=2))
keep_asking = True
while(keep_asking):
text = input('Enter a message: ')
if (text == 'exit'):
keep_asking = False
break
else:
predict_intent(text)

How to make LUIS respond with the matched entity

I am setting up a LUIS service for dutch.
I have this sentence:
Hi, ik ben igor -> meaning Hi, I'm igor
Where Hi is an simple entity called Hey, that can have multiple different values such as (hey, hello, ..) which I specified as a list in the phrases.
And Igor is a simple entity called Name
In the dashboard I can see that Igor has been correctly mapped as a Name entity, but the retrieved result is the following:
{
"query": "Hi, ik ben igor",
"topScoringIntent": {
"intent": "Greeting",
"score": 0.462906122
},
"intents": [
{
"intent": "Greeting",
"score": 0.462906122
},
{
"intent": "None",
"score": 0.41605103
}
],
"entities": [
{
"entity": "hi",
"type": "Hey",
"startIndex": 0,
"endIndex": 1,
"score": 0.9947428
}
]
}
Is it possible to solve this? I do not want to make a phrase list of all the names that exist.
Managed to train LUIS to even recognize asdaasdasd:
{
"query": "Heey, ik ben asdaasdasd",
"topScoringIntent": {
"intent": "Greeting",
"score": 0.5320666
},
"intents": [
{
"intent": "Greeting",
"score": 0.5320666
},
{
"intent": "None",
"score": 0.236944184
}
],
"entities": [
{
"entity": "asdaasdasd",
"type": "Name",
"startIndex": 13,
"endIndex": 22,
"score": 0.8811139
}
]
}
To be honest I do not have a great guide on how to do this:
Add multiple example utterances with example entity position
Did this for about 5 utterances
No phrase list necessary
I'm going to accept this as an answer, but once someone explains in-depth and technically what is happening behind the covers, I will accept that answer.

Skype bot not showing response from webhook but shows correct result for embedded custom payload in api ai

Hello I'm building a bot in skype using api.ai, or dialogflow, as it is called now. Anyway, this is my custom payload:
{
"skype": {
"type": "",
"attachmentLayout": "",
"text": "",
"attachments": [
{
"contentType": "",
"content": {
"title": "",
"images": [
{
"url": ""
}
],
"buttons": [
{
"type": "",
"title": "",
"value": ""
}
]
}
}
]
}
}
And here is my webhook response:
"data": {
"skype": {
"type": "message",
"attachmentLayout": "carousel",
"text": "Here you go!",
"attachments": [
{
"contentType": "application/vnd.microsoft.card.hero",
"content": {
"title": "Italian Cassoulet (Italian Chili)",
"images": [
{
"url": "http://img.food.boxspace.in/image/rbk_57139479f2705/hdpi.jpg"
}
],
"buttons": [
{
"type": "openUrl",
"title": "View Recipe",
"value": "http://recipebk.com/Share.html#url=rbk_57139479f2705"
}
]
}
}
]
}
}
}
Now, if I embed this response I get the result as a carousel of cards on skype. but when I try the same with my webhook, no message is displayed. Can someone tell me what I'm doing wrong? Already check this Stackoverflow question and this api.ai link , but it's been of no use so far.
Alright, so if I get this correctly, creating the response on the API.ai online console works, but when you generate the json from your webhook it fails?
Just for reference, it's maybe a bit difficult to test but in the online console you can click the "default response" on the right where you'd test your intents to "skype". This way, you could look at the error message at the bottom to see if there's any error and why.
Now that that's cleared out of the way, even if the documentation says you should embed custom payloads from the webhook in the data field, I simply don't. I simply follow the exact same way API.ai generates the response by overriding the message field in the webhook response. As examples say more than words, I'll give you the full webhook response that creates a couple of lists of richcards for one of my chatbot intents. As you notice, I put everything in the message field of the Json.
{
"speech": "",
"displayText": "",
"data": {
},
"contextOut": [
],
"source": "Webhook",
"messages": [
{
"type": 4,
"platform": "skype",
"speech": "",
"payload": {
"skype": {
"attachmentLayout": "list",
"attachments": [
{
"contentType": "application\/vnd.microsoft.card.hero",
"content": {
"title": "Unit 2A",
"subtitle": "",
"text": "These timeslots are available for 2017-10-16",
"images": [
],
"buttons": [
{
"type": "imBack",
"title": "from 13:00 until 14:00 Unit 2A",
"value": "from 13:00 until 14:00 Unit 2A"
},
{
"type": "imBack",
"title": "from 14:00 until 15:00 Unit 2A",
"value": "from 14:00 until 15:00 Unit 2A"
},
{
"type": "imBack",
"title": "from 15:00 until 16:00 Unit 2A",
"value": "from 15:00 until 16:00 Unit 2A"
}
]
}
},
{
"contentType": "application\/vnd.microsoft.card.hero",
"content": {
"title": "Unit 1",
"subtitle": "",
"text": "These timeslots are available for 2017-10-16",
"images": [
],
"buttons": [
{
"type": "imBack",
"title": "from 13:00 until 14:00 Unit 1",
"value": "from 13:00 until 14:00 Unit 1"
},
{
"type": "imBack",
"title": "from 14:00 until 15:00 Unit 1",
"value": "from 14:00 until 15:00 Unit 1"
},
{
"type": "imBack",
"title": "from 15:00 until 16:00 Unit 1",
"value": "from 15:00 until 16:00 Unit 1"
},
{
"type": "imBack",
"title": "from 16:00 until 17:00 Unit 1",
"value": "from 16:00 until 17:00 Unit 1"
}
]
}
}
]
}
}
}
]
}
Do note however that API.ai will simply override the messages this way and passes it along to skype. For more information about richcards you could read: https://learn.microsoft.com/en-us/bot-framework/rest-api/bot-framework-rest-connector-add-rich-cards and use the Json structure in your api.ai webhook.
I've given the full example as it's very difficult for me to test your setup the way you provided your question, also API.ai is a blackbox in some cases with undocumented features...

I'm using Sentiment on NLU, getting this error: "warnings": [ "sentiment: cannot locate keyphrase"

when I enter this request:
{
"text": "
Il sindaco pensa solo a far realizzare rotonde...non lo disturbate per le cavolate! ,Che schifo!
",
"features":
{
"sentiment": {
"targets": [
"aggressione", "aggressioni", "agguati", "agguato", "furto", "furti", "lavoro nero",
"omicidi", "omicidio", "rapina", "rapine", "ricettazione", "ricettazioni", "rom", "zingari", "zingaro",
"scippo", "scippi", "spaccio", "scommesse"
]
},
"categories": {},
"entities": {
"emotion": true,
"sentiment": true,
"limit": 5
},
"keywords": {
"emotion": true,
"sentiment": true,
"limit": 5
}
}
}
I get this response:
{
"language": "it",
"keywords": [
{
"text": ",Che schifo",
"relevance": 0.768142
}
],
"entities": [],
"categories": [
{
"score": 0.190673,
"label": "/law, govt and politics/law enforcement/police"
},
{
"score": 0.180499,
"label": "/style and fashion/clothing/pants"
},
{
"score": 0.160763,
"label": "/society/crime"
}
],
"warnings": [
"sentiment: cannot locate keyphrase"
]
}
Why I don't receive output for the document sentiment? if NLU does not find the key phrase it gives back this warning without the sentiment for the text! is this a NLU error to fix?
If NLU does not find any of the keyphrases you passed, then it would throw the warning "cannot locate keyphrase". It does return the doc sentiment even when one of the targets is present in the text.
If you are not sure about the presence of target phrases in your text, make a separate API call just for sentiment without any targets for retrieving document sentiment.
I would not say it as a bug on NLU Side but the service can be lenient instead of being strict if it did not find any target phrase in a given text.

Use a many to many relation in Elasticsearch

Currently we have a problem to perform a query (or more precisely to design a mapping) in elasticsearch, which help us to perform a query over a relational problem, that we didn't get solved with our non-document orientated thinking from sql.
We want to create a many-to-many relation between different Elasticsearch entries. We need this to edit an entry once and keep all using’s updated to this.
To describe the problem, we'll use the following simple data model:
Broadcast Content
------------ ---------
Id Id
Day Title
Contents [] Description
So we have two different types to index, broadcasts and contents.
A broadcast can have many contents and single contents could also be part of different broadcasts (e.g. repetition).
JSON like:
index/broadcasts
{
"Id": "B1",
"Day": "2014-10-15",
"Contents": [
"C1",
"C2"
]
}
{
"Id": "B2",
"Day": "2014-10-16",
"Contents": [
"C1",
"C3"
]
}
index/contents
{
"Id": "C1",
"Title": "Wake up",
"Description": "Morning show with Jeff Bridges"
}
{
"Id": "C2",
"Title": "Have a break!",
"Description": "Everything about Android"
}
{
"Id": "C3",
"Title": "Late Night Disaster",
"Description": "Comedy show"
}
Now we want to rename the "Late Night Disaster" into something more precisely and keep all references up to date.
How could we approach this? Are there fourther options in ES, like includes in RavenDB?
Nested objects or child-parent relations didn't helped us so far.
What about denormalizing? seems difficult if we come from the SQL mindset, but give you a try, even with millions of documents, LUCENE indexing can help, and renaming will be a batch job.
[
{
"Id": "B1",
"Day": "2014-10-15",
"Contents": [
{
"Title": "Wake up",
"Description": "Morning show with Jeff Bridges"
},
{
"Title": "Have a break!",
"Description": "Everything about Android"
}
]
},
{
"Id": "B2",
"Day": "2014-10-16",
"Contents": [
{
"Title": "Wake up",
"Description": "Morning show with Jeff Bridges"
},
{
"Title": "Late Night Disaster",
"Description": "Comedy show"
}
]
}
]

Resources