Not able to add new intent in LUIS with ":" - azure-language-understanding

My whole application has intents with name having ":" .But now when i am trying to add new intent,its giving me error "BadArgument: Intent and entity name cannot contain the character ":" or "$" "

Welcome to Stack Overflow. Unfortunately, these special characters should not be used in intent names because they are reserved for other uses. An example of how the colon is used is in entity roles in patterns. My recommendation is to rename your intents. I also recommend storing the intent names in your application as constant string resources so that the values can be easily changed.

Related

How to fix different intent getting identified when input contains special characters

In my LUIS application I have a 'Greeting' intent. The intent identified for 'hi' is 'Greeting' but for 'hi.......' some other intent is identified.
After training the 'hi.......' as 'Greeting' it gets identified as 'Greeting' correctly. There are some other variants too with special characters which need to be trained to make it work.
How do I make this to identify as Greeting without training with special characters?
This is being used in Microsoft Bot Framework v3 in C#
You can either train your LUIS model with all possible variations that include special characters or you can strip out all of the special characters before you send it to LUIS. I would recommend the latter. Here is an example of how you would do that in Node.
turnContext.activity.text = turnContext.activity.text.replace(/[^a-zA-Z ]/g, "", "");
Hope this helps!

Rules for field names in ElasticSearch 6?

Currently all what I can find online is:
must not start with underscore "_"
must not contain comma ","
must not contain hash mark "#"
usage of point "." is discouraged but possible
field names must not be longer than 255
But it seems that these are the rules for ElasticSearch 5 and older versions.
I did some experiments and found:
using dots (.) may result in various kinds of errors, e.g. illegal_state_exception, array_index_out_of_bounds_exception, but sometimes it's legal
empty strings are not allowed (illegal_argument_exception)
leading underscores, commas, hash marks seem to be legal in ElasticSearch 6
field names can be longer than 255 (but perhaps there's a new limit?)
I wonder whether there's an official document for this? Am I just being blind?
We are currently planning an upgrade from 5.6.5 to 6.2.x.
I'm looking for evidence to support the worrying comment "...as underscores in field names will not be allowed" mentioned in Breaking Changes for Watcher in 6.0.0-alpha2.
I've been unable to find any additional evidence that underscores are now verboten. I'll open a support case referencing this question to get an official response on this.

How to use a DN containing commas as the attribute value in an LDAP search filter?

Was attempting to search our directory based on an attribute whose value is a DN. However, our user RDNs are of the form CN=Surname, GivenName, which requires that the comma be quoted in the full DN. But given an attribute like manager whose value is the DN of another user, I was unable to search for all users having specific manager. I tried (manager=CN=Surname\, GivenName,CN=users,DC=mydomain,DC=com), but got a syntax error "Bad search filter". I tried various options for quoting the DN, but all either gave me a syntax error or failed to match any objects. What am I doing wrong?
(Note that if I were looking for user objects directly, I could search for simply (CN=Surname, GivenName), with no quoting required, but I was searching for users having a specific manager. The comma-containing attribute value only becomes a problem when part of a Distinguished Name.)
The problem is that quoting the comma in the Common Name is not for the benefit of the filter parser, but for the benefit of the DN parser; the attribute value passed to that by the filter has to literally contain the backslash character. Unfortunately, the backslash is also (differently) special in LDAP filters, thus the syntax errors.
The solution is simple, but it isn't as obvious as doubling the backslash; backslash in LDAP filters works like % in URIs, so you have to use a literal backslash followed by the 2-digit hexadecimal code point for a backslash:
(manager=CN=Surname\5c, Givenname,OU=org,DC=mydomain,DC=com)
It turns out there's an example of this specific use case at the very bottom of https://docs.oracle.com/cd/E19424-01/820-4811/gdxpo/index.html#6ng8i269q.

LUIS issues with special characters

(TEXT) is converted to ( TEXT ) in LUIS when we identify an entity name.
Issues with special characters.
Refer the image in below:
Here monthly iq dashboard hospitalists is converted to reportname --> "monthly iq dashboard ( hospitalists )" in Entities. So when we use this entity in bot framework we are facing issues while comparing to actual report name stored in Metadata (database).
(TEXT) is converted to ( TEXT ) in LUIS when we identify an entity name. Issues with special characters.
The issue you reported seems be that whitespace is added when some special characters are using, I reproduced the issue on my side, and I find similar issues are reported by others:
LUIS inserts whitespace in utterances when punctuation present causing entity getting incorrectly parsed
LUIS cannot take care of special characters
when we use this entity in bot framework we are facing issues while comparing to actual report name stored in Metadata (database)
To solve it, as Nicolas R and NiteLordz mentioned in comments, you can try to handle that in your code. And to remove whitespace from ( hospitalists ), the following regex would be helpful.
Regex regex = new Regex(#"\(\s\w*\s\)");
input = Regex.Replace(input, regex.ToString(), c => c.Value.Replace(" ", ""));
Note: can reproduce the issue, and same issue will appear when we process something like URL that contains / and . etc

LUIS seperates words in german without logical rule

has LUIS a solution for German composite words like "SALAMIPIZZA" (engl: salami pizza)?
German instance separates words without any logical rule. "Apfelsaft" (engl.: apple juice) as example is not separated, but salami pizza it's.
You can use a list entity to account for this. For example, create list entity named "Foods", then a create a canonical form called "SALAMIPIZZA", and for that canonical form's list, enter in "SALAMIPIZZA" and "SALAMI PIZZA". This would also allow you to account for other spellings, such as "SALAMI-PIZZA".

Resources