URI Search fails with parse_exception when query string contains slash - elasticsearch

I get an error message when including a slash in query string.
The query is looks as below ,
"query_string": {
"query": "usr0\\/7\\/0\\/20",
"default_field": "logmsg"
"analyzer": "keyword"
}
my document looks like as below,
{
"_index" : "logstash-log-2016.11.03",
"_type" : "log",
"_id" : "AVgpFuqyvHnB4OYqM9QE",
"_score" : 2.2499034,
"_source" : {
"message" : "#<SNMP::SNMP_Trap:0x5383e289 #request_id=63766, #error_index=0, #error_status=0, #value=#<SNMP::TimeTicks:0x3cbfc0fd #value=2033549672>>,blablabla>",
"#timestamp" : "2016-11-03T07:28:37.177Z",
"type" : "usrinfo",
"logmsg" : "DISMAN-EVENT-MIB::sysUpTimeInstance:235 days, 08:44:56.72,SNMPv2-MIB::snmpTrapOID:IF-MIB::linkUp,IF-MIB::ifIndex.132:132,IF-MIB::ifDescr.132:usr0/7/0/20,IF-MIB::ifType.132:6,CISCO-IF-EXTENSION-MIB::cieIfStateChangeReason.132:up",
"error_status" : "0",
}
I want to get documents that logmsg have the keyword "usr0/7/0/20",
but get no hits return
This occurs with ES "number" : "2.3.5",

The backslash escapes the forward slash, but you also need to escape the backslash itself, like this:
{
"query": {
"query_string": {
"query": "user0\\/0\\/0\\/2",
"default_field": "name"
}
}
}
However, this will not work if your goal is to search for the token user0/0/0/2 in your message field. You either need to use a term query or add "analyzer": "keyword" to your query_string query, otherwise user0/0/0/2 will get tokenized to user0, 0and 2

Related

ElasticSearch results are inaccurate

My current query is:
GET /index/_search
{
"query": {
"simple_query_string": {
"query": "(\"cheaper+than,+therapy\")",
"analyzer": "standard",
"flags": "OR|AND",
"fields": ["name"]
}
}
}
My main problem is at the moment this still find matches like "GOLF . . . CHEAPER THAN THERAPY". I don't want matches like this. I want to maybe fix some typo and normalize the search query but i don't want to extend them. So in this result the TM's "GOLF . . . CHEAPER THAN THERAPY" and "RUNNING IS: CHEAPER THAN THERAPY" should not be a result.
So the result should just show results which are almost the same as my search query is.
I try something with fuzzienes and so on but it does not help me.
The field name is a text field.
I await the following results:
CHEAPER THAN THERAPY
CHEAPER THAN, THERAPY
I dont await the following results:
GOLF . . . CHEAPER THAN THERAPY
"CHEAPER THAN THERAPY" MOORENKO'S
SHOPPING IS CHEAPER THAN THERAPY!
RUNNING IS: CHEAPER THAN THERAPY
CHEAPER THAN THERAPY AND WAY MORE FUN!
What do I have to do to get more accurate results?
You can use fuzzy query on keyword field.
The standard analyzer is the default analyzer which is used if none is specified. It provides grammar based tokenization. Basically it breaks a text in number of tokens.
So when you are using simple_query_string it is just checking if any document has tokens ["CHEAPER","THAN","THERAPY"] in it.
You can use fuzzy query on text.keyword which will match whole string
{
"query": {
"fuzzy": {
"text.keyword": {
"value": "CHEAPER THAN THERAPY",
"fuzziness": "AUTO"
}
}
}
}
Result
[
{
"_index" : "index129",
"_type" : "_doc",
"_id" : "pnXJM3oBX7bKb5rQ30Vb",
"_score" : 1.6739764,
"_source" : {
"text" : "CHEAPER THAN THERAPY"
}
},
{
"_index" : "index129",
"_type" : "_doc",
"_id" : "p3XJM3oBX7bKb5rQ60UT",
"_score" : 1.5902774,
"_source" : {
"text" : "CHEAPER THAN, THERAPY"
}
}
]

ElasticSearch - Fuzzy search in list elements

I've got some documents stored in ElasticSearch like this:
{
"tag" : ["tag1", "tag2", "tag3"]
...
}
I want to search through the "tag" field. I know that It should work with a query like:
{
"query":
{
"match" : {"tag" : "tag1"}
}
}
But, I don't want to use a match, I want to use a fuzzy search through the list, for example, something like:
{
"query":
{
"fuzzy" : {"tag" : "tagg1"}
}
}
The problem is, the above query doesn't return anything. What should I use instead?
What is the type of tag field in your elasticsearch mapping ?
I have tried with following type for tag field & elastisearch version is 7.2
"tag" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
And working well for me.
Query With elastic fuzzy will be :
{
"query":
{
"fuzzy": {"tag" : "tagg1"}
}
}

Favor exact matches over nGram in elasticsearch

I am trying to map a field as nGram and 'exact' match, and make the exact matches appear first in the search results. This is an answer to a similar question, but I am struggling to make it work.
No matter what boost value I specify for the 'exact' field I get the same results order each time. This is how my field mapping looks:
"name" : {
"type" : "multi_field",
"fields" : {
"name" : {
"type" : "string",
"boost" : 2.0,
"analyzer" : "ngram"
},
"exact" : {
"type" : "string",
"boost" : 4.0,
"analyzer" : "simple",
"include_in_all" : false
}
}
}
And this is how the query looks like:
{
"query": {
"filtered": {
"query": {
"query_string": {
"fields":["name","name.exact"],
"query":"Woods"
}
}
}
}
}
Understating how score is calculated
Elasticsearch has an option for producing an explanation with every search result. by setting the explain parameter to be true
POST <Index>/<Type>/_search?explain&format=yaml
{
"query" : " ....."
}
it will produce a lot of output for every hit and that can be overwhelming, but it worth taking some time to understand what it all means
the output of eplian might be harder to read in json, so adding format=yaml makes it easier to read
Understanding why a document is matched or not
you can pass the query to a specific document like below to see explanation how matching is being done.
GET <Index>/<type>/<id>/_explain
{
"query": "....."
}
The multi_field mapping is correct, but the search query needs to be changed like this:
{
"query": {
"filtered": {
"query": {
"multi_match": { # changed from "query_string"
"fields": ["name","name.exact"],
"query": "Woods",
# added this so the engine does a "sum of" instead of a "max of"
# this is deprecated in the latest versions but works with 0.x
"use_dis_max": false
}
}
}
}
}
Now the results take into account the 'exact' match and adds up to the score.

Elasticsearch search fo words having '#' character

For example, I am right now searching like this:
http://localhost:9200/posts/post/_search?q=content:%23sachin
But, I am getting all the results with 'sachin' and not '#sachin'. Also, I am writing a regular expression for getting the count of terms. The facet looks like this:
"facets": {
"content": {
"terms": {
"field": "content",
"size": 1000,
"all_terms": false,
"regex": "#sachin",
"regex_flags": [
"DOTALL",
"CASE_INSENSITIVE"
]
}
}
}
This is not returning any values. I think it has something to do with escaping the '#' inside the regular expression, but I am not sure how to do it. I have tried to escape it \ and \\, but it did not work. Can anyone help me in this regard?
This article gives information on how save # and # using custom analyzers:
https://web.archive.org/web/20160304014858/http://www.fullscale.co/blog/2013/03/04/preserving_specific_characters_during_tokenizing_in_elasticsearch.html
curl -XPUT 'http://localhost:9200/twitter' -d '{
"settings" : {
"index" : {
"number_of_shards" : 1,
"number_of_replicas" : 1
},
"analysis" : {
"filter" : {
"tweet_filter" : {
"type" : "word_delimiter",
"type_table": ["# => ALPHA", "# => ALPHA"]
}
},
"analyzer" : {
"tweet_analyzer" : {
"type" : "custom",
"tokenizer" : "whitespace",
"filter" : ["lowercase", "tweet_filter"]
}
}
}
},
"mappings" : {
"tweet" : {
"properties" : {
"msg" : {
"type" : "string",
"analyzer" : "tweet_analyzer"
}
}
}
}
}'
This isn't dealing with facets, but the redefining of the type of those special characters in the analyzer could help.
Another approach that worth to consider is to index a special (e.g. "reserved") word instead of hash symbol. For example: HASHSYMBOLCHAR. Make sure that you will replace '#' chars in query as well.

How should I query Elastic Search given my mapping and using keywords?

I have a very simple mapping which looks like this (I streamlined the example a bit):
{
"location" : {
"properties": {
"name": { "type": "string", "boost": 2.0, "analyzer": "snowball" },
"description": { "type": "string", "analyzer": "snowball" }
}
}
}
Now I index a lot of locations using some random values which are based on real English words.
I'd like to be able to search for locations that match any of the given keywords in either the name or the description field (name is more important, hence the boost I gave it). I tried a few different queries and they don't return any results.
{
"fields" : ["name", "description"],
"query" : {
"terms" : {
"name" : ["savage"],
"description" : ["savage"]
},
"from" : 0,
"size" : 500
}
}
Considering there are locations which have the word savaged in the description it should get me some results (savage is the stem of savaged). It yields 0 results using the above query. I've been using curl to query ES:
curl -XGET -d #query.json http://localhost:9200/myindex/locations/_search
If I use query string instead:
curl -XGET http://localhost:9200/fieldtripfinder/locations/_search?q=description:savage
I actually get one result (of course now it would be searching the description field only).
Basically I am looking for a query that will do a OR kind of search using multiple keywords and compare them to the values in both the name and the description field.
Snowball stems "savage" into "savag" that’s why term "savage" didn't return any results. However, when you specify "savage" on URL, it’s getting analyzed and you get results. Depending on what your intention is, you can either use correct stem ("savag") or analyze your terms by using "match" query instead of "terms":
{
"fields" : ["name", "description"],
"query" : {
"bool" : {
"should" : [
{"match" : {"name" : "savage"}},
{"match" : {"description" : "savage"}}
]
},
"from" : 0,
"size" : 500
}
}

Resources