ELK bool query with match and prefix - elasticsearch

I'm new in ELK. I have a problem with the followed search query:
curl --insecure -H "Authorization: ApiKey $ESAPIKEY" -X GET "https://localhost:9200/commsrch/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"should" : [
{"match" : {"cn" : "franc"}},
{"prefix" : {"srt" : "99889300200"}}
]
}
}
}
'
I need to find all documents that satisfies the condition: OR field "cn" contains "franc" OR field "srt" starts with "99889300200".
Index mapping:
{
"commsrch" : {
"mappings" : {
"properties" : {
"addr" : {
"type" : "text",
"index" : false
},
"cn" : {
"type" : "text",
"analyzer" : "compname"
},
"srn" : {
"type" : "text",
"analyzer" : "srnsrt"
},
"srt" : {
"type" : "text",
"analyzer" : "srnsrt"
}
}
}
}
}
Index settings:
{
"commsrch" : {
"settings" : {
"index" : {
"routing" : {
"allocation" : {
"include" : {
"_tier_preference" : "data_content"
}
}
},
"number_of_shards" : "1",
"provided_name" : "commsrch",
"creation_date" : "1675079141160",
"analysis" : {
"filter" : {
"ngram_filter" : {
"type" : "ngram",
"min_gram" : "3",
"max_gram" : "4"
}
},
"analyzer" : {
"compname" : {
"filter" : [
"lowercase",
"stop",
"ngram_filter"
],
"type" : "custom",
"tokenizer" : "whitespace"
},
"srnsrt" : {
"type" : "custom",
"tokenizer" : "standard"
}
}
},
"number_of_replicas" : "1",
"uuid" : "C15EXHnaTIq88JSYNt7GvA",
"version" : {
"created" : "8060099"
}
}
}
}
}
Query works properly with just only one condition. If query has only "match" condition, results has properly documents count. If query has only "prefix" condition, results has properly documents count.
In case of two conditions "match" and "prefix", i see in result documents that corresponds only "prefix" condition.
In ELK docs can't find any limitation about mixing "prefix" and "match", but as i see some problem exists. Please help to find where is the problem.

In continue of experince I have one more problem.
Example:
Source data:
1st document cn field: "put stone is done"
2nd document cn field:: "job one or two"
Mapping and index settings the same as described in my first post
Request:
{
"query": {
"bool": {
"should" : [
{"match" : {"cn" : "one"}},
{"prefix" : {"cn" : "one"}}
]
}
}
}
'
As I understand, the high scores got first document, because it has more repeats of "one". But I need high scores for documents, that has at least one word in field "cn" started from string "one". I have experiments with query:
{
"query": {
"bool": {
"should": [
{"match": {"cn": "one"}},
{
"constant_score": {
"filter": {
"prefix": {
"cn": "one"
}
},
"boost": 100
}
}
]
}
}
}
But it doesn't work properly. What's wrong with my query?

Related

query to find all docs that match with exact terms with all the fields in the query

I have a simple doc structure as follows.
{
"did" : "1",
"uid" : "user1",
"mid" : "pc-linux1",
"path" : "/tmp/path1"
}
I need to query elastic ,that matches all fields exactly
GET index2/_search
{
"query": {
"bool":{
"must": [
{
"term" : { "uid" : "user1"}
},
{
"term" : { "mid" : "pc-linux1"}
},
{
"term" : { "did" : "1"}
},
{
"term" : { "path" : "/tmp/path1"}
}
]
}
}
}
The matching should happen without any kind of elastic 'analysis' on keywords, so that "/tmp/path1" is matched as a full term.
I tried to use a custom mapping: with
"index" : false
which does not work.
PUT /index2?include_type_name=true
{
"mappings" : {
"_doc": {
"properties" : {
"did" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"index" : false,
"ignore_above" : 256
}
}
},
"mid" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"index" : false,
"ignore_above" : 256
}
}
},
"path" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"index" : false,
"ignore_above" : 256
}
}
},
"uid" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"index" : false,
"ignore_above" : 256
}
}
}
}
}
}
}
I am using elastic7.0 and few posts suggesting a custom mapping with
"index" : "not_analysed"
does not get accepted as a valid mapping in elastic 7.0
Any suggestions?
If you want to match exact terms, try this query:
GET index2/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"uid": "user1"
}
},
{
"match": {
"mid": "pc-linux1"
}
},
{
"match": {
"did": "1"
}
},
{
"match": {
"path": "/tmp/path1"
}
}
]
}
}
}

Elastic Search error - variable [relevancy] is not defined

I am trying to query my products ElasticSearch index and create a script_score but I keep receiving the error Variable [relevancy] is not defined.
I tried replacing the script with just a number, then with Math.log(_score) to make sure the script_score was working properly and the math function is ok, and both queries executed as expected. I also tried doc['relevancy'].value and received the same error.
My query is:
curl -X GET "localhost:9200/products/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"function_score": {
"query": {
"multi_match" : {
"query": "KQ",
"fields": [ "item_id", "extended_desc", "mfg_part_no" ]
}
},
"script_score" : {
"script": "Math.log(_score) + Math.log(doc['relevancy'])"
},
"boost_mode": "replace"
}
}
}
'
And the mapping for this index is:
{
"products" : {
"mappings" : {
"properties" : {
"#timestamp" : {
"type" : "date"
},
"#version" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"extended_desc" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"frecno" : {
"type" : "long"
},
"item_id" : {
"type" : "text",
"analyzer" : "my_analyzer"
},
"mfg_part_no" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"relevancy" : {
"type" : "long"
}
}
}
}
}
Replaced ' with \u0027 because this is curl.
curl -X GET "localhost:9200/products/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"function_score": {
"query": {
"multi_match" : {
"query": "KQ",
"fields": [ "item_id", "extended_desc", "mfg_part_no" ]
}
},
"script_score" : {
"script": "Math.log(_score) + Math.log(doc[\u0027relevancy\u0027].value)"
},
"boost_mode": "replace"
}
}
}
'

Elasticsearch multi_match + nested search

I am trying to execute a multi_match + nested search in ElasticSearch 6.4. I have the following mappings:
"name" : {
"type" : "text"
},
"status" : {
"type" : "short"
},
"user" : {
"type" : "nested",
"properties" : {
"first_name" : {
"type" : "text"
},
"last_name" : {
"type" : "text"
},
"pk" : {
"type" : "integer"
},
"profile" : {
"type" : "nested",
"properties" : {
"location" : {
"type" : "nested",
"properties" : {
"name" : {
"type" : "text",
"analyzer" : "html_strip"
}
}
}
}
}
}
},
And this is the html_strip analyzer:
"html_strip" : {
"filter" : [
"lowercase",
"stop",
"snowball"
],
"char_filter" : [
"html_strip"
],
"type" : "custom",
"tokenizer" : "standard"
}
And my current query is this one:
"query": {
"bool": {
"must": {
"multi_match": {
"query": 'Paris',
"fields": ['name', 'user.profile.location.name']
},
},
"filter": {
"term": {
"status": 1
}
}
}
}
Obviously searching for "Paris" in user.profile.location.name doesn't work. I was trying to adapt my code to following this answer https://stackoverflow.com/a/48836012/12007123 but without any success.
What I am basically trying to achieve, is to be able to search for a value in multiple fields, this may or may not be nested.
I was also checking this discussion https://discuss.elastic.co/t/multi-match-query-string-with-nested-and-non-nested-fields/118652/5 but everything I tried wasn't successful.
If I just search for name, the search is working fine.
Any tips on how can I achieve this the right way, would be much appreciated.
EDIT:
While I didn't get an answer to my initial question, I was following Nikolay's (#nikolay-vasiliev) comment and changed th mappings to Object instead of Nested.
At least now I am able to search in user.profile.location.name. This is how the new mapping for user looks like:
"user" : {
"properties" : {
"first_name" : {
"type" : "text"
},
"last_name" : {
"type" : "text"
},
"pk" : {
"type" : "integer"
},
"profile" : {
"properties" : {
"location" : {
"properties" : {
"name" : {
"type" : "text",
"analyzer" : "html_strip"
}
}
}
}
}
}
},

How to highlight ngram tokens in a word using elastic search

I would like to highlight just the ngrams which match, not the whole word.
Example:
term: "Wo"
highlight should be: "<em>Wo</em>nderfull world!"
currently it is: "<em>Wonderfull</em> world!"
Mapping is:
{
"global_search_1495732922733" : {
"mappings" : {
"meeting" : {
"properties" : {
...
"name" : {
"type" : "text",
"analyzer" : "meeteor_index_analyzer",
"search_analyzer" : "meeteor_search_term_analyzer"
},
...
}
}
}
}
}
Analyzers are:
"analysis" : {
"filter" : {
"meeteor_stemmer" : {
"name" : "english",
"type" : "stemmer"
},
"meeteor_ngram" : {
"type" : "nGram",
"min_gram" : "2",
"max_gram" : "15"
}
},
"analyzer" : {
"meeteor_search_term_analyzer" : {
"filter" : [
"lowercase",
"asciifolding"
],
"tokenizer" : "standard"
},
"meeteor_index_analyzer" : {
"filter" : [
"lowercase",
"asciifolding",
"meeteor_ngram"
],
"tokenizer" : "standard"
},
"meeteor_project_id_analyzer" : {
"tokenizer" : "standard"
}
}
},
Concrete example:
curl -XGET 'localhost:9200/global_search/meeting/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"name": "Me"
}
},
"highlight":{
"fields": {
"name": {}
}
}
}
'
The result is:
"...highlight" : {
"name" : [
"Sad <em>Meeting</em>"
]
}
The correct way to achieve what you want is using ngram as tokenizer and not filter. You can do something like this:
"analysis" : {
"filter" : {
"meeteor_stemmer" : {
"name" : "english",
"type" : "stemmer"
}
},
"tokenizer" : {
"meeteor_ngram_tokenizer" : {
"type" : "nGram",
"min_gram" : "2",
"max_gram" : "15"
}
},
"analyzer" : {
"meeteor_search_term_analyzer" : {
"filter" : [
"lowercase",
"asciifolding"
],
"tokenizer" : "standard"
},
"meeteor_index_analyzer" : {
"filter" : [
"lowercase",
"asciifolding"
],
"tokenizer" : "meeteor_ngram_tokenizer"
},
"meeteor_project_id_analyzer" : {
"tokenizer" : "standard"
}
}
},
It will generate the highlighting by ngram for you like this:
"...highlight" : {
"name" : [
"Sad <em>Me</em>eting"
]
}

Using Email tokenizer in elasticsearch

Did try some examples from elasticsearch documentation and from google but nothing helped in figuring out..
just a sample data I have is just few blog posts. I am trying to see all posts with email address. When I use "email":"someone" I see all the posts matching someone but when I change to use someone#gmail.com nothing shows up!
"hits": [
{
"_index": "blog",
"_type": "post",
"_id": "2",
"_score": 1,
"_source": {
"user": "sreenath",
"email": "someone#gmail.com",
"postDate": "2011-12-12",
"body": "Trying to figure out this",
"title": "Elastic search testing"
}
}
]
when I use Get query is as shown below, I see all posts matching someone#anything.com. But I want to change this
{ "term" : { "email" : "someone" }} to { "term" : { "email" : "someone#gmail.com" }}
GET blog/post/_search
{
"query" : {
"filtered" : {
"filter" : {
"and" : [
{ "term" :
{ "email" : "someone" }
}
]
}
}
}
}
I did the curl -XPUT for the following, but did not help
curl -XPUT localhost:9200/test/ -d '
{
"settings" : {
"analysis" : {
"filter" : {
"email" : {
"type" : "pattern_capture",
"preserve_original" : 1,
"patterns" : [
"([^#]+)",
"(\\p{L}+)",
"(\\d+)",
"#(.+)"
]
}
},
"analyzer" : {
"email" : {
"tokenizer" : "uax_url_email",
"filter" : [ "email", "lowercase", "unique" ]
}
}
}
}
}
'
You have created a custom analyzer for email addresses but you are not using it. You need to declare the email field in your mapping type to actually use that analyzer, like below. Also make sure to create the right index with that analyzer, i.e. blog and not test
change this
|
v
curl -XPUT localhost:9200/blog/ -d '{
"settings" : {
"analysis" : {
"filter" : {
"email" : {
"type" : "pattern_capture",
"preserve_original" : 1,
"patterns" : [
"([^#]+)",
"(\\p{L}+)",
"(\\d+)",
"#(.+)"
]
}
},
"analyzer" : {
"email" : {
"tokenizer" : "uax_url_email",
"filter" : [ "email", "lowercase", "unique" ]
}
}
}
},
"mappings": { <--- add this
"post": {
"properties": {
"email": {
"type": "string",
"analyzer": "email"
}
}
}
}
}
'

Resources