Why does elastic search wildcard query return no results? - elasticsearch

Query #1 in Kibana returns results, however Query #2 returns no results. I search for only "bob" and get results, but when searching for "bob smith", no results, even though "Bob Smith" exists in the index. Any reason why?
Query #1: returns results
GET people/_search
{
"query": {
"wildcard" : {
"name" : "*bob*"
}
}
}
Results:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 23,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "people",
"_type" : "_doc",
"_id" : "xxxxx",
"_score" : 1.0,
"_source" : {
"name" : "Bob Smith",
...
Query #2: returns nothing.. why(?)
GET people/_search
{
"query": {
"wildcard" : {
"name" : "*bob* *smith*"
}
}
}
results...nothing
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}

Look like the reason of the empty result is your index mapping. If you use "text" type field, you actually search in the inverted index, mean you search in the token "bob" and token "smith" (standard analyzer) and not in the "Bob Smith". If you want to search in "Bob Smith" as one token, you need to use "keyword" type (maybe with lowercase normalizer, if you want to use not key sensetive search)
For example:
PUT test
{
"settings": {
"analysis": {
"normalizer": {
"lowercase_normalizer": {
"type": "custom",
"char_filter": [],
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "keyword",
"ignore_above": 256,
"normalizer": "lowercase_normalizer"
}
}
}
}
PUT test/_doc/1
{
"name" : "Bob Smith"
}
GET test/_search
{
"query": {
"wildcard": {
"name": "*bob* *Smith*"
}
}
}

Related

Highlight multi_match doesnt take last term

When I search for multiple keywords, the last term is not highlighted in the result.
This is the index and mapping:
PUT objects
{
"mappings": {
"properties": {
"title": {
"type": "search_as_you_type"
}
}
}
}
And this is my search:
// query
GET objects/_search
{
"query": {
"multi_match": {
"query": "Goldenen Vlies",
"type": "bool_prefix",
"fields": [
"title",
"title._2gram",
"title._3gram",
"title._index_prefix"
]
}
},
"highlight": {
"fields": {
"title": {}
}
},
"_source": false
}
The output I get is the following:
{
"took" : 1,
"timed_out" : false,
"_shards" : {...},
"hits" : {
"total" : {
"value" : 23,
"relation" : "eq"
},
"max_score" : 7.628418,
"hits" : [
{
"_index" : "objects",
"_id" : "AWj1tIEBIysZ6sOt9vqw",
"_score" : 7.628418,
"highlight" : {
"title" : [
"Schwurkreuz des Ordens vom <em>Goldenen</em> Vlies" <-------
]
}
}
]
}
}
However, this would be the expected/desired output:
{
"took" : 1,
"timed_out" : false,
"_shards" : {...},
"hits" : {
"total" : {
"value" : 23,
"relation" : "eq"
},
"max_score" : 7.628418,
"hits" : [
{
"_index" : "objects",
"_id" : "AWj1tIEBIysZ6sOt9vqw",
"_score" : 7.628418,
"highlight" : {
"title" : [
"Schwurkreuz des Ordens vom <em>Goldenen</em> <em<Vlies</em>" <-------
]
}
}
]
}
}
It does work as expected when I add an extra empty space in the query like so: "query": "Goldenen Vlies ", but I want to know if there is a better solution?
Try this way with "best_fields":
{
"query": {
"multi_match": {
"query": "Goldenen Vlies",
"type": "best_fields",
"fields": [
"title",
"title._2gram",
"title._3gram",
"title._index_prefix"
]
}
},
"highlight": {
"fields": {
"title": {}
}
},
"_source": false
}

Query consecutive words using match_phrase elasticsearch works unexpected

I have the parameter name as a text:
{
"mappings": {
"_doc": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
}
}
}
}
Because of the nature of text type in ElasticSearch, matchs every word on the phrase. That's why in some cases I get the next results:
POST /example-tags/_search
{
"query": {
"match": {
"name": "Jordan Rudess was born in 1956"
}
}
}
// Results
{
"took" : 28,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 2.1596613,
"hits" : [
{
"_index" : "example-tags",
"_type" : "_doc",
"_id" : "6101e538bc8ec610aff699e4",
"_score" : 4.1596613,
"_source" : {
"name" : "Jordan Rudess"
}
},
{
"_index" : "example-tags",
"_type" : "_doc",
"_id" : "610123538bc8ec61034ff699e4",
"_score" : 4.1796613,
"_source" : {
"name" : "Alice in Chains"
}
},
]
}
}
As you can see, in the text Jordan Rudess was born in 1956 I get the result Alice in Chains just for the word in. I want to avoid this behaviour.
If I try:
POST /example-tags/_search
{
"query": {
"match_phrase": {
"name": "Dream Theater keyboardist's Jordan Rudess was born in 1956"
}
}
}
// Results
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
So, in the past example I was expecting to get the Jordan Rudess tag name but I get empty results.
I need to get the maximum ocurrences in tag.name of consecutive words in a phrase. How can I achieve that?

elasticsearch does not return expected returns

I'm complete new on elasticsearch. I tried search API but it's not returning what I expected
What I did
POST /test/_doc/1
{
"name": "Hello World"
}
GET /test/_doc/1
Response:
{
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_version" : 5,
"_seq_no" : 28,
"_primary_term" : 1,
"found" : true,
"_source" : {
"name" : "Hello World"
}
}
GET /test/_mapping
Response:
{
"test" : {
"mappings" : {
"properties" : {
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"query" : {
"properties" : {
"term" : {
"properties" : {
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
}
}
}
GET /test/_search
{
"query": {
"term": {
"name": "Hello"
}
}
}:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
GET /test/_search
{
"query": {
"term": {
"name": "Hello World"
}
}
}
Response:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
My elasticsearch version is 7.3.2
The last two search should return me document 1, is that correct? Why does it hit nothing?
Problem is that you have term queries. Term queries are not analysed. Hence Hello didn't match the term hello in your index. Note the case difference.
Unlike full-text queries, term-level queries do not analyze search terms. Instead, term-level queries match the exact terms stored in a field.
Reference
Whereas match queries analyse the search term also.
{
"query": {
"match": {
"name": "Hello"
}
}
}
You can use _analyze to check how your terms are indexed.

Nested boolean aggregation in elastic

I have json payloads as such
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 61,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "CAojVWwBO8H0jj7a_j3P",
"_score" : 1.0,
"_source" : {
"appName" : "BigApp",
"appVer" : "1.0",
"reviews" : {
"reviewer" : {
"value" : "Bob"
},
"testsPass" : [
{
"name" : "unit",
"pass" : false
},
{
"name" : "integraton",
"pass" : false
},
{
"name" : "ui",
"pass" : false
}
]
}
}
}
]
}
}
In elastic I want to aggregate the boolean values under testsPass to return true if all of the pass values are true.
I am new to Elastic and struggling to write a query in that shape, can someone please help?
So far I have tried nested aggregators but can't get the syntax right.
Looking at your data, I'm assuming the structure of your mapping is as follow:
Mapping:
PUT myindex
{
"mappings": {
"properties": {
"appName":{
"type": "keyword"
},
"appVer": {
"type": "keyword"
},
"reviews":{
"properties": {
"reviewer":{
"properties":{
"value": {
"type": "keyword"
}
}
},
"testsPass":{
"type": "nested"
}
}
}
}
}
}
Sample Documents:
POST myindex/_doc/1
{
"appName":"BigApp",
"appVer":"1.0",
"reviews":{
"reviewer":{
"value":"Bob"
},
"testsPass":[
{
"name":"unit",
"pass":false
},
{
"name":"integraton",
"pass":false
},
{
"name":"ui",
"pass":false
}
]
}
}
POST myindex/_doc/2
{
"appName":"MidApp",
"appVer":"1.0",
"reviews":{
"reviewer":{
"value":"Bob"
},
"testsPass":[
{
"name":"unit",
"pass":true
},
{
"name":"integraton",
"pass":true
},
{
"name":"ui",
"pass":true
}
]
}
}
POST myindex/_doc/3
{
"appName":"SmallApp",
"appVer":"1.0",
"reviews":{
"reviewer":{
"value":"Bob"
},
"testsPass":[
{
"name":"unit",
"pass":true
},
{
"name":"integraton",
"pass":true
},
{
"name":"ui",
"pass":false
}
]
}
}
Note that in the list of the above documents, only the document having appName: MidApp(2nd document) has the list of all true values.
Aggregation Query:
POST myindex/_search
{
"size":0,
"aggs":{
"pass_reviewers":{
"filter":{
"bool":{
"must":[
{
"nested":{
"path":"reviews.testsPass",
"query":{
"match":{
"reviews.testsPass.pass":"true"
}
}
}
}
],
"must_not":[
{
"nested":{
"path":"reviews.testsPass",
"query":{
"match":{
"reviews.testsPass.pass":"false"
}
}
}
}
]
}
},
"aggs":{
"myhits":{
"top_hits":{
"size":10
}
}
}
}
}
}
Note that the above returns only the concerned document as result of Top Hits aggregation. The main aggregation over here is in filter section which is just a Filter Aggregation
Response:
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"pass_reviewers" : {
"doc_count" : 1, <------ Note this. Returns count of docs. This is result of filtered aggregation
"myhits" : { <------ Start of top hits aggregation
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "2", <----- Document
"_score" : 1.0,
"_source" : {
"appName" : "MidApp",
"appVer" : "1.0",
"reviews" : {
"reviewer" : {
"value" : "Bob"
},
"testsPass" : [
{
"name" : "unit",
"pass" : true
},
{
"name" : "integraton",
"pass" : true
},
{
"name" : "ui",
"pass" : true
}
]
}
}
}
]
}
}
}
}
}
Just in case if you just want the query to return the documents having all true, and not necessarily make use of aggregation, you can simply make use of the below query:
Query:
POST myindex/_search
{
"query":{
"bool":{
"must":[
{
"nested":{
"path":"reviews.testsPass",
"query":{
"match":{
"reviews.testsPass.pass":"true"
}
}
}
}
],
"must_not":[
{
"nested":{
"path":"reviews.testsPass",
"query":{
"match":{
"reviews.testsPass.pass":"false"
}
}
}
}
]
}
}
}
Basically the core execution logic is the same in both the queries, I've just narrowed down the logic you are looking for.
Response:
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.597837,
"hits" : [
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.597837,
"_source" : {
"appName" : "MidApp",
"appVer" : "1.0",
"reviews" : {
"reviewer" : {
"value" : "Bob"
},
"testsPass" : [
{
"name" : "unit",
"pass" : true
},
{
"name" : "integraton",
"pass" : true
},
{
"name" : "ui",
"pass" : true
}
]
}
}
}
]
}
}
Hope this helps!

enabled fielddata on text field in ElasticSearch but aggregation is not working

According to the documentation you can run ElasticSearch aggregations on fields that are type keyword or not a text field or which have fielddata set to true in the index mapping.
I am trying to count city_names in an nginx log. It works fine with the int field result. But it does not work with the field city_name even when I updated the index mapping for that to put fielddata=true. The should have been not required as it was of type keyword.
To say it does not work means that:
"aggregations" : {
"cities" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ ]
}
}
Here is the field mapping:
"city_name" : {
"type" : "text",
"fielddata" : true
},
And here is the aggression query:
curl -XGET --user $pwd --header 'Content-Type: application/json' https://58571402f5464923883e7be42a037917.eu-central-1.aws.cloud.es.io:9243/logstash/_search?pretty -d '{
"aggs" : {
"cities": {
"terms" : { "field": "city_name"}
}
}
}'
If you don't get any error when executing your search it seems that is more like a problem with the data. Are you sure you have, at least, one document with the field city_name filled?
I tried to reproduce your issue with ElasticSearch 6.6.2.
I created an index
PUT cities
{
"mappings": {
"city": {
"dynamic": "true",
"properties": {
"id": {
"type": "long"
},
"city_name": {
"type": "text",
"fielddata": true
}
}
}
}
}
I added one document without the city_name
PUT cities/city/1
{
"id": "1"
}
When i performed the search:
GET cities/_search
{
"aggs": {
"cities": {
"terms" : { "field": "city_name"}
}
}
}
I got no buckets in the cities aggregation. But when I added one document with the city name filled:
PUT cities/city/2
{
"id": "2",
"city_name": "London"
}
I got the expected result:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [
{
"_index" : "cities",
"_type" : "city",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"id" : "2",
"city_name" : "london"
}
},
{
"_index" : "cities",
"_type" : "city",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"id" : "1"
}
}
]
},
"aggregations" : {
"cities" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "london",
"doc_count" : 1
}
]
}
}
}

Resources