Elastic search query string - elasticsearch

Why can't I get the same result in the second query as in the third one? What am I doing wrong?
I make this query:
{
"size": 20,
"track_total_hits": false,
"_source": [
"title"
],
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "63 ",
"default_field": "title",
"type": "phrase_prefix"
}
}
]
}
}
}
and got this result:
{
"hits": {
"max_score": 13.483224,
"hits": [
{
"_index": "products_2022_11_3_17_30_44_56920",
"_type": "_doc",
"_id": "19637",
"_score": 13.483224,
"_source": {
"title": "Заднее стекло 6302BGNE"
}
}
]
}
}
all right, after this I am typing one more character:
"query": "63 2"
and got empty result:
"hits" : {
"max_score" : null,
"hits" : [ ]
}
}
then I am adding one more character again:
"query": "63 21"
and got not empty result again:
{
"hits": [
{
"_index": "products_2022_11_3_17_30_44_56920",
"_type": "_doc",
"_id": "105863",
"_score": 440.54578,
"_source": {
"title": "Лампа накаливания 63 21 0 151 620 BMW"
}
}
]
}
Index mapping:
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
GET products/_settings
{
"products_2022_11_7_8_57_7_118045" : {
"settings" : {
"index" : {
"routing" : {
"allocation" : {
"include" : {
"_tier_preference" : "data_content"
}
}
},
"number_of_shards" : "1",
"provided_name" : "products_2022_11_7_8_57_7_118045",
"creation_date" : "1667800627119",
"number_of_replicas" : "0",
"uuid" : "GV6-5tzQQPavncFUcvq9NA",
"version" : {
"created" : "7170299"
}
}
}
}
}

Looks like you are using the some analyzer on your title field, that is creating tokens in search a way it doesn't match your search term.
I used the standard analyzer for title field and index the sample documents shown by you and its giving me results in all three queries. as shown below:
{
"size": 20,
"track_total_hits": true,
"_source": [
"title"
],
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "63 2",
"default_field": "title",
"type": "phrase_prefix"
}
}
]
}
}
}
Search Result
"hits": [
{
"_index": "74308224",
"_type": "_doc",
"_id": "2",
"_score": 1.1689311,
"_source": {
"title": "Лампа накаливания 63 21 0 151 620 BMW"
}
}
]
Giving your index mapping and settings would be helpful to identify why its not giving the expected result.

Related

How to get specific items from nested object in elastic search

I've prepared an Elastic Search query in which I'm trying to fetch results from nested objects. The query looks something like this:
{
"from": 0,
"size": 100,
"_source": {
"excludes": [
"#version"
]
},
"query": {
"bool": {
"must": [
{
"term": {
"doc.workflow_id.keyword": "workflow1"
}
},
{
"nested": {
"path": "doc.attributes",
"query": {
"bool": {
"filter": [
{
"match": {
"doc.attributes.name": "color"
}
},
{
"bool": {
"should": [
{
"wildcard": {
"doc.attributes.value.rawold": "*green*"
}
}
]
}
}
]
}
}
}
},
{
"nested": {
"path": "doc.attributes",
"query": {
"bool": {
"filter": [
{
"match": {
"doc.attributes.name": "price"
}
},
{
"bool": {
"should": [
{
"wildcard": {
"doc.attributes.value.rawold": "*34*"
}
}
]
}
}
]
}
}
}
}
],
"must_not": []
}
}
}
Output:
"hits" : [
{
"_index" : "sample_index",
"_type" : "_doc",
"_id" : "mv1",
"_score" : null,
"_source" : {
"doc" : {
"workflow_id" : "workflow1",
"attributes" : [
{
"name" : "price",
"value" : "34"
},
{
"name" : "weight",
"value" : "10"
},
{
"name" : "color",
"value" : "green"
},
{
"name" : "city",
"value" : "#error"
}
]
}
}
},
{
"_index" : "sample_index",
"_type" : "_doc",
"_id" : "mv2",
"_score" : null,
"_source" : {
"doc" : {
"workflow_id" : "workflow1",
"attributes" : [
{
"name" : "price",
"value" : "34"
},
{
"name" : "color",
"value" : "green"
}
]
}
}
}
]
I've omitted a few trivial details in query and output for simplicity. The attributes array in the response is of type nested and contains name and value fields of type string.
I've put filters on attributes color and price, but as you can see, I'm getting other attributes too in the attributes array. Can I somehow pass specific attribute names to the ES query and get the value of those attributes only?
I tried using inner_hits in both nested queries, but it returns the attribute value only for the passed attribute name in the nested query.
E.g.
{
"nested": {
"path": "doc.attributes",
"query": {
"bool": {
"filter": [
{
"match": {
"doc.attributes.name": "color"
}
},
{
"bool": {
"should": [
{
"wildcard": {
"doc.attributes.value.rawold": "*green*"
}
}
]
}
}
]
}
},
"inner_hits": {
"name": "two",
"_source": [
"doc.product_attributes.name",
"doc.product_attributes.value"
]
}
}
}
gives result
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "sample_index",
"_type": "_doc",
"_id": "mv1",
"_score": null,
"_source": {
"doc": {
"workflow_id": "workflow1",
"attributes": [
{
"name": "price",
"value": "34"
},
{
"name": "weight",
"value": "34"
},
{
"name": "color",
"value": "green"
},
{
"name": "city",
"value": "#ERROR"
}
]
}
},
"inner_hits": {
"two": {
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.0,
"hits": [
{
"_index": "sample_index",
"_type": "_doc",
"_id": "mv1",
"_nested": {
"field": "doc.attributes",
"offset": 1
},
"_score": 0.0,
"_source": {
"name": "color",
"value": "green"
}
}
]
}
}
}
},
{
"_index": "sample_index",
"_type": "_doc",
"_id": "mv2",
"_score": null,
"_source": {
"doc": {
"workflow_id": "workflow1",
"attributes": [
{
"name": "price",
"value": "34"
},
{
"name": "color",
"value": "green"
}
]
}
},
"inner_hits": {
"two": {
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.0,
"hits": [
{
"_index": "sample_index",
"_type": "_doc",
"_id": "mv1",
"_nested": {
"field": "doc.attributes",
"offset": 1
},
"_score": 0.0,
"_source": {
"name": "color",
"value": "green"
}
}
]
}
}
}
}
]
}
Note the attribute name and value received inside the inner_hits object.
I want to get other attribute names and values as well in the response for which I'm putting any filter. For example, if I want to get attribute names and values for weight, color & city only, how do I do that?
I've checked this thread select matching objects from array in elasticsearch, but it doesn't solve my problem.

Two filters (RANGE) in different fields in elasticsearch

I am a beginner in elasticsarch and I wanted this query below to work with the two filters, having two range of different fields, but only the first range is working.
This filter is working normally:
"range" : {"pgrk" : { "gte" : 1, "lte" : 10} }
Could someone tell me why this second filter below doesn't work?
"should" : {
"range" : {"url_length" : { "lte" : 100 } }
--------------------------Follow my query below with the two filters--------------------------
{
"from" : 0, "size" : 10,
"sort" : [
{ "pgrk" : {"order" : "desc"} },
{ "url_length" : {"order" : "asc"} }
],
"query": {
"bool": {
"must": {
"multi_match" : {
"query": "netflix",
"type": "cross_fields",
"fields": [ "titulo", "descricao", "url" ],
"operator": "and"
}
},
"filter": {
"range" : {"pgrk" : { "gte" : 1, "lte" : 10} }
},
"should" : {
"range" : {"url_length" : { "lte" : 100 } }
}
}
}
}
Not sure, what is your requirement as index mapping and sample documents are not provided but I created my own mapping and sample documents to show you how to create multiple range queries in filter context.
Please comment, so that I can modify if its results are not according to your requirements.
Index Def
{
"mappings": {
"properties": {
"title": {
"type": "text"
},
"url": {
"type": "keyword"
},
"pgrk": {
"type": "integer"
},
"url_length": {
"type": "integer"
}
}
}
}
Index sample docs
{
"title": "netflix",
"url" : "www.netflix.com", --> this shouldn't match as `pgrk > 10`
"pgrk": 12,
"url_length" : 50
}
{
"title": "Netflix", --> this should match both filetrs
"url" : "www.netflix.com",
"pgrk": 8,
"url_length" : 50
}
{
"title": "Netflix", --> this should match both filetrs
"url" : "www.netflix",
"pgrk": 5,
"url_length" : 50
}
{
"title": "netflix",
"url" : "www.netflix",
"pgrk": 5,
"url_length" : 80. --> note pgrk has same 5 as prev and url_length is diff
}
Search query
{
"from": 0,
"size": 10,
"sort": [
{
"pgrk": {
"order": "desc"
}
},
{
"url_length": {
"order": "asc"
}
}
],
"query": {
"bool": {
"must": {
"multi_match": {
"query": "netflix",
"type": "cross_fields",
"fields": [
"title",
"url"
],
"operator": "and"
}
},
"filter": [ --> note filter array to have multiple range queries in filter context
{
"range": {
"pgrk": {
"gte": 1,
"lte" : 10
}
}
},
{
"range": {
"url_length": {
"lte": 100
}
}
}
]
}
}
}
And search result which brings only three docs (even 2 has same pgrk value)
"hits": [
{
"_index": "so_range",
"_type": "_doc",
"_id": "1",
"_score": null,
"_source": {
"title": "netflix",
"url": "www.netflix.com",
"pgrk": 8,
"url_length": 50
},
"sort": [
8,
50
]
},
{
"_index": "so_range",
"_type": "_doc",
"_id": "3",
"_score": null,
"_source": {
"title": "netflix",
"url": "www.netflix",
"pgrk": 5,
"url_length": 50
},
"sort": [
5,
50
]
},
{
"_index": "so_range",
"_type": "_doc",
"_id": "4",
"_score": null,
"_source": {
"title": "netflix",
"url": "www.netflix",
"pgrk": 5,
"url_length": 80
},
"sort": [
5,
80
]
}
]

ElasticSearch: nested items count in search results

I have following mapping:
{
"test_index" : {
"mappings" : {
"test_type" : {
"properties" : {
"field1" : {
"type" : "string"
},
"field2" : {
"type" : "string"
},
"items" : {
"type" : "nested",
"properties" : {
"nested_field1" : {
"type" : "string"
},
"nested_field2" : {
"type" : "string"
}
}
}
}
}
}
}
}
With search results I want to get total nested items inside the results structure:
{
"hits": {
"total": 2,
"max_score": 1.0,
"hits": [
{
"_index": "test_index",
"_type": "test_type",
"_id": "AWfAc79wljtimCd5JZlJ",
"_score": 1.0,
"_source": {
"field1": "Some string 1",
"field2": "Some string 2",
"items": [
{
"nested_field1": "Some val1",
"nested_field2": "Some val2"
}
],
"totalItems": 1
}
},
{
"_index": "test_index",
"_type": "test_type",
"_id": "AZxfc79dtrt878xx",
"_score": 1.0,
"_source": {
"field1": "Some string 3",
"field2": "Some string 4",
"items": [
{
"nested_field1": "Some val3",
"nested_field2": "Some val4"
},
{
"nested_field1": "Some val5",
"nested_field2": "Some val6"
}
],
"totalItems": 2
}
}
]
}
}
Can I achieve this via aggregations?
Since you have had the great idea to also store the totalItems field at the root level you could just sum up that field and you'd get the number of nested items:
{
"query": {
"match_all": {}
},
"aggs": {
"total_items": {
"sum": {
"field": "totalItems"
}
}
}
}

How to return nested documents and some of its fileds via a query over main document?

I have the following index on elasticsearch:
PUT /blog
{
"mappings": {
"threadQ":{
"properties": {
"title" : {
"type" : "string",
"analyzer" : "standard"
},
"body" : {
"type" : "string",
"analyzer" : "standard"
},
"posts":{
"type": "nested",
"properties": {
"comment": {
"type": "string",
"analyzer": "standard"
},
"prototype": {
"type": "string",
"analyzer": "standard"
},
"customScore":{
"type": "long"
}
}
}
}
}
}
}
And I added one document:
PUT /blog/threadQ/1
{
"title": "What is c#?",
"body": "C# is a good programming language, makes it easy to develop!",
"posts": [{
"comment": "YEP!",
"prototype": "Hossein Bakhtiari",
"customScore": 2
},
{
"comment": "NEVER EVER :O",
"prototype": "Garpizio En Larri",
"customScore": 3
}]
}
So the following query works:
POST /blog/threadQ/_search
{
"query": {
"bool": {
"must": [{
"nested": {
"query": {
"query_string": {
"fields": ["posts.comment"],
"query": "YEP"
}
},
"path": "posts"
}
}]
}
}
}
And the result is the document.
Now want to make a query like this:
SELECT threadQ.posts.customScore FROM threadQ WHERE threadQ.posts.comment = "YEP!"
Please tell me how I can implement it.
To return a specific field in the document either use the fields or _source parameters
Here _source is used
curl -XGET http://localhost:9200/blog/threadQ/_search -d '
{
"_source" : "posts.customScore",
"query": {
"bool": {
"must": [{
"nested": {
"query": {
"query_string": {
"fields": ["posts.comment"],
"query": "YEP"
}
},
"path": "posts"
}
}]
}
}
}'
it will return:
"hits" : {
"total" : 1,
"max_score" : 2.252763,
"hits" : [ {
"_index" : "myindex",
"_type" : "threadQ",
"_id" : "1",
"_score" : 2.252763,
"_source":{"posts":[{"customScore":2},{"customScore":3}]}
} ]
}
}
Finally the problem has been solved by dynamic templates. So the new index structure is like this:
PUT /my_index
{
"mappings": {
"my_type": {
"properties": {
"Id":{
"type": "integer",
"analyzer": "standard"
},
"name":{
"type": "string",
"analyzer": "english"
}
},
"dynamic_templates": [
{ "en": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"analyzer": "english"
}
}}
]
}}}
And the query:
POST /my_index/my_type/_search
{
"query": {
"function_score": {
"query": {"match_all": {}},
"functions": [
{
"script_score": {
"script": "doc.apple.value * _score"
}
}
]
}
}
}
And the result looks like this:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 14,
"hits": [
{
"_index": "my_index",
"_type": "my_type",
"_id": "2",
"_score": 14,
"_source": {
"Id": 2,
"name": "Second One",
"iphone": 20,
"apple": 14
}
},
{
"_index": "my_index",
"_type": "my_type",
"_id": "3",
"_score": 14,
"_source": {
"Id": 3,
"name": "Third One",
"apple": 14
}
},
{
"_index": "my_index",
"_type": "my_type",
"_id": "1",
"_score": 1,
"_source": {
"Id": 1,
"name": "First One",
"iphone": 2,
"apple": 1
}
}
]
}
}

boosting along with prefix query with all

I want to be able to prefix query on EACH of search terms found in any field, and I would like to be able to have highlighting. I formulated a query which seems to work. Now, I want to update query so that matches in one of the fields yields a higher score than matches in the other fields.
For example I index the following data (this is just a sample, in my real data there are many more fields than just the two):
PUT /my_index/my_type/abc124
{
"title" : "blah",
"description" : "golf"
}
PUT /my_index/my_type/abc123
{
"title" : "blah golf",
"description" : "course"
}
PUT /my_index/my_type/abc125
{
"title" : "blah golf tee",
"description" : "course"
}
Then I can query as mentioned with a query like:
POST my_index/my_type/_search
{
"query": {
"bool": {
"must": [
{
"prefix": {
"_all" : "gol"
}
},
{
"prefix": {
"_all": "bla"
}
}
]
}
},
"highlight":{
"fields":{
"*":{}
}
}
}
Which produces the result:
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 4,
"successful": 4,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1.4142135,
"hits": [
{
"_index": "my_index",
"_type": "my_type",
"_id": "abc125",
"_score": 1.4142135,
"_source": {
"title": "blah golf tee",
"description": "course"
},
"highlight": {
"title": [
"<em>blah</em> <em>golf</em> tee"
]
}
},
{
"_index": "my_index",
"_type": "my_type",
"_id": "abc124",
"_score": 1.4142135,
"_source": {
"title": "blah",
"description": "golf"
},
"highlight": {
"description": [
"<em>golf</em>"
],
"title": [
"<em>blah</em>"
]
}
},
{
"_index": "my_index",
"_type": "my_type",
"_id": "abc123",
"_score": 1.4142135,
"_source": {
"title": "blah golf",
"description": "course"
},
"highlight": {
"title": [
"<em>blah</em> <em>golf</em>"
]
}
}
]
}
}
How can I modify the scoring using function_score or other means so that I can score matches on title field higher than other fields? Do I need to change the query to multi-match instead of using _all? Any suggestions would be appreciated.
Regards,
LT
Try adding to your bool query a should section which would give a higher score to the whole query if any of the statements in the should match (and it's not mandatory for those to match for the query to return results).
For example, try this:
POST my_index/my_type/_search
{
"query": {
"bool": {
"must": [
{
"prefix": {
"_all": "gol"
}
},
{
"prefix": {
"_all": "bla"
}
}
],
"should": [
{
"prefix": {
"title": {
"value": "gol",
"boost": 3
}
}
},
{
"prefix": {
"title": {
"value": "bla",
"boost": 3
}
}
}
]
}
},
"highlight": {
"fields": {
"*": {}
}
}
}

Resources