Elasticsearch: How to optimize the source parameter in a script function? - elasticsearch

I have the following data in an Elasticsearch index called products
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "products",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"prod_id" : 1,
"currency" : "USD",
"price" : 1
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"prod_id" : 2,
"currency" : "INR",
"price" : 60
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"prod_id" : 3,
"currency" : "EUR",
"price" : 2
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"prod_id" : 5,
"currency" : "MYR",
"price" : 1
}
}
]
}
}
I am sorting the data based on the price field,
I have the following script to do so -
GET products/_search
{
"query": {
"function_score": {
"query": {
"match_all": {}
},
"functions": [{
"script_score": {
"script": {
"params": {
"USD": 1,
"SGD": 0.72,
"MYR": 0.24,
"INR": 0.014,
"EUR": 1.12
},
"source": "doc['price'].value * (doc.currency.value == 'eur'? params.EUR : doc.currency.value == 'myr' ? params.MYR : doc.currency.value == 'inr' ? params.INR : 1)"
}
}
}]
}
},
"sort": [
{
"_score": {
"order": "desc"
}
}
]
}
Because the field currency in the product index is of type text,
it is indexed with Standard Analyzer, which converts it to lower case.
I wish to optimise this part of the script, As I may end up with 20-30 currencies -
"source": "doc['price'].value * (doc.currency.value == 'eur'? params.EUR : doc.currency.value == 'myr' ? params.MYR : doc.currency.value == 'inr' ? params.INR : 1)"

I was able to optimize the source script with the following working solution -
GET products/_search
{
"query": {
"function_score": {
"query": {
"match_all": {}
},
"functions": [{
"script_score": {
"script": {
"params": {
"USD": 1,
"SGD": 0.72,
"MYR": 0.24,
"INR": 0.014,
"EUR": 1.12
},
"source": "doc['price'].value * params[doc['currency.keyword'].value]"
}
}
}]
}
},
"sort": [
{
"_score": {
"order": "desc"
}
}
]
}

Related

Order results by smallest absolute difference from input

Can elasticsearch find the closest number to an input?
Example: I have apartments with 1, 2, 5, 6 and 10 rooms. I want a search for apartments with 5 rooms to order results by absolute difference (e.g. |6-5| = 1, |2-5| = 3 etc)
What I want to see: 5, 6, 2, 1, 10.
GET appartaments/_search
{
"query": {
"bool": {
"must":[
{
"match":{
"properties.id":1
}
},
{
"match":{
"properties.value":"5"
}
}
]
}
}
}
You can probably achieve what you want using script-based sorting:
GET appartaments/_search
{
"sort": {
"_script": {
"type": "number",
"script": {
"lang": "painless",
"source": "Math.abs(params.value - Integer.parseInt(doc['properties.value.keyword'].value))",
"params": {
"value": 5
}
},
"order": "asc"
}
},
"query": {
"bool": {
"must": [
{
"match": {
"properties.id": 1
}
}
]
}
}
}
Results =>
"hits" : [
{
"_index" : "appartaments",
"_type" : "_doc",
"_id" : "5",
"_score" : null,
"_source" : {
"properties" : {
"id" : 1,
"value" : "5"
}
},
"sort" : [
0.0
]
},
{
"_index" : "appartaments",
"_type" : "_doc",
"_id" : "6",
"_score" : null,
"_source" : {
"properties" : {
"id" : 1,
"value" : "6"
}
},
"sort" : [
1.0
]
},
{
"_index" : "appartaments",
"_type" : "_doc",
"_id" : "2",
"_score" : null,
"_source" : {
"properties" : {
"id" : 1,
"value" : "2"
}
},
"sort" : [
3.0
]
},
{
"_index" : "appartaments",
"_type" : "_doc",
"_id" : "1",
"_score" : null,
"_source" : {
"properties" : {
"id" : 1,
"value" : "1"
}
},
"sort" : [
4.0
]
},
{
"_index" : "appartaments",
"_type" : "_doc",
"_id" : "10",
"_score" : null,
"_source" : {
"properties" : {
"id" : 1,
"value" : "10"
}
},
"sort" : [
5.0
]
}
]
}

Elasticsearch override field value in search results

The use case is that, I want to hide product price(change to 0) if a user is not logged in.
PUT /products
{
"mappings": {
"properties": {
"price": {
"type": "scaled_float",
"scaling_factor": 100
}
}
}
}
POST /products/_doc
{
"price": 101
}
POST /products/_doc
{
"price": 102
}
POST /products/_doc
{
"price": 103
}
I try to use runtime_mapping with the following script, but the result still has the original data.
GET /products/_search
{
"query": {
"match_all": {}
},
"runtime_mappings": {
"price": {
"type": "double",
"script": "if(0 == 1) {emit(333);} else{emit(222);}"
}
}
}
What do I miss? Is it the script condition invalid?
Thanks.
-- Edit --
I expect all the price to be 222. But the original price is returned.
Expected out put:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "products",
"_id" : "XXphNoMBcFxgyV6Mwe1-",
"_score" : 1.0,
"_source" : {
"price" : 222
}
},
{
"_index" : "products",
"_id" : "XnphNoMBcFxgyV6Mye2c",
"_score" : 1.0,
"_source" : {
"price" : 222
}
},
{
"_index" : "products",
"_id" : "X3phNoMBcFxgyV6M0e2W",
"_score" : 1.0,
"_source" : {
"price" : 222
}
},
{
"_index" : "products",
"_id" : "YHphNoMBcFxgyV6M3u0V",
"_score" : 1.0,
"_source" : {
"price" : 222
}
}
]
}
}
Actual output:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "products",
"_id" : "XXphNoMBcFxgyV6Mwe1-",
"_score" : 1.0,
"_source" : {
"price" : 101
}
},
{
"_index" : "products",
"_id" : "XnphNoMBcFxgyV6Mye2c",
"_score" : 1.0,
"_source" : {
"price" : 102
}
},
{
"_index" : "products",
"_id" : "X3phNoMBcFxgyV6M0e2W",
"_score" : 1.0,
"_source" : {
"price" : 105
}
},
{
"_index" : "products",
"_id" : "YHphNoMBcFxgyV6M3u0V",
"_score" : 1.0,
"_source" : {
"price" : 0
}
}
]
}
}
After retry on the example in the official doc, I realized that I missed out the fields key in query body.
GET /products/_search
{
"query": {
"match_all": {}
},
"runtime_mappings": {
"price": {
"type": "double",
"script": "if(0 == 1) {emit(333);} else{emit(222);}"
}
},
"fields": ["price"]
}
Now I get both the original and script fields.

Getting incorrect inner hits from parent child relationship when combined with boolean query

Getting incorrect inner hits from parent child relationship when combined with boolean query
Hi Everyone
I am getting incorrect inner hits results when combining parent-child query with boolean query. To reproduce the issue, I create this Index
PUT /my-index-000001
{
"mappings": {
"_routing": {
"required": true
},
"properties": {
"parentProperty": {
"type": "text"
},
"childProperty": {
"type": "text"
},
"id": {
"type": "integer"
},
"myJoinField": {
"type": "join",
"relations": {
"parent": "mychild"
}
}
}
}
}
then I add these three documents (document with Id equals "1" is the parent of the other two documents)
POST /my-index-000001/_doc/1?routing=1
{
"id": 1,
"parentProperty": "a parent document",
"myJoinField": "parent"
}
POST /my-index-000001/_doc/2?routing=1
{
"id": 2,
"childProperty": "queensland civil administration",
"myJoinField": {
"name":"mychild",
"parent":"1"
}
}
POST /my-index-000001/_doc/3?routing=1
{
"id": 3,
"childProperty": "beautiful weather",
"myJoinField": {
"name":"mychild",
"parent":"1"
}
}
now we set up our index with 3 documents. I am looking for all child documents that meet this boolean query: [childProperty contains either "queensland civil" or both "beautiful" and "nothing"].
I expect that elastic returns only the child document with Id "2" since the child document with Id "3" does not have the term "nothing" in it.
The translated version of this query is as follows:
GET /my-index-000001/_search
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"has_child": {
"inner_hits": {
"name": "opr1"
},
"query": {
"query_string": {
"analyzer": "stop",
"query": "childProperty:(\"queensland civil\")"
}
},
"type": "mychild"
}
},
{
"bool": {
"must": [
{
"has_child": {
"inner_hits": {
"name": "opr2"
},
"query": {
"query_string": {
"query": "childProperty:(beautiful)"
}
},
"type": "mychild"
}
},
{
"has_child": {
"inner_hits": {
"name": "opr3"
},
"query": {
"query_string": {
"query": "childProperty:(nothing)"
}
},
"type": "mychild"
}
}
]
}
}
]
}
}
}
and the result that is returned from elasitc is as follows:
{
"took" : 24,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_routing" : "1",
"_source" : {
"id" : 1,
"parentProperty" : "a parent document",
"myJoinField" : "parent"
},
"inner_hits" : {
"opr1" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.2814486,
"hits" : [
{
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.2814486,
"_routing" : "1",
"_source" : {
"id" : 2,
"childProperty" : "queensland civil administration",
"myJoinField" : {
"name" : "mychild",
"parent" : "1"
}
}
}
]
}
},
"opr2" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.7549127,
"hits" : [
{
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.7549127,
"_routing" : "1",
"_source" : {
"id" : 3,
"childProperty" : "beautiful weather",
"myJoinField" : {
"name" : "mychild",
"parent" : "1"
}
}
}
]
}
},
"opr3" : {
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
}
}
]
}
}
as you can see in the result the elastic returns both child document which clearly is against what I have written in the "must" section of the query.
but if I rewrite the query as following then it will return ONLY the expected document (document with Id "2"):
GET /my-index-000001/_search
{
"query": {
"bool": {
"must": [
{
"has_child": {
"inner_hits": {
"name": "opr1"
},
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"query_string": {
"query": "childProperty:(\"queensland civil\")"
}
},
{
"bool": {
"must": [
{
"query_string": {
"query": "childProperty:(beautiful)"
}
},
{
"query_string": {
"query": "childProperty:(weather1)"
}
}
]
}
}
]
}
},
"type": "mychild"
}
}
]
}
}
}
here is the correct result:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_routing" : "1",
"_source" : {
"id" : 1,
"parentProperty" : "a parent document",
"myJoinField" : "parent"
},
"inner_hits" : {
"opr1" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.2814486,
"hits" : [
{
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.2814486,
"_routing" : "1",
"_source" : {
"id" : 2,
"childProperty" : "queensland civil administration",
"myJoinField" : {
"name" : "mychild",
"parent" : "1"
}
}
}
]
}
}
}
}
]
}
}
I appreciate it if someone tells me what I did wrong in the first query or if this is the default behavior in elasitc when it comes to parent/child relationship.

Find nearest timestamp

I'm using Elasticsearch 6.4.2, and I need to find the previous and next docs considering a specified timestamp.
Kinda like if I did a SELECT TOP 1 * from table WHERE date < 2019-01-01 ORDER BY date DESC and SELECT TOP 1 * from table WHERE date > 2019-01-01 ORDER BY date ASCon a SQL table, to find the previous and next records from 2019-01-01, you know?
Any ideas?
Data:
[
{
"_index" : "index25",
"_type" : "_doc",
"_id" : "mceIBm4B1qXGA4PnKzvZ",
"_score" : 1.0,
"_source" : {
"id" : 1,
"date" : "2019-10-01"
}
},
{
"_index" : "index25",
"_type" : "_doc",
"_id" : "mseIBm4B1qXGA4PnRDvs",
"_score" : 1.0,
"_source" : {
"id" : 2,
"date" : "2019-10-02"
}
},
{
"_index" : "index25",
"_type" : "_doc",
"_id" : "m8eIBm4B1qXGA4PncDv9",
"_score" : 1.0,
"_source" : {
"id" : 3,
"date" : "2019-10-03"
}
},
{
"_index" : "index25",
"_type" : "_doc",
"_id" : "nMeIBm4B1qXGA4Pnhjvs",
"_score" : 1.0,
"_source" : {
"id" : 4,
"date" : "2019-10-04"
}
},
{
"_index" : "index25",
"_type" : "_doc",
"_id" : "nceIBm4B1qXGA4Pnmjtm",
"_score" : 1.0,
"_source" : {
"id" : 5,
"date" : "2019-10-05"
}
}
]
Query: I am using two filter and terms aggregations to get first date greater than and less to 2019-10-03
{
"size": 0,
"aggs": {
"above": {
"filter": {
"range": {
"date": {
"gt": "2019-10-03"
}
}
},
"aggs": {
"TopDocument": {
"terms": {
"field": "date",
"size": 1,
"order": {
"_term": "asc"
}
},
"aggs": {
"documents": {
"top_hits": {
"size": 10
}
}
}
}
}
},
"below":{
"filter": {
"range": {
"date": {
"lt": "2019-10-03"
}
}
},
"aggs": {
"TopDocument": {
"terms": {
"field": "date",
"size": 1,
"order": {
"_term": "desc"
}
},
"aggs": {
"documents": {
"top_hits": {
"size": 10
}
}
}
}
}
}
}
}
Response:
{
"below" : {
"doc_count" : 2,
"TopDocument" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 1,
"buckets" : [
{
"key" : 1569974400000,
"key_as_string" : "2019-10-02T00:00:00.000Z",
"doc_count" : 1,
"documents" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "index25",
"_type" : "_doc",
"_id" : "mseIBm4B1qXGA4PnRDvs",
"_score" : 1.0,
"_source" : {
"id" : 2,
"date" : "2019-10-02"
}
}
]
}
}
}
]
}
},
"above" : {
"doc_count" : 2,
"TopDocument" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 1,
"buckets" : [
{
"key" : 1570147200000,
"key_as_string" : "2019-10-04T00:00:00.000Z",
"doc_count" : 1,
"documents" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "index25",
"_type" : "_doc",
"_id" : "nMeIBm4B1qXGA4Pnhjvs",
"_score" : 1.0,
"_source" : {
"id" : 4,
"date" : "2019-10-04"
}
}
]
}
}
}
]
}
}
}
You can try this :
SELECT TOP 1 * from table WHERE date < 2019-01-01 ORDER BY date DESC
{
"sort": [
{
"date": {
"order": "desc"
}
}
],
"query": {
"bool": {
"filter": [
{
"range": {
"date": {
"lt": "2019-01-01"
}
}
}
]
}
},
"size": 1
}
SELECT TOP 1 * from table WHERE date > 2019-01-01 ORDER BY date ASC
{
"sort": [
{
"date": {
"order": "asc"
}
}
],
"query": {
"bool": {
"filter": [
{
"range": {
"date": {
"gt": "2019-01-01"
}
}
}
]
}
},
"size": 1
}

how to get only particular elements from an array in elastic search

I'm very new to elastic search. I'm trying to get some particular elements from an array...
I created my index like below
PUT store
{
"mappings": {
"properties": {
"storeList": {"type": "nested"},
"storeLocation": {"type": "text"},
"storePinCode" : {"type": "long"}
}
}
}
and I'm having data like this
{
"storeLocation": "tirupati",
"storePinCode" : 517501
"storeList" : [
{
"storeName" : "apollo",
"storeType" : "med"
},
{
"storeName" : "carrots",
"storeType" : "restaurants"
},
{
"storeName" : "more",
"storeType" : "supermarket"
}
]
},
{
"storeLocation": "hyderabad",
"storePinCode" : 500038
"storeList" : [
{
"storeName" : "apollo",
"storeType" : "med"
},
{
"storeName" : "bahar cafe",
"storeType" : "restaurants"
},
{
"storeName" : "dmart",
"storeType" : "supermarket"
}
]
}
My excepted output should be like below
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "store",
"_type" : "_doc",
"_id" : "Yk8SFWwB2zt5weEsMHn7",
"_score" : 1.0,
"_source" : {
"storeLocation" : "tirupati",
"storePinCode" : 517501,
"storeList" : [
{
"storeName" : "apollo",
"storeType" : "med"
}
]
}
},
{
"_index" : "store",
"_type" : "_doc",
"_id" : "ZE8SFWwB2zt5weEsqnkd",
"_score" : 1.0,
"_source" : {
"storeLocation" : "hyderabad",
"storePinCode" : 500038,
"storeList" : [
{
"storeName" : "apollo",
"storeType" : "med"
}
]
}
}
]
}
}
To achive that i try with below query
POST store/_search
{
"query": {
"nested": {
"path": "storeList",
"query": {
"bool" : {
"must" : [
{"match":{"storeList.storeName": "apollo"}}
]
}
},
"inner_hits": {}
}
}
}
I'm getting the output but it not exactly what I expect. Is it possible to get the output as I expect..?
Actual Output:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.093527,
"hits" : [
{
"_index" : "store",
"_type" : "_doc",
"_id" : "Yk8SFWwB2zt5weEsMHn7",
"_score" : 1.093527,
"_source" : {
"storeLocation" : "tirupati",
"storePinCode" : 517501,
"storeList" : [
{
"storeName" : "apollo",
"storeType" : "med"
},
{
"storeName" : "carrots",
"storeType" : "restaurants"
},
{
"storeName" : "more",
"storeType" : "supermarket"
}
]
},
"inner_hits" : {
"storeList" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.093527,
"hits" : [
{
"_index" : "store",
"_type" : "_doc",
"_id" : "Yk8SFWwB2zt5weEsMHn7",
"_nested" : {
"field" : "storeList",
"offset" : 0
},
"_score" : 1.093527,
"_source" : {
"storeName" : "apollo",
"storeType" : "med"
}
}
]
}
}
}
},
{
"_index" : "store",
"_type" : "_doc",
"_id" : "ZE8SFWwB2zt5weEsqnkd",
"_score" : 1.093527,
"_source" : {
"storeLocation" : "hyderabad",
"storePinCode" : 500038,
"storeList" : [
{
"storeName" : "apollo",
"storeType" : "med"
},
{
"storeName" : "bahar cafe",
"storeType" : "restaurants"
},
{
"storeName" : "dmart",
"storeType" : "supermarket"
}
]
},
"inner_hits" : {
"storeList" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.093527,
"hits" : [
{
"_index" : "store",
"_type" : "_doc",
"_id" : "ZE8SFWwB2zt5weEsqnkd",
"_nested" : {
"field" : "storeList",
"offset" : 0
},
"_score" : 1.093527,
"_source" : {
"storeName" : "apollo",
"storeType" : "med"
}
}
]
}
}
}
}
]
}
}
could you please help me out of this...
#ajay sharma, as you suggested i change my query like this
GET store/_search
{
"_source": {
"includes": [ "*" ],
"excludes": [ "storeList" ]
},
"query": {
"nested": {
"path": "storeList",
"inner_hits": {
"_source": [
"storeName", "storeType"
]
},
"query": {
"bool": {
"must": [
{"match":{"storeList.storeName": "more"}}
]
}
}
}
}
}
but im getting the response like below...
{
"_index" : "store",
"_type" : "_doc",
"_id" : "Yk8SFWwB2zt5weEsMHn7",
"_score" : 1.0946013,
"_source" : {
"storeLocation" : "tirupati",
"storePinCode" : 517501
},
"inner_hits" : {
"storeList" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0946013,
"hits" : [
{
"_index" : "store",
"_type" : "_doc",
"_id" : "Yk8SFWwB2zt5weEsMHn7",
"_nested" : {
"field" : "storeList",
"offset" : 2
},
"_score" : 1.0946013,
"_source" : { }
}
]
}
}
}
}
I cannot respond to your comment by a comment. Therefore sharing as an answer.
I have updated the query below. Please check. I replicated your index on my local machine and could get the desired result.
Query
{
"_source": {
"includes": [ "*" ],
"excludes": [ "storeList" ]
},
"query": {
"nested": {
"path": "storeList",
"inner_hits": {
"_source": [
"storeList.storeName", "storeList.storeType" <-- changes are here -->
]
},
"query": {
"bool": {
"must": [
{"match":{"storeList.storeName": "more"}}
]
}
}
}
}
}
Output
"hits": {
"total": 1,
"max_score": 0.9808292,
"hits": [
{
"_index": "store",
"_type": "store",
"_id": "2",
"_score": 0.9808292,
"_source": {
"storeLocation": "tirupati",
"storePinCode": 517501
},
"inner_hits": {
"storeList": {
"hits": {
"total": 1,
"max_score": 0.9808292,
"hits": [
{
"_nested": {
"field": "storeList",
"offset": 2
},
"_score": 0.9808292,
"_source": {
"storeList": {
"storeType": "supermarket",
"storeName": "more"
}
}
}
]
}
}
}
}
]
}

Resources