How to set condition for aggregation in elasticsearch - elasticsearch

Let's say I have an index
"products": {
"aliases": {},
"mappings": {
"products": {
"properties": {
"id": {
"type": "long"
},
"price": {
"type": "double"
},
"discount": {
"type": "double"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
I need to get all products with discounts in a range 0-10, price range in 5-20 and whose total price should be less than 200.
I know how to filter on fields
{
"query": {
"bool": {
"must": [
{
"range": {
"price": {
"from": 5,
"to": 200,
"include_lower": true,
"include_upper": true,
"boost": 1.0
}
}
},
{
"range": {
"discount": {
"from": 0,
"to": 10,
"include_lower": true,
"include_upper": true,
"boost": 1.0
}
}
}
]
}
}
}
Also, I know how to aggregate the price
"aggregations": {
"total_price": {
"sum": {
"field": "price"
}
}
}
But how to set the bound for this total_price?

Related

elastic search nested sub aggregations

We are using elastic search which holds records as documents with following definition
{
"loadtender": {
"aliases": {},
"mappings": {
"_doc": {
"_meta": {
"version": 20
},
"properties": {
"carrierId": {
"type": "long"
},
"destinationData": {
"type": "keyword"
},
"destinationZip": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 50
}
}
},
"effStartTime": {
"type": "date"
},
"endTime": {
"type": "date"
},
"id": {
"type": "long"
},
"mustRespondByTime": {
"type": "date"
},
"orgdiv": {
"type": "keyword"
},
"originData": {
"type": "keyword"
},
"originZip": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 50
}
}
},
"purchaseOrderNum": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 255
}
}
},
"startTime": {
"type": "date"
},
"tenderStatus": {
"type": "keyword"
},
"tenderedTime": {
"type": "date"
}
}
}
},
"settings": {
"index": {
"creation_date": "1655105542470",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "ohcXgA8EQ5iJj0X6_4BqXA",
"version": {
"created": "6080499"
},
"provided_name": "loadtender"
}
}
}
}
I am trying to search records to return me following filtered results
Input Parameter : startDate (yesterday), originData.originCity and originData.destinationCity
Output Required:
Three buckets for 0-30 days, 30-60 days and 60-90 days
buckets of distinct originData.city and destinationData.city combinations under each of the above
Under each of the above, buckets of data for each unique carrierId and the corresponding record list / count
Basically I was trying to achieve something like the below
{
"aggregations": {
"aggr": {
"buckets": [
{
"key": "0-30 days",
"doc_count": 10,
"aggr": {
"buckets": [
{
"key": "(originCity)Menasha, WI, US|Hanover, MD, US (DestinationCity)",
"aggr": {
"buckets": [
{
"key": "10183-carrierId",
"count": 10
}
]
}
}
]
}
},
{
"key": "30-60 days",
"doc_count": 11,
"aggr": {
"buckets": [
{
"key": "Dallas, TX, US|Houston, TX, US",
"aggr": {
"buckets": [
{
"key": "10183-carrierId",
"count": 10
},
{
"key": "10022-carrierId",
"count": 1
}
]
}
}
]
}
}
]
}
}
}
I've tried the following but I think I am not finding a way to filter it further using the sub aggregators.
{
"_source":["id", "effStartTime", "carrierId", "originData", "destinationData"],
"size": 100,
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"range": {
"startTime": {
"from": "2021-08-27T23:59:59.000Z",
"to": "2022-09-01T00:00:00.000Z",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
],
"must_not": [
{
"term": {
"tenderStatus": {
"value": "REMOVED",
"boost": 1
}
}
}
],
"filter" : {
"exists" : {
"field" : "carrierId"
}
},
"adjust_pure_negative": true,
"boost": 1
}
},
"aggregations": {
"aggr": {
"terms": {
"script": "doc['originData'].values[0] + '|' + doc['destinationData'].values[0]"
}
}
}
}
I started beginning to think if this is even possible OR should I shift to issuing multiple queries for the same
I was able to achieve the same using the following sub-aggregations:
"aggregations": {
"aggr":{
"date_range": {
"field": "startTime",
"format": "MM-yyyy",
"ranges": [
{"to": "now-1M/M", "from": "now"}, --> now to 30 days back
{"to": "now-1M/M", "from": "now-2M/M"}, from 30 days back to 60 days back
{"to": "now-2M/M", "from": "now-3M/M"}, from 60 days back to 90 days back
{"to": "now-3M/M", "from": "now-12M/M"}
]
},
"aggregations": {
"aggr":{
"terms": {
"script": "doc['originData'].values[0] + '|' + doc['destinationData'].values[0]" --> concatenated origin and destination address as a key
},
"aggregations": {
"aggr": {
"terms": {
"field": "carrierId" --> nested carrier count
}
}
}
}
}
}
}
Following is the response template that I receive.
"aggregations": {
"aggr": {
"buckets": [
{
"key": "09-2021-06-2022",
"from": 1630454400000,
"from_as_string": "09-2021",
"to": 1654041600000,
"to_as_string": "06-2022",
"doc_count": 1,
"aggr": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Dallas, TX, US|Houston, TX, US",
"doc_count": 14,
"aggr": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 10022,
"doc_count": 14
}
]
}
}
]
}
}
]
}
}
Thank you to all of you for your efforts and time. Do let me know if you discover any better way.

elasticsearch with range sub query in nested query

I am trying to get a nested query filter inside of a nested.
here is my es mapping: there is one "id" field(long) and a nested field called "my_field" with four sub fields in it.
{
"my_index": {
"mappings": {
"dynamic": "strict",
"properties": {
"id": {
"type": "long"
},
"my_field": {
"type": "nested",
"properties": {
"x": {
"type": "long"
},
"y": {
"type": "long"
},
"z": {
"type": "long"
},
"a": {
"type": "double"
},
"b": {
"type": "long"
}
}
}
}
}
}
}
My question is how to retrive the document with nested es query which contains sub range query in it.
For example, I'm trying to get two document id :11111 and id:22222 with nested query restriction "x > 15" or "a > 0.5" and also with inner hit size limitation, which is 20 here.
{
"_source": false,
"query": {
"bool": {
"must": {
"nested": {
"inner_hits": {
"size": 20
},
"path": "my_field",
"query": {
"bool": {
"should": [
{
"range": {
"x": {
"from": 15,
"include_lower": true,
"include_upper": true,
"to": null
}
}
},
{
"range": {
"a": {
"from": 0.5,
"include_lower": true,
"include_upper": true,
"to": null
}
}
}
]
}
}
}
},
"should": [
{
"term": {
"id": 11111
}
},
{
"term": {
"id": 22222
}
}
]
}
},
"timeout": "5000ms",
"track_total_hits": true
}
However, there are no hits return
Please use the dot notation in your query to include the complete path, e.g.,
"range": {
"my_field.x": { "from": ... }
}

Getting error "No mapping found for [logdata.timestamp] in order to sort on"

Getting error No mapping found for [logdata.timestamp] in order to sort on"
{
"dynamic": "false",
"_meta": {
"version": 2,
"updateTimeInMs": 1607537203813
},
"properties": {
"log": {
"properties": {
"logid": {
"type": "keyword"
},
"logdata": {
"type": "text",
"index": false
},
"timestamp": {
"type": "date"
},
"version": {
"type": "integer",
"index": false,
"doc_values": false
}
}
}
}
}
and i am using to fetch the results
Note:- Fields logid and timestamps are indexed
{
"from": 0,
"size": 1000,
"query": {
"bool": {
"must": [
{
"term": {
"logid": {
"value": "1",
"boost": 1.0
}
}
},
{
"range": {
"timestamp": {
"from": 1607212800,
"to": 1607299200,
"include_lower": true,
"include_upper": true,
"boost": 1.0
}
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
},
"sort": [
{
"timestamp": {
"order": "asc"
}
}
]
}
Based on the mapping you have provided, you should use log.timestamp and log.logid field, the modified search query will be :
{
"from": 0,
"size": 1000,
"query": {
"bool": {
"must": [
{
"term": {
"log.logid": {
"value": "1",
"boost": 1.0
}
}
},
{
"range": {
"log.timestamp": {
"from": 1607212800,
"to": 1607299200,
"include_lower": true,
"include_upper": true,
"boost": 1.0
}
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
},
"sort": [
{
"log.timestamp": { <-- note this
"order": "asc"
}
}
]
}

How to do sorting on a field with composite aggregation in elastic search

How to do sorting on a field with composite aggregation in elastic search.
We are using elastic search version 6.8.6 and trying to achieve sorting on a field with composite aggregation.
But we are not able to get expected results with aggregation.
This is our mapping
{
"properties": {
"department": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256.0,
"type": "keyword"
}
}
},
"project": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256.0,
"type": "keyword"
}
}
},
"billingUnit": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256.0,
"type": "keyword"
}
}
},
"billingType": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256.0,
"type": "keyword"
}
}
},
"application": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256.0,
"type": "keyword"
}
}
},
"environmet": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256.0,
"type": "keyword"
}
}
},
"cost": {
"type": "float"
}
}
}
By using the following query we are not able to do sorting, The results are not in alphabetical orders :
{
"query": {
"bool": {
"must": [
{
"match_phrase": {
"department": {
"query": "HR",
"slop": 0,
"zero_terms_query": "NONE",
"boost": 1.0
}
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
},
"sort": [
{
"project.keyword": {
"order": "desc"
}
}
],
"aggs": {
"TERM_RANGE": {
"composite": {
"size": 10000,
"sources": [
{
"billingUnitKey": {
"terms": {
"field": "billingUnit.keyword",
"missing_bucket": false
}
}
},
{
"billingTypeKey": {
"terms": {
"field": "billingType.keyword",
"missing_bucket": false
}
}
}
]
},
"aggregations": {
"TOTAL": {
"sum": {
"field": "cost"
}
},
"dataHits": {
"top_hits": {
"from": 0,
"size": 1,
"version": false,
"seq_no_primary_term": false,
"explain": false,
"_source": {
"includes": [
"application.keyword",
"environmet.keyword",
],
"excludes": []
},
"docvalue_fields": [
{
"field": "application.keyword"
},
{
"field": "environmet.keyword"
}
]
}
},
"paginate_bucket": {
"bucket_sort": {
"sort": [],
"from": 0,
"size": 100,
"gap_policy": "SKIP"
}
}
}
}
}
}
Sorting is working fine with following query without aggregation
{
"query": {
"match": {
"department": "HR"
}
},
"size": 100,
"sort": [
{
"project.keyword": {
"order": "desc"
}
}
]
}
You should use order key of composite aggregation
https://www.elastic.co/guide/en/elasticsearch/reference/7.8/search-aggregations-bucket-composite-aggregation.html#_order

Need help to correctly perform wildcard search on a field

My sData.Name mapping looks as below:
{
"abc_history": {
"mappings": {
"abc-data-type": {
"sData.Name": {
"full_name": "sData.Name",
"mapping": {
"Name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
My sData.startDate mapping looks as below
{
"abc_history": {
"mappings": {
"abc-data-type": {
"sData.startDate": {
"full_name": "sData.startDate",
"mapping": {
"startDate": {
"type": "date"
}
}
}
}
}
}
}
I am trying to perform a wildcard search on sData.Name and used following query:
{
"from": 0,
"size": 20,
"query": {
"bool": {
"must":[
{"range": {"requestDate": { "gte": "2019-10-01T08:00:00.000Z" }}},
{
"wildcard": {
"sData.Name": "*Scream*"
}
}
]
}
},
"sort": [
{ "requestDate": {"order": "desc"}}
]
}
The above query is returning empty response.
How should I modify my query so that I can perform wildcard search on sData.Name
Response from http://{serverhost}:{port}/abc_history/_search looks as below:
{
"took": 181,
"timed_out": false,
"_shards": {
"total": 3,
"successful": 3,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": null,
"hits": [
{
"_index": "abc_history",
"_type": "abc-data-type",
"_id": "5e29cbb7965809fe6cb22a7b",
"_score": null,
"_source": {
"sData": [
{
"status": "ASSIGNED",
"Name": "CloudView abcmission Automation Support",
startDate : "2020-01-26T20:12:57.091Z"
},
{
"status": "RESOLVED",
"Name": "DSE - Tools Engineering",
startDate : "2020-01-27T20:12:57.091Z"
},
{
"status": "CLOSED",
"Name": "abcmission Orchestration",
startDate : "2020-01-29T20:12:57.091Z"
},
{
"status": "ASSIGNED",
"Name": "CloudView abcmission Automation Support",
startDate : "2020-01-29T20:19:29.687Z"
}
]
},
"sort": [
1579797431366
]
}
]
}
}
I am mainly concerned about querying sData.Name. I want to perform search only in the last array element. So in my case I want to search only sData[3].Name In other words the keyword DSE should be searched within "Name": "CloudView abcmission Automation Support" only
I try to create the index by your input. Try to use
"wildcard": {
"sData.Name.keyword": {
"wildcard": "*DSE*",
"boost": 1
}
}
The full query is:
PUT /abc_history
{
"mappings": {
"abc-data-type": {
"properties": {
"sData": {
"properties": {
"status": {
"type": "keyword"
},
"Name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
GET /abc_history/_search
{
"from": 0,
"size": 200,
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"wildcard": {
"sData.Name.keyword": {
"wildcard": "*DSE*",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
}
It may
GET /abc_history/_search
{
"from": 0,
"size": 200,
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"wildcard": {
"sData.Name": {
"wildcard": "*ddd*",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"sort": [
{
"sData.startDate": {
"order": "asc"
}
}
]
}

Resources