I'm grouping by offerId, the each bucket has two buckets: price <=0 and price > 0. I need to make sure that price <= 0 includes documents where price field is missing:
{
"size": 0,
"aggs": {
"by_offer_id": {
"terms": {
"field": "offerId"
},
"aggs": {
"by_price": {
"range": {
"field": "price",
"ranges": [
{
"to": 0
},
{
"from": 0
}
]
},
"aggs": {
"price_stats": {
"stats": {
"field": "price"
}
}
}
}
}
}
}
}
I've tried adding "missing": 0 after "field": "price",, but it throws SearchPhaseExecutionException.
I'm using 1.7.5, but potentially could use syntax from 2.4.x.
In this particular case I don't event need to set "missing" : 0,
{
"size": 0,
"aggs": {
"by_offer_id": {
"terms": {
"field": "offerId"
},
"aggs": {
"price_stats": {
"stats": {
"field": "price"
}
}
}
}
}
}
because term aggregation returns total document count, white stats aggregation only includes documents with existing price and returns the total number. I can deduce how many document don't have a price field by subtraction.
I Think you should use script just like this:
{
"size": 0,
"aggs": {
"by_offer_id": {
"terms": {
"field": "offerId"
},
"aggs": {
"by_price": {
"range": {
"script": {
"lang": "painless",
"source": "doc['price'].value ==null ? 0 : doc['price'].value"
},
"ranges": [
{
"to": 0
},
{
"from": 0
}
]
},
"aggs": {
"price_stats": {
"stats": {
"field": "price"
}
}
}
}
}
}
}
}
or
"source": "doc['price'].value * 1"
Related
I have a data structure in Elasticsearch that looks like:
{
"name": "abc",
"date": "2022-10-08T21:30:40.000Z",
"rank": 3
}
I want to get, for each unique name, the rank of the document (or the whole document) with the most recent date.
I currently have this:
"aggs": {
"group-by-name": {
"terms": {
"field": "name"
},
"aggs": {
"max-date": {
"max": {
"field": "date"
}
}
}
}
}
How can I get the rank (or the whole document) for each result, and if possible, in 1 request ?
You can use below options
Collapse
"collapse": {
"field": "name"
},
"sort": [
{
"date": {
"order": "desc"
}
}
]
Top hits aggregation
{
"aggs": {
"group-by-name": {
"terms": {
"field": "name",
"size": 100
},
"aggs": {
"top_doc": {
"top_hits": {
"sort": [
{
"date": {
"order": "desc"
}
}
],
"size": 1
}
}
}
}
}
}
I use ES v7.3 and as per my requirements I am aggregating some fields to fetch the required docs in response, further their is a requirement to fetch the count of total number of all such docs also that contain the nested field which qualifies the aggregation condition as described below but I did not find a way where I am able to do that.
Current aggregation query that I am using to fetch the documents is,
"aggs": {
"users": {
"composite": {
"sources": [
{
"users": {
"terms": {
"field": "co_profileId.keyword"
}
}
}
],
"size": 5000
},
"aggs": {
"sessions": {
"nested": {
"path": "co_score"
},
"aggs": {
"last_4_days": {
"filter": {
"range": {
"co_score.sessionTime": {
"gte": "2021-01-10T00:00:31.399Z",
"lte": "2021-01-14T01:37:31.399Z"
}
}
},
"aggs": {
"score_count": {
"sum": {
"field": "co_score.value"
}
}
}
}
}
},
"page_view_count_filter": {
"bucket_selector": {
"buckets_path": {
"sessionCount": "sessions > last_4_days > score_count"
},
"script": "params.sessionCount > 100"
}
},
"filtered_users": {
"top_hits": {
"size": 1,
"_source": {
"includes": [
"co_profileId",
"co_type",
"co_score"
]
}
}
}
}
}
}
Sample doc:
{
"co_profileId": "14654325",
"co_type": "identify",
"co_updatedAt": "2021-01-11T11:37:33.499Z",
"co_score": [
{
"value": 3,
"sessionTime": "2021-01-09T01:37:31.399Z"
},
{
"value": 3,
"sessionTime": "2021-01-10T10:47:33.419Z"
},
{
"value": 6,
"sessionTime": "2021-01-11T11:37:33.499Z"
}
]
}
I'm new to ElasticSearch, so this question could be quite trivial for you, but here I go:
I'm using kibana_sample_data_ecommerce, which documents have a mapping like this
{
...
"order_date" : <datetime>
"taxful_total_price" : <double>
...
}
I want to get a basic daily behavior of the data:
Expecting documents like this:
[
{
"qtime" : "00:00",
"mean" : 20,
"std" : 40
},
{
"qtime" : "01:00",
"mean" : 150,
"std" : 64
},
...
]
So, the process I think that I need to do is:
Group by day all records ->
Group by time window for each day ->
Sum all record in each time window ->
Cumulative Sum for each sum by time window, thus, I get behavior of a day ->
Extended_stats by the same time window across all days
And that can be expressed like this:
But I can't unwrap those buckets to process those statistics. May you give me some advice to do that operation and get that result?
Here is my current query(kibana developer tools):
POST kibana_sample_data_ecommerce/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"range": {
"order_date": {
"gt": "now-1M",
"lte": "now"
}
}
}
]
}
},
"aggs": {
"day_histo": {
"date_histogram": {
"field": "order_date",
"calendar_interval": "day"
},
"aggs": {
"qmin_histo": {
"date_histogram": {
"field": "order_date",
"calendar_interval": "hour"
},
"aggs": {
"qminute_sum": {
"sum": {
"field": "taxful_total_price"
}
},
"cumulative_qminute_sum": {
"cumulative_sum": {
"buckets_path": "qminute_sum"
}
}
}
}
}
}
}
}
Here's how you pull off the extended stats:
{
"size": 0,
"query": {
"bool": {
"must": [
{
"range": {
"order_date": {
"gt": "now-4M",
"lte": "now"
}
}
}
]
}
},
"aggs": {
"by_day": {
"date_histogram": {
"field": "order_date",
"calendar_interval": "day"
},
"aggs": {
"by_hour": {
"date_histogram": {
"field": "order_date",
"calendar_interval": "hour"
},
"aggs": {
"by_taxful_total_price": {
"extended_stats": {
"field": "taxful_total_price"
}
}
}
}
}
}
}
}
yielding
Here is my query result
GET _search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"match": {
"serviceName.keyword": "directory-view-service"
}
},
{
"match": {
"path": "thewall"
}
},
{
"range": {
"#timestamp": {
"from": "now-31d",
"to": "now"
}
}
}
]
}
},
"aggs": {
"by_day": {
"date_histogram": {
"field": "date",
"interval": "7d"
},
"aggs": {
"byUserUid": {
"terms": {
"field": "token_userId.keyword",
"size": 150000
},
"aggs": {
"filterByCallNumber": {
"bucket_selector": {
"buckets_path": {
"doc_count": "_count"
},
"script": {
"inline": "params.doc_count <= 1"
}
}
}
}
}
}
}
}
}
I want my query return all user call my endpoint min. once time by 1 month range by 7 days interval, until then everything is good.
But my result is a buckets with 370 elements and I just need to know the array size...
Are there any keyword or how can I handle it ?
Thanks
I actually want to aggregate all the values of the a field in my index whose length is greater than 6 in some date range.
I could fetch all values of the field, grouped by that keyword. Now, i want to add the condition to check if the keyword length is more than 6 or not.
Here is the query, till where I could come up with.
"size": 0,
"aggs": {
"range":{
"date_range": {
"field": "timestamp",
"ranges": [
{
"from": "now-1d/d",
"to": "now"
}
]
},
"aggs": {
"group_by_name":{
"terms": {
"field": "name.keyword",
"size": 100
}
}
}
}
}
}
You can did using simple painless script. check out the docs aggregations
{
"size": 0,
"aggs": {
"range": {
"date_range": {
"field": "timestamp",
"ranges": [
{
"from": "now-1d/d",
"to": "now"
}
]
},
"aggs": {
"group_by_name": {
"terms": {
"script": {
"source": """
if (doc['name.keyword'].value.toString().length() > 6) {
return doc['name.keyword'].value;
}
""",
"lang": "painless"
},
"size": 100
}
}
}
}
}
}