I want to group by a property and only get the cheapest returned in the result.
When I have the following information in an elastic index:
Name | price | type
bear | 15 | animal
bal | 4 | toy
duck | 10 | animal
bear | 13 | animal
doll | 16 | toy
dog | 20 | animal
I would like the following as the result
Name | price | type
duck | 10 | animal
bal | 4 | toy
I've tried to get such a result with the following query:
{
"aggregations": {
"aggregation_1": {
"terms": {
"field": "type.keyword",
"order": {
"price_min": "desc"
},
"size": 5
},
"aggregations": {
"price_min": {
"min": {
"field": "price"
}
}
}
}
},
"size": 10
}
But the results from that query returns all items, is aggregation the wrong method to get what I want?
You want to do it like this, i.e. for each product type, find the hit with the lowest price:
{
"size": 0,
"aggregations": {
"aggregation_1": {
"terms": {
"field": "type.keyword",
"size": 5
},
"aggregations": {
"cheapest": {
"top_hits": {
"size": 1,
"sort": {
"price": "asc"
}
}
}
}
}
}
}
Related
I have a following format which has duplicate ID field.
ID PDE_ID Curency
1 21 USD 35
1 23 USD 34
2 25 CAD 43
3 26 INR 33
When there is a duplicate ID field , we need to pick the latest record by PDE_ID column, and with the result we need to do aggregation like Sum, Min, Max, value_count.
I tried top_result and max, but both doesn't support sub-aggregation. So i can do distinct and latest record, but cannot do any aggregation(sum/min/max/count) top of it.
Any help is much appreciated.
Whoever is stuck, pls find the below query:
GET dispute-1-2022-04-*/_search?size=0
{
"aggs": {
"Duplicates": {
"terms": {
"field": "PDE.keyword"
},
"aggs": {
"div_id": {
"terms": {
"field": "PDE_DETAIL_ID.keyword",
"order": {
"_term": "desc"
},
"size": 1
},
"aggs": {
"individual_sum": {
"sum": {
"field": "DIV_ID"
}
}
}
},
"max_cal": {
"max_bucket": {
"buckets_path": "div_id>individual_sum"
}
}
}
},
"total_min": {
"max_bucket": {
"buckets_path": "Duplicates>max_cal"
}
}
}
}
I have a document of following mapping:
{
"id": {"type": "integer"},
"owner": {"type": "object"},
"company_id": {"type": "integer"},
"summary": {"type": "object"},
"create_date": {"type": "date"},
}
So basically I want to filter id of owner and 12 months from now based on create_date. And then perform aggregate on keys inside summary objects.
Example of data I have:
id | owner | company_id | summary | create_date
01 | {"id": 1, "name": "x"} | 1 | {"data1": 2, "data2": 5, "data3": 6} | "2020-09-22T01:04:17.852112Z"
02 | {"id": 2, "name": "y"} | 2 | {"data1": 2, "data2": 5, "data4": 6} | "2020-09-17T04:11:45.851231Z"
03 | {"id": 3, "name": "z"} | 3 | {"data1": 0, "data2": 4, "data3": 6} | "2019-02-02T12:19:27.852121Z"
Data as I want.
month-year | aggregate of summary keys
09-2020 (any indicator/format of month and year) |{"data1":1, "data2": 5, "data3": 6, "data4": 6}
here data I want average of all the keys inside summary object according of every month of last 12 months.
GET data/_search
{
"size": 0, // <====== Represent that query o/p is not required, only aggs
"query": {
"bool": {
"filter": [
{
"range": {
"create_date": {
"gte": "now-6M" // <========== 'M' represent month, now represents current timestamp
}
}
},
{
"term": {
"owner.id": 4
}
}
]
}
},
"aggs": {
"NAME": { //<====== Custom name you can provide to this aggregation
"terms": { // <============ You need grouping based on the field and count of the grouped field will be returned
"field": "summary.v1",
"size": 10 // <==== How many data points needs to be returned
}
}
}
}
Some details are added in the query. Other important things to learn :
Queries & Filters
Terms Aggregation
Edit: Use below aggregation part in the existing query if you need monthly avgs.
"aggs": {
"monthly_grouping": {
"date_histogram": {
"field": "create_date",
"interval": "month",
"missing": "0"
},"aggs": {
"average_V1": {
"avg": {
"field": "summary.v1"
}
},
"average_V2": { //<===== Similarly add other fields if required
"avg": {
"field": "summary.v1"
}
}
}
}
}
Read about Date-Histogram here.
I have my product indexed with the following
product_id | key | text | number
1 | A1 | pc | <null>
1 | A1 | mac | <null>
1 | A2 | <null> | 23
1 | A2 | <null> | 30
2 | A1 | pc | <null>
3 | A2 | <null> | 25
4 | A2 | <null> | 32
4 | A1 | linux | <null>
Now I want to find the products where
key = A1 and text is either pc or mac
key = A2 and number is between 22 and 28
This should give me product_id 1, 2 and 3, but not product_id 4, because its not inside the range nor have A1 the selected key in A1
My indexes are
text: { type: keyword, index: not_analyzed }
number: { type: float, index: not_analyzed }
product_id: { type: integer, index: not_analyzed }
key: { type: keyword, index: not_analyzed }
The following works perfect, if only text are selected
{
"query": {
"bool": {
"must": [
{
"term": {
"key": "A1"
}
},
{
"terms": {
"text": [
"mac",
"pc"
]
}
}
]
}
},
"aggs": {
"BUCKET_NAME": {
"terms": {
"field": "text",
"min_doc_count":2
}
}
}
}
But if I put my range in, it does not work any more
You can use your both conditions in a should clause like below and then simply apply aggregation.
{
"size": 0,
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"term": {
"key": "A1"
}
},
{
"terms": {
"text": [
"pc",
"mac"
]
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"key": "A2"
}
},
{
"range": {
"number": {
"gte": 22,
"lte": 28
}
}
}
]
}
}
]
}
},
"aggs": {
"PRODUCT_IDs": {
"terms": {
"field": "product_id"
}
}
}
}
I am struck with an issue in ElasticSearch.
I are trying to get unique documents from the filtered data set and do an aggregation on top of it.
Our data set looks something like this..
ID object Property1 Property2
12 123 Test1 Fest2
23 234 Test3 Fest4
5 123 Test1 Fest2
55 123 Test2 Fest4
3 234 Test2 Fest2
I could like to filter the devices based on property2 and aggregate (group by) on property1 of the unique filtered records.
Could someone help me on this?
Filtering and getting unique records.
{
"size": 0,
"query": { "match": { "Property2": "Fest2" }},
"aggs": {
"Unique-Object": {
"terms": {
"field": "object.keyword",
"size": 20
},
"aggs": {
"top_uids_hits": {
"top_hits": {
"sort": [
{
"_score": {
"order": "desc"
}
}
],
"size": 1
}
}
}
}
}
}'
I could like to group by Property1 as below on the output of the above DSL…Could some one throw light on this?
Group by Property1
"aggs": {
"Property1_count": {
"terms": {
"field": " Property1.keyword"
}
Thanks
Arun S
I think this should be your query . check using this , agg inside agg :-
Use of cardinality link .
{
"_source":false,
"query": {
"match": {
"Property2": "xxxx"
}
},
"aggregations": {
"byProperty1": {
"terms": {
"field": "property1"
},
"aggs": {
"byId": {
"cardinality": {
"field": "id"
}
}
}
}
}
}
For java implementation , check my answer .
Check this question group by on multiple fields link .
I have this mapping model in my elasticsearch index :
{
"my_index": {
"mappings": {
"vehicules": {
"properties": {
"name": {
"type": "text"
},
"category_id": {
"type": "integer"
},
"price": {
"type": "float"
},
}
}
}
}
}
In this index, I've insert some demo data :
+--------------+-------------+-------+
| Name | Category ID | Price |
+--------------+-------------+-------+
| Car 1 | 1 | 1500 |
| Car 2 | 1 | 4000 |
| Car 3 | 1 | 2500 |
| Motorcycle 1 | 2 | 3000 |
| Motorcycle 2 | 2 | 1400 |
| Motorcycle 3 | 2 | 2700 |
| Truck 1 | 3 | 19000 |
| Truck 2 | 3 | 15000 |
+--------------+-------------+-------+
I would like to sort all the product based on price value ASC, and group the results by category. The categories themselves have to be sorted with the price value of her child data. Which give :
{
"2": [ <= Category where the price start with lower price (1400)
{
"name": "Motorcycle 2",
"price": 1400
},
{
"name": "Motorcycle 3",
"price": 2700
},
{
"name": "Motorcycle 1",
"price": 3000
}
],
"1": [
{
"name": "Car 1",
"price": 1500
},
{
"name": "Car 3",
"price": 2500
},
{
"name": "Car 2",
"price": 4000
}
],
"3": [
{
"name": "Truck 2",
"price": 15000
},
{
"name": "Truck 1",
"price": 19000
}
]
}
Is it possible to have that kind of results or something close to it with ES ? I'm a very beginner with ES and I've tried many different query in the DevTool of Kibana, without success.
I think I found the query to have the desired result. I'm not sure it's fully optimized, but it works.
GET my_index/my_type/_search
{
"size": 0,
"aggs": {
"grouped_by_cat": {
"terms": {
"field": "category_id",
"order": {
"min_price_aggs": "asc"
}
},
"aggs": {
"min_price_aggs": {
"min": {
"field": "price"
}
},
"list_top_hits": {
"top_hits": {
"_source": {
"includes": [
"name",
"price"
]
},
"sort": [
{
"price": {
"order": "asc"
}
}
],
"size": 10000
}
}
}
}
}
}
Does this query seems correct to you?
So you can achieve the above result by following query
GET my_index/vehicles/_search
{
"size": 0,
"aggs" : {
"category_id_aggs": {
"terms" : {
"field" : "category_id"
},
"aggs": {
"data": {
"top_hits": {
"size": 10,
"sort": [{"price":"asc"}],
"_source": {
"includes" : ["name","price"]
}
}
}
}
}
}
}