How to paginate sorted data (with terms aggregation) using composite aggrgation? - sorting

How to write a pipeline aggregation to paginate the sorted data.[where sorting is done using terms aggregation based on its sub-aggregation]
GET index_name/_search
{
"query":{<some querying>}
"aggs": {
"pagination": {
"composite": {
"sources": [
{
"grouping": {
"terms": {
"field": "field_name.keyword",
"order": "desc"
}
}
}
]
},
"aggs": {
"results": {
"terms": {
"field": "field_name.keyword",
"order": {
"sub_aggregation": "desc"
}
},
"aggs": {
"sub_aggregation": {
"filter": {
"term": {
"field_name": "value"
}
}
}
}
}
}
}
}
}
The main problem is merging the following 2 sub-problems
P1. Sorting data based on selected key for which I used terms aggregation and inside the order of the same I have the sub-aggregation.
P2. I want to paginate the above-sorted data, I have used composite aggregation with terms aggregation.
When the composite aggregation is sub-aggregation to the terms aggregation. I get the following error:
[composite] aggregation cannot be used with a parent aggregation of type: [TermsAggregatorFactory]
and when I try the vice versa I get paginated data of terms data(P2) separately and sorting(P1) data in separate buckets.
How can I merge these two problems?

Related

Deduplicate and perform composite aggregation on deduced result

I've an index in elastic search which contains data of daily transactions. Each doc has mainly three fields as below :
TxnId, Status, TxnType,userId
two documents can have same TxnIds.
I'm looking for a query that provides aggregation over status,TxnType for unique txnIds. Basically I'm looking for something like : select unique txnIds from user_table group by status,txnType.
I've a ES query which will dedup on TxnIds. I've another ES query which can perform composite aggregation on status and txnType. I want to do both things in Single query.
I tried collapse feature . I also tried cardinality and dedup features. But query is not giving correct output.:
{
"size": 0,
"query": {
"bool": {
"filter": [
{
"term": {
"streamSource": 3
}
}
]
}
},
"collapse": {
"field": "txnId"
},
"aggs": {
"buckets": {
"composite": {
"size": 30,
"sources": [
{
"status": {
"terms": {
"field": "status"
}
}
},
{
"txnType": {
"terms": {
"field": "txnType"
}
}
}
]
}
}
}
}

Is it possible to paginate term aggregation result with search term?

Is it possible to use pagination in term aggregation query with a search term?
I need to paginate the result of the following query I am not able to find any solution ?
{
"sort": [{
"create_date": {
"order": "desc"
}
}],
"query": {
"bool": {
"must": []
}
},
"aggs": {
"genres": {
"terms": {
"field": "mentions.keyword",
"include": "insta.*"
}
}
}
}
you could use size and from to tell the engine to return the documents in that range every time you come back for next page. Have two variables in your service design and whoever calls the service should also pass the two variables values (basically documents from and the limit)
{
"from": from,
"size": limit,
"sort": [{
"create_date": {
"order": "desc"
}
}],
"query": {
"bool": {
"must": []
}
},
"aggs": {
"genres": {
"terms": {
"field": "mentions.keyword",
"include": "insta.*"
}
}
}
}
if you exposed this query through a service for example mysearch then call the service like this
mysearch?searchTerm=theWord&from=0&limit=15
and in the next call you do the same but with different from and limit values
mysearch?searchTerm=theWord&from=16&limit=15
if this information is not enough then post some sample documents to play with
If you are trying to fetch documents inside terms aggregation, you can use either of two options
In terms aggregation you can use partition to paginate data.
Refer document here
You can use composite aggregation .
In composite aggregtion you can only access data sequentially using after key. You won't be able to jump pages.

how to order on doc count for terms aggregation within a composite aggregation?

I was trying the composite aggregation in elastic-search but found it weird that what i can do within a terms aggregation normally, isn't supported for terms within a composite aggregation!
See the query below :
GET _search
{
"size": 0,
"query": {
"match_all": {}
},"aggs": {
"compo": {
"composite": {
"sources": [
{
"terms_inside": {
"terms": {
"field": "result_type",
"order": {
"_count": "asc" // not supported here!
}
}
}
}
]
}
},
"just_terms" :{
"terms": {
"field": "result_type",
"order": {
"_count": "asc" // supported here
}
}
}
}
}
Is the just the way it is, or is there a way to get sorted buckets on doc count with nested terms aggregation. I want to use paging and sorting on the terms aggregation.
It cannot be done as composite results paginate the aggregation and thus its function is designed to not fetch the count on all fields, only those in the first paginated set.
https://discuss.elastic.co/t/composite-aggregation-order-by/139563/5
You cannot aggregate on multiple terms and order on doc_count before elastic 7.12. On elasticsearch 7.12, you can use a multi terms aggregation.

Get all documents from elastic search with a field having same value

Say I have documents of type Order and they have a field bulkOrderId. Bulkorderid represents a group or bulk of orders issued at once. They all have the same Id like this :
Order {
bulkOrderId": "bulkOrder:12345678";
}
The id is unique and is generated using UUID.
How do I find groups of orders with the same bulkOrderId from elasticsearch when the bulkOrderId is not known? Is it possible?
You can achieve that using a terms aggregation and a top_hits sub-aggregation, like this:
{
"query": {
"match_all": {}
},
"aggs": {
"bulks": {
"terms": {
"field": "bulkOrderId",
"size": 10
},
"aggs": {
"orders": {
"top_hits": {
"size": 10
}
}
}
}
}
}

multiple metric sub aggregations situation with ElasticSearch

I am aware that Elasticsearch supports sub aggregations with bucketing (where bucketing aggregation can have bucketing or metric sub aggregations). Sub aggregation isn't possible with metric aggregations. May be that makes sense but here is the use case.
I have term aggregation as a parent. And using another term aggregation as a child of it. child term has a child aggregation of type top_hits. top_hits is a metric aggregation so it can't take any child aggregation. And now need to include avg aggregation into the mix. Given top_hits is the last aggregation in the aggregation tree can't have avg as a child to it since top_hits is a metric aggregation.
following is the desired aggregation levels. (of course it's invalid given top_hits is a metric aggregation and true for avg aggregation too.
{
"aggregations": {
"top_makes": {
"terms": {
"field": "make"
},
"aggregations": {
"top_models": {
"terms": {
"field": "model"
},
"aggregations": {
"top_res": {
"top_hits": {
"_source": {
"include": [
"model",
"color"
]
},
"size": 10
}
}
}
}
},
"aggregations": {
"avg_length": {
"avg": {
"field": "vlength"
}
}
}
}
}
}
What's the workaround or best way to address this?
I think this will work , verify ..
{
"aggregations": {
"top_makes": {
"terms": {
"field": "make"
},
"aggregations": {
"top_models": {
"terms": {
"field": "model"
},
"aggregations": {
"top_res": {
"top_hits": {
"_source": {
"include": [
"model",
"color"
]
},
"size": 10
}
}
},
"avg_length": {
"avg": {
"field": "vlength"
}
}
}
}
}
}
}
The point is you can have 1 or more sibblings (sub aggregation) for a parent aggregation.

Resources