Find documents per category - elasticsearch

IM a newbie to elasticsearch world.
I have done an aggregation and got the results. Now I need to see which documents are inside each category/buckets. How to do the same?

You can simply add the top_hits aggregation as a sub-aggregation of your terms aggregation, like this:
{
"aggs": {
"categories": {
"terms": {
"field": "category"
},
"aggs": { <--- add this sub-aggregation
"top_category_hits": {
"top_hits": {}
}
}
}
}
}

Related

How to paginate sorted data (with terms aggregation) using composite aggrgation?

How to write a pipeline aggregation to paginate the sorted data.[where sorting is done using terms aggregation based on its sub-aggregation]
GET index_name/_search
{
"query":{<some querying>}
"aggs": {
"pagination": {
"composite": {
"sources": [
{
"grouping": {
"terms": {
"field": "field_name.keyword",
"order": "desc"
}
}
}
]
},
"aggs": {
"results": {
"terms": {
"field": "field_name.keyword",
"order": {
"sub_aggregation": "desc"
}
},
"aggs": {
"sub_aggregation": {
"filter": {
"term": {
"field_name": "value"
}
}
}
}
}
}
}
}
}
The main problem is merging the following 2 sub-problems
P1. Sorting data based on selected key for which I used terms aggregation and inside the order of the same I have the sub-aggregation.
P2. I want to paginate the above-sorted data, I have used composite aggregation with terms aggregation.
When the composite aggregation is sub-aggregation to the terms aggregation. I get the following error:
[composite] aggregation cannot be used with a parent aggregation of type: [TermsAggregatorFactory]
and when I try the vice versa I get paginated data of terms data(P2) separately and sorting(P1) data in separate buckets.
How can I merge these two problems?

Get all documents from elastic search with a field having same value

Say I have documents of type Order and they have a field bulkOrderId. Bulkorderid represents a group or bulk of orders issued at once. They all have the same Id like this :
Order {
bulkOrderId": "bulkOrder:12345678";
}
The id is unique and is generated using UUID.
How do I find groups of orders with the same bulkOrderId from elasticsearch when the bulkOrderId is not known? Is it possible?
You can achieve that using a terms aggregation and a top_hits sub-aggregation, like this:
{
"query": {
"match_all": {}
},
"aggs": {
"bulks": {
"terms": {
"field": "bulkOrderId",
"size": 10
},
"aggs": {
"orders": {
"top_hits": {
"size": 10
}
}
}
}
}
}

Elasticsearch - exclude filter from aggregations

I query the following:
{
"query": {
"bool": {
"filter": {
"terms": {
"agent_id": [
"58becc297513311ad81577eb"
]
}
}
}
},
"aggs": {
"agent_id": {
"terms": {
"field": "agent_id"
}
}
}
}
I would like the aggregation to be excluded from the filter. In solr there is an option to tag a filter and use this tag to exclude this filter from the fact query.
How can I do the same in Elasticsearch.
One way to approach this problem is to use post_filter as described here.
It might be performance concern, so if it doesn't fit your SLA there is alternative approach using global bucket and described here.
You can use post_filter for elasticsearch. Post filter excludes the scope of the filters from the aggregations and is perfect to build an eCommerce search for drilled down aggregations count on filters
you can build a query like the following
{
"aggs": {
"agent_id": {
"terms": {
"field": "agent_id",
"size": 10
}
}
},
"post_filter": {
"bool": {
"terms": {
"agent_id": [
"58becc297513311ad81577eb"
]
}
}
}
}
Thanks

Elastic count by facets that exists only for some documents

I have a facet that exists only in some of the documents. I wish to know how many documents have each possible value of the facet, and how many doesn't have this facet at all.
The facet is color. My current query returns the count for different colors, but doesn't returns the count for documents without color:
"facets": {
"_Properties": {
"terms": {
"field": "Color",
"size": 100
}
}
}
Thanks!
Facets have been deprecated in Elasticsearch. You can use a combination of Terms Aggregation and Missing Aggregation for this. Find the query below for your requirement:
"aggs": {
"_Properties": {
"terms": {
"field": "Color",
"size": 100
}
},
"_MissingColor": {
"missing": {
"field": "Color"
}
}
}

multiple metric sub aggregations situation with ElasticSearch

I am aware that Elasticsearch supports sub aggregations with bucketing (where bucketing aggregation can have bucketing or metric sub aggregations). Sub aggregation isn't possible with metric aggregations. May be that makes sense but here is the use case.
I have term aggregation as a parent. And using another term aggregation as a child of it. child term has a child aggregation of type top_hits. top_hits is a metric aggregation so it can't take any child aggregation. And now need to include avg aggregation into the mix. Given top_hits is the last aggregation in the aggregation tree can't have avg as a child to it since top_hits is a metric aggregation.
following is the desired aggregation levels. (of course it's invalid given top_hits is a metric aggregation and true for avg aggregation too.
{
"aggregations": {
"top_makes": {
"terms": {
"field": "make"
},
"aggregations": {
"top_models": {
"terms": {
"field": "model"
},
"aggregations": {
"top_res": {
"top_hits": {
"_source": {
"include": [
"model",
"color"
]
},
"size": 10
}
}
}
}
},
"aggregations": {
"avg_length": {
"avg": {
"field": "vlength"
}
}
}
}
}
}
What's the workaround or best way to address this?
I think this will work , verify ..
{
"aggregations": {
"top_makes": {
"terms": {
"field": "make"
},
"aggregations": {
"top_models": {
"terms": {
"field": "model"
},
"aggregations": {
"top_res": {
"top_hits": {
"_source": {
"include": [
"model",
"color"
]
},
"size": 10
}
}
},
"avg_length": {
"avg": {
"field": "vlength"
}
}
}
}
}
}
}
The point is you can have 1 or more sibblings (sub aggregation) for a parent aggregation.

Resources