ElasticSearch - Ordering aggregation by nested aggregation on nested field - elasticsearch

{
"query": {
"match_all": {}
},
"from": 0,
"size": 0,
"aggs": {
"itineraryId": {
"terms": {
"field": "iid",
"size": 2147483647,
"order": [
{
"price>price>price.max": "desc"
}
]
},
"aggs": {
"duration": {
"stats": {
"field": "drn"
}
},
"price": {
"nested": {
"path": "prl"
},
"aggs": {
"price": {
"filter": {
"terms": {
"prl.cc.keyword": [
"USD"
]
}
},
"aggs": {
"price": {
"stats": {
"field": "prl.spl.vl"
}
}
}
}
}
}
}
}
}
}
Here, I am getting the error:
"Invalid terms aggregation order path [price>price>price.max]. Terms
buckets can only be sorted on a sub-aggregator path that is built out
of zero or more single-bucket aggregations within the path and a final
single-bucket or a metrics aggregation at the path end. Sub-path
[price] points to non single-bucket aggregation"
query works fine if I order by duration aggregation like
"order": [
{
"duration.max": "desc"
}
So is there any way to Order aggregation by nested aggregation on nested field i.e something like below ?
"order": [
{
"price>price>price.max": "desc"
}

As Val has pointed out in the comments ES does not support it yet.
Till then you can first aggregate the nested aggregation and then use the reverse nested aggregation to aggregate the duration, that is present in the root of the document.
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-reverse-nested-aggregation.html

Related

ELASTICSEARCH - How can i get an aggregation in a boolean field?

I want to get aggregation in a boolean field, but the out is a error:
query:
"""
{
"size": 0,
"aggs": {
"RecentCreated": {
"terms": {
"field": "created_at.keyword",
"order": {
"_key": "desc"
},
"size": 1
},
"aggs": {
"nestedData": {
"nested": {
"path": "data.add.serv"
},
"aggs": {
"NAME": {
"terms": {
"field": "data.add.serv.beast"
, "include": true
}
}
}
}
}
}
}
}
"""
error:
"type" : "x_content_parse_exception",
"reason" : "[terms] include doesn't support values of type: VALUE_BOOLEAN"
I have been reading that it is possible to transform the true values ​​into 1 through script to get count in the aggregation, but I cannot get the result of the true values
How could I get a count of the boolean field with true value?
I think what you might want to do is use a filter aggregation over your nested document rather than a terms aggregation. So in short change this bit of your query:
"aggs": {
"NAME": {
"terms": {
"field": "data.add.serv.beast",
"include": true
}
}
}
to
"aggs": {
"NAME": {
"filter": {
"term": {
"data.add.serv.beast": true
}
}
}
}
I'm not too familiar with nested aggregations, so there might still be an error with my syntax. The main point is to use a filter aggregation rather than terms, hopefully that should work for you.

Elasticsearch: Aggregate all unique values of a field and apply a condition or filter by another field

My documents look like this:
{
"ownID": "Val_123",
"parentID": "Val_456",
"someField": "Val_78",
"otherField": "Val_90",
...
}
I am trying to get all (unique, as in one instance) results for a list of ownID values, while filtering by a list of parentID values and vice-versa.
What I did so far is:
Get (separate!) unique values for ownID and parentID in key1 and key2
{
"size": 0,
"aggs": {
"key1": {
"terms": {
"field": "ownID",
"include": {
"partition": 0,
"num_partitions": 10
},
"size": 100
}
},
"key2": {
"terms": {
"field": "parentID",
"include": {
"partition": 0,
"num_partitions": 10
},
"size": 100
}
}
}
}
Use filter to get (some) results matching either ownID OR parentID
{
"size": 0,
"query": {
"bool": {
"should": [
{
"terms": {
"ownID": ["Val_1","Val_2","Val_3"]
}
},
{
"terms": {
"parentID": ["Val_8","Val_9"]
}
}
]
}
},
"aggs": {
"my_filter": {
"top_hits": {
"size": 30000,
"_source": {
"include": ["ownID", "parentID","otherField"]
}
}
}
}
}
However, I need to get separate results for each filter in the second query, and get:
(1) the parentID of the documents matching some value of ownID
(2) the ownID for the documents matching some value of parentID.
So far I managed to do it using two similar queries (see below for (1)), but I would ideally want to combine them and query only once.
{
"size": 0,
"query": {
"bool": {
"should": [
{
"terms": {
"ownID": [ "Val1", Val_2, Val_3 ]
}
}
]
}
},
"aggs": {
"my_filter": {
"top_hits": {
"size": 30000,
"_source": {
"include": "parentID"
}
}
}
}
}
I'm using Elasticsearch version 5.2
If I got your question correctly then you need to get all the aggregations count correct irrespective of the filter query but in search hits you want the filtered documents only, so for this elasticsearch has another type of filter : "post filter" : refer to this : https://www.elastic.co/guide/en/elasticsearch/reference/5.5/search-request-post-filter.html
its really simple, it will just filter the results after the aggregations have been computed.

Need aggregation of only the query results

I need to do an aggregation but only with the limited results I get form the query, but it is not working, it returns other results outside the size limit of the query. Here is the query I am doing
{
"size": 500,
"query": {
"bool": {
"must": [
{
"term": {
"tags.keyword": "possiblePurchase"
}
},
{
"term": {
"clientName": "Ci"
}
},
{
"range": {
"firstSeenDate": {
"gte": "now-30d"
}
}
}
],
"must_not": [
{
"term": {
"tags.keyword": "skipPurchase"
}
}
]
}
},
"sort": [
{
"firstSeenDate": {
"order": "desc"
}
}
],
"aggs": {
"byClient": {
"terms": {
"field": "clientName",
"size": 25
},
"aggs": {
"byTarget": {
"terms": {
"field": "targetName",
"size": 6
},
"aggs": {
"byId": {
"terms": {
"field": "id",
"size": 5
}
}
}
}
}
}
}
}
I need the aggregations to only consider the first 500 results of the query, sorted by the field I am requesting on the query. I am completely lost. Thanks for the help
Scope of the aggregation is the number of hits of your query, the size parameter is only used to specify the number of hits to fetch and display.
If you want to restrict the scope of the aggregation on the first n hits of a query, I would suggest the sampler aggregation in combination with your query

How to mention from and size for the first level of elastic search aggregation in nested aggregation?

I have written a query to get the buckets based on id and then sort it. This works fine. But how to make it return buckets from position 100 till 200 for aggregation_by_id rule?
{
"query": {
"match_all": {}
},
"size": 0,
"aggregations": {
"aggregation_by_id": {
"terms": {
"field": "id.keyword"
"size" : 200
},
"aggs": {
"sort_timestamp": {
"top_hits": {
"sort": [{
"timestamp": {
"order": "desc",
"unmapped_type": "long"
}
}],
"size": 1
}
}
}
}
}
}

Paging the top_hits aggregation in ElasticSearch

Right now I'm doing a top_hits aggregation in Elastic Search that groups my data by a field, sorts the groups by a date, and chooses the top 1.
I need to somehow page this aggregation results in a way that I can pass through the pageSize and the pageNumber, but I don't know how.
In addition to this, I also need the total results of this aggregation so we can show it in a table in our web interface.
The aggregation looks like this:
POST my_index/_search
{
"size": 0,
"aggs": {
"top_artifacts": {
"terms": {
"field": "artifactId.keyword"
},
"aggs": {
"top_artifacts_hits": {
"top_hits": {
"size": 1,
"sort": [{
"date": {
"order": "desc"
}
}]
}
}
}
}
}
}
If I understand what you want, you should be able to do pagination through a Composite Aggregation. You can still pass your size parameter in your pagination, but your from would be the key for the bucket.
POST my_index/_search
{
"size": 0,
"aggs": {
"top_artifacts": {
"composite": {
"sources": [
{
"artifact": {
"terms": {
"field": "artifactId.keyword"
}
}
}
]
,
"size": 1, // OPTIONAL SIZE (How many buckets)
"after": {
"artifact": "FOO_BAZ" // Buckets after this bucket key
}
},
"aggs": {
"hits": {
"top_hits": {
"size": 1,
"sort": [
{
"timestamp": {
"order": "desc"
}
}
]
}
}
}
}
}
}

Resources