Elasticsearch Aggregation Query Builder - elasticsearch

I am new to working with aggregations in Elasticsearch. I am on version 5.3 of Elasticsearch. I have a query to do a terms aggregation and also a filter, but when I try to use AggregationsBuilder to build the same query, I can't get it to look the same as the manual query. The manual query works and it is:
{
"size": 0,
"aggs": {
"data": {
"nested": {
"path": "data"
},
"aggs": {
"my_filters": {
"filter": {
"bool": {
"must": [
{
"term": {
"data.genre": "sci-fi"
}
}
]
}
},
"aggs": {
"my_agg_field": {
"terms": {
"field": "data.director.keyword"
}
}
}
}
}
}
}
}
But in my code, I need to use AggregationBuilder to create the query. So I use the following:
AggregationBuilder aggregation =
AggregationBuilders
.nested("data", "data")
.subAggregation(
AggregationBuilders.filter("filters", query)
.subAggregation(
AggregationBuilders
.terms("bucket_field").field("data.directors.keyword")
)
);
With the AggregationsBuilder, I get the following query:
{
"aggs": {
"filters" : {
"filter" : {
"nested" : {
"query" : {
"bool" : {
"must" : [
{
"match" : {
"data.genre" : {
"query" : "sci-fi",
"operator" : "OR"
}
}
}
]
}
},
"path" : "data"
}
},
"aggregations" : {
"bucket_field" : {
"terms" : {
"field" : "data.director.keyword"
}
}
}
}
}
}
This aggregation query returns results but doesn't aggregate the buckets like the manual query does and just returns:
"aggregations": {
"filters": {
"doc_count": 62,
"bucket_field": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": []
}
}
},
I need help converting my manual query into an AggregationsBuilder query. Thanks!

Related

Elasticsearch aggregation query with filters

I wrote a elasticsearch query to get the aggregated doc count of a matching keyword "webserver1". Below is the query:
POST _search?filter_path=aggregations.*.buckets
{
"query": {
"bool": {
"must": [
{
"match": {
"hostname": "webserver1"
}
}
]
}
},
"aggs": {
"webserver1": {
"terms": {
"field": "webserver1"
}
}
}
}
Response:
{
"aggregations" : {
"webserver1" : {
"buckets" : [
{
"key" : "webserver1",
"doc_count" : 36715
}
]
}
}
}
Is there a way to filter only the wanted text and display it like the below one:
{
"webserver1" : 36715
}
I have checked multiple resource but I'm not able to find any filters/options to do it.

Search by internal field in Elasticsearch

Structure:
{
.................
"mp": "CAR",
"nPhoto": 1,
"items": [
{
"availableQuantity": 3,
},
{
"availableQuantity": 0,
},
{
"availableQuantity": 0,
}
],
............................
}
}
If I filter by mp field, I generate the following query:
GET catalog/_search
{
"from" : 0,
"size" : 0,
"aggregations" : {
"brand" : {
"filter" : {
"bool" : {
"must" : {
"term" : {
"mp" : "CAR"
}
}
}
},
"aggregations" : {
"photosQuantity" : { "sum" : { "field" : "nPhoto" } }
}
}
}
}
But how to generate query if you need to filter by field availableQuantity, where availableQuantity > 0 at least one of the items?
What you probably want is nested query in filter part.
something along line of this:
{
"from": 0,
"size": 0,
"aggregations": {
"brand": {
"filter": {
"nested": {
"path": "items",
"query": {
"range": {
"items.availableQuantity": {
"gte": 0
}
}
}
}
},
"aggregations": {
"photosQuantity": {
"sum": {
"field": "nPhoto"
}
}
}
}
}
}

Elasticsearch Filter Query

I am using elasticsearch 1.5.2. I stored some products with a field named "allergic" and some others without this field. And the values of this field can be fish or milk or nuts etc. I want to make a query and to get as a result only products which doesn't have at all this field called "allergic" and to integrate this to an other aggregation query. I want to make just one query: first eliminate products which have "allergic" field and then execute the aggregation query of the second block.
How to integrate this :
{
"constant_score" : {
"filter" : {
"missing" : { "field" : "allergic" }
}
}
}
to this aggregation query:
POST tes1/_search?search_type=count
{
"aggs" : {
"fruits" : {
"filter" : {
"query":{
"query_string": {
"query": "Fruits",
"fields": [
"category"
]
}
}},
"aggs" : {
"minprice": {
"top_hits": {
"sort": [
{
"prix en €/kg": {
"order": "asc"
}
}
], "size":400
}
}
}
}} }
You need to add the query part before the aggregation call. This will filter the results and then run aggregation on the resultset.
POST tes1/_search
{
"_source": false,
"size": 1000,
"query":
{ "constant_score" : {
"filter" : {
"missing" : { "field" : "allergic" }
}
}
},
"aggs" : {
"fruits" : {
"filter" : {
"query":{
"query_string": {
"query": "Fruits",
"fields": [
"category"
]
}
}},
"aggs" : {
"minprice": {
"top_hits": {
"sort": [
{
"prix en €/kg": {
"order": "asc"
}
}
], "size":400
}
}
}
}} }
On a side note please consider upgrading ElasticSearch to the latest version as 1.x is no longer supported.

Count how many documents have an attribute or are missing that attribute in Elasticsearch

How can I write a single Elasticsearch query that will count how many documents either have a value for a field or are missing that field?
This query successfully count the docs missing the field:
POST localhost:9200//<index_name_here>/_search
{
"size": 0,
"aggs" : {
"Missing_Field" : {
"missing": { "field": "group_doc_groupset_id" }
}
}
}
This query does the opposite, counting documents NOT missing the field:
POST localhost:9200//<index_name_here>/_search
{
"size": 0,
"aggs" : {
"Not_Missing_Field" : {
"exists": { "field": "group_doc_groupset_id" }
}
}
}
How can I write one that combines both? For example, this yields a syntax error:
POST localhost:9200//<index_name_here>/_search
{
"size": 0,
"aggs" : {
"Missing_Field_Or_Not" : {
"missing": { "field": "group_doc_groupset_id" },
"exists": { "field": "group_doc_groupset_id" }
}
}
}
GET indexname/_search?size=0
{
"aggs": {
"a1": {
"missing": {
"field": "status"
}
},
"a2": {
"filter": {
"exists": {
"field": "status"
}
}
}
}
}
As per new Elastic search recommendation in the docs:
GET {your_index_name}/_search #or _count, to see just the value
{
"query": {
"bool": {
"must_not": { # here can be also "must"
"exists": {
"field": "{field_to_be_searched}"
}
}
}
}
}
Edit: _count allows to have exact values of how many documents are indexed. If there're more than 10k the total is shown as:
"hits" : {
"total" : {
"value" : 10000, # 10k
"relation" : "gte" # Greater than
}

Post filter on subaggregation in elasticsearch

I am trying to run a post filter on the aggregated data, but it is not working as i expected. Can someone review my query and suggest if i am doing anything wrong here.
"query" : {
"bool" : {
"must" : {
"range" : {
"versionDate" : {
"from" : null,
"to" : "2016-04-22T23:13:50.000Z",
"include_lower" : false,
"include_upper" : true
}
}
}
}
},
"aggregations" : {
"associations" : {
"terms" : {
"field" : "association.id",
"size" : 0,
"order" : {
"_term" : "asc"
}
},
"aggregations" : {
"top" : {
"top_hits" : {
"from" : 0,
"size" : 1,
"_source" : {
"includes" : [ ],
"excludes" : [ ]
},
"sort" : [ {
"versionDate" : {
"order" : "desc"
}
} ]
}
},
"disabledDate" : {
"filter" : {
"missing" : {
"field" : "disabledDate"
}
}
}
}
}
}
}
STEPS in the query:
Filter by indexDate less than or equal to a given date.
Aggregate based on formId. Forming buckets per formId.
Sort in descending order and return top hit result per bucket.
Run a subaggregation filter after the sort subaggregation and remove all the documents from buckets where disabled date is not null.(Which is not working)
The whole purpose of post_filter is to run after aggregations have been computed. As such, post_filter has no effect whatsoever on aggregation results.
What you can do in your case is to apply a top-level filter aggregation so that documents with no disabledDate are not taken into account in aggregations, i.e. consider only documents with disabledDate.
{
"query": {
"bool": {
"must": {
"range": {
"versionDate": {
"from": null,
"to": "2016-04-22T23:13:50.000Z",
"include_lower": true,
"include_upper": true
}
}
}
}
},
"aggregations": {
"with_disabled": {
"filter": {
"exists": {
"field": "disabledDate"
}
},
"aggs": {
"form.id": {
"terms": {
"field": "form.id",
"size": 0
},
"aggregations": {
"top": {
"top_hits": {
"size": 1,
"_source": {
"includes": [],
"excludes": []
},
"sort": [
{
"versionDate": {
"order": "desc"
}
}
]
}
}
}
}
}
}
}
}

Resources