I am new to working with aggregations in Elasticsearch. I am on version 5.3 of Elasticsearch. I have a query to do a terms aggregation and also a filter, but when I try to use AggregationsBuilder to build the same query, I can't get it to look the same as the manual query. The manual query works and it is:
{
"size": 0,
"aggs": {
"data": {
"nested": {
"path": "data"
},
"aggs": {
"my_filters": {
"filter": {
"bool": {
"must": [
{
"term": {
"data.genre": "sci-fi"
}
}
]
}
},
"aggs": {
"my_agg_field": {
"terms": {
"field": "data.director.keyword"
}
}
}
}
}
}
}
}
But in my code, I need to use AggregationBuilder to create the query. So I use the following:
AggregationBuilder aggregation =
AggregationBuilders
.nested("data", "data")
.subAggregation(
AggregationBuilders.filter("filters", query)
.subAggregation(
AggregationBuilders
.terms("bucket_field").field("data.directors.keyword")
)
);
With the AggregationsBuilder, I get the following query:
{
"aggs": {
"filters" : {
"filter" : {
"nested" : {
"query" : {
"bool" : {
"must" : [
{
"match" : {
"data.genre" : {
"query" : "sci-fi",
"operator" : "OR"
}
}
}
]
}
},
"path" : "data"
}
},
"aggregations" : {
"bucket_field" : {
"terms" : {
"field" : "data.director.keyword"
}
}
}
}
}
}
This aggregation query returns results but doesn't aggregate the buckets like the manual query does and just returns:
"aggregations": {
"filters": {
"doc_count": 62,
"bucket_field": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": []
}
}
},
I need help converting my manual query into an AggregationsBuilder query. Thanks!
Related
I wrote a elasticsearch query to get the aggregated doc count of a matching keyword "webserver1". Below is the query:
POST _search?filter_path=aggregations.*.buckets
{
"query": {
"bool": {
"must": [
{
"match": {
"hostname": "webserver1"
}
}
]
}
},
"aggs": {
"webserver1": {
"terms": {
"field": "webserver1"
}
}
}
}
Response:
{
"aggregations" : {
"webserver1" : {
"buckets" : [
{
"key" : "webserver1",
"doc_count" : 36715
}
]
}
}
}
Is there a way to filter only the wanted text and display it like the below one:
{
"webserver1" : 36715
}
I have checked multiple resource but I'm not able to find any filters/options to do it.
Structure:
{
.................
"mp": "CAR",
"nPhoto": 1,
"items": [
{
"availableQuantity": 3,
},
{
"availableQuantity": 0,
},
{
"availableQuantity": 0,
}
],
............................
}
}
If I filter by mp field, I generate the following query:
GET catalog/_search
{
"from" : 0,
"size" : 0,
"aggregations" : {
"brand" : {
"filter" : {
"bool" : {
"must" : {
"term" : {
"mp" : "CAR"
}
}
}
},
"aggregations" : {
"photosQuantity" : { "sum" : { "field" : "nPhoto" } }
}
}
}
}
But how to generate query if you need to filter by field availableQuantity, where availableQuantity > 0 at least one of the items?
What you probably want is nested query in filter part.
something along line of this:
{
"from": 0,
"size": 0,
"aggregations": {
"brand": {
"filter": {
"nested": {
"path": "items",
"query": {
"range": {
"items.availableQuantity": {
"gte": 0
}
}
}
}
},
"aggregations": {
"photosQuantity": {
"sum": {
"field": "nPhoto"
}
}
}
}
}
}
I am using elasticsearch 1.5.2. I stored some products with a field named "allergic" and some others without this field. And the values of this field can be fish or milk or nuts etc. I want to make a query and to get as a result only products which doesn't have at all this field called "allergic" and to integrate this to an other aggregation query. I want to make just one query: first eliminate products which have "allergic" field and then execute the aggregation query of the second block.
How to integrate this :
{
"constant_score" : {
"filter" : {
"missing" : { "field" : "allergic" }
}
}
}
to this aggregation query:
POST tes1/_search?search_type=count
{
"aggs" : {
"fruits" : {
"filter" : {
"query":{
"query_string": {
"query": "Fruits",
"fields": [
"category"
]
}
}},
"aggs" : {
"minprice": {
"top_hits": {
"sort": [
{
"prix en €/kg": {
"order": "asc"
}
}
], "size":400
}
}
}
}} }
You need to add the query part before the aggregation call. This will filter the results and then run aggregation on the resultset.
POST tes1/_search
{
"_source": false,
"size": 1000,
"query":
{ "constant_score" : {
"filter" : {
"missing" : { "field" : "allergic" }
}
}
},
"aggs" : {
"fruits" : {
"filter" : {
"query":{
"query_string": {
"query": "Fruits",
"fields": [
"category"
]
}
}},
"aggs" : {
"minprice": {
"top_hits": {
"sort": [
{
"prix en €/kg": {
"order": "asc"
}
}
], "size":400
}
}
}
}} }
On a side note please consider upgrading ElasticSearch to the latest version as 1.x is no longer supported.
How can I write a single Elasticsearch query that will count how many documents either have a value for a field or are missing that field?
This query successfully count the docs missing the field:
POST localhost:9200//<index_name_here>/_search
{
"size": 0,
"aggs" : {
"Missing_Field" : {
"missing": { "field": "group_doc_groupset_id" }
}
}
}
This query does the opposite, counting documents NOT missing the field:
POST localhost:9200//<index_name_here>/_search
{
"size": 0,
"aggs" : {
"Not_Missing_Field" : {
"exists": { "field": "group_doc_groupset_id" }
}
}
}
How can I write one that combines both? For example, this yields a syntax error:
POST localhost:9200//<index_name_here>/_search
{
"size": 0,
"aggs" : {
"Missing_Field_Or_Not" : {
"missing": { "field": "group_doc_groupset_id" },
"exists": { "field": "group_doc_groupset_id" }
}
}
}
GET indexname/_search?size=0
{
"aggs": {
"a1": {
"missing": {
"field": "status"
}
},
"a2": {
"filter": {
"exists": {
"field": "status"
}
}
}
}
}
As per new Elastic search recommendation in the docs:
GET {your_index_name}/_search #or _count, to see just the value
{
"query": {
"bool": {
"must_not": { # here can be also "must"
"exists": {
"field": "{field_to_be_searched}"
}
}
}
}
}
Edit: _count allows to have exact values of how many documents are indexed. If there're more than 10k the total is shown as:
"hits" : {
"total" : {
"value" : 10000, # 10k
"relation" : "gte" # Greater than
}
I am trying to run a post filter on the aggregated data, but it is not working as i expected. Can someone review my query and suggest if i am doing anything wrong here.
"query" : {
"bool" : {
"must" : {
"range" : {
"versionDate" : {
"from" : null,
"to" : "2016-04-22T23:13:50.000Z",
"include_lower" : false,
"include_upper" : true
}
}
}
}
},
"aggregations" : {
"associations" : {
"terms" : {
"field" : "association.id",
"size" : 0,
"order" : {
"_term" : "asc"
}
},
"aggregations" : {
"top" : {
"top_hits" : {
"from" : 0,
"size" : 1,
"_source" : {
"includes" : [ ],
"excludes" : [ ]
},
"sort" : [ {
"versionDate" : {
"order" : "desc"
}
} ]
}
},
"disabledDate" : {
"filter" : {
"missing" : {
"field" : "disabledDate"
}
}
}
}
}
}
}
STEPS in the query:
Filter by indexDate less than or equal to a given date.
Aggregate based on formId. Forming buckets per formId.
Sort in descending order and return top hit result per bucket.
Run a subaggregation filter after the sort subaggregation and remove all the documents from buckets where disabled date is not null.(Which is not working)
The whole purpose of post_filter is to run after aggregations have been computed. As such, post_filter has no effect whatsoever on aggregation results.
What you can do in your case is to apply a top-level filter aggregation so that documents with no disabledDate are not taken into account in aggregations, i.e. consider only documents with disabledDate.
{
"query": {
"bool": {
"must": {
"range": {
"versionDate": {
"from": null,
"to": "2016-04-22T23:13:50.000Z",
"include_lower": true,
"include_upper": true
}
}
}
}
},
"aggregations": {
"with_disabled": {
"filter": {
"exists": {
"field": "disabledDate"
}
},
"aggs": {
"form.id": {
"terms": {
"field": "form.id",
"size": 0
},
"aggregations": {
"top": {
"top_hits": {
"size": 1,
"_source": {
"includes": [],
"excludes": []
},
"sort": [
{
"versionDate": {
"order": "desc"
}
}
]
}
}
}
}
}
}
}
}