Elasticsearch: how to do filtered search and aggregation at the same time - elasticsearch

I need to do a filtered search plus aggregation the following way, conceptually.
{
"filtered" : {
"query": {
"match_all" : {
}
},
"aggregations": {
"facets": {
"terms": {
"field": "subject"
}
}
},
"filter" : {
...
}
}
}
The above query is not working because I got the following error message:
[filtered] query does not support [aggregations]]
I was trying to solve this problem. I found Filter Aggregation or Filters Aggregation online, but they do not seem to address my need.
Could someone show me the structure of the correct query that can achieve my goal?
Thanks and regards.

The scope of aggregation is the query and all the filters in it. Which means if you give the aggregation along with the query in normal fashion , it should work.
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {}
}
},
"aggregations": {
"facets": {
"terms": {
"field": "subject"
}
}
}
}

Related

Ignore "match" clause from query in aggregation

I have a query with aggregations. One of the aggregation is on the field starsCount. There is a query clause that filters on the starsCount field along with other match clauses (hidden for clarity).
I wish for the starsCount aggregation to ignore the starsCount filtering in its results (the aggregation's result should be as if I had run the same query without the match clause on the starsCount field) while the other aggregation keeps its current behavior
Can this be done in a single query or should I use multiple ?
Here is the (simplified) query:
{
[...]
"aggs": {
"group_by_service": {
"comment": "keep current behaviour",
"terms": {
"field": "services",
"size": 46
}
},
"group_by_stars": {
"comment": "ignore the filter on the starsCount field",
"terms": {
"field": "starsCount",
"size": 100
}
}
},
"query": {
"bool": {
"must": [
[...] filters on other properties, non-relevant
{
"match": {
"starsCount": {
"query": "2"
}
}
}
]
}
}
}
Yes you can achieve this in single query by making use of post filter and filter aggregation.
You need to follow the below steps to create the query:
Remove the starsCount match query from the main query as it should not affect the group_by_stars aggregation.
Since starsCount match query should filter the documents, move it to post_filter. Any query inside post_filter will filter the documents after calculating aggregations.
Now since starsCount is no more part of main query all the aggregations will not be affected by it. But what is required is that this filter should effect all other aggregations except group_by_stars aggregation. To achieve this we'll make use of filter aggregation and apply it to all the aggregations except group_by_stars aggregation.
The resultant query will be as below. (Note that instead of match query I have used term query. You can still use match but in this case term is a better choice.):
{
"aggs": {
"some_other_agg":{
"filter": {
"term": {
"starsCount": "2"
}
},
"aggs": {
"some_other_agg_filtered": {
"terms": {
"field": "some_other_field"
}
}
}
},
"group_by_service": {
"filter": {
"term": {
"starsCount": "2"
}
},
"aggs": {
"group_by_service_filtered": {
"terms": {
"field": "services",
"size": 46
}
}
}
},
"group_by_stars": {
"terms": {
"field": "starsCount",
"size": 100
}
}
},
"query": {
"bool": {
"must": [
{...} //filter on other properties
]
}
},
"post_filter": {
"term": {
"starsCount": "2"
}
}
}

Aggregation of fields inside nested type field

I want to aggregate keyword type field which lies inside a nested type field. The mapping for nested field is as below:
"Nested_field" : {
"type" : "nested",
"properties" : {
"Keyword_field" : {
"type" : "keyword"
}
}
}
And the part of query which I am using to aggregate is as below:
"aggregations": {
"Nested_field": {
"aggregations": {
"Keyword_field": {
"terms": {
"field": "Nested_field.Keyword_field"
}
}
},
"filter": {
"bool": {}
}
},
}
But this is not returning correct aggregation. Even though there are Keyword_field value existing docs, the query returns 0 buckets. So, there is something wrong in my aggregation query. Can anyone help me to find what's wrong?
I think you need to provide a nested path in there. This worked in ES 5, but it looks like you're using 6 based on the "aggregations" vs "aggs", so let me know if it doesn't work and I'll scrap this answer. Give this a try:
{
"aggregations": {
"nested_level": {
"nested": {
"path": "Nested_field"
},
"aggregations": {
"keyword_field": {
"terms": {
"field": "Nested_field.Keyword_field"
}
}
}
}
}
}

How to aggregate query result in elasticsearch

I am new in elasticsearch. I want elasticsearch result be like following sql query,
select distinct(car_name) from car_master where car_name like '%SUV%'
I am getting result by doing:
{ "query": {
"query_string": {
"fields" : ["car_name"],
"query": "*SUV*"
}
}
}
but I want distinct records.
You are almost there, you simply need to add a terms aggregation on the car_name field:
{
"query": {
"query_string": {
"fields" : ["car_name"],
"query": "*SUV*"
}
},
"aggs": {
"cars": {
"terms": {
"field": "car_name"
}
}
}
}

Elasticsearch scoped aggregation not desired results

I have the following query but the aggregation doesn't seem to be acting on top of the query.
The query returns 3 results there are 10 items in the aggregation. Looks like the aggregation is acting on top of all queried results.
Basically, how do I get the aggregation to take the given query as the input?
{
"query": {
"filtered": {
"filter": {
"and": [
{
"geo_distance": {
"coordinates": [
-79.3931,
43.6709
],
"distance": "15km"
}
},
{
"term": {
"user.type": "2"
}
}
]
},
"query": {
"match": {
"user.shoes": "314"
}
}
}
},
"aggs": {
"dedup": {
"terms": { "field": "user.id" }
"aggs": {
"dedup_docs": {
"top_hits": {
"size": 1
}
}
}
}
}
}
So as it turns out, I was expecting the aggregation to act on the paginated results given by the query. And that's incorrect.
The aggregation takes as input "all results" of the query, not just the paginated one.

Filtering nested aggregation result on number of buckets

I have this query that does a nested aggregation giving me unique machineid per unique key. What I want Elasticsearch to return is only those key with two or more unique machineid. I can of course solve this problem application-side, but is there a way to solve this directly in the query? Or maybe I am going about this the wrong way?
My query:
{
"query": {
"filtered": {
"filter": {
"bool": {
"must_not": {
"term" : { "key" : "" }
}
}
}
}
},
"aggs": {
"keys": {
"terms": {
"field": "key",
"size" : 0
},
"aggs": {
"machines": {
"terms": {
"field": "machineid",
"size" : 0
}
},
}
}
}
}
Example document:
{
"timestamp":"2014-05-23T08:21:51+00:00",
"machineid":"1444056739053156926",
"hash":"77f595dee5ffacea72b135b1fce1312e",
"key":"XXXXXX-XXXXXX-XXXXXX-XXXXXX"
}
I have been looking at scripted metric aggregation but it doesn't seem to be what I'm looking for.
Issue #4404 and issue #8110 on Elasticsearch GitHub seem to describe my problem but they are both closed.

Resources