Ignore "match" clause from query in aggregation - elasticsearch

I have a query with aggregations. One of the aggregation is on the field starsCount. There is a query clause that filters on the starsCount field along with other match clauses (hidden for clarity).
I wish for the starsCount aggregation to ignore the starsCount filtering in its results (the aggregation's result should be as if I had run the same query without the match clause on the starsCount field) while the other aggregation keeps its current behavior
Can this be done in a single query or should I use multiple ?
Here is the (simplified) query:
{
[...]
"aggs": {
"group_by_service": {
"comment": "keep current behaviour",
"terms": {
"field": "services",
"size": 46
}
},
"group_by_stars": {
"comment": "ignore the filter on the starsCount field",
"terms": {
"field": "starsCount",
"size": 100
}
}
},
"query": {
"bool": {
"must": [
[...] filters on other properties, non-relevant
{
"match": {
"starsCount": {
"query": "2"
}
}
}
]
}
}
}

Yes you can achieve this in single query by making use of post filter and filter aggregation.
You need to follow the below steps to create the query:
Remove the starsCount match query from the main query as it should not affect the group_by_stars aggregation.
Since starsCount match query should filter the documents, move it to post_filter. Any query inside post_filter will filter the documents after calculating aggregations.
Now since starsCount is no more part of main query all the aggregations will not be affected by it. But what is required is that this filter should effect all other aggregations except group_by_stars aggregation. To achieve this we'll make use of filter aggregation and apply it to all the aggregations except group_by_stars aggregation.
The resultant query will be as below. (Note that instead of match query I have used term query. You can still use match but in this case term is a better choice.):
{
"aggs": {
"some_other_agg":{
"filter": {
"term": {
"starsCount": "2"
}
},
"aggs": {
"some_other_agg_filtered": {
"terms": {
"field": "some_other_field"
}
}
}
},
"group_by_service": {
"filter": {
"term": {
"starsCount": "2"
}
},
"aggs": {
"group_by_service_filtered": {
"terms": {
"field": "services",
"size": 46
}
}
}
},
"group_by_stars": {
"terms": {
"field": "starsCount",
"size": 100
}
}
},
"query": {
"bool": {
"must": [
{...} //filter on other properties
]
}
},
"post_filter": {
"term": {
"starsCount": "2"
}
}
}

Related

Query on multiple range of document

What I want to search is to extract documents among certain range of documents, not the whole documents. I know ids of documents. For example, I want to query matching some sentences with query field - 'pLabel' among the documents ids of which I know via different process. My trial is as below but I got bunch of documents which is different with my expectation.
For example, in such documents as eid1, eid2...etc groups, I want to query filtering out the matching documents out of the groups (eid1, eid2, eid3, ...). Query is shown as below.
How I fix query statement to get the right search result?
{
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "pLabel" ,
"query": "search words here"
}
}
] ,
"must_not": [] ,
"should": [
{
"term": {
"eid": "eid1"
}
} ,
{
"term": {
"eid": "eid2"
}
}
]
}
} ,
"size": 0 ,
"_source": [
"eid"
] ,
"aggs": {
"eids": {
"terms": {
"field": "eid" ,
"size": 1000
}
}
}
}
You need to move the should clause of the Doc IDs inside the must clause.
Right now the query can return any document that matches the query_string clause, it'll only prefer docs that matches the Doc IDs.
Also, you should use terms query
{
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "pLabel",
"query": "search words here"
}
},
{
"terms": {
"user": ["eid1", "eid2"]
}
}
]
}
},
"size": 0,
"_source": [
"eid"
],
"aggs": {
"eids": {
"terms": {
"field": "eid",
"size": 1000
}
}
}
}

Elasticsearch scoped aggregation not desired results

I have the following query but the aggregation doesn't seem to be acting on top of the query.
The query returns 3 results there are 10 items in the aggregation. Looks like the aggregation is acting on top of all queried results.
Basically, how do I get the aggregation to take the given query as the input?
{
"query": {
"filtered": {
"filter": {
"and": [
{
"geo_distance": {
"coordinates": [
-79.3931,
43.6709
],
"distance": "15km"
}
},
{
"term": {
"user.type": "2"
}
}
]
},
"query": {
"match": {
"user.shoes": "314"
}
}
}
},
"aggs": {
"dedup": {
"terms": { "field": "user.id" }
"aggs": {
"dedup_docs": {
"top_hits": {
"size": 1
}
}
}
}
}
}
So as it turns out, I was expecting the aggregation to act on the paginated results given by the query. And that's incorrect.
The aggregation takes as input "all results" of the query, not just the paginated one.

Elasticsearch - exclude filter from aggregations

I query the following:
{
"query": {
"bool": {
"filter": {
"terms": {
"agent_id": [
"58becc297513311ad81577eb"
]
}
}
}
},
"aggs": {
"agent_id": {
"terms": {
"field": "agent_id"
}
}
}
}
I would like the aggregation to be excluded from the filter. In solr there is an option to tag a filter and use this tag to exclude this filter from the fact query.
How can I do the same in Elasticsearch.
One way to approach this problem is to use post_filter as described here.
It might be performance concern, so if it doesn't fit your SLA there is alternative approach using global bucket and described here.
You can use post_filter for elasticsearch. Post filter excludes the scope of the filters from the aggregations and is perfect to build an eCommerce search for drilled down aggregations count on filters
you can build a query like the following
{
"aggs": {
"agent_id": {
"terms": {
"field": "agent_id",
"size": 10
}
}
},
"post_filter": {
"bool": {
"terms": {
"agent_id": [
"58becc297513311ad81577eb"
]
}
}
}
}
Thanks

To find the distinct fields in an elastic search query

I need the values of only one field and there are duplicate values in it.
POST _search
{
"query": {
"bool": {
"must": [
{"term": {
"report": {
"value": "some_value"
}
}}
]
}
},
"fields": [
"field_name"
]
}
I need only the distinct values of field_name.
What if you have your query, with the use of terms aggregation and then by applying a top_hits aggregation in order to narrow down to the single value which you wanted to achieve:
"aggs": {
"values": {
"terms": {
"field": "your_field"
}
}
}
This SO could be helpful as well.

ElasticSearch: getting facets from all results with filter query

I don't know whether the title of this question is clear enough.
I have a text search with language filter in the left pane in ElasticSearch. When a specific language filter is selected in the left pane from search results (from a query), I still want to get the language facets from all search results from the query. I know this is possible in Solr but I am not sure whether this is doable in ElasticSearch.
Yes, you can achieve this by using post_filter instead of a normal filter. What post_filter does is to filter the documents after the aggregations have been computed on the full data set.
So instead of this:
{
"query": {
"bool": {
"filter": {
"term": {
"some_field": "some_value"
}
}
}
},
"aggs": {
"languages": {
"terms": {
"field": "language"
}
}
}
}
Do this:
{
"post_filter": {
"term": {
"some_field": "some_value"
}
},
"aggs": {
"languages": {
"terms": {
"field": "language"
}
}
}
}

Resources