Terms aggregation not returning other buckets - elasticsearch

I'm not able to get other buckets with terms aggregation when combining a filter aggregation. Anyway to do this in elasticsearch?
Mapping: customer with nested address. address with nested properties.
I've tried the following,
{
"size": 0,
"aggs": {
"address": {
"nested": {
"path": "address"
},
"aggs": {
"shipping_to_address": {
"aggs": {
"city": {
"terms": {
"field": "address.city.name.keyword",
"size": 10,
"missing": "others"
}
}
},
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "address.properties",
"query": {
"bool": {
"filter": [
{
"term": {
"address.properties.type": "shipping_to"
}
}
]
}
}
}
}
]
}
}
}
}
}
}
}
The above only returns the buckets matching the filter.
{
"hits": {
"total": 3,
"max_score": 0,
"hits": []
},
"aggregations": {
"address": {
"doc_count": 3,
"shipping_to_address": {
"doc_count": 1,
"city": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "new york",
"doc_count": 1
}
]
}
}
}
}
}
I would like to see the other buckets as below:
"buckets": [
{
"key": "new york",
"doc_count": 1
},
{
"key": "others",
"doc_count": 2
}
]

You need to add "min_doc_count":0 to terms aggregation, it will return empty buckets.
Link for reference
{
"size": 0,
"aggs": {
"address": {
"nested": {
"path": "address"
},
"aggs": {
"shipping_to_address": {
"aggs": {
"city": {
"terms": {
"field": "address.city.name.keyword",
"size": 10,
"min_doc_count":0,
"missing": "others"
}
}
},
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "address.properties",
"query": {
"bool": {
"filter": [
{
"term": {
"address.properties.type": "shipping_to"
}
}
]
}
}
}
}
]
}
}
}
}
}
}
}

Related

Remove selected filters from Nested and Aggregation in Elasticsearh to perform filtered search

I am trying to use elasticsearch (VERSION 7.8.0) to perform Faceted search and have gotten something to work. However I want to remove the selected filters from my returned aggregations. E.g. If I had a shop selling clothes and I filter the products by colour: red Then I don't want colour: red to appear in my aggregations as that has been selected already.
To perform my filters I have the following Nested field on my example products:
"search_filters" : {
"type" : "nested",
"properties" : {
"key" : {
"type" : "keyword"
},
"value" : {
"type" : "keyword"
}
}
},
So if I had data like
"search_filters" : [
{
"key" : "colour",
"value" : "red"
}
]
I then perform a search like:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "search_filters",
"query": {
"bool": {
"must": [
{
"match": {
"search_filters.key": "colour"
}
},
{
"match": {
"search_filters.value": "red"
}
}
]
}
}
}
}
]
}
},
"aggs": {
"filters": {
"nested": {
"path": "search_filters"
},
"aggs": {
"search_keys": {
"terms": {
"field": "search_filters.key"
},
"aggs": {
"search_values": {
"terms": {
"field": "search_filters.value"
}
}
}
}
}
}
}
}
This gives me the right documents but my aggregations shows colour: red E.g.
{
...
"filters": {
"doc_count": 31,
"search_keys": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "colour",
"doc_count": 31,
"search_values": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "red",
"doc_count": 31
}
...
]
}
}
...
]
}
}
}
Is there a way to exclude these from elasticsearch or do I have to manually parse and ignore/remove my selected filters?
I have seen the use of filter in the aggregation field part of the request but I couldn't get that to work with my filter field; especially when I applied more than one filter e.g. colour and size.
You can use the filter aggregation, to exclude those results that match with "search_filters.key": "colour" "search_filters.value": "red". This can contain additional musts for additional filters.
{
"query": {
"nested": {
"path": "search_filters",
"query": {
"bool": {
"must": [
{
"match": {
"search_filters.key": "colour"
}
},
{
"match": {
"search_filters.value": "red"
}
}
]
}
}
}
},
"aggs": {
"filters": {
"nested": {
"path": "search_filters"
},
"aggs": {
"filterResult": {
"filter": {
"bool": {
"must_not": {
{
"bool": {
"must": [
{
"match": {
"search_filters.key": "colour"
}
},
{
"match": {
"search_filters.value": "red"
}
}
]
}
}
}
},
"aggs": {
"search_keys": {
"terms": {
"field": "search_filters.key"
},
"aggs": {
"search_values": {
"terms": {
"field": "search_filters.value"
}
}
}
}
}
}
}
}
}
}
Search Result will be
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.8754687,
"hits": [
{
"_index": "66222818",
"_type": "_doc",
"_id": "1",
"_score": 0.8754687,
"_source": {
"search_filters": [
{
"key": "colour",
"value": "red" // note this
}
]
}
}
]
},
"aggregations": {
"filters": {
"doc_count": 1,
"filterResult": {
"doc_count": 0,
"search_keys": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [] // note this
}
}
}
}

Add multiple filters to nested aggregation filters Elasticsearch

So I would like to add a couple more filters to the aggregate filter for the "inner" portion of the aggregate section. The other two filters I need to add are in the query section. I was able to get this code to work correctly, it just needs the second and third nested filters added from the first section down into the aggregate area, where I am only filtering by the "givingMatch.db_type" terms currently.
Here is the current code that just needs the additional filters added:
GET /testserver/_search
{
"query": {
"bool": {
"filter": [
{
"nested": {
"path": "givingMatch",
"query": {
"bool": {
"filter": {
"terms": {
"givingMatch.db_type": [
"FECmatch",
"StateMatch"
]
}
}
}
}
}
},
{
"nested": {
"path": "givingMatch",
"query": {
"bool": {
"filter": {
"range": {
"givingMatch.Status": {
"from": 0,
"to": 8
}
}
}
}
}
}
},
{
"nested": {
"path": "givingMatch",
"query": {
"bool": {
"filter": {
"range": {
"givingMatch.QualityScore": {
"from": 17
}
}
}
}
}
}
}
]
}
},
"aggs": {
"categories": {
"nested": {
"path": "givingMatch"
},
"aggs": {
"inner": {
"filter": {
"terms": {
"givingMatch.db_type":["FECmatch","StateMatch"]
}
},
"aggs":{
"org_category": {
"terms": {
"field": "givingMatch.org_category",
"size": 1000
},
"aggs": {
"total": {
"sum":{
"field": "givingMatch.low_gift"
}
}
}
}
}
}
}
}
},
"size": 0
}
Giving these results:
...."aggregations": {
"categories": {
"doc_count": 93084,
"inner": {
"doc_count": 65492,
"org_category": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "DEM",
"doc_count": 28829,
"total": {
"value": 29859163
}
},
{
"key": "REP",
"doc_count": 21561,
"total": {
"value": 69962305
}
},...
Hopefully this will save someone else a few hours. To add multiple filters, the aggregate section would become:
GET materielelectrique_search_alias/product/_search?explain=false
{
"aggs": {
"categories": {
"nested": {
"path": "givingMatch"
},
"aggs": {
"inner": {
"filter": {
"bool": {
"must": [
{
"terms": {
"givingMatch.db_type": [
"FECmatch",
"StateMatch"
]
}
},
{
"range": {
"givingMatch.QualityScore": {
"from": 17
}
}
},
{
"range": {
"givingMatch.Status": {
"from": 0,
"to": 8
}
}
}
]
}
},
"aggs": {
"org_category": {
"terms": {
"field": "givingMatch.org_category",
"size": 1000
},
"aggs": {
"total": {
"sum": {
"field": "givingMatch.low_gift"
}
}
}
}
}
}
}
}
}
}
This allows for multiple filters within the nested aggs.

Elasticsearch summing buckets

I have the following request which will return the count of all documents with a status of either "Accepted","Released" or closed.
{
"size": 0,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "*",
"analyze_wildcard": true
}
}
],
"must_not": []
}
},
"aggs": {
"slices": {
"terms": {
"field": "status.raw",
"include": {
"pattern": "Accepted|Released|Closed"
}
}
}
}
}
In my case the response is:
"buckets": [
{
"key": "Closed",
"doc_count": 2216
},
{
"key": "Accepted",
"doc_count": 8
},
{
"key": "Released",
"doc_count": 6
}
]
Now I'd like to add all of them up into a single field.
I tried using pipeline aggregations and even tried the following sum_bucket (which apparently only works on multi-bucket):
"total":{
"sum_bucket":{
"buckets_path": "slices"
}
}
Anyone able to help me out with this?
With sum_bucket and your already existent aggregation:
"aggs": {
"slices": {
"terms": {
"field": "status.raw",
"include": {
"pattern": "Accepted|Released|Closed"
}
}
},
"sum_total": {
"sum_bucket": {
"buckets_path": "slices._count"
}
}
}
What I would do is to use the filters aggregation instead and define all the buckets you need, like this:
{
"size": 0,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "*",
"analyze_wildcard": true
}
}
],
"must_not": []
}
},
"aggs": {
"slices": {
"filters": {
"filters": {
"accepted": {
"term": {
"status.raw": "Accepted"
}
},
"released": {
"term": {
"status.raw": "Released"
}
},
"closed": {
"term": {
"status.raw": "Closed"
}
},
"total": {
"terms": {
"status.raw": [
"Accepted",
"Released",
"Closed"
]
}
}
}
}
}
}
}
You could add count with value_count sub aggregation and then use sum_bucket pipeline aggregation
{
"aggs": {
"unique_status": {
"terms": {
"field": "status.raw",
"include": "Accepted|Released|Closed"
},
"aggs": {
"count": {
"value_count": {
"field": "status.raw"
}
}
}
},
"sum_status": {
"sum_bucket": {
"buckets_path": "unique_status>count"
}
}
},
"size": 0
}

Elasticsearch applying filters to aggregation

I'm trying to build a facets system using Elasticsearch to display the number of documents which match a query.
I'm currently doing this query on /_search?search_type=count:
{
"query": {
"query_string": {
"query": "status:(1|2) AND categories:A"
}
},
"aggs": {
"all_products": {
"global": {},
"aggs": {
"countries": {
"aggs": {
"counter": {
"terms": ["min_doc_count": 0, "field": "country"],
"aggs": ["unique": ["cardinality": ["field": "id"]]]
}
}
},
"categories": {
"aggs": {
"counter": {
"terms": ["min_doc_count": 0, "field": "category"],
"aggs": ["unique": ["cardinality": ["field": "id"]]]
}
}
},
"statuses": {
"aggs": {
"counter": {
"terms": ["min_doc_count": 0, "field": "status"],
"aggs": ["unique": ["cardinality": ["field": "id"]]]
}
}
}
}
}
}
}
the documents have the following structure:
{
"id": 123,
"name": "Title",
"categories": ["A", "B", "C"],
"country": "United Kingdom",
"status": 1
}
so the output I'm looking for should be:
Country
UK: 123
USA: 1000
Category
Motors: 23
Fashion: 1100
Status
Active: 1120
Not Active: 3
I don't know how to filter properly the aggregations, because right now they are counting all the document in the specified field, without considering the query status:(1|2) AND categories:A.
The elastic version is 1.7.2.
You simply need to remove global aggregation since it is not influenced by the query, just move your countries, categories and statuses aggregations at the top level like this:
{
"query": {
"query_string": {
"query": "status:(1|2) AND categories:A"
}
},
"aggs": {
"countries": {
"aggs": {
"counter": {
"terms": ["min_doc_count": 0, "field": "country"],
"aggs": ["unique": ["cardinality": ["field": "id"]]]
}
}
},
"categories": {
"aggs": {
"counter": {
"terms": ["min_doc_count": 0, "field": "category"],
"aggs": ["unique": ["cardinality": ["field": "id"]]]
}
}
},
"statuses": {
"aggs": {
"counter": {
"terms": ["min_doc_count": 0, "field": "status"],
"aggs": ["unique": ["cardinality": ["field": "id"]]]
}
}
}
}
}
Fabio. Ill see Your post on upwork, i have worked example for ES 2.4, may be it help You.
"index": "{{YOUR ELASTIC INDEX}}",
"type": "{{YOUR ELASTIC TYPE}}",
"body": {
"aggs": {
"trademarks": { // aggs NAME
"terms": {
"field": "id", // field name in ELASTIC base
"size": 100 // count of results YOU need
}
},
"materials": { //another aggs NAME
"terms": {
"field": "materials.name", // field name in ELASTIC base
"size": 100 / count of results YOU need
}
},
"certificate": {
"terms": {
"field": "certificate_type_id",
"size": 100
}
},
"country": {
"terms": {
"field": "country.id",
"size": 100
}
},
"price": {
"stats": {
"field": "price"
}
}
},
"from": 0, // start from
"size": 20, // results count
"query": {
"constant_score": {
"filter": { //apply filter
"bool": {
"should": [{ // all categories You need to show
"term": {
"categories": "10142"
}
}, {
"term": {
"categories": "10143"
}
}, {
"term": {
"categories": "10144"
}
}, {
"term": {
"categories": "10145"
}
}, {
"term": {
"categories": "12957"
}
}, {
"term": {
"categories": "13968"
}
}, {
"term": {
"categories": "14353"
}
}, {
"term": {
"categories": "16954"
}
}, {
"term": {
"categories": "18243"
}
}, {
"term": {
"categories": "10141"
}
}],
"must": [{ // if you want another filed to filter for example filter BY field trademark_id
"bool": {
"should": [{
"term": {
"trademark_id": "2872"
}
}, {
"term": {
"trademark_id": "2879"
}
}, {
"term": {
"trademark_id": "2914"
}
}]
}
}, {
"bool": { // filter by PRICE
"must": [{
"range": {
"price": {
"from": 5.97,
"to": 15752.69
}
}
}]
}
}]
}
}
}
},
"sort": { //here SORT BY desc or asc
"updated_at": "desc" //updated_at - field from ES base
}
}

Elasticsearch sum_bucket aggregation to sum the values contained in resulting buckets

I have a query as follows:
{
"size": 0,
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"match": {
"_type": "grx-ipx"
}
},
{
"range": {
"#timestamp": {
"gte": "2015-09-08T15:00:00.000Z",
"lte": "2015-09-08T15:10:00.000Z"
}
}
}
]
}
},
"filter": {
"and": [
{
"terms": {
"inSightCustID": [
"ASD001",
"ZXC049"
]
}
},
{
"terms": {
"reportFamily": [
"GRXoIPX",
"LTEoIPX"
]
}
}
]
}
}
},
"_source": [
"inSightCustID",
"fiveMinuteIn",
"reportFamily",
"#timestamp"
],
"aggs": {
"timestamp": {
"terms": {
"field": "#timestamp",
"size": 5
},
"aggs": {
"reportFamily": {
"terms": {
"field": "reportFamily"
},
"aggs": {
"averageFiveMinute": {
"avg": {
"field": "fiveMinuteIn"
}
}
}
}
}
},
"distinct_timestamps": {
"cardinality": {
"field": "#timestamp"
}
}
}
}
This result of this query looks like:
...
"aggregations": {
"distinct_timestamps": {
"value": 3,
"value_as_string": "1970-01-01T00:00:00.003Z"
},
"timestamp": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 1441724700000,
"key_as_string": "2015-09-08T15:05:00.000Z",
"doc_count": 10,
"reportFamily": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "GRXoIPX",
"doc_count": 5,
"averageFiveMinute": {
"value": 1687.6
}
},
{
"key": "LTEoIPX",
"doc_count": 5,
"averageFiveMinute": {
"value": 56710.6
}
}
]
}
},
...
What I want to do is for each bucket in the reportFamily aggregation, I want to show the sum of the averageFiveMinute values. So for instance, in the example above, I would also like to show the sum of 1687.6 and 56710.6. I want to do this for all reportFamily aggregations.
Here is what I have tried:
{
"size": 0,
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"match": {
"_type": "grx-ipx"
}
},
{
"range": {
"#timestamp": {
"gte": "2015-09-08T15:00:00.000Z",
"lte": "2015-09-08T15:10:00.000Z"
}
}
}
]
}
},
"filter": {
"and": [
{
"terms": {
"inSightCustID": [
"ASD001",
"ZXC049"
]
}
},
{
"terms": {
"reportFamily": [
"GRXoIPX",
"LTEoIPX"
]
}
}
]
}
}
},
"_source": [
"inSightCustID",
"fiveMinuteIn",
"reportFamily",
"#timestamp"
],
"aggs": {
"timestamp": {
"terms": {
"field": "#timestamp",
"size": 5
},
"aggs": {
"reportFamily": {
"terms": {
"field": "reportFamily"
},
"aggs": {
"averageFiveMinute": {
"avg": {
"field": "fiveMinuteIn"
}
}
}
},
"sum_AvgFiveMinute": {
"sum_bucket": {
"buckets_path": "reportFamily>averageFiveMinute"
}
}
}
},
"distinct_timestamps": {
"cardinality": {
"field": "#timestamp"
}
}
}
}
I have added:
"sum_AvgFiveMinute": {
"sum_bucket": {
"buckets_path": "reportFamily>averageFiveMinute"
}
}
But unfortunately, this triggers an exception Parse Failure [Could not find aggregator type [sum_bucket] in [sum_AvgFiveMinute]
I expected the results to be something like:
...
"aggregations": {
"distinct_timestamps": {
"value": 3,
"value_as_string": "1970-01-01T00:00:00.003Z"
},
"timestamp": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 1441724700000,
"key_as_string": "2015-09-08T15:05:00.000Z",
"doc_count": 10,
"reportFamily": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "GRXoIPX",
"doc_count": 5,
"averageFiveMinute": {
"value": 1687.6
}
},
{
"key": "LTEoIPX",
"doc_count": 5,
"averageFiveMinute": {
"value": 56710.6
}
}
]
},
"sum_AvgFiveMinute": {
"value": 58398.2
}
},
...
What is wrong with this query and how can I achieve the expected result?
Here is a link to the sum bucket aggregation docs.
Many thanks for the help.

Resources