Elasticsearch aggregation return null values as 0? - elasticsearch

I can count the hits per day that match my queried string with this code, but if the span of a whole week has no hits, then the query will return nothing - as opposed to returning 0 for each day. Is there a way I can 'default' to 0?
GET index/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{"match_phrase": {
"message": "Cannot login"
}
},
{"range": {
"#timestamp":{
"gte":"2021-07-01",
"lte":"2021-07-07"
}
}
}
]
}
},
"aggs": {
"hit_count_per_day": {
"date_histogram": {
"field": "#timestamp",
"calendar_interval": "day"
}
}
}
}

For this propose you need to add extended_bounds to your aggregation like below:
GET index/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{"match_phrase": {
"message": "Cannot login"
}
},
{"range": {
"#timestamp":{
"gte":"2021-07-01",
"lte":"2021-07-07"
}
}
}
]
}
},
"aggs": {
"hit_count_per_day": {
"date_histogram": {
"field": "#timestamp",
"calendar_interval": "day",
"extended_bounds": {
"min": "2021-07-01",
"max": "2021-07-07"
}
}
}
}
}
Please let me know if it did not solve your problem.

When elasticsearch in aggregation response returns 0s means there are values not showed because of filters.
For example, I had a term filter on "region":"mexico" and it returned me:
"colombia": 0
"argentina": 0
"mexico": 7
Firsts two because of they are in dataset (index) but filtered.
Hopin' it could help.

Related

Filter an elasticsearch result after an aggregation

I have this elasticsearch query that get every x-locations for which the number of documents (with timestamp gte 1 month ago) is greater than 5000. I'm also able to get the most recent data timestamp for each of these x-locations.
Is it possible to add an additional filter at the end of the query, in order to ignore all x-locations for which the most recent timestamp is older than 2 days ago?
The query:
GET /mypattern-*/_search
{
"query": {
"bool": {
"must": [
{"match": {"method": "GET"}},
{
"range": {
"timestamp": {
"gte": "now-1M"
}
}
}
]
}
},
"aggs": {
"location_terms": {
"terms": {
"field": "x-location.keyword",
"min_doc_count": 500,
"size": 1000,
"order": {
"recent_timestamp": "desc"
}
},
"aggs": {
"recent_timestamp": {
"max": {
"field": "timestamp"
}
}
}
}
}
}

malformed bool query elasticsearch - Elasticsearch watcher

Hi I have the below elastic search query using this in dev tools. I keep getting errors for my bool query but it seems correct looking at #timestamp field and trying to only retrieve one day worth of data.
"input": {
"search": {
"request": {
"indices": [
"<iovation-*>"
],
"body": {
"size": 0,
"query": {
"bool": {
"must": {
"range": {
"#timestamp": {
"gte": "now-1d"
}
}
}
},
"aggs": {
"percentiles": {
"percentiles": {
"field": "logstash.load.duration",
"percents": 95,
"keyed": false
}
},
"dates": {
"date_histogram": {
"field": "#timestamp",
"calendar_interval": "5m",
"min_doc_count": 1
}
}
}
}
}
}
}
},
Any help is appreciated thanks!
There are few errors in your query
Whenever aggregation is used along with the query part, then the structure is
{
"query": {},
"aggs": {}
}
You are missing one } at the end of the query part
Calendar Intervals do not accept multiple quantities like 2d, 2m, etc.
If you have a fixed interval, then you can refer to the fixed_interval param
Modify your query as
{
"size": 0,
"query": {
"bool": {
"must": {
"range": {
"#timestamp": {
"gte": "now-1d"
}
}
}
} // note this
},
"aggs": {
"percentiles": {
"percentiles": {
"field": "logstash.load.duration",
"percents": 95,
"keyed": false
}
},
"dates": {
"date_histogram": {
"field": "timestamp",
"fixed_interval": "5m", // note this
"min_doc_count": 1
}
}
}
}

Need aggregation of only the query results

I need to do an aggregation but only with the limited results I get form the query, but it is not working, it returns other results outside the size limit of the query. Here is the query I am doing
{
"size": 500,
"query": {
"bool": {
"must": [
{
"term": {
"tags.keyword": "possiblePurchase"
}
},
{
"term": {
"clientName": "Ci"
}
},
{
"range": {
"firstSeenDate": {
"gte": "now-30d"
}
}
}
],
"must_not": [
{
"term": {
"tags.keyword": "skipPurchase"
}
}
]
}
},
"sort": [
{
"firstSeenDate": {
"order": "desc"
}
}
],
"aggs": {
"byClient": {
"terms": {
"field": "clientName",
"size": 25
},
"aggs": {
"byTarget": {
"terms": {
"field": "targetName",
"size": 6
},
"aggs": {
"byId": {
"terms": {
"field": "id",
"size": 5
}
}
}
}
}
}
}
}
I need the aggregations to only consider the first 500 results of the query, sorted by the field I am requesting on the query. I am completely lost. Thanks for the help
Scope of the aggregation is the number of hits of your query, the size parameter is only used to specify the number of hits to fetch and display.
If you want to restrict the scope of the aggregation on the first n hits of a query, I would suggest the sampler aggregation in combination with your query

ES query ignoring time range filter

I have mimicked how kibana does a query search and have come up with the below query. Basically I'm looking for the lat 6 days of data (including those days where there is no data, since I need to feed it to a graph). But the returned buckets is giving me more than just those days. I woul like to understand where I'm going wring with this.
{
"version": true,
"size": 0,
"sort": [
{
"#timestamp": {
"order": "desc",
"unmapped_type": "boolean"
}
}
],
"_source": {
"excludes": []
},
"aggs": {
"target_traffic": {
"date_histogram": {
"field": "#timestamp",
"interval": "1d",
"time_zone": "Asia/Kolkata",
"min_doc_count": 0,
"extended_bounds": {
"min": "now-6d/d",
"max": "now"
}
},
"aggs": {
"days_filter": {
"filter": {
"range": {
"#timestamp": {
"gt": "now-6d",
"lte": "now"
}
}
},
"aggs": {
"in_bytes": {
"sum": {
"field": "netflow.in_bytes"
}
},
"out_bytes": {
"sum": {
"field": "netflow.out_bytes"
}
}
}
}
}
}
},
"stored_fields": [
"*"
],
"script_fields": {},
"docvalue_fields": [
"#timestamp",
"netflow.first_switched",
"netflow.last_switched"
],
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "( flow.src_addr: ( \"10.5.5.1\" OR \"10.5.5.2\" ) OR flow.dst_addr: ( \"10.5.5.1\" OR \"10.5.5.2\" ) ) AND flow.traffic_locality: \"private\"",
"analyze_wildcard": true,
"default_field": "*"
}
}
]
}
}
}
If you put the range filter inside your aggregation section without any date range in your query, what is going to happen is that your aggregations will run on all your data and metrics will be bucketed by day over all your data.
The range query on #timestamp should be moved inside the query section so as to compute aggregations only on the data you want, i.e. the last 6 days.

ElasticSearch - significant term aggregation with range

I am interested to know how can I add a range for a significant terms aggregations query. For example:
{
"query": {
"terms": {
"text_content": [
"searchTerm"
]
},
"range": {
"dateField": {
"from": "date1",
"to": "date2"
}
}
},
"aggregations": {
"significantQTypes": {
"significant_terms": {
"field": "field1",
"size": 10
}
}
},
"size": 0
}
will not work. Any suggestions on how to specify the range?
Instead of using a range query, use a range filter as the relevance/score doesn't seem to matter in your case.
Then, in order to combine your query with a range filter, you should use a filtered query (see documentation).
Try something like this :
{
"query": {
"filtered": {
"query": {
"terms": {
"text_content": [
"searchTerm"
]
}
},
"filter": {
"range": {
"dateField": {
"from": "date1",
"to": "date2"
}
}
}
}
},
"aggs": {
"significantQTypes": {
"significant_terms": {
"field": "field1",
"size": 10
}
}
},
"size": 0
}
Hope this helps!

Resources