Query elasticsearch with multiple numeric ranges - elasticsearch

{
"query": {
"filtered": {
"query": {
"match": {
"log_path": "message_notification.log"
}
},
"filter": {
"numeric_range": {
"time_taken": {
"gte": 10
}
}
}
}
},
"aggs": {
"distinct_user_ids": {
"cardinality": {
"field": "user_id"
}
}
}
}
I have to run this query 20 times as i want to know notification times above each of the following thresholds- [10,30,60,120,240,300,600,1200..]. Right now, i am running a loop and making 20 queries for fetching this.
Is there a more sane way to query elasticsearch once and get ranges that fall into these thresholds respectively?

What you probably want is a "range aggregation".
Here is the possible query where you can add more range or alter them -
{
"size": 0,
"query": {
"match": {
"log_path": "message_notification.log"
}
},
"aggs": {
"intervals": {
"range": {
"field": "time_taken",
"ranges": [
{
"to": 50
},
{
"from": 50,
"to": 100
},
{
"from": 100
}
]
},
"aggs": {
"distinct_user_ids": {
"cardinality": {
"field": "user_id"
}
}
}
}
}
}

Related

malformed bool query elasticsearch - Elasticsearch watcher

Hi I have the below elastic search query using this in dev tools. I keep getting errors for my bool query but it seems correct looking at #timestamp field and trying to only retrieve one day worth of data.
"input": {
"search": {
"request": {
"indices": [
"<iovation-*>"
],
"body": {
"size": 0,
"query": {
"bool": {
"must": {
"range": {
"#timestamp": {
"gte": "now-1d"
}
}
}
},
"aggs": {
"percentiles": {
"percentiles": {
"field": "logstash.load.duration",
"percents": 95,
"keyed": false
}
},
"dates": {
"date_histogram": {
"field": "#timestamp",
"calendar_interval": "5m",
"min_doc_count": 1
}
}
}
}
}
}
}
},
Any help is appreciated thanks!
There are few errors in your query
Whenever aggregation is used along with the query part, then the structure is
{
"query": {},
"aggs": {}
}
You are missing one } at the end of the query part
Calendar Intervals do not accept multiple quantities like 2d, 2m, etc.
If you have a fixed interval, then you can refer to the fixed_interval param
Modify your query as
{
"size": 0,
"query": {
"bool": {
"must": {
"range": {
"#timestamp": {
"gte": "now-1d"
}
}
}
} // note this
},
"aggs": {
"percentiles": {
"percentiles": {
"field": "logstash.load.duration",
"percents": 95,
"keyed": false
}
},
"dates": {
"date_histogram": {
"field": "timestamp",
"fixed_interval": "5m", // note this
"min_doc_count": 1
}
}
}
}

Elasticsearch distinct records in order with pagination

How do I get records after aggregation on a terms field in order with pagination. So far I have this:
{
"query": {
"bool": {
"filter": [
{
"terms": {
"user_id.keyword": [
"user#domain.com"
]
}
},
{
"range": {
"creation_time": {
"gte": "2019-02-04T19:00:00.000Z",
"lte": "2019-05-04T19:00:00.000Z"
}
}
}
],
"should": [
{
"wildcard": {
"operation": "*sol*"
}
},
{
"wildcard": {
"object_id": "*sol*"
}
},
{
"wildcard": {
"user_id": "*sol*"
}
},
{
"wildcard": {
"user_type": "*sol*"
}
},
{
"wildcard": {
"client_ip": "*sol*"
}
},
{
"wildcard": {
"country": "*sol*"
}
},
{
"wildcard": {
"workload": "*sol*"
}
}
]
}
},
"aggs": {
"user_ids": {
"terms": {
"field": "country.keyword",
"include": ".*United.*"
}
}
},
"from": 0,
"size": 10,
"sort": [
{
"creation_time": {
"order": "desc"
}
}
]
}
I looked into this and some people say its possible by using composite aggregations or by using partitions. But I am not sure how I can actually achieve this.
I also looked into bucket_sort but I cant seem to get it to work:
"my_bucket_sort": {
"bucket_sort": {
"sort": [
{
"user_ids": {
"order": "desc"
}
}
],
"size": 3
}
}
I am a noob at this. Kindly help me out. Thanks.
As the field is country, and presumably doesn't have a high cardinality, you could set size to be a sufficiently high number to return all countries in a single request
"aggs": {
"user_ids": {
"terms": {
"field": "country.keyword",
"include": ".*United.*",
"size": 10000
}
}
}
Or alternatively, for a high cardinality field, you could filter the aggregation first, and then use partitioning to page through the values
{
"size": 0,
"aggs": {
"user_ids": {
"filter": {
"wildcard" : { "country" : ".*United.*" }
},
"aggs": {
"countries": {
"terms": {
"field": "country.keyword",
"include": {
"partition": 0,
"num_partitions": 20
},
"size": 10000
}
}
}
}
}
}
where you would increase the value of partition with each query you send up to 19
See the elastic documentation for further details

Elasticsearch term aggregation and range with timestamp

I'm trying to count # of logs grouped by user agent.
This is what I have.
GET /myindex/_search
{
"size": 30,
"stored_fields": ["req.headers.user-agent.keyword"],
"aggs": {
"group_by_userAgent": {
"terms": {
"field": "req.headers.user-agent.keyword"
}
}
}
}
I wanted to add "Query last 15 mins" feature. I've tried to add 'range' query and I ended up the following query, which does not work.
GET /myindex/_search
{
"size": 30,
"stored_fields": ["req.headers.user-agent.keyword"],
"aggs": {
"group_by_userAgent": {
"terms": {
"field": "req.headers.user-agent.keyword"
},
"range": {
"timestamp": {
"gt": "now-15m"
}
}
}
}
}
How do I query terms aggregation with range with "now-x15min" syntax?
The range should go inside the query section, not aggs. The time range is good as it is
I think what you're looking for is this, the number of docs in the first 30 user-agent buckets, i.e. the top 30 user agents producing the most logs
GET /myindex/_search
{
"size": 0,
"query": {
"range": {
"#timestamp": {
"gt": "now-15m"
}
}
},
"aggs": {
"group_by_userAgent": {
"terms": {
"field": "req.headers.user-agent.keyword",
"size": 30
}
}
}
}
you can do this in two ways to achieve aggregation results for user-agent.
POST phrase_index/_search
{
"aggs": {
"date_range_filtered_agg": {
"filter": {
"range": {
"timestamp": {
"gte": "now-15m/m"
}
}
},
"aggs": {
"group_by_userAgent": {
"terms": {
"field": "req.headers.user-agent.keyword",
"size": 10
}
}
}
}
},
"size": 30,
"stored_fields": ["req.headers.user-agent.keyword"]
}
POST phrase_index/_search
{
"query": {
"range": {
"timestamp": {
"gte": "now-15m/m"
}
}
},
"aggs": {
"group_by_userAgent": {
"terms": {
"field": "req.headers.user-agent.keyword",
"size": 10
}
}
},
"size": 30,
"stored_fields": ["req.headers.user-agent.keyword"]
}
You need a filter aggregation first to apply the range query, then add a terms sub-aggregation.
See: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filter-aggregation.html

Elasticsearch : How get result buckets size

Here is my query result
GET _search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"match": {
"serviceName.keyword": "directory-view-service"
}
},
{
"match": {
"path": "thewall"
}
},
{
"range": {
"#timestamp": {
"from": "now-31d",
"to": "now"
}
}
}
]
}
},
"aggs": {
"by_day": {
"date_histogram": {
"field": "date",
"interval": "7d"
},
"aggs": {
"byUserUid": {
"terms": {
"field": "token_userId.keyword",
"size": 150000
},
"aggs": {
"filterByCallNumber": {
"bucket_selector": {
"buckets_path": {
"doc_count": "_count"
},
"script": {
"inline": "params.doc_count <= 1"
}
}
}
}
}
}
}
}
}
I want my query return all user call my endpoint min. once time by 1 month range by 7 days interval, until then everything is good.
But my result is a buckets with 370 elements and I just need to know the array size...
Are there any keyword or how can I handle it ?
Thanks

ElasticSearch - significant term aggregation with range

I am interested to know how can I add a range for a significant terms aggregations query. For example:
{
"query": {
"terms": {
"text_content": [
"searchTerm"
]
},
"range": {
"dateField": {
"from": "date1",
"to": "date2"
}
}
},
"aggregations": {
"significantQTypes": {
"significant_terms": {
"field": "field1",
"size": 10
}
}
},
"size": 0
}
will not work. Any suggestions on how to specify the range?
Instead of using a range query, use a range filter as the relevance/score doesn't seem to matter in your case.
Then, in order to combine your query with a range filter, you should use a filtered query (see documentation).
Try something like this :
{
"query": {
"filtered": {
"query": {
"terms": {
"text_content": [
"searchTerm"
]
}
},
"filter": {
"range": {
"dateField": {
"from": "date1",
"to": "date2"
}
}
}
}
},
"aggs": {
"significantQTypes": {
"significant_terms": {
"field": "field1",
"size": 10
}
}
},
"size": 0
}
Hope this helps!

Resources