Query for the lack of requests in specific points in time - elasticsearch

I have an Elasticsearch/kibana stack that stores every request the application receives. It stores gereneral information about the request (RequestTimestamp, IP, Headers, HttpStatus, Route etc), and there's at least some requests per minute.
I would like to know if there's some way to query Kibana/Elastic to know the points in time that the application didn't receive any request for, let's say, 3 minutes.
I know it can be done programmatically, but it needs to be purely done with querys (so I can show it on the Dashboard).

You could do date histogram aggregation.
You could specify 3m interval and query for a specified day.
So you would get 24*60/3 = 480 values for each day.
You could plot it on the chart and see the gaps.
If you are an expert ES user you could try filtering the aggregations using bucket selector pipeline aggregation or create a moving average using moving average aggregation.

Related

Grafana with Elastic - Show requests count toguether with average response time

I'm new at Grafana and I'm trying to create a graph that shows the requests count together with the average response time for the requests, I was able to create my requests count but now I'm struggling to add the information with the requests time, there is an option to show both information inside a panel? Or do I need to create two panels, one with the request count and another with the average time?
And another question, there is an option to show the average time in milliseconds?

How to find memory usage difference in Grafana

I am working in graph panel in grafana and using elastic search as a data source. In the data source, I have memory-used with timestamp. I am trying to give notification alert when the difference is more than 100 MB. How to find memory difference between the memory used in day one and memory used in current day and send alert notification?
You would setup a query which is basically grouped by timestamp and define it based on whether you are looking for the 100 MB difference to be on max value or average. Assuming it is max value- you query would be something like
And then you would set alerts by going to the alert tab based on the query and diff in the values for 24 hours

ElasticSearch [6.5] fetching multiple records execution time issue

I am trying to fetch about 2.5 million records from elastic search using elastic search's Java High Level Client. Which is taking too much time (15 to 22 minutes based on number of records) to fetch all the record using scroll API as it has a limitation of fetch 10,000 record in one request. I tried sliced scroll also but that is taking more time than normal scroll. Following is my assumption about sliced scroll API:
I divided my scroll request into five slices. Which creates 5 requests.
I send 5 request in different threads.
Because every sliced scroll request is an individual request. I guess for each sliced scroll request first it fetches all the records (2.5 million) then filters out the records which belongs to that particular slice.
Which is resulting in more time.
Can anyone tell me more efficient way to fetch all the records.

Aggregation by ID on Elasticsearch or by timestamp with unsupervised clustering

I have a data log entry stored in elasticsearch, each with its own timestamp. I now have a dashboard that can get the aggregation by day / week using Date Histogram aggregation.
Now I want to get the data in chunk (data logs are written several time per transaction, spanning for up to several minutes) by analyzing the "cluster" of logs according to its timestamp to identify whether it's the same "transaction". Would that be possible for Elastic search to automatically analyze the meaningful bucket and aggregate the data accordingly?
Another approach I'm trying is to group the data by transaction ID - however there's a warning that to do this I need to enable fielddata which will use a significant amount of memory. Any suggestion?

How to add calculations to an Elastic Search database?

I'm using Elastic Search to index large amounts of sensor data for analytics purposes. The table has 4 million + rows and growing fast - expecting 40 million within the next year. This makes Elastic Search seem like a natural fit, especially with tools such as Kibana to easily display the data.
Elastic Search seems great, however there are are some more complex calculations that have to be performed as well. One such calculation is for our "average user time", where we take two data points (timestamp of item picked up and timestamp of item placed back), subtract them from each other and do an average of all these for one specific customer over a specific timeframe. The SQL query would look something like "select * from events where event_type = 'object picked up' or event_type = 'object placed back down'" then take all these events and get diffs on all their timestamps, add them all together then divide by count.
These types of calculations to my understanding are not the type of thing that Elastic Search is meant to do. I've had people recommend Hadoop but that could take a long time to get set up and we can use a fast language like GO or Node/JavaScript to batch process things and add them to the DB periodically... but what is the right way to do this? Allowing for future scalability and working nicely with Elastic Search.
Our setup is: Rails, AngularJS, Elastic Search, Heroku, Postgres.
Maybe you could try to use scripted metrics. In connection with filters can give you more or less proper solution for your problem
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-scripted-metric-aggregation.html

Resources