Kibana Dashboard multiple time periods and search terms - kibana-4

Is it possible to give different time periods or different search terms to each Visualization in a Kibana Dashboard?

Currently - no.
This is on the list of enhancements that the 'elastic' team will implement soon, but doesn't have any due date yet.
You could follow the open issue here: https://github.com/elastic/kibana/issues/3578

I think i've understood your question.
Lets supose this is yout data whitin elasticSearch:
timestamp level message
19:05:15 error connection failed
19:06:30 debug connection succesfull
You can reflect your percentajes of each level in differente time periods (10% of debug, 20% of errors, 14% of info and so on). For instance you can design a chart for the last 1 hour and other one for the last day in the same dashboard, so you don't need to manipulate the date picker in de header.
First you have to make a query to filter your data by the timestamp
(ex. last day):
#timestamp:[now-1d TO now]
Second, you need to save this search, and name it.
Finally, design whatever visualization you need based on this
search, and the results will be bound to it.
Repeat with different time periods.
Hope this helps. Bye.

Related

Reasons & Consequences of putting a Date in Elastic Index Name

I am looking at sending my App logs to Elastic (6.x) via FileBeat and Logstash. As mentioned in Configure the Logstash output and recommended elsewhere, it seems that I need add the Date to the Index name. The reason for doing so was that when the time came to delete old data, it was easier to delete an entire Index by date, rather than individual documents. Is this true?
If I should be following this recommendation of adding the Date to the Index Name, I’m curious what additional things I need to do to ensure seamless querying? By this I mean querying esp. in Kibana, for e.g. over the past day which would need to look at today’s index as well as yesterday’s index.
Speaking of querying in Kibana, is there a way of simply working with the base index name without the date stamp i.e. setting it up so that I do not see or have to deal with the date named indexes?
Edit: Kamal raised a good point that I have not provided any information about my cluster and my needs. The following is what I'm working with:
What is your daily data creation/expected count
I'm not sure. I don't expect anything more than a GB of data day, and no more than a couple of 100K documents a day. Since these are logs, I don't expect any updates to the documents once they are created.
Growth rate of the data in the future (1 year - 5 years)
At the moment, I don't see the growth rate to cross a GB a day.
How many teams are using the same cluster apart from yours if there is
any
The cluster would be used (actually queried) by just my team. We are about 5 right now, but I don't see more than 10 users (and that's not concurrent, just over a day or month)
Usage patterns, type of queries used etc.
I'm not sure, but there certainly would not be updates to the data other than deletions
Hardware details
I've not worked this out with management. For most part I expect 3 nodes. Also this is not critical i.e. if we lose all of our logs for some reason, I would not lose sleep over it.
First of all you need to take a step back and understand do you really need multiple index or single one(where you need to filter documents while querying using a date field for a particular date).
Some of questions you must have before you take on such decision
What is your daily data creation/expected count
Growth rate of the data in the future (1 year - 5 years)
How many teams are using the same cluster apart from yours if there is any
Usage patterns, type of queries used etc.
Hardware details
Advantages
In a way, having multiple indexes(with date field as its index name) would be more beneficial.
You can delete the old indexes without affecting new ones.
In case if you have to change the mapping, you can do so with the new index without affecting the old ones. Comparatively less overhead while for single index, you have to reindex all the documents which would take lot more time if size is pretty huge. And if this keeps happening every now and then, you would need to come up with solution where you have to execute such operations at the times of minimal usages. That means, it can harm productivity.
searching using multiple indexes still is convenient.
not really sure but its easier for scaling using multiple indexes.
Disadvantages are:
Additional shards are created for each and every index that can waste some storage space.
Overhead to maintain multiple indexes by monitoring/operations team.
At times can lead to over-creation of indexes.
No mapping changes and less documents insertion(in 100s or few 100s), it'd be better to use single index.
The only way and the only correct way to figure out what's best is to have a cluster that closely resembles the production one with data too resembling to production, try various configurations and see which solution fits best.
Speaking of querying in Kibana, is there a way of simply working with
the base index name without the date stamp i.e. setting it up so that
I do not see or have to deal with the date named indexes?
Yes there is. If you have indexes with names like logs-0001, logs-0002, you can use logs-* as indexname when you query.
Including date in an index name is a very common use case implemened by many Elasticsearch users. It helps with archiving/ purging old indices as you mentioned. You dont need to do anything additionally to be able to query. Setup your index basename as an index pattern for your indices for ex. logstash-* and you can query on that particular index pattern in Kibana.

How to merge old data to save space in Elasticsearch

I tried to find information about this, but I have not find what I was looking for.
I am storing metrics every minutes in an Elasticsearch database. My idea is that the frequency is important only in a short period.
For example, I want to have my metrics every minutes for the last past week, but then I would like to merge these metrics in order to have only one document of metrics for each past weeks.
Thus, I have an idea to achieve this with a stream processing framework such as Spark streaming or Flink, but my question is : is there a native way / tool / tricks to make it happen in Elasticsearch ?
Thank you, hope my question is clear enough, otherwise leave a comment for more details.
One idea would be to have a weekly index in which you store all your metrics every minutes, once the week has passed, you could run an aggregation query on the past week index and aggregate all info at the day or week level. You'd then store that weekly aggregated information as new document in another historical index that you can query later on. I don't think it's necessary to leverage Spark streaming for this, ES aggregations can do the job pretty easily.

Is there a way to maintain aging in documents in elastic search

Here is the problem
I have about 1 million record in indexes. There is a property aging in the documents which increase daily. Every night scheduler runs and it calculates the aging from current date and created date in the document and update the index.
The problem is as data is increasing the bulk update is leading to GC overhead limit exceeded. So what I did is added some pause in each update, but still no help.
Now I am thinking and researching of using groovy script with 'update_with_query'.
I want to ask it there any other way to maintain age. e.g in jira everyday overdue date is increased or I have to fetch visit and update documents
EveryTime bulk request is run I can see elastic search throttling ' now throttling indexing: numMergesInFlight=5, maxNumMerges=4'. I have read about this but not sure what to do. I think there should be another approach to calculate aging but not sure, because as data will increase this problem is going to persist
IN the end I want a query like give me all docs whose aging is 100 or give me all documents whose aging > 100
The answer was simple. I was thinking other way around.
if a query is get all docs where aging is > 2. It means I need to get all docs who were created before two days. Simple convert '2' to date from current date and use range operation and it should solve the problem

How to add calculations to an Elastic Search database?

I'm using Elastic Search to index large amounts of sensor data for analytics purposes. The table has 4 million + rows and growing fast - expecting 40 million within the next year. This makes Elastic Search seem like a natural fit, especially with tools such as Kibana to easily display the data.
Elastic Search seems great, however there are are some more complex calculations that have to be performed as well. One such calculation is for our "average user time", where we take two data points (timestamp of item picked up and timestamp of item placed back), subtract them from each other and do an average of all these for one specific customer over a specific timeframe. The SQL query would look something like "select * from events where event_type = 'object picked up' or event_type = 'object placed back down'" then take all these events and get diffs on all their timestamps, add them all together then divide by count.
These types of calculations to my understanding are not the type of thing that Elastic Search is meant to do. I've had people recommend Hadoop but that could take a long time to get set up and we can use a fast language like GO or Node/JavaScript to batch process things and add them to the DB periodically... but what is the right way to do this? Allowing for future scalability and working nicely with Elastic Search.
Our setup is: Rails, AngularJS, Elastic Search, Heroku, Postgres.
Maybe you could try to use scripted metrics. In connection with filters can give you more or less proper solution for your problem
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-scripted-metric-aggregation.html

How to do a time range search in Kibana

We are using the ELK for log aggregation. Is it possible to search for events that occured during a particular time range. Lets say I want to see all exceptions that occurred between 10am and 11am in last month.
Is it possible to extract the time part from #timestamp and do a range search on that somehow (similiar to date() in SQL)?
Thanks to Magnus who pointed me to looking at scripted fields. Take a look at:
https://www.elastic.co/blog/kibana-4-beta-3-now-more-filtery
or
https://www.elastic.co/guide/en/elasticsearch/reference/1.3/search-request-script-fields.html
Unfortunately you can not use these scripted fields in queries but only in visualisations.
So I resorted to a workaround and use logstashs drop filter to remove the events I don't want to show up in Kibana in the first-place. That is not perfect for obvious reasons but it does the job.

Resources