Hey I am using marvel alongside elasticsearch and I am trying to avoid using curator to clean indices that look like ".marvel-2015-*" is there a specific config or set of configs that I can use to accomplish this.
Note: I am using chef to provision the node and inside of the logstash cookbook I am setting the attribute in default.rb like so
default['logstash']['instance_default']['curator_days_to_keep'] = 14
I would assume this sets the max amount of these indices to 14. But when I added some fake ".marvel-2015-*" indices they still appear and are not cleared out.
I realize that I am talking about a tool for working with marvel curator and marvel itself, but I am new to these tools and I need help connecting these dots.
Ideally I want marvel to have the logic to just remove these indices by itself, and I don't know if there is some option to accomplish this in the plugins/marvel/marvel-1.3.1.jar
Any help would be appreciated.
I agree that ideally Marvel should provide this as a configuration option but, at time of writing, it does not and over time, the marvel indexes can become quite big. Here's an example for a cluster I'm currently managing:
I know you want to avoid using Curator but short of writing your own script or plugin to manage this, it is by far the easiest way to deal with this.
To purge Marvel indexes older than 30 days, you can do:
curator delete indices --timestring '%Y.%m.%d' --prefix '.marvel-2' --older-than 30 --time-unit 'days'
To test what would be deleted, I recommend to first use --dry-run :
curator --dry-run delete indices --timestring '%Y.%m.%d' --prefix '.marvel-2' --older-than 30 --time-unit 'days'
Related
I have an Elasticsearch cluster on Kubernetes, I also have a curator that deletes indices older than 7 days.
I want to change the curator to work according to a certain condition:
If document key1=value1 delete these documents delete after 10 days, otherwise delete after 7 days.
Is there any way to do it?
Curator is limited to index deletion as a whole and not at the document level.
What Curator does under the hood is call DELETE index-name and there is not way to configure it to call the delete by query API which is what you're asking for.
I work on several log files that I process with logstash. I divide them into several documents (multiline) and then I extract the information I want.
The problem is that I find myself in the end with several documents where I have nothing interesting and that takes me up space.
Do you know a way to delete documents where there is no information extract by logstash ?
Thank you very much for your help !
In lower versions of ElasticSearch, when creating indexes you can specify a ttl field that indicates the expiry of a document in the index. You could set the ttl to a value of say 24 hours. Read more here
However ttl has been deprecated as of version 2.0 since its a clumsy way of removing stale data, personally, i create rolling indexes with logstash and have a cron job that simply drops the daily index at eod via curl.
Also refer to this article from ES
https://www.elastic.co/guide/en/elasticsearch/guide/current/retiring-data.html
I have a Graylog 2.1 server that has been running for some time. I hadn't paid attention to my retention rate recently and came in this morning to find Graylog partially crashed because the disk was out of space. Nearly 100% of the disk space is currently being taken up by Elasticsearch Shards. The web interface for Graylog is not currently usable in the state it's in. I tried some of the standard Ubuntu tricks for freeing up disk space like apt-get autoremove and clean, but wasn't able to get enough to get the web interface functional.
The problem is all of the documentation I can currently find for changing the retention rate and cycling the shards, is via the web interface. The only config options no longer appear present in the Graylog config file.
Does anyone know of a manual, CLI, way of purging data from the Elasticsearch Shards in Graylog 2.1?
First aid: check which indices are present:
curl http://localhost:9200/_cat/indices
Then delete the oldest indices (you should not delete all)
curl -XDELETE http://localhost:9200/graylog_1
curl -XDELETE http://localhost:9200/graylog_2
curl -XDELETE http://localhost:9200/graylog_3
Fix: You can then reduce the parameter elasticsearch_max_number_of_indices in /etc/graylog/server/server.conf to a value that fits your disk.
If Elasticsearch is still starting, you can simply delete indices with the Delete Index API, which is, after using Graylog directly (System / Indices page in the web interface), the preferred way of getting rid of Elasticsearch indices.
If you're totally screwed (i. e. neither Graylog, nor Elasticsearch are starting), you can still delete the complete data from Elasticsearch's data path (see Directory Layout).
There is list of indexes under graylog admin panel,
"/system/indices"
There is delete button for each index. You can check old indexes and delete them if not required.
You can also delete log files older that 7 days from elastic search,
sudo find /var/log/elasticsearch/ -type f -mtime +7 -delete
You should set up a retention strategy from within graylog. If you manage the indices yourself and you delete the wrong index, you might break your graylog.
Go to system/indeces. Select default index set. Select edit index set and there you'll find index rotation and retention.
is it possible to show the size (physical size, e.g. MB) of one or more ES indices in Kibana?
Thanks
Kibana only:
It's not possible out of the box to view the disk-size of indices in Kibana.
Use the cat command to know how big your indices are (thats even possible without any Kibana).
If you need to view that data in Kibana index the output from the cat command to a dedicated Elasticsearch index and analyse it then in Kibana.
If other plugins/tools then Kibana are acceptable, read the following:
Check the Elasticsearch community plugins. The Head-Plugin (which I would recommand to you) gives you the info you want in addition to many other infos, like stats about your Shards, Nodes, etc...
Alternatively you could use the commerical Marvel Plugin from Elastic. I have never used it before, but it should be capeable of what you want, and much more. But Marvel is likely an overkill for what you want - so I wouldn't recommand that in the first place.
Although not a plugin of Kibana, cerebro is the official replacement of Kopf and runs as a standalone web server that can connect remotely to ElasticSearch instances. The UI is very informational and functional.
https://github.com/lmenezes/cerebro
is there a way to find out the names of all the indices ever created? Even after the index might have been deleted. Does elastic store such historical info?
Thanks
Using a plugin that keeps an audit trail for all changes that happened in your ES cluster might do the trick.
If you use the changes plugin (or a more recent one), then you can query it for all the changes in all indices using
curl -XGET http://localhost:9200/_changes
and your response will contain all the index names that were at least created. Not sure this plugin works with the latest versions of ES, though.