Removing Old Records for Logstash / Elasticsearch / Kibana

Edit: This post is pretty old and Elasticsearch/Logstash/Kibana have evolved a lot since it was written.

Part 4 of 4 – Part 1Part 2Part 3

Now that you’ve got all your logs flying through logstash into elasticsearch, how to remove old records that are no longer doing anything but consuming space and ram for the index?

These are all functions of elasticsearch. Deleting is pretty easy, as is closing an index.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-delete-index.html
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-open-close.html

The awesome people working on elasticsearch already have the solution! It’s called curator.
https://github.com/elasticsearch/curator
https://logstash.jira.com/browse/LOGSTASH-211

I like the idea of being able to let a cron job kick off the cleanup so I don’t forget.

To install, we’ll have to instal pip.
[text]
sudo apt-get install python-pip
[/text]

Then use pip to install elasticsearch-curator
[text]
pip install elasticsearch-curator
[/text]

When making a cron job, I always use full paths
[text]
which curator
/usr/local/bin/curator
[/text]

edit the crontab. Any user should have access so I’ll run this under my user.
[text]
crontab -e
[/text]

Add the following line to run curator at 20 minutes past midnight (system time) and connect to the elasticsearch node on 127.0.0.1 and delete all indexes older than 120 days and close all indexes older than 90 days.
[text]
20 0 * * * /usr/local/bin/curator –host 127.0.0.1 -d 120 -c 90
[/text]

If you prefer an alternative, here’s one written in perl.
https://github.com/bloonix/logstash-delete-index

8 thoughts on “Removing Old Records for Logstash / Elasticsearch / Kibana

  1. Good article, helped me alot! Thank you for that!

    I suggest using long parameters for cron jobs. Increases the readability. Also maybe mention the –dry-run argument before executing the cronjob 🙂

    Like

  2. Hi, due to alot of log data that comes into my small server I would like to delete the log data that is older than 1 hour. How can I use the above crontab line to delete by hour instead of by day?

    Like

  3. You can also set curator to wait to prune indices until the disk is full to a certain size. Example:

    */15 * * * * root curator --loglevel CRITICAL delete --disk-space 480 indices --all-indices

    To prune disk usage to 480GB

    Like

  4. Thanks, very helpful. It might be worth adding that above commands work for 3.5 branch of curator, it didn’t work for me until I specifically went for 3.5

    pip install elasticsearch-curator==3.5.0

    Like

Leave a Reply to devnull Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: