after following this tuto (https://www.bmc.com/blogs/elasticsearch-logs-beats-logstash/) in order to use logstash to analyze some log files, my index was created fine at the first time, then I wanted to re-index new files with new filters and new repositories so I deleted via "curl XDELETE" the index and now when I restart logstash and filebeat, the index is not created anymore.. I dont see any errors while launching the components.
Do I need to delete something else in order to re-create my index?
Ok since my guess (see comments) was correct, here's the explanation:
To avoid that filebeat reads and publishes lines of a file over and over again, it uses a registry to store the current state of the harvester:
The registry file stores the state and location information that Filebeat uses to track where it was last reading.
As you stated, filebeat successfully harvested the files, sent the lines to logstash and logstash published the events to elasticsearch which created the desired index. Since filebeat updated its registry, no more lines had to be harvested and thus no events were published to logstash again, even when you deleted the index. When you inserted some new lines, filebeat reopened the harvester and published only the new lines (which came after the "registry checkpoint") to logstash.
The default location of the registry file is ${path.data}/registry (see Filebeat's Directory Layout Overview).
... maybe the curl api call is not the best solution to restart the index
This has nothing to do with deleting the index. Deleting the index happens inside elasticsearch. Filebeat has no clue about your actions in elasticsearch.
Q: Is there a way to re-create an index based on old logs?
Yes, there are some ways you should take into consideration:
You can use the reindex API which copies documents from one index to another. You can update the documents while reindexing them into the new index.
In contrast to the reindex you can use the update by query API to update documents that will remain in the original index.
Lastly you could of course delete the registry file. However this could cause data loss. But for development purposes I guess that's fine.
Hope I could help you.
Related
I'm using elasticsearch 5.3.3.
A couple of years ago I created some index snapshots and uploaded to S3 as a backup.
I recently needed to restore this backup and noticed that my snapshot is missing the meta info.
Comparing to a working snapshot I see these missing files:
meta-*
snap-*
index-0
index.latest
I know the data must be there because of the size of the directory and some text I see when I open some files that I believe are lucene segments.
I'm trying to find a way to recover the data by rebuilding the index somehow but I can't find good info about this. I even tried some low level lucene functions on the segments but it seems the snapshot doesn't a a store segments* file for lucene to read.
Any idea how can I recover this data?
Some info:
Elasticsearch 5.3.3
Lucene 6.4.2
Single index snapshot
Hope there's a way to recover the data.
Thank you all.
The problem is that Filebeats is sending duplicated logs to Elasticsearch, when I restart Filebeats, he sends the whole log again.
I have been mounting /var/share/filebeat/data to the container where I am runnig Filebeats. I also had change the permissions of the share directory, to be owned by the filebeats user.
I am using Elasticsearch 8.1.2
The most probable reason for this is persistent volume location for filebeat registry. Essentially, filebeat creates a registry to keep track of all log files processed and to what offset. If this registry is not stored on a persistent location (for instance stored to /tmp) and filebeat is restarted, the registry file will be lost and new one will be created. This tells filebeat to tail all the log files present at specified path from beginning, hence the duplicate logs.
To resolve this, please mount a persistent volume to filebeat (may be hostpath) and configure it to be used for storing registry.
Thanks for the answers, but the issue was that in the initial setup we didn't define an ID tag for the filestream input type. As simple as that.
https://www.elastic.co/guide/en/beats/filebeat/current/_step_1_set_an_identifier_for_each_filestream_input.html
I have an ELK stack running on the Kubernetes cluster with security enabled. Everything is running fine and I am able to push data to an index. After logging in to Kibana as an admin user, and I to "Discover" it asks me to create an index pattern. So I have some metricbeat data, and I create a pattern and saved it. But when I go back to discover, it is prompting me to create an index pattern again!
I don't find any errors in Kibana/Elastic pods
Really appreciate any pointers
Elastisearch version: 7.10.1
What finally worked for me was destroy and recreate Kibana. After recreating kibana i was able to see all the index patterns i have been trying to save
I have started working on ELK recently and have a doubt regarding handling of multiple types of logs.
I have two sets of logs on my server that I want to analyse, one from my android application and the other from my website. I have successfully transferred logs from this server via filebeat to the ELK server.
I have created two filters for either types of logs and have successfully imported these logs into logstash and then Kibana.
This link helped do the above stuff.
https://www.digitalocean.com/community/tutorials/how-to-install-elasticsearch-logstash-and-kibana-elk-stack-on-centos-7
The above link directs to use the logs in the filebeat index in Kibana and start analysing(I successfully did for one type of logs). But the problem that I am facing is that since both these logs are very different, they need to be analysed differently. How do I do this in Kibana. Should I create multiple filebeat indexes there and import them, or should it be just one single index, or some other way. I am not very clear on this(could not find much documentation), hence would request to please help and guide me here.
Elasticsearch organizes by index and type. Elastic used to compare these to SQL concepts, but now offers a new explanation.
Since you say that the logs are very different, Elastic is saying that you should use different indexes.
In Kibana, the visualization is tied to an index. If you had one panel from each index, you can show them both on the same dashboard.
I have lot of data indexed in my elasticsearch.
I deleted elasticsearch folder and then extarct again fresh zip of elasticsearch and start the elasticsearch server.
I am surprised because after staring new elasticsearch server, I again found all old data and this problem persists again and again.
Can any please help me? I don't want to get all old data indexed in elasticsearch.
Regards
Given the cluster health response it's not a problem with multiple nodes running on the same cluster as suggested by Igor. I'd suggest you to check the java processes running. You could maybe have an elasticsearch hanging somewhere which keeps writing in that folder.