Elasticsearch data issue - elasticsearch

Today I faced a weird problem with elasticsearch and I don't know how to fix the issue. The scenrario is somewhat like this
Elasticsearch cluster with 2 nodes :
Total docs 1000
Now one of my server goes down and all write and reads are handled by second server. Now say there are 10000 docs in the ES2.
Due to system problem now whole elasticsearch network is down and both ES1 and ES2 are down. Now somehow I manually go and make ES1 up and now its taking all the write requests.
ES2 comes up and now discovers ES1 as its master.
ES2 syncs up with master and all the data written to ES2 node is lost.
Is there any way to recover the lost data. Is this an expected behaviour in distributed system?
Please let me know if something is not clear.

Related

Elasticsearch reindex gets stuck

Context
We have two Elasticsearch clusters with 6 and 3 nodes each. The cluster with 6 nodes is the one we use in production environment and we use the one with 3 nodes for testing purposes. (We have the same problem in both clusters). All the nodes have the following characteristics:
Elasticsearch 7.4.2
1TB HDD disk
8 GB RAM
In our case, we need to reindex some of the indexes. Those indexes have billions of documents and a size between 50GB and 250GB.
Problem
Whenever we start reindexing, internally or from a remote source, the task starts working correctly but it reaches a point where it stops reindexing, without apparent reason. We canĀ“t see anything in the logs. The task is not cancelled or anything, it only stops reindexing documents, it looks like the task gets stuck. We tried changing GC strategies, we used CMS and Shenandoah but nothing changes.
Has anyone run into the same problem?
It's difficult to find the RCA of these issues without debugging it and with the little information you provided(missing cluster and index configuration, index slow logs information, elasticsearch error logs, Elasticsearch hot threads to name a few).

Limiting Elasticsearch data retention below disk space

Scenario:
We use Elasticsearch & logstash to do application logging for a moderately high traffic system
This system generates ~200gb of logs every single day
We use 4 instances sharded; and want to retain roughly last 3 days worth of logs
So, we implemented a "cleanup" system, running daily, which removes all data older than 3 days
So far so good. However, a few days ago, some subsystem generated a persistent spike of data logs, resulting in filling up all available disk space within a few hours, which turned the cluster red. This also meant, that the cleanup system wasn't able to connect to ES, as the entire cluster was down -on account of disk being full. This is extremely problematic, as it limits our visibility into what's going on -and blocks our ability to see what caused this in the first place.
Doing root cause analysis here, a few questions pop out:
How can we look at the system in eg Kibana when the cluster status is red?
How can we tell ES to throw away (oldest-first) logs if there is no more space, rather than going status=red?
In what ways can we make sure this does not happen ever again?
Date based index patterns are tricky with spiky loads. There are two things to combine this for a smooth setup without needing manual intervention:
Switch to rollover indices. You can then define that you want to create a new index once your existing one has reached X GB. Then you don't care about the log volume per day any more, but you can simply keep as many indices around as you have disk space (and leave some buffer / fine tune the watermarks).
To automate the rollover, removal of indices, and optionally setting of an alias, we have Elastic Curator:
Example for rollover
Example for delete index, but you want to combine this with the count filtertype
PS: There will be another solution soon, called Index Lifecycle Management. It's built into Elasticsearch directly and can be configured through Kibana, but it's only around the corner at the moment.
How can we look at the system in eg Kibana when the cluster status is red?
Kibana can't connect to ES if it's already down. Best to poll Cluster health API to get cluster's current state.
How can we tell ES to throw away (oldest-first) logs if there is no more space, rather than going status=red?
This option is not inbuilt within Elasticsearch. Best way is to monitor disk space using Watcher or some other tool and have your monitoring send out an alert + trigger a job that cleansup old logs if the disk usage goes below a specified threshold.
In what ways can we make sure this does not happen ever again?
Monitor the disk space of your cluster nodes.

Elasticsearch replica shard

I am trying to setup a replica shard on a second node, and I tried the codes on es website without any success. So, how could I modify the config file on both es instances? All help is much appreciated.
FYI, here is what I want to setup:
ES1: primary node for index A
ES2: backup node for index A
In case ES1 fails, ES2 should still be available. I should maybe curl [ip]:9200/index/type and still be able to access the data.

Use Elasticsearch as backup store

My application receives and parse thousands of small JSON snippets each about ~1Kb every hour. I want to create a backup of all incoming JSON snippets.
Is it a good idea to use Elasticsearch to backup this snippets in an index with f.ex. "number_of_replicas:" 4? Never read that anyone has used Elasticsearch for this.
Is my data safe in Elasticsearch when I use a cluster of servers and replicas or should I better use another storage for this use case?
(Writing it to the local file system isn't safe, as our hard discs crashes often. First I have thought about using HDFS, but this isn't made for small files.)
First you need to find difference between replica and backups.
replica is more than one copy of data at run time.It increases high availability and failover support,it wont support accidental delete of data.
Backup is copy of whole data at backup time.it will be used to restore when system crashed.
Elastic search for back up.. its not good idea.. Elastic search is a search engine not DB.If you have not configured ES cluster carefully,then you will end up with loss of data.
So in my opinion ,
To store json object, we got lot of dbs.. For example mongodb is a nosql db.We can easily configure it with more replicas.It means high availability of data and failover support.As you asked its also opensource and more reliable.
for more info about mongodb refer https://www.mongodb.org/
Update:
In elasticsearch if you create index with more shards it'll be distributed among nodes.If a node fails then the data will be lost.But in mongoDB more node means ,each mongodb node contains its own copy of data.If a mongodb fails then we can retrieve out data from replica mongodbs. We need to be more conscious about replica setup and shard allocation in Elasticsearch. But in mongoDB it's easier and good architecture too.
Note: I didn't say storing data in elasticsearch is not safe.I mean, comparing to mongodb,it's difficult to configure replica and maintain in elasticsearch.
Hope it helps..!

Clustering in Elasticsearch

I have implemented clustering using Elasticsearch. ElasticHead UI displays detected nodes.
However I am not sure how it works. Any one could please provide me a link/direction that shows how clustering works with elasticsearch?
ElasticSearch uses multicasting to see if there are other nodes with same cluster name present in the network.
If there is such a node it connects to it.
It shares it data with it depending upon the shard configuration.
http://www.elasticsearch.org/guide/reference/modules/discovery/zen.html
Read the above to get the full idea

Resources