Elasticsearch replica shard - elasticsearch

I am trying to setup a replica shard on a second node, and I tried the codes on es website without any success. So, how could I modify the config file on both es instances? All help is much appreciated.
FYI, here is what I want to setup:
ES1: primary node for index A
ES2: backup node for index A
In case ES1 fails, ES2 should still be available. I should maybe curl [ip]:9200/index/type and still be able to access the data.

Related

Elastic search indices are getting recreated after deletion

We are running single node cluster, as a single instance.
Filebeat is the log forwarder for logstash.
We have indices like
abc_12.06.2018
abc_13.06.2018
With 5 primary shards and 1 replica shard.
When I delete abc_12.06.2018, it is getting deleted at that moment and slowly after some time index is getting recreated.
The same is happening with replica 0 as well.
Please help.
Looks like Filebeat just write logs to index, that you deleted and recreate it. Root cause isn't in elasticsearch.
Does "recreated" indices have any data?

Old Elasticsearch shards are not deleted after relocation

Our Elasticsearch cluster has two data directories. We recently restarted all the nodes in the cluster. After the successful restart process, we observed increased disk space usage on few nodes. When we examined the folders inside the data directory, we found that there are orphaned shards.
For example, an orphaned shard "15" exists at location data_dir0/cluster_name/nodes/0/indices/index_name/15, while one of the replicas of the same shard "15" exists on the same node inside other data directory, here at data_dir1/cluster_name/nodes/0/indices/index_name/15. This shard "15" from data_dir1 is also included in cluster metadata and thus, we assume that shard "15" from data_dir0 is an orphaned shard and has to be deleted by Elasticsearch. But Elasticsearch hasn't deleted the orphaned shard yet, even after 6 days since last restart.
We found this topic https://discuss.elastic.co/t/old-shards-on-re-joining-nodes-useful/182661 relating to our issue but it did not help us as in ES did not take care of that orphaned shard. We also raised the question on Elastic forum but we are not getting quick replies. So, I am asking it here as stack overflow has larger community.
This also happened to our cluster, we run elastic 6.1.3. One specific node had 88% of it disk used, it seems there were some shard leftovers from previous relocation on our production index.
To fix this I stopped Elasticsearch on the node (make sure you have plenty of disk-space on your other datanodes), let the relocation of elastic do its work. Once it is done and re-balanced, delete the index folder and start Elasticsearch again, this went quite painless.
What version of Elasticsearch are you running?
Is your cluster green? If so, those shard files should be deleted by Elasticsearch during initialization. But if that shard has unallocated replicas at the time the node rejoined the cluster, Elasticsearch won't remove pre-existing shard files on disk.
You can manually delete the directory if you don't need the shard. Or you can try restarting Elasticsearch on the node and let it delete the files for you.
We also got help from Elastic forum here https://discuss.elastic.co/t/old-shards-not-deleted-upon-relocation/71161/6
Restarting the node did not help and we do not want to manually delete the folders. So, we are going to replace the affected nodes one by one.
#chani It would be great if you can provide any official link to the manual delete suggestion.

Elasticsearch data issue

Today I faced a weird problem with elasticsearch and I don't know how to fix the issue. The scenrario is somewhat like this
Elasticsearch cluster with 2 nodes :
Total docs 1000
Now one of my server goes down and all write and reads are handled by second server. Now say there are 10000 docs in the ES2.
Due to system problem now whole elasticsearch network is down and both ES1 and ES2 are down. Now somehow I manually go and make ES1 up and now its taking all the write requests.
ES2 comes up and now discovers ES1 as its master.
ES2 syncs up with master and all the data written to ES2 node is lost.
Is there any way to recover the lost data. Is this an expected behaviour in distributed system?
Please let me know if something is not clear.

Elasticsearch some indices unassigned after a brain-split happened

Using ES 1.3.1 version
Found a brain-split then restart the entire cluster. Now only the latest index got correctly allocated, leave all other indices unassigned...
I've checked on several nodes, there are index data saved on disk, and I've tried to restart those nodes, still won't get a shard allocate...
Please see this screen shot:
http://i.stack.imgur.com/d6jT7.png
I've tried the "Cluster reroute": http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-reroute.html. However, got a exception like "cannot allocate Primary Shard"...
Please help and any comment is welcome. Thanks a lot.
Don't allocate primary shards with the _cluster/reroute API, this will create an empty shard with no data.
Try setting your replica count to 0.
If that doesn't work, set index.gateway logging to be TRACE and restart a node that contains saved index data for one of the unassigned shards. What do you see in the logs for that node or in the logs for the master node?

how we decide in which node we should store document in elasticsearch?

Since i am a new in ES, i need help.
I read that it is possible to specify the shard where the document to be stored using 'routing'. But is it possible to restrict that a document should be saved in a particular node?..
Suppose i have two nodes. Node1 and node2. my requirement is that, if i add a document from node1 to index 'attenadance', it should store primary shard at node1 and the replica may be at node2. And the same thing if i add a document from node2 to index 'attenadance', it should store primary shard at node2 and the replica may be at node1...Please advice me is it possible in ES?..if yes, please tell how to achieve this in ES.
Each node has a specific value associated with rack in cofig.yml. Node 1 has a setting node.rack: rack1, Node 2 a setting of node.rack: rack2, and so on.
We can create an index that will only deploy on nodes that have rack set to rack1 setting index.routing.allocation.include.tag to rack1. For example:
curl -XPUT localhost:9200/test/_settings -d '{
"index.routing.allocation.include.tag" : "rack1"
}'
further ref:
Elasticsearch official doc
You don't control where shards/replicas go -- elasticsearch handles that... In general, it won't put a replica of a shard on the same node. There is a really good explanation of how it all works here: Shards and replicas in Elasticsearch
There is also good documentation on using shard routing on the elasticsearch blog if you need to group data together (but be careful because it can generate hot-spots.

Resources