Scalling down elasticsearch cluster - elasticsearch

First of all, I would like to mention that I am not Elasticsearch expert.
I have a 3 node elasticsearch cluster. The utilization of the resources are not proportional to cost. So I have decided to reduce 2 nodes.
Now I am thinking what is the gracious way to kill 2 node out of 3 without downtime? What can be consequences?
I can not compeltely shut it down the whole cluster. Running Elasticsearch version: 5.6.8
Any help or suggestion will be really appreciated.

For high availability you need at least 3 nodes for the master election. Be sure to set discovery.zen.minimum_master_nodes correctly:
2 (= majority) for 3 nodes — only this is highly available
2 for 2 nodes (also the majority), but you are losing HA because as soon as one node is down you will not be able to elect a master any more
1 for 1 node
If you are removing data nodes, be sure that the data is replicated to at least one other node. Either set the replication factor number_of_replicas to 2 (= 3 copies, so on all nodes in your case) if you want to kill 2 of the 3 nodes. Or slightly more gracefully, set "index.routing.allocation.require._name": "A" to ensure that data must be allocated on the node with the name A. Ensure with the cat shards API that the surviving node has all the required data.

Related

Why I need 3 nodes for k-safety value of 1

Referring to Vertica documentation -
"Minimum Subcluster Size for K-Safe Databases
In a K-safe database, subclusters must have at least three nodes in order to operate. Each subcluster tries to maintain subscriptions to all shards in the database. If a subcluster has less than three nodes, it cannot maintain shard coverage. Vertica returns an error if you attempt to rebalance shards in a subcluster with less than three nodes in a K-safe database." from https://www.vertica.com/docs/10.0.x/HTML/Content/Authoring/Eon/Subclusters.htm?TocPath=Vertica%20Architecture%3A%20Eon%20Versus%20Enterprise%20Mode|Eon%20Mode%20Concepts|_____3
Why do I need 3 nodes?
Wouldn't things work if Ksafety is 1 and there are only 2 shards? So node 1 has shard1 and shard 2 and so does node 2? If node 2 fails then node 1 serves all queries? Has it got to do with QUORUM that with do nodes, if 1 node gets down then QUORUM is lost and thus the database shuts down?
As #minatmerva put it in his comment:
Working on several nodes is what Massive Parallel Processing (MPP) is all about. Working on 2 nodes when the third is down is still MPP. Working on 1 node when the 2nd is down isn't MPP any more. So working MPP on 2 nodes is not foreseen at all.

From how many nodes do you need dedicated master nodes

A question. Is there any recommandation from how many nodes that you need to use dedicated master nodes in a elasticsearch cluster?
My setup:
4 nodes: for non critical data (32GB ram) each. Can be the master node 3
3 nodes: for critial data (16GB ram) each.
Does the master nodes need the same memory requirement as the data nodes?
At a time you can have only one master node, but for availability you should have more than one master elegible by setting node.master
The master node is the only node in a cluster that can make changes to the cluster state. This mean that if your master node is rebooted or down then you will not be able to make any changes to your cluster.
Well at some point it is a bit hard to what is right or best practice because it always depends on many parameters.
With your setup i would better go with 3 nodes and up to 64 GB of memory per each node, other wise you are loosing some hits on communication between your 7 servers while they are not utilizing 100% of resources. Then all 3 nodes must be able to become master and set
discovery.zen.minimum_master_nodes: 2
This parameter is a bit important to avoid brain split when each node could become a master.
For you critical data you must use 1 replica to prevent lose of data.
Other option would be to make master only nodes and data only nodes.
So at some point minimum master nodes should be always 3 this will allow you to upgrade without downtime and make sure that you have always on setup.

discovery.zen.minimum_master_nodes value for a cluster of two nodes

I have two dedicated Windows Servers (Windows Server 2012R2, 128GB memory on each server) for ES (2.2.0). If I have one node on each server and the two nodes form a cluster. What is the proper value for
discovery.zen.minimum_master_nodes
I read this general rule in elasticsearch.yml:
Prevent the "split brain" by configuring the majority of nodes (total
number of nodes / 2 + 1):
I saw this SO thread:
Proper value of ES_HEAP_SIZE for a dedicated machine with two nodes in a cluster
There is an answer saying:
As described in Elasticsearch Pre-Flight Checklist, you can set
discovery.zen.minimum_master_nodes to at least (N/2)+1 on clusters
with N > 2 nodes.
Please note "N > 2". What is the proper value in my case?
N is the number of ES nodes (not physical machines but ES processes) that can be part of the cluster.
In your case, with one node on two machines, N = 2 (note that it was 4 here), so the formula N/2 + 1 yields 2, which means that both of your nodes MUST be eligible as master nodes if you want to prevent split brain situations.
If you set that value to 1 (which is the default value!) and you experience networking issues and both of your nodes can't see each other for a brief moment, each node will think it is alone in the cluster and both will elect themselves as master. You end up in a situation where you have two masters and that's not a good thing. Whereas if you set that value to 2 and you experience networking issues, the current master node will stay elected and the second node will never decide to elect itself as new master. Whenever network is back up, the second node will rejoin the cluster and continue serving requests.
The ideal topology is to have 3 dedicated master nodes (i.e. with master: true and data:false) and have discovery.zen.minimum_master_nodes set to 2. That way you'll never have to change the setting whatever the number of data nodes are part of your cluster.
So the N > 2 constraint should indeed be N >= 2, but I guess it was somehow implied, because otherwise you're creating a fertile ground for split brain situations.
Interestingly, in ES 7 discovery.zen.minimum_master_nodes is no longer need to be defined.
discovery.zen.minimum_master_nodes value for a cluster of two nodes
https://www.elastic.co/blog/a-new-era-for-cluster-coordination-in-elasticsearch

elasticsearch: Proper config in 3 node cluster for each node to have full copy of index?

3 node cluster of ElasticSearch 1.7.2 on CentOS
In a traditional cluster perspective, for a 3 node environment, the approach is to allow the failure of any one node, and the cluster will still be operational.
The default elasticsearch.yml reflects this, and all is well.
In our environment, 3 nodes, we want any one node to be able to stand alone and operate even if both other nodes are lost.
We believe the following achieves this:
index.number_of_replicas: 2 # in 3-node cluster, every node will have p or r copy of every shard
discovery.zen.minimum_master_nodes: 2 # reqd for 3 node env, but what happens when only 1 node survives?
Any additions or changes to the above appro?
We also have the three node cluster with all node being capable of becoming master. I guess apart from minimum master nodes, rest config remains the same as default. Just as a word of precaution, when the cluster has only one node working, then there are no replicas available on that node. Try not to have that situation in production when you are indexing data otherwise it takes good time to propogate all changes and reallocation of shards once other nodes are up if the data set is huge. Cheers.
The answer is:
index.number_of_replicas: 2
On a 3 node system, this means every node will have a replica of every shard, so any 1 node can stand alone/has all the data.
Distributed database are meant to be resistant to failures, but each node is not meant to be stand alone. It would be possible to setup ES such that each node has 100% of the data from each of the indexes but that would mean extra replicas and less shards. Both of those are going to lead to reduced performance from the cluster.
If you are really worried that 2 of your nodes will go down at the same time I suggest adding a 4th data node instead of setting it up so that the 3rd node is stand alone.

Elasticsearch - Add one node to a running cluster

I have Elasticsearch cluster ( it's not a real cluster cause i have only 1 node )
The cluster have 1.8 TB of data something like 350 indexes .
I would like to add new node to the cluster BUT i don't want to run replication for all the data .
I want to split my shards into 2 nodes ( Each node will be with 1 shard) .
for each index i have 2 shards 0 & 1 and i would like to split my data .
This is possible ? how this will effect Kibana performance ?
Thanks a lot
Amit
for each index i have 2 shards 0 & 1 and i would like to split my data . This is possible ?
When you add your second node to your cluster, then your data will be automatically "rebalanced" and your data will be replicated.
Presumably, if you were to run
$ curl -XGET localhost:9200/_cluster/health?pretty
then you would see that your current cluster health is probably yellow because there is no replication taking place. You can tell because you presumably have an equal number of assigned primary shards as you have unassigned shards (unallocated replicas).
What will happen when you start the second node in the cluster? It will immediately begin to copy shards from the original node. Once complete, the data will exist on both nodes and this is what you want to happen. As you scale further by eventually adding a third node, you will actually spread the shards across the cluster in a less predictable way that does not result in a 1:1 relationship between the nodes. It's only once you add a third node that you can reasonably avoid copying all of the shard data to every node.
Other considerations:
Be sure to set discovery.zen.minimum_master_nodes to 2. This should always be set to M / 2 + 1 using integer division (truncated division) where M is the number of master eligible nodes in your cluster. If you don't set this setting, then you will eventually cause yourself data loss.
You want replication because it gives you higher availability in the event of hardware failure on either node. Due to the above setting with a two node cluster, your cluster would be readonly until you added a second node or the setting was unset, but at least the data would still exist.
how this will effect Kibana performance ?
It's hard to say whether this will really improve performance, but it most likely will simply by spreading the workload across two machines.

Resources