How many master in three node cluster - elasticsearch

I was stumbled at this question that how many masters can be there in a three node cluster. I came across this point in one of a article on internet that search and index requests are not to be sent to elected master. Is that correct? So , if i have three nodes acting as master(out of which one node is elected master) should i point out incoming logs to be indexed and searched onto other master nodes apart from elected master?Please clarify.Thanks in advance

In a three node cluster, all nodes most likely hold data and are master-eligible. That is the most simple situation in which you don't have to worry about anything else.
If you have a larger cluster, you can have a couple of nodes which are configured as dedicated master nodes. That is, they are master-eligible and they don't hold any data. For example you would have 3 dedicated master nodes and 7 data nodes (not master-eligible). Exactly one of the dedicated master nodes will always be the elected master.
The point is that since the dedicated master nodes don't hold data, they will not directly service index and search request. If you send an index or search request to them there's no other way for them than to delegate to one of the 7 data nodes.
From the Elasticsearch Reference for Modules - Node:
dedicated master nodes are nodes with the settings node.data: false
and node.master: true. We actively promote the use of dedicated master
nodes in critical clusters to make sure that there are 3 dedicated
nodes whose only role is to be master, a lightweight operational
(cluster management) responsibility. By reducing the amount of
resource intensive work that these nodes do (in other words, do not
send index or search requests to these dedicated master nodes), we
greatly reduce the chance of cluster instability.
A related question is how many master nodes there should be in a cluster. The answer essentially is at least 3 in order to prevent split-brain (a situation when due to a network error, two masters are elected simultaneously).
The Elasticsearch Guide has a section on Minimum Master Nodes, an excerpt:
When you have a split brain, your cluster is at danger of losing data.
Because the master is considered the supreme ruler of the cluster, it
decides when new indices can be created, how shards are moved, and so
forth. If you have two masters, data integrity becomes perilous, since
you have two nodes that think they are in charge.
This setting tells Elasticsearch to not elect a master unless there
are enough master-eligible nodes available. Only then will an election
take place.
This setting should always be configured to a quorum (majority) of
your master-eligible nodes. A quorum is (number of master-eligible
nodes / 2) + 1. Here are some examples:
If you have ten regular nodes (can hold data, can become master), a
quorum is 6.
If you have three dedicated master nodes and a hundred data nodes, the quorum is 2, since you need to count only nodes that are master eligible.
If you have two regular nodes, you are in a conundrum. A quorum would be 2, but this means a loss of one node will
make your cluster inoperable. A setting of 1 will allow your cluster
to function, but doesn’t protect against split brain. It is best to
have a minimum of three nodes in situations like this.

Related

Number of nodes AWS Elasticsearch

I read documentation, but unfortunately I still don't understand one thing. While creating AWS Elasticsearch domain, I need to choose "Number of nodes" in "Data nodes" section.
If i specify 3 data nodes and 3-AZ, what it actually means?
I have suggestions:
I'll get 3 nodes with their own storages (EBS). One of node is master and other 2 are replicas in different AZ. Just copy of master, not to lose data if master node become broken.
I'll get 3 nodes with their own storages (EBS). All of them will work independent and on their storadges are different data. So at the same time data can be processed by different nodes and store on different storages.
It looks like in other AZ's should be replicas. but then I don't understand why I have different values of free space on different nodes
Please, explain how it works.
Many thanks for any info or links.
I haven't used AWS Elasticsearch, but I've used the Cloud Elasticsearch service.
When you use 3 AZ (availability zones), means that your cluster will use 3 zones in order to make it resilient. If one zone has problems, then the nodes in that zone will have problems as well.
As the description section mentions, you need to specify multiples of 3 if you choose 3 AZ. If you have 3 nodes, then every AZ will have one zone. If one zone has problems, then that node is out, the two remaining will have to pick up from there.
Now in order to answer your question. What do you get with these configurations. You can check so yourself. Use this via kibana or any HTTP client
GET _nodes
Check for the sections:
nodes.roles
nodes.attributes
In the various documentations, blog posts etc you will see that for production usage, 3 nodes and 3 AZ is a good starting point in order to have a resilient production cluster.
So let's take it step by step:
You need an even number of master nodes in order to avoid the split brain problem.
You need more than one node in your cluster in order to make it resilient (if the node is unavailable).
By combining these two you have the minimum requirement of 3 nodes (no mention of zones yet).
But having one master and two data nodes, will not cut it. You need to have 3 master-eligible nodes. So if you have one node that is out, the other two can still form a quorum and vote a new master, so your cluster will be operational with two nodes. But in order for this to work, you need to set your primary shards and replica shards in a way that any two of your nodes can hold your entire data.
Examples (for simplicity we have only one index):
1 primary, 2 replicas. Every node holds one shard which is 100% of the data
3 primaries, 1 replica. Every node will hold one primary and one replica (33% primary, 33% replica). Two nodes combined (which is the minimum to form a quorum as well) will hold all your data (and some more)
You can have more combinations but you get the idea.
As you can see, the shard configuration needs to go along with your number and type of nodes (master-eligible, data only etc).
Now, if you add the availability zones, you take care of the problem of one zone being problematic. If your cluster was as a whole in one zone (3 nodes in one node), then if that zone was problematic then your whole cluster is out.
If you set up one master node and two data nodes (which are not master eligible), having 3 AZ (or 3 nodes even) doesn't do much for resiliency, since if the master goes out, your cluster cannot elect a new one and it will be out until a master node is up again. Now for the same setup if a data node goes out, then if you have your shards configured in a way that there is redundancy (meaning that the two nodes remaining have all the data if combined), then it will work fine.
Your answers should be covered in following three points.
If i specify 3 data nodes and 3-AZ, what it actually means?
This means that your data and replica's will be available in 3AZs with none of the replica in same AZ as the data node. Check this link. For example, When you say you want 2 data nodes in 2 AZ. DN1 will be saved in (let's say) AZ1 and it's replica will be stored in AZ2. DN2 will be in AZ2 and it's replica will be in AZ1.
It looks like in other AZ's should be replicas. but then I don't understand why I have different values of free space on different nodes
It is because when you give your AWS Elasticsearch some amount of storage, the cluster divides the specified storage space in all data nodes. If you specify 100G of storage on the cluster with 2 data nodes, it divides the storage space equally on all data nodes i.e. two data nodes with 50G of available storage space on each.
Sometime you will see more nodes than you specified on the cluster. It took me a while to understand this behaviour. The reason behind this is when you update these configs on AWS ES, it takes some time to stabilize the cluster. So if you see more data or master nodes as expected hold on for a while and wait for it to stabilize.
Thanks everyone for help. To understand how much space available/allocated, run next queries:
GET /_cat/allocation?v
GET /_cat/indices?v
GET /_cat/shards?v
So, if i create 3 nodes, than I create 3 different nodes with separated storages, they are not replicas. Some data is stored in one node, some data in another.

Rejoin separated data node to cluster

I have 3 nodes of elasticsearch all of them act as master-data node.
Due to connectivity issue one node leaves the cluster and promotes iteslf as master.Now i have two cluster first one with two nodes and other with one node. As all the nodes were under load balancer all nodes were receiving request from logstash.What will happen if i restart the single node cluster and try to add it back to the original cluster?
The problem that you are encountering is called split brain problem.
Here is a description of it
The problem comes in when a node falls down or there's simply a lapse
in communication between nodes for some reason. If one of the slave
nodes cannot communicate with the master node, it initiates the
election of a new master node from those it's still connected with.
That new master node then will take over the duties of the previous
master node. If the older master node rejoins the cluster or
communication is restored, the new master node will demote it to a
slave so there's no conflict. For the most part, this process is
seamless and "just works."
However, consider a scenario where you have just two nodes: one master
and one slave. If communication between the two is disrupted, the
slave will be promoted to a master, but once communication is
restored, you end up with two master nodes. The original master node
thinks the slave dropped and should rejoin as a slave, while the new
master thinks the original master dropped and should rejoin as a
slave. Your cluster, therefore, is said to have a split brain.
Reference link to it : https://qbox.io/blog/split-brain-problem-elasticsearch
To avoid this problem add this to your yml file on your master nodes : discovery.zen.minimum_master_nodes: 2
The formulae for this is : Prevent the "split brain" by configuring the
majority of nodes (total number of master-eligible nodes / 2 + 1)

From how many nodes do you need dedicated master nodes

A question. Is there any recommandation from how many nodes that you need to use dedicated master nodes in a elasticsearch cluster?
My setup:
4 nodes: for non critical data (32GB ram) each. Can be the master node 3
3 nodes: for critial data (16GB ram) each.
Does the master nodes need the same memory requirement as the data nodes?
At a time you can have only one master node, but for availability you should have more than one master elegible by setting node.master
The master node is the only node in a cluster that can make changes to the cluster state. This mean that if your master node is rebooted or down then you will not be able to make any changes to your cluster.
Well at some point it is a bit hard to what is right or best practice because it always depends on many parameters.
With your setup i would better go with 3 nodes and up to 64 GB of memory per each node, other wise you are loosing some hits on communication between your 7 servers while they are not utilizing 100% of resources. Then all 3 nodes must be able to become master and set
discovery.zen.minimum_master_nodes: 2
This parameter is a bit important to avoid brain split when each node could become a master.
For you critical data you must use 1 replica to prevent lose of data.
Other option would be to make master only nodes and data only nodes.
So at some point minimum master nodes should be always 3 this will allow you to upgrade without downtime and make sure that you have always on setup.

Can a two-node (one on each box) cluster provide uninterrupted services if one box is down?

In my production environment, I have a two-node cluster (ES 2.2.0) and each node sits on a different physical box. Inside elasticsearch.yml, I have the following:
discovery.zen.minimum_master_nodes: 2
My question is: if one box is down, can the other node continues to function normally to provide uninterrupted search services (index and search, write and read)?
If you have two nodes and each is master-eligible and you have discovery.zen.minimum_master_nodes: 2, if the network goes down and the two nodes don't see each other for a while, you'll get into a split brain situation because each node will elect itself as a master.
However, with a setting of 2, you have two possible situations:
if the non-master goes down, the other node will continue to function properly (since it is already master)
if the master goes down, the other won't be able to elect itself as the master (since it will wait for a second master-eligible node to be visible).
For this reason, with only two nodes, you need to choose between the possibility of a split brain (with minimum_master_nodes: 1) or a potentially RED cluster (with minimum_master_nodes: 2). The best way to overcome this is to include a third master-only node and then minimum_master_nodes: 2 would make sense.
Just try it out:
Start your cluster, bring down the master node, what happens?
Start your cluster, bring down the non-master node, what happens?
The purpose of minimum master nodes is to maintain the stability of the cluster.
If you have only 2 nodes in cluster and with 2 minimum master nodes settings.
If you are setting minimum master as 2, the cluster will expect 2 nodes to be UP to serve the various search services.
If one node goes down in your 2 node cluster (which had 2 minimum master node settings), theoretically cluster will goes down.
First, this setting helps prevent split brains, the existence of two masters in a single cluster.
If you have two nodes, A setting of 1 will allow your cluster to function, but doesn’t protect against split brain. It is best to have a minimum of three nodes in situations.

elasticsearch: Proper config in 3 node cluster for each node to have full copy of index?

3 node cluster of ElasticSearch 1.7.2 on CentOS
In a traditional cluster perspective, for a 3 node environment, the approach is to allow the failure of any one node, and the cluster will still be operational.
The default elasticsearch.yml reflects this, and all is well.
In our environment, 3 nodes, we want any one node to be able to stand alone and operate even if both other nodes are lost.
We believe the following achieves this:
index.number_of_replicas: 2 # in 3-node cluster, every node will have p or r copy of every shard
discovery.zen.minimum_master_nodes: 2 # reqd for 3 node env, but what happens when only 1 node survives?
Any additions or changes to the above appro?
We also have the three node cluster with all node being capable of becoming master. I guess apart from minimum master nodes, rest config remains the same as default. Just as a word of precaution, when the cluster has only one node working, then there are no replicas available on that node. Try not to have that situation in production when you are indexing data otherwise it takes good time to propogate all changes and reallocation of shards once other nodes are up if the data set is huge. Cheers.
The answer is:
index.number_of_replicas: 2
On a 3 node system, this means every node will have a replica of every shard, so any 1 node can stand alone/has all the data.
Distributed database are meant to be resistant to failures, but each node is not meant to be stand alone. It would be possible to setup ES such that each node has 100% of the data from each of the indexes but that would mean extra replicas and less shards. Both of those are going to lead to reduced performance from the cluster.
If you are really worried that 2 of your nodes will go down at the same time I suggest adding a 4th data node instead of setting it up so that the 3rd node is stand alone.

Resources