I know that it is possible to define more than one master for the ElasticSearch cluster, where only one acts as master and the others can step in if necessary. See also https://stackoverflow.com/a/15022820/2648551 .
What I don't understand is how I can determine which master is active and which could step in if necessary.
The following setting I currently have:
node-01: master (x) data(-)
node-02: master (-) data(x)
node-03: master (-) data(x)
node-04: master (-) data(x)
node-05: master (-) data(x)
node-06: master (-) data(x)
Now I want to determine that e.g. node-02 becomes additionally a master eligible. Can I rely on ES being so smart that it always takes the non-data node (node-01) as the active master, or could it be that node-02 ever acts as the active master if all nodes are present and there are no problems? Or is that something I just don't have to worry about?
I am currently using ElasticSearch 1.7 [sic!], but I am also interested in answers based on the latest versions.
A few laters and just for the context we "can" now decide which node becomes master, although not straight forward its possible.
Elasticsearch now has an method called voting_config_exclusions which can be used to move away from current master node e.g.
lets say you have 3 master-eligible nodes in your cluster
$ GET _cat/nodes?v
ip node.role master name
192.168.0.10 cdfhilmrstw - node-10
192.168.0.20 cdfhilmrstw * node-20
192.168.0.30 cdfhilmrstw - node-30
192.168.0.99 il - node-99
and Elasticsearch has selected node-20 as active master, you can run following call to remove the active node from voting.
POST /_cluster/voting_config_exclusions?node_names=node_name
This will randomly select another master-eligible node as master (if you have more than one left) Keep doing this for the active nodes until you get the right one activated as master.
Note: this doesn't remove the node, only makes it in-active master / non-voting node and allows another node to become active as master.
Once done, make sure to run below command to remove exclusions and allow all eligible nodes to become master if and when the selected node goes down.
DELETE /_cluster/voting_config_exclusions
Thank You
In short, no, you can't decide which of the master eligible nodes will become a master, because master node is elected (it was in ES 1.7, it still is in ES 6.2).
No, you can't rely on Elasticsearch being so smart to always take the non-data node as the active master. In fact, as of now (6.2) they advice to have dedicated master nodes (i.e. those that do not perform any data operations):
To ensure that your master node is stable and not under pressure, it
is a good idea in a bigger cluster to split the roles between
dedicated master-eligible nodes and dedicated data nodes.
... It is important
for the stability of the cluster that master-eligible nodes do as
little work as possible.
(Note that they are talking about a "bigger cluster".)
I can only assume that this also holds for the earlier versions and the documentation just got reacher.
There is a problem with the configuration that you have posted. Although you have many nodes, loss of one (the master node, node-01) will make your cluster non-functional. To avoid this situation you may choose one of these options:
use default strategy and make all nodes data nodes and master nodes;
make a set of dedicated master-only nodes (at least 3 of them).
It would be nice to know the reason why the ES defaults are not good enough for you, because usually they are good enough.
However, if this is the case when you need a dedicated master node, make sure you have at least 3 and that discovery.zen.minimum_master_nodes is enough to avoid the "split brain" situation:
discovery.zen.minimum_master_nodes = (master_eligible_nodes / 2) + 1
Hope that helps!
Related
We are currently setting up an environment with two elasticsearch instances (clustered servers).
Since it's clustered, we need to make sure that data (indexes) are synched between the two instances.
We do not have the possibility to setup an additional (3rd) server/instance to act as the 'master'.
Therefore we have configured both instances as master and data nodes. So instance 1 is master & node and instance 2 is also master & node.
The synchronization works fine when both instances are up and running. But when one instance is down, the other instance keeps trying to connect with the instance that is down, which obviously fails because the instance is down. Therefore the node that is up is also not functioning anymore, because it can not connect to his 'master' node (which is the node that is down), even though the instance itself is also a 'master'.
The following errors are logged in this case:
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/2/no master];
org.elasticsearch.transport.ConnectTransportException: [xxxxx-xxxxx-2][xx.xx.xx.xx:9300] connect_exception
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: xx.xx.xx.xx/xx.xx.xx.xx:9300
In short: two elasticsearch master instances in a clustered setup. When one is down, the other one does not function because it can not connect to the 'master' instance.
Desired result: If one of the master instances is down, the other should continue functioning (without throwing errors).
Any recommendations on how to solve this, without having to setup an additional server that is the 'master' and the other 2 the 'slaves'?
Thanks
To be able to vote, masters must be a minimum of 2.
That's why you must have a minimum of 3 master nodes if you want your cluster to resist to the loss of one node.
You can just add a specialized small master node by settings all other roles to false.
This node can have very few resources .
As describe in this post :
https://discuss.elastic.co/t/master-node-resource-requirement/84609
Dedicated master nodes need persistent storage, but not a lot of it. 1-2 CPU cores and 2-4GB RAM is often sufficient for smaller deployments. As dedicated master nodes do not store data you can also set the heap to a higher percentage (75%-80%) of total RAM that is recommended for data nodes.
If there are no options to increase 1 more node then you can set
minimum_master_nodes=1 . This will let your es cluster up even if 1 node is up. But it may lead to split brain issue as we restricted only 1 node to be visible to form cluster.
In that scenario you have to restart the cluster to resolve split brain.
I would suggest you to upgrade to elasticsearch 7.0 or above. There you can live with two nodes each master eligible and split brain issue will not come.
You should not have 2 master eligible nodes in the cluster as its a very risky thing and can lead to split brain issue.
Master nodes doesn't require much resources, but as you have just two data nodes, you can still live without having a dedicated master nodes(but please aware that it has downsides) to just save the cost.
So simply, remove master role from another node and you should be good to go.
A question. Is there any recommandation from how many nodes that you need to use dedicated master nodes in a elasticsearch cluster?
My setup:
4 nodes: for non critical data (32GB ram) each. Can be the master node 3
3 nodes: for critial data (16GB ram) each.
Does the master nodes need the same memory requirement as the data nodes?
At a time you can have only one master node, but for availability you should have more than one master elegible by setting node.master
The master node is the only node in a cluster that can make changes to the cluster state. This mean that if your master node is rebooted or down then you will not be able to make any changes to your cluster.
Well at some point it is a bit hard to what is right or best practice because it always depends on many parameters.
With your setup i would better go with 3 nodes and up to 64 GB of memory per each node, other wise you are loosing some hits on communication between your 7 servers while they are not utilizing 100% of resources. Then all 3 nodes must be able to become master and set
discovery.zen.minimum_master_nodes: 2
This parameter is a bit important to avoid brain split when each node could become a master.
For you critical data you must use 1 replica to prevent lose of data.
Other option would be to make master only nodes and data only nodes.
So at some point minimum master nodes should be always 3 this will allow you to upgrade without downtime and make sure that you have always on setup.
In my production environment, I have a two-node cluster (ES 2.2.0) and each node sits on a different physical box. Inside elasticsearch.yml, I have the following:
discovery.zen.minimum_master_nodes: 2
My question is: if one box is down, can the other node continues to function normally to provide uninterrupted search services (index and search, write and read)?
If you have two nodes and each is master-eligible and you have discovery.zen.minimum_master_nodes: 2, if the network goes down and the two nodes don't see each other for a while, you'll get into a split brain situation because each node will elect itself as a master.
However, with a setting of 2, you have two possible situations:
if the non-master goes down, the other node will continue to function properly (since it is already master)
if the master goes down, the other won't be able to elect itself as the master (since it will wait for a second master-eligible node to be visible).
For this reason, with only two nodes, you need to choose between the possibility of a split brain (with minimum_master_nodes: 1) or a potentially RED cluster (with minimum_master_nodes: 2). The best way to overcome this is to include a third master-only node and then minimum_master_nodes: 2 would make sense.
Just try it out:
Start your cluster, bring down the master node, what happens?
Start your cluster, bring down the non-master node, what happens?
The purpose of minimum master nodes is to maintain the stability of the cluster.
If you have only 2 nodes in cluster and with 2 minimum master nodes settings.
If you are setting minimum master as 2, the cluster will expect 2 nodes to be UP to serve the various search services.
If one node goes down in your 2 node cluster (which had 2 minimum master node settings), theoretically cluster will goes down.
First, this setting helps prevent split brains, the existence of two masters in a single cluster.
If you have two nodes, A setting of 1 will allow your cluster to function, but doesn’t protect against split brain. It is best to have a minimum of three nodes in situations.
I was stumbled at this question that how many masters can be there in a three node cluster. I came across this point in one of a article on internet that search and index requests are not to be sent to elected master. Is that correct? So , if i have three nodes acting as master(out of which one node is elected master) should i point out incoming logs to be indexed and searched onto other master nodes apart from elected master?Please clarify.Thanks in advance
In a three node cluster, all nodes most likely hold data and are master-eligible. That is the most simple situation in which you don't have to worry about anything else.
If you have a larger cluster, you can have a couple of nodes which are configured as dedicated master nodes. That is, they are master-eligible and they don't hold any data. For example you would have 3 dedicated master nodes and 7 data nodes (not master-eligible). Exactly one of the dedicated master nodes will always be the elected master.
The point is that since the dedicated master nodes don't hold data, they will not directly service index and search request. If you send an index or search request to them there's no other way for them than to delegate to one of the 7 data nodes.
From the Elasticsearch Reference for Modules - Node:
dedicated master nodes are nodes with the settings node.data: false
and node.master: true. We actively promote the use of dedicated master
nodes in critical clusters to make sure that there are 3 dedicated
nodes whose only role is to be master, a lightweight operational
(cluster management) responsibility. By reducing the amount of
resource intensive work that these nodes do (in other words, do not
send index or search requests to these dedicated master nodes), we
greatly reduce the chance of cluster instability.
A related question is how many master nodes there should be in a cluster. The answer essentially is at least 3 in order to prevent split-brain (a situation when due to a network error, two masters are elected simultaneously).
The Elasticsearch Guide has a section on Minimum Master Nodes, an excerpt:
When you have a split brain, your cluster is at danger of losing data.
Because the master is considered the supreme ruler of the cluster, it
decides when new indices can be created, how shards are moved, and so
forth. If you have two masters, data integrity becomes perilous, since
you have two nodes that think they are in charge.
This setting tells Elasticsearch to not elect a master unless there
are enough master-eligible nodes available. Only then will an election
take place.
This setting should always be configured to a quorum (majority) of
your master-eligible nodes. A quorum is (number of master-eligible
nodes / 2) + 1. Here are some examples:
If you have ten regular nodes (can hold data, can become master), a
quorum is 6.
If you have three dedicated master nodes and a hundred data nodes, the quorum is 2, since you need to count only nodes that are master eligible.
If you have two regular nodes, you are in a conundrum. A quorum would be 2, but this means a loss of one node will
make your cluster inoperable. A setting of 1 will allow your cluster
to function, but doesn’t protect against split brain. It is best to
have a minimum of three nodes in situations like this.
I have a 3 node cluster with minimum_master_nodes set to 2. If I shut down all nodes except the master, leaving one node online, the cluster is no longer operational.
Is this by design? It seems like the node that was the master shouldd remain operational, instead I get errors like this:
{"error":"MasterNotDiscoveredException[waited for [30s]]","status":503}
All the other settings are stock and I am using the aws cloud plugin.
Yes, this is intentional.
Split brain
Imagine a situation where the other 2 nodes were still running but couldn't communicate to the the third node - you'd end up with two clusters otherwise known as a "split brain".
As the two clusters could be updating and deleting data independently of each other then recovery would be very difficult - you wouldn't have a single source of truth for the data.
By setting minimum_master_nodes to (n/2)+1 (were n is the number of nodes) you can prevent a split brain.
Single Node
If you know that the first two nodes have definitely died and not coming back - you can set the minimum_master_nodesto 1 on the remaining node (and also set to one on the other nodes before you restart them).
There is also an option no master block that lets you control what happens when you don't have a valid cluster - e.g. you could make the remaining node read-only until the cluster is re-established.