Rejoin separated data node to cluster - elasticsearch

I have 3 nodes of elasticsearch all of them act as master-data node.
Due to connectivity issue one node leaves the cluster and promotes iteslf as master.Now i have two cluster first one with two nodes and other with one node. As all the nodes were under load balancer all nodes were receiving request from logstash.What will happen if i restart the single node cluster and try to add it back to the original cluster?

The problem that you are encountering is called split brain problem.
Here is a description of it
The problem comes in when a node falls down or there's simply a lapse
in communication between nodes for some reason. If one of the slave
nodes cannot communicate with the master node, it initiates the
election of a new master node from those it's still connected with.
That new master node then will take over the duties of the previous
master node. If the older master node rejoins the cluster or
communication is restored, the new master node will demote it to a
slave so there's no conflict. For the most part, this process is
seamless and "just works."
However, consider a scenario where you have just two nodes: one master
and one slave. If communication between the two is disrupted, the
slave will be promoted to a master, but once communication is
restored, you end up with two master nodes. The original master node
thinks the slave dropped and should rejoin as a slave, while the new
master thinks the original master dropped and should rejoin as a
slave. Your cluster, therefore, is said to have a split brain.
Reference link to it : https://qbox.io/blog/split-brain-problem-elasticsearch
To avoid this problem add this to your yml file on your master nodes : discovery.zen.minimum_master_nodes: 2
The formulae for this is : Prevent the "split brain" by configuring the
majority of nodes (total number of master-eligible nodes / 2 + 1)

Related

elasticsearch 7.X cluster with specified master node

I have 3 elastic node , How can I cluster there three nodes with always same master node , I didn't find any good docs about new elastic 7 way of specify discovery and master node:
discovery.seed_hosts: [ ]
cluster.initial_master_nodes: []
for example I have node a, b, c and I want node a to be master what what should be discovery.seed_hosts and cluster.initial_master_nodes for master node and child nodes
UPDATE
with using Daniel answer , and checking ports are open and node have same cluster name , other nodes didn't join cluster, is there any additional config needed?
UPDATE 2
looks like nodes found each other but for some reason can't choose master node with election:
master not discovered or elected yet, an election requires 2 nodes
with ids [wOZEfOs9TvqGWIHHcKXtkQ, Cs0xaF-BSBGMGB8a-swznA]
Solution
Deleting folder data of all nodes start a node and then add other nodes with first node (as master) as seed host.
Elasticsearch allows you to specify the role of a node. A node (an instance of Elasticsearch) can serve as a coordinating node, master node, voting_only node, data node, ingest node or machine learning node.
With respect to master nodes you can only configure which nodes potentially can become the (active) master, but you cannot specify which one of the so-called master-eligible nodes will be the active master node.
The only exception to this is when you only configure one master-eligible node, then obviously only this one can become the active master. But be aware that in order to get true high availability you need to have at least 3 master-eligible nodes (this ensures that your cluster will still be 100% operational even when losing one of the master-eligible nodes).
Therefore Elastic always recommends to configure 3 or 5 nodes in your cluster as master-eligible nodes. You can configure that role via the node.master property in the Elasticsearch.yml-file. Setting it to true (default) allows that node to become master, while false will ensure that this node never ever will become master and also will not participate in the master election.
Over the life-time of your cluster (master-eligible) nodes might get added and removed. Elasticsearch automatically manages your cluster and the master node election process with the ultimate goal to prevent a split brain scenario from happening, meaning you eventually end up having 2 clusters which go by the same name but with independent master nodes. To prevent that from happening when starting up your cluster for the very first time (bootstrapping your cluster) Elastic requires you to configure the cluster.initial_master_nodes property with the names of the nodes that initially will serve as master-eligible nodes. This property only needs to be configured on nodes that are master-eligible and the setting will only be considered for the very first startup of your cluster. As values you put in the names as configured with the node.name property of your master-eligible nodes.
The discovery.seed_hosts property supports the discovery process which is all about enabling a new node to establish communication with an already existing cluster and eventually joining it when the cluster.name matches. You are supposed to configure it with an array of host names (not node names!) on which you expect other instances of Elasticsearch belonging to the same cluster to be running. You don't need to add all 100 host names of the 100 nodes you may have in your cluster. It's sufficient to list host names of the most stable node names there. As master (eligible) nodes are supposed to be very stable nodes, Elastic recommends to put the host of all master-eligible nodes (typically 3) in there. Whenever you start/restart a node, it goes through this discovery process.
Conclusion
With a cluster made up of 3 nodes you would configure all of them as master-eligible nodes and list the 3 node names in the cluster.initial_master_nodes setting. And you would put all the 3 host names also in the discovery.seed_hosts setting to support the discovery process.
Useful information from the Elasticsearch reference:
Important discovery and cluster formation settings
Discovery and cluster formation settings
Bootstrapping a cluster

Can a two-node (one on each box) cluster provide uninterrupted services if one box is down?

In my production environment, I have a two-node cluster (ES 2.2.0) and each node sits on a different physical box. Inside elasticsearch.yml, I have the following:
discovery.zen.minimum_master_nodes: 2
My question is: if one box is down, can the other node continues to function normally to provide uninterrupted search services (index and search, write and read)?
If you have two nodes and each is master-eligible and you have discovery.zen.minimum_master_nodes: 2, if the network goes down and the two nodes don't see each other for a while, you'll get into a split brain situation because each node will elect itself as a master.
However, with a setting of 2, you have two possible situations:
if the non-master goes down, the other node will continue to function properly (since it is already master)
if the master goes down, the other won't be able to elect itself as the master (since it will wait for a second master-eligible node to be visible).
For this reason, with only two nodes, you need to choose between the possibility of a split brain (with minimum_master_nodes: 1) or a potentially RED cluster (with minimum_master_nodes: 2). The best way to overcome this is to include a third master-only node and then minimum_master_nodes: 2 would make sense.
Just try it out:
Start your cluster, bring down the master node, what happens?
Start your cluster, bring down the non-master node, what happens?
The purpose of minimum master nodes is to maintain the stability of the cluster.
If you have only 2 nodes in cluster and with 2 minimum master nodes settings.
If you are setting minimum master as 2, the cluster will expect 2 nodes to be UP to serve the various search services.
If one node goes down in your 2 node cluster (which had 2 minimum master node settings), theoretically cluster will goes down.
First, this setting helps prevent split brains, the existence of two masters in a single cluster.
If you have two nodes, A setting of 1 will allow your cluster to function, but doesn’t protect against split brain. It is best to have a minimum of three nodes in situations.

what is the role of elected master elasticsearch (ELK Stack)

What is the main purpose of elected master in ELK. If elected master to have only node.master enabled, node.data disabled and not allowed to take any searching and indexing requests.
I have 3 node cluster in which 1 is elected master. If I have kibana as front end UI for querying data and logstash sending data to the cluster for indexing (for realtime log analysis) , Is it a good idea to send searching/indexing requests to other 2 master nodes apart from elected master. or select 1 node for searching or another node for indexing leaving elected master untouched. Please advice.
Please suggest me what would be the best plan
Plan A
or
Plan B
or Plan C
Elected master is a node elected between data nodes and the function of master is to maintain cluster state. Cluster state is the data which Has information on the entire cluster . What nodes are present , what indices are present and which shards are in which nodes etc are stored in cluster state , though only master is allowed to make any changes to cluster state , every node will have a copy of it. This removes single dependency of master and make it possible for any node to become master.
Now as the function of master is thin , it doesn't make sense to have dedicated masters until and unless you have crazy number of nodes.

How many master in three node cluster

I was stumbled at this question that how many masters can be there in a three node cluster. I came across this point in one of a article on internet that search and index requests are not to be sent to elected master. Is that correct? So , if i have three nodes acting as master(out of which one node is elected master) should i point out incoming logs to be indexed and searched onto other master nodes apart from elected master?Please clarify.Thanks in advance
In a three node cluster, all nodes most likely hold data and are master-eligible. That is the most simple situation in which you don't have to worry about anything else.
If you have a larger cluster, you can have a couple of nodes which are configured as dedicated master nodes. That is, they are master-eligible and they don't hold any data. For example you would have 3 dedicated master nodes and 7 data nodes (not master-eligible). Exactly one of the dedicated master nodes will always be the elected master.
The point is that since the dedicated master nodes don't hold data, they will not directly service index and search request. If you send an index or search request to them there's no other way for them than to delegate to one of the 7 data nodes.
From the Elasticsearch Reference for Modules - Node:
dedicated master nodes are nodes with the settings node.data: false
and node.master: true. We actively promote the use of dedicated master
nodes in critical clusters to make sure that there are 3 dedicated
nodes whose only role is to be master, a lightweight operational
(cluster management) responsibility. By reducing the amount of
resource intensive work that these nodes do (in other words, do not
send index or search requests to these dedicated master nodes), we
greatly reduce the chance of cluster instability.
A related question is how many master nodes there should be in a cluster. The answer essentially is at least 3 in order to prevent split-brain (a situation when due to a network error, two masters are elected simultaneously).
The Elasticsearch Guide has a section on Minimum Master Nodes, an excerpt:
When you have a split brain, your cluster is at danger of losing data.
Because the master is considered the supreme ruler of the cluster, it
decides when new indices can be created, how shards are moved, and so
forth. If you have two masters, data integrity becomes perilous, since
you have two nodes that think they are in charge.
This setting tells Elasticsearch to not elect a master unless there
are enough master-eligible nodes available. Only then will an election
take place.
This setting should always be configured to a quorum (majority) of
your master-eligible nodes. A quorum is (number of master-eligible
nodes / 2) + 1. Here are some examples:
If you have ten regular nodes (can hold data, can become master), a
quorum is 6.
If you have three dedicated master nodes and a hundred data nodes, the quorum is 2, since you need to count only nodes that are master eligible.
If you have two regular nodes, you are in a conundrum. A quorum would be 2, but this means a loss of one node will
make your cluster inoperable. A setting of 1 will allow your cluster
to function, but doesn’t protect against split brain. It is best to
have a minimum of three nodes in situations like this.

Elasticsearch minimum master nodes

I have a 3 node cluster with minimum_master_nodes set to 2. If I shut down all nodes except the master, leaving one node online, the cluster is no longer operational.
Is this by design? It seems like the node that was the master shouldd remain operational, instead I get errors like this:
{"error":"MasterNotDiscoveredException[waited for [30s]]","status":503}
All the other settings are stock and I am using the aws cloud plugin.
Yes, this is intentional.
Split brain
Imagine a situation where the other 2 nodes were still running but couldn't communicate to the the third node - you'd end up with two clusters otherwise known as a "split brain".
As the two clusters could be updating and deleting data independently of each other then recovery would be very difficult - you wouldn't have a single source of truth for the data.
By setting minimum_master_nodes to (n/2)+1 (were n is the number of nodes) you can prevent a split brain.
Single Node
If you know that the first two nodes have definitely died and not coming back - you can set the minimum_master_nodesto 1 on the remaining node (and also set to one on the other nodes before you restart them).
There is also an option no master block that lets you control what happens when you don't have a valid cluster - e.g. you could make the remaining node read-only until the cluster is re-established.

Resources