How to add dedicated master node to existing elasticsearch cluster - elasticsearch

We have 6 elasticsearch 6.4 with 3 of them are master eligible does both master and data node operations.
We are thinking of getting 3 dedicated Master as we see the 3 Master/Data node uses high resource utilization sometime and feel that it might crash during working hours some day.
Looking for procedure to add 3 new dedicated master server to existing cluster and how to make the current 3 Master/Data node to just data node.

We found our procedure on how to do this from below link.
https://discuss.elastic.co/t/introduction-of-dedicated-master-nodes/43601
We followed following steps (except disabling http port) mentioned in the post.
shutdown cluster
modify actual 5 nodes with master: false flag and data: true
make 3 new nodes master:true and data: false
modify all nodes to discover using 3 new master nodes addresses
we can optionnally disable http port on master nodes to make them not receiving REST requests.
start cluster
We are still in experimental stage So full cluster restart is not an issue for us however the link has discussion about how to add dedicated master dynamically and avoid split brain issue.

Related

Can you run an elasticsearch data node after deleting the data folder?

I am running a three node Elasticsearch (ELK) cluster. All nodes have all and the same roles, e.g. data, master, etc. The disk on node 3 where the data folder is assigned became corrupt and that data is probably unrecoverable. The other nodes are running normally and one of them assumed the master role instead.
Will the cluster work normally if I replace the disk and make the empty directory available to elastic again, or am I risking crashing the whole cluster?
EDIT: As this is not explicitly mentioned in the answer, yes, if you add your node with an empty data folder, the cluster will continue normally as if you added a new node to the cluster, but you have to deal with the missing data. In my case, I lost the data as I do not have replicas.
Let me try to explain that in simple way.
Your data got corrupt at node-3 so if you add that that node again, it will not have the older data, i.e. the shards stored in node-3 will remain unavailable for the cluster.
Did you have the replica shards configured for the indexes?
What is the current status(yellow/red) of the cluster when you have
node-3 removed?
If a primary shard isn't available then the master-node promotes one of the active replicas to become the new primary. If there are currently no active replicas then status of the cluster will remain red.

How to know total nodes in an elasticsearch cluster?

I have 3 nodes elasticsearch cluster. If more than one node goes down then I can easily check them manually. Suppose nodes in the cluster got increased then it will be difficult to check them manually. So, how can I get all the nodes(specifically name of the nodes) of the cluster even if they are down?
To get live/healthy nodes I hit the api endpoint:
curl -X GET "hostname/ip:port/_cat/nodes?v&pretty"
Is there any endpoint by using which I can get total nodes and unhealthy/down nodes in elasticsearch cluster?
I was trying to list all the nodes using discovery.seed.hosts present in elasticsearch.yml config file. But I don't know how to do it or is it the right approach or not.
I don't think there is any API to know about offline nodes. If your entire cluster is down or single node down, then Elastic doesn't provide any way to check the node's health. You need to depend on an external script or code or monitoring tool which will ping all your nodes and print status.
You can write a custom script which will call below API and it will return all the nodes which are available in the cluster. Once you have received response, you can filter out IP or hostname of the node and whichever are not coming in response you can consider it as down node.
GET _cat/nodes?format=json&filter_path=ip,name
Another option is to enable cluster monitoring which will give you status of entire cluster but again it will show information about running node only.
Please check this answer for how Kibana show offline node in Cluster Monitoring.

How to configure 3 new instances as dedicated master nodes in a running cluster with all its master and data nodes (Elasticsarch)?

Context:
We have an elastic search cluster with 10 nodes that are all configured as master: true and data true.
Due to the characteristics of our infrastructure, all the nodes of a cluster (cluster of virtual machines in this case) take their configuration from a Github repository. In other words, each and every single node has the same configuration.
From this group these 10 nodes that are configured as master: true, data: true, 3 are configured as master eligible.
Steps we performed:
We turned off the 10 nodes that were being used by the cluster (all master and data true).
We changed the configuration in the old nodes (let's call that group cluster of virtual machines: elastic-data) to data: true and the new nodes (let's call that new cluster of virtual machines elastic-master) to master: true.
We set the new master as master elegible on both configurations (elastic-master and elastic-data nodes).
We restarted the app.
Problem we found:
The cluster started normally. The queries for cluster administration tasks (list of nodes, search for configurations, etc.) went very fast. With the previous configuration, most of the time, it did not respond. When we tried to perform a query to see the data we got: cannot allocate because all found copies of the shard are either stale or corrupt.
After hours of trying to recover from that state, we decided to roll back the configuration, result cluster continues unstable.
A step-by-step guide on how to do this without leaving the cluster in the described state is highly appreciated.

Why Druid segments become unavailable after data ingestion

Druid cluster shows unavailable for certain segments of data of data source after data ingestion.
Ex: 72.4% available (2352 segments, 647 segments unavailable)
We have a clustered deployment 3 nodes :
master node (coordinator amd overlord)
Data node (historical and middlemanager)
Query node (broker and router)
Any specific reason why it is happening so.
The issue is resolved after clean restart of master and data nodes. However just restarting nodes without cleaning data did not work

Adding cluster to existing elastic search in elk

Currently I have existing
1. Elastic search
2. Logstash
3. Kibana
I have existing data on them.
Now i have setup ELK cluster with 3 Master nodes , 5 data nodes 3 client nodes.
But i am not sure how can i get existing data into them.
Is it possible that if i make the existing ES node as data node and then attach it to the cluster . Then will that data gets replicated to other data nodes as well? and then take that node offline
Option 1
How about just try with fewer nodes? It is not hard to test if it is supported if you setup one node, feed some data, and add one more and configure them as a cluster to see if data get synchronized.
Option 2
Another option is to use an elasticsearch migration tool like https://github.com/taskrabbit/elasticsearch-dump, basically, you could setup a clean cluster and migrate all your data in old node to this cluster.

Resources