Elasticsearch Unicast Weird Behavior in Clustering - elasticsearch

I have two nodes each of which forms a cluster (with one empty node).
0.0.0.0:9200 (elasticsearch)
0.0.0.0:9201 (test-1)
Node at 9200 is in cluster elasticsearch (maybe default cluster.name). Node at 9201 is in cluster test-1. (Additionally, important or not, I bind network.hosts of both nodes to 0.0.0.0)
I want to join a new node to test-1. When I leave discovery.zen.ping.unicast.hosts setting commented out alone, the new node is successfully joined to test-1. However, When I set it something else, e.g., ["0.0.0.0"] or ["127.0.1"], it is failed to join...
Joining a new node to elasticsearch has no problem. ["0.0.0.0"], ["127.0.1"] and ["IP"] all worked well. (But ["0.0.0.0", "ANOTHER-IP"] failed... Please answer about this as well if possible...)
What causes this joining issue? Have anybody experienced problems like this?

The discovery.zen.ping.unicast.hosts should have the IPs of all the nodes joining the cluster. Do this for all the nodes in the cluster and use IPs not 0.0.0.0 or 127.0.0.1.
As your new node is trying to join the test-1 cluster you can try to change the port of the new node to 9201 and see if it joins.
The minimal things required to form a cluster:
Same cluster.name
Put different node.name
discovery.zen.ping.unicast.hosts - IPs of all the nodes in the cluster.
gateway.recover_after_nodes and discovery.zen.minimum_master_nodes - comment these lines if they are not already so for all the nodes of the cluster.
Lastly check your firewall settings and disable the firewall if necessary. Check if the nodes can talk to each other.

Related

How to know total nodes in an elasticsearch cluster?

I have 3 nodes elasticsearch cluster. If more than one node goes down then I can easily check them manually. Suppose nodes in the cluster got increased then it will be difficult to check them manually. So, how can I get all the nodes(specifically name of the nodes) of the cluster even if they are down?
To get live/healthy nodes I hit the api endpoint:
curl -X GET "hostname/ip:port/_cat/nodes?v&pretty"
Is there any endpoint by using which I can get total nodes and unhealthy/down nodes in elasticsearch cluster?
I was trying to list all the nodes using discovery.seed.hosts present in elasticsearch.yml config file. But I don't know how to do it or is it the right approach or not.
I don't think there is any API to know about offline nodes. If your entire cluster is down or single node down, then Elastic doesn't provide any way to check the node's health. You need to depend on an external script or code or monitoring tool which will ping all your nodes and print status.
You can write a custom script which will call below API and it will return all the nodes which are available in the cluster. Once you have received response, you can filter out IP or hostname of the node and whichever are not coming in response you can consider it as down node.
GET _cat/nodes?format=json&filter_path=ip,name
Another option is to enable cluster monitoring which will give you status of entire cluster but again it will show information about running node only.
Please check this answer for how Kibana show offline node in Cluster Monitoring.

Elasticsearch cluster in AWS ECS

I'm trying to create an elasticsearch cluster in AWS ECS but i'm getting the warn "message": "master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes. My elasticsearch.yml and task definition are the same for all nodes. How can i differentiate between the master and the other nodes ? Should i have a separate elasticsearch.yml/task definition for master node ?
My elasticsearch.yml :
cluster.name: "xxxxxxxxxxx"
bootstrap.memory_lock: false
network.host: 0.0.0.0
network.publish_host: _ec2:privateIp_
transport.publish_host: _ec2:privateIp_
discovery.seed_providers: ec2
discovery.ec2.tag.project: xxxxxxx-elasticsearch
discovery.ec2.endpoint: ec2.${REGION}.amazonaws.com
s3.client.default.endpoint: s3.${REGION}.amazonaws.com
cloud.node.auto_attributes: true
cluster.routing.allocation.awareness.attributes: aws_availability_zone
xpack.security.enabled: false
I have faced the similar problem as well. Firstly, You need to create a initial cluster and make it ready to form a cluster. It is possible to start by using a inital node configuration on elasticsearch.yml. The solution I am using is to host on one ECS instance running with one elasticsearch docker container (As elasticsearch requires good amount of memory)
cluster.initial_master_nodes: '<<INITIAL_NODE_IPADDRESS>>'
This above configuration kickstarts the cluster that means elasticsearch is ready to join the nodes. In the next step Add the below configuration
cluster.initial_master_nodes: [<<MASTER_NODE_IPADDRESS>>,<<INITIAL_NODE_IPADDRESS>>]
discovery.seed_hosts: [<<MASTER_NODE_IPADDRESS>>,<<INITIAL_NODE_IPADDRESS>>]
Then you can add as many number of data nodes as you want. This depends on how much data you have.
Note: The IPADDRESS are from different nodes so use AWS SSM Parameter store to store IP securely and use engtrypoint.sh to get those and update the elasticsearch.yml file dynamically when you are building the docker images.
I hope this will solve the problem.

What does discovery.seed_hosts and cluster.initial_master_nodes mean in ES

I am using ES 7.10.1, and I am reading https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#unicast.hosts.
I have 5 master nodes (node.master: true node.data: false) and 20 data nodes(node.master: false node.data: true).
I got following four questions:
Should both discovery.seed_hosts and cluster.initial_master_nodes be specified with the master nodes? I mean, could I specify the data node for these two configurations?
Since I have 5 master nodes in my case, how many nodes should I specify for these two configurations. I think I don't have to list all of these 5 nodes in these configurations?
It looks to me that discovery.seed_hosts is like old version elastic search's discovery.zen.ping.unicast.hosts?
It looks to me that cluster.initial_master_nodes is like old version elastics search's discovery.zen.minimum_master_nodes?
Thanks!
It is recommended to add the setting in at least 3 master nodes for fault tolerance.
I would put all 5 to avoid confusion
Yes it is but will still work for legacy
Yes it is but will still work for legacy
Remember that cluster.initial_master_nodes is used once after the cluster bootstrapping then will be ignored and it is recommended to remove it from the configuration.
Cluster bootstrapping
Cluster Settings
In short discovery.seed_hosts is the list of master nodes a new node uses to join the cluster, and cluster.initial_master_nodes is the initial list to bootstrap a cluster.

How to check/make sure of Elasticsearch load balancer?

Using Nest,Asp.net Core 3.1 and Elasticsearch , I have created a 3-nodes-Cluster, with default roles.
How could I check that the queries/search queries are balanced between my local machines?
I tried to monitor metrics of each server/node while indexing large data, and I saw that only nodes having related replica and primary shard were engaged during the large indexing process.
But I need to check and make sure that the requests are balanced/divided between my nodes in a round robin manner, but I do not know how to check that? Is there any way or any tools that I make sure that for example, at first search query node-1 is engaged and at second search query node-3 is engaged?
Any hint, keyword and any help is appreciable.
My each .KML configuration : (all 3 nodes are
cluster.name: my-cluster
node.name: node-1
network.host: 192.168.254.137
http.port: 9200
discovery.seed_hosts: ["192.168.254.137", "192.168.254.135", "192.168.254.136"]
cluster.initial_master_nodes: ["192.168.254.137", "192.168.254.135", "192.168.254.136"]
My index is distributed as below:
index shard prirep state docs store ip node
suggestionindex 0 p STARTED 2000 170.5kb 192.168.254.136 node-3
suggestionindex 0 r STARTED 2000 90.5kb 192.168.254.137 node-1
My appsettings.json :
"ElasticsearchSettings": {
// IP of one of the 3 master eligible nodes
"uri": "http://192.168.254.137:9200/",
"basicAuthUser": "",
"basicAuthPassword": ""
},
Does all the search queries send to primary shard (node-1) always?? or the search queries are balanced between node-1 and node-3 in my case?
If it is balanced, how can I check it?
Who balances it between nodes?? Nest or my Master node ?
Elasticsearch internally load-balance the queries on all the data-nodes, so you don't have to do anything from your side, if you are on Elasticsearch version 7.X, than elasticsearch uses the Smart load balancing technique called Adaptive replica selection before that by default it was based on round-robin technique.
Elastic Blog which I mentioned has all the details of its working.

Remove of data folder is not synced in Elasticsearch upon index delete

We have an ES cluster with 2 nodes. When we delete an index not all folders in the cluster (on filesystem) are deleted which causes some problems when restarting one server.
Then our deleted indices gets distributed with some weird state indicating that the cluster health is not green.
Example. We delete index with name someIndex and after deletion we check file system, one can see this:
Node1
ElasticSearch\data\clustername\nodes\0\indices\
ElasticSearch\data\clustername\nodes\1\indices\
Node2
ElasticSearch\data\clustername\nodes\0\indices\
ElasticSearch\data\clustername\nodes\1\indices\someIndex (<-- still present)
Anyone know whats causing this?
ES-version: 0.90.5
There are two nodes directories for each on your filesystem (these are nodes\0 and nodes\1).
When you start Elasticsearch, you start up a node (in ES-lingo). Your machine can host multiple nodes, which happens if you start Elasticsearch multiple times. The default settings for the http port is 9200-9300, that means, ES is looking for a free port in that range and binds its node to it (the same is true for the transport module with 9300-9400)
So, if you start an ES process while another is still running, that is, it's bound to a port, you start a second node and ES will create a new directory for it. Maybe this has happened if you issued a restart, but ES couldn't shut down in time before the new node started up.
But now you have a third node in your cluster and ES will assign shards to it. Then you do a cluster restart or something similar and you start one node on each of your machine. ES cannot find the shards that were assigned to the third node, because it's not spun up, and it will show you a red or yellow state, depending on what shards live on the third node. If you delete you index data, you won't delete the data from this missing node.
If you don't care about the data, you can just shutdown ES and delete these directories or start two ES nodes on each of your machines and then delete the index again.
Then you could change the port settings to one specific port, that would prevent second processes from starting up, since they won't be able to bind to a free port.

Resources