ElasticSearch : observer: timeout notification from cluster service - elasticsearch

I have a ElasticSearch Cluster with 3 Data Master Nodes, one dedicated Client Node & a logstash sending events to Elasticsearch Cluster via the elasticsearch client node.
The Client is not able to connect to the cluster and seeing the below errors in log:-
[2015-10-24 00:18:29,657][DEBUG][action.admin.indices.create] [ESClient] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m]
[2015-10-24 00:18:30,743][DEBUG][action.admin.indices.create] [ESClient] no known master node, scheduling a retry
I have gone through this answer but it is not working for me. My Master-Data elastic search config looks like below:-
cluster.name: elasticsearch
node.name: "ESMasterData1"
node.master: true
node.data: true
index.number_of_shards: 7
index.number_of_replicas: 1
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["es-master3:9300", "kibana:9300", "es-master2:9300", "es-master1:9300"]
cloud.aws.access_key: AK
cloud.aws.secret_key: J0
ES Client Config looks like below:-
cluster.name: elasticsearch
node.name: "ESClient"
node.master: false
node.data: false
index.number_of_shards: 7
index.number_of_replicas: 1
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["es-master1:9300", "es-master2:9300", "es-master3:9300", "kibana:9300"]
bootstrap.mlockall: true
cloud.aws.access_key: AK
cloud.aws.secret_key: J0
The nodes are having all the standard configuration like JVM Heap set to 30GB & mlockall set to true.
Logstash Output looks like below:
elasticsearch {
index => "j-%{env}-%{app}-%{iver}-%{[#metadata][app_log_time]}"
cluster => "elasticsearch"
host => "kibana"
port => "9300"
protocol => "transport"
}
Telnet is working fine from ES Client Node to ES Master-Data nodes on port 9300. Also all the three ES Master-Data nodes are able to talk to each other. I have also verified TCP & UDP is enabled between the client & data-master machine by using iperf.
I am using Elastic Search Version 1.7.1 on Debian 7
Can some one let me know what is going wrong or how can I debug this?

Related

Elasticsearch Cluster - No known master node, scheduling a retry

I have a server running elasticsearch and kibana. I have added a second node to form a cluster but only want that second node to replicate data from the master node.
Based on limited documentation on how to do this, I am running into issue on second with following error
[DEBUG][action.admin.indices.get ] [Match] no known master node, scheduling a retry
I am unable to determine the best configuration for both servers to achieve this but this is what I have done so far:
Master Node Config:
cluster.name: elasticsearch
node.master: true
path.data: /local00/elasticsearch/
path.work: /local00/el_temp/
network.host: 0.0.0.0
http.port: 9200
script.disable_dynamic: true
Node 2
cluster.name: elasticsearch
node.master: false
node.data: true
index.number_of_shards: 5
index.number_of_replicas: 1
path.data: /local00/elasticsearch/
path.work: /local00/el_temp/
network.host: 0.0.0.0
http.port: 9200
script.disable_dynamic: true
I am assuming I am missing additional config somewhere. Any help will be much appreciated.
Got it working with following changes answered here How to set up ES cluster?:
Node 1:
cluster.name: mycluster
node.name: "node1"
node.master: true
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["node1.example.com"]
Node 2:
cluster.name: mycluster
node.name: "node2"
node.master: false
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["node1.example.com"]
If you are trying to connect additional node to already existed ES cluster, make sure that this node also have all the same ES plugins as another nodes. If not - node cant be fully connected (other nodes can show it as connected but http-commands cant be launched on it) to ES cluster with error like:
[2016-07-21 11:56:59,564][DEBUG][action.admin.cluster.health] [dev-marvel1] no known master node, scheduling a retry
[2016-07-21 11:57:05,313][INFO ][rest.suppressed ] /_cluster/health Params: {pretty=true}
MasterNotDiscoveredException[waited for [30s]]
at org.elasticsearch.action.support.master.TransportMasterNodeAction$4.onTimeout(TransportMasterNodeAction.java:154)
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:239)
at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:574)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
In my case problem was with license plugin. After removing it everything became fine.
If you are searching for this because you are working on your local machine then the quickest solution is to free up nodes that are hung by individually killing all the processes. I searched for these processes using.
ps -alx | grep elastic
kill -9 {pid}

Elasticsearch cluster initialization

I just setup a 3 node Elasticsearch cluster, with each node having common settings (pasted at the end of the post)
However, when I start my master node, and try to get the cluster status or even check if any one of the nodes is up, I get a 503 as the status code. Also, shutdowns (on any of the nodes) do not work.
Could someone please tell me what I'm doing wrong here? The log file on Node 1 says:
[ESNode1] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
Here's snippets from the elasticsearch.yml config files:
Node 1
cluster.name: myCluster
node.name: ESNode1
node.master: true
node.data: true
discovery.zen.minimum_master_nodes: 2
discover.zen.ping.timeout: 20s #just for good measure
discovery.zen.ping.multicast.enabled: false
Node 2
cluster.name: myCluster
node.name: ESNode2
node.master: true
node.data: true
discovery.zen.minimum_master_nodes: 2
discover.zen.ping.timeout: 20s
discovery.zen.ping.multicast.enabled: false
Node 3
cluster.name: myCluster
node.name: ESNode3
node.master: false
node.data: true
discovery.zen.minimum_master_nodes: 2
discover.zen.ping.timeout: 20s
discovery.zen.ping.multicast.enabled: false
Thank you!
You configure that the minimum master nodes is 2. This means your cluster needs at least two master nodes. This is fine, however, together with the setting discovery.zen.ping.multicast.enabled: false this is hard to get working. This setting means you are not going to look for other nodes. So you should configure the nodes manually using the setting hosts.
You can find more information here:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html#unicast
An example for three nodes running on one machine:
discovery.zen.ping.unicast.hosts: ["127.0.0.1:9300","127.0.0.1:9301","127.0.0.1:9302"]
Disabling multicast discovery means discovery pings will only be sent to specific addresses. The addresses/hosts are those specified in discovery.zen.ping.unicast.hosts.
Note that a single address can be specified. When a node joins it becomes aware of all nodes in the cluster, and can start communicating with them directly.
To clarify using Jettros' example:
discovery.zen.unicast.hosts:["127.0.0.1:9300"]
will cause the nodes bound to 9301 and 9302 to ping only 9300.
If 9301 joins first it 'already knows' all other nodes in the cluster (just 9300).
If 9302 subsequently joins it will become aware of 9301 and vice-versa.
If 9301 and 9302 cant join with 9300 the cluster will not be formed.

How could I set up a ES cluster?

I have a master node which ip is 192.168.1.101 and a non-master node which ip is 192.168.1.106. The two use the same version of ElasticSearch-1.2.0.
But after I started the master node and the non-master node, then I got the following info:
[2014-06-04 02:38:49,350][INFO ][discovery.zen ] [node2] failed to send join request to master [[node1][TxZ5wuhnT1awPC1gEjYPdw][flyers-MacBook-Air.local][inet[/192.168.1.101:9300]]{master=true}], reason [org.elasticsearch.ElasticsearchTimeoutException: Timeout waiting for task.]
Config of the master node:
cluster.name: mycluster
node.name: "node1"
node.master: true
node.data: true
index.number_of_shards: 5
index.number_of_replicas: 1
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["192.168.1.101"]
Config of the non-master node:
cluster.name: mycluster
node.name: "node2"
node.master: false
node.data: true
index.number_of_shards: 5
index.number_of_replicas: 1
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["192.168.1.101"]
I don't know why this exception happens. Please give me some tips. Thanks in advance.
After I set network.bind_host、network.publish_host、network.host to the IP that the node held,it worked. Very strange.
I had the same issue until I found out that my ES node did not bind to eth0 as expected but to eth2 instead. Of course this could not work because the registration response from the master node could not be sent to the IP address of my other network.
I was able to fix this behaviour by setting the following parameter in my elasticsearch.yml (on the server which was not able to join the cluster)
network.publish_host: "_eth0:ipv4_"
I'd better change ["192.168.1.101"] to ["192.168.1.101", "192.168.1.106"] in both configurations.

Trying to replicate across two remote servers

I have two Elastic Search on different VPSs with their own host names and I haven't been able to get them to replicate to each other. They are both version 0.90.2.
My settings are:
cluster.name: mycluster
name.name: "nodeA"
node.master: true
node.data: true
index.number_of_shards: 5
index.number_of_replicas: 1
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["nodeB.example.com"]
and:
cluster.name: mycluster
name.name: "nodeB"
node.master: false
node.data: true
index.number_of_shards: 5
index.number_of_replicas: 1
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["nodeA.example.com"]
When I attempt to start the nodeB instance the result is:
[INFO ][discovery.zen] [nodeB]
failed to send join request to master
[[nodeA][IZFgtrTPSISQR7VklS3www][inet[/*.*.*.*:9300]]{master=true}],
reason [org.elasticsearch.ElasticSearchTimeoutException:
Timeout waiting for task.]
So communication is there because nodeB knows the others name is 'nodeA', but what I am I missing or what can I check?
UPDATE:
Unfortunately this turned out to be a server issue on nodeB and had nothing to do with ES.
For the record, the above settings work fine and the unicast setting for nodeA is pointless as nodeB is not a master only nodeB needs to know about nodeA.
UPDATED:
Make the following in both config files:
discovery.zen.ping.unicast.hosts: ["master_node_ip"]
If provided settings above is the all of it. Then you should add cluster name to your settings.
cluster.name: my_cool_cluster_name

How to set up ES cluster?

Assuming I have 5 machines I want to run an elasticsearch cluster on, and they are all connected to a shared drive. I put a single copy of elasticsearch onto that shared drive so all three can see it. Do I just start the elasticsearch on that shared drive on eall of my machines and the clustering would automatically work its magic? Or would I have to configure specific settings to get the elasticsearch to realize that its running on 5 machines? If so, what are the relevant settings? Should I worry about configuring for replicas or is it handled automatically?
its super easy.
You'll need each machine to have it's own copy of ElasticSearch (simply copy the one you have now) -- the reason is that each machine / node whatever is going to keep it's own files that are sharded accross the cluster.
The only thing you really need to do is edit the config file to include the name of the cluster.
If all machines have the same cluster name elasticsearch will do the rest automatically (as long as the machines are all on the same network)
Read here to get you started:
https://www.elastic.co/guide/en/elasticsearch/guide/current/deploy.html
When you create indexes (where the data goes) you define at that time how many replicas you want (they'll be distributed around the cluster)
It is usually handled automatically.
If autodiscovery doesn't work. Edit the elastic search config file, by enabling unicast discovery
Node 1:
cluster.name: mycluster
node.name: "node1"
node.master: true
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["node1.example.com"]
Node 2:
cluster.name: mycluster
node.name: "node2"
node.master: false
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["node1.example.com"]
and so on for node 3,4,5. Make node 1 master, and the rest only as data nodes.
Edit: Please note that by ES rule, if you have N nodes, then by convention, N/2+1 nodes should be masters for fail-over mechanisms They may or may not be data nodes, though.
Also, in case auto-discovery doesn't work, most probable reason is because the network doesn't allow it (and therefore disabled). If too many auto-discovery pings take place across multiple servers, the resources to manage those pings will prevent other services from running correctly.
For ex, think of a 10,000 node cluster and all 10,000 nodes doing the auto-pings.
Elastic Search 7 changed the configurations for cluster initialisation.
What is important to note is the ES instances communicate internally using the Transport layer(TCP) and not the HTTP protocol which is normally used to perform ops on the indices. Below is sample config for 2 machines cluster.
cluster.name: cluster-new
node.name: node-1
node.master: true
node.data: true
bootstrap.memory_lock: true
network.host: 0.0.0.0
http.port: 9200
transport.host: 102.123.322.211
transport.tcp.port: 9300
discovery.seed_hosts: [“102.123.322.211:9300”,"102.123.322.212:9300”]
cluster.initial_master_nodes:
- "node-1"
- "node-2”
Machine 2 config:-
cluster.name: cluster-new
node.name: node-2
node.master: true
node.data: true
bootstrap.memory_lock: true
network.host: 0.0.0.0
http.port: 9200
transport.host: 102.123.322.212
transport.tcp.port: 9300
discovery.seed_hosts: [“102.123.322.211:9300”,"102.123.322.212:9300”]
cluster.initial_master_nodes:
- "node-1"
- "node-2”
cluster.name: This has be same across all the machines that are going to be part of a cluster.
node.name : Identifier for the ES instance. Defaults to machine name if not given.
node.master: specifies whether this ES instance is going to be master or not
node.data: specifies whether this ES instance is going to be data node or not(hold data)
bootsrap.memory_lock: disable swapping.You can start the cluster without setting this flag. But its recommended to set the lock.More info: https://www.elastic.co/guide/en/elasticsearch/reference/master/setup-configuration-memory.html
network.host: 0.0.0.0 if you want to expose the ES instance over network. 0.0.0.0 is different from 127.0.0.1( aka localhost or loopback address).
It means all IPv4 addresses on the machine. If machine has multiple ip addresses with a server listening on 0.0.0.0, the client can reach the machine from any of the IPv4 addresses.
http.port: port on which this ES instance will listen to for HTTP requests
transport.host: The IPv4 address of the host(this will be used to communicate with other ES instances running on different machines). More info: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-transport.html
transport.tcp.port: 9300 (the port where the machine will accept the tcp connections)
discovery.seed_hosts: This was changed in recent versions. Initialise all the IPv4 addresses with TCP port(important) of ES instances that are going to be part of this cluster. This is going to be same across all ES instances that are part of this cluster.
cluster.initial_master_nodes: node names(node.name) of the ES machines that are going to participate in master election.(Quorum based decision making :- https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-quorums.html#modules-discovery-quorums)
I tried the steps that #KannarKK suggested on ES 2.0.2, however, I could not bring the cluster up and running. Evidently, I figured out something, as I had set tcp port number on Master, on the Slave configuration discovery.zen.ping.unicast.hosts needs Master's port number along with IP address ( tcp port number ) for discovery. So when I try following configuration it works for me.
Node 1
cluster.name: mycluster
node.name: "node1"
node.master: true
node.data: true
http.port : 9200
tcp.port : 9300
discovery.zen.ping.multicast.enabled: false
# I think unicast.host on master is redundant.
discovery.zen.ping.unicast.hosts: ["node1.example.com"]
Node 2
cluster.name: mycluster
node.name: "node2"
node.master: false
node.data: true
http.port : 9201
tcp.port : 9301
discovery.zen.ping.multicast.enabled: false
# The port number of Node 1
discovery.zen.ping.unicast.hosts: ["node1.example.com:9300"]

Resources