I am new to ES. I want to add some new nodes in elasticsearch.yml file for the same cluster, but I have this error when I try to check the cluster health
(GET _cluster/health):
"error": "MasterNotDiscoveredException[waited for [30s]]",
"status": 503
This is my config file:
cluster.name: mycluster6
node.name: "nodeA"
node.master: true
node.data: true
discovery.zen.ping.multicast.enabled: false
cluster.name: mycluster6
node.name: "nodeB"
node.master: false
node.data: true
discovery.zen.ping.multicast.enabled: false
cluster.name: mycluster6
node.name: "nodeC"
node.master: false
node.data: true
discovery.zen.ping.multicast.enabled: false
cluster.name: mycluster6
node.name: "nodeD"
node.master: false
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.minimum_master_nodes: 3
in console:
java.lang.NullPointerException
at org.elasticsearch.marvel.agent.Utils.extractHostsFromHttpServer(Utils.java:90)
at org.elasticsearch.marvel.agent.exporter.ESExporter.openAndValidateConnection(ESExporter.java:344)
at org.elasticsearch.marvel.agent.exporter.ESExporter.openExportingConnection(ESExporter.java:212)
at org.elasticsearch.marvel.agent.exporter.ESExporter.exportXContent(ESExporter.java:275)
at org.elasticsearch.marvel.agent.exporter.ESExporter.exportNodeStats(ESExporter.java:173)
at org.elasticsearch.marvel.agent.AgentService$ExportingWorker.exportNodeStats(AgentService.java:305)
at org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:225)
at java.lang.Thread.run(Thread.java:745)
[2015-05-05 14:00:58,053][TRACE][discovery.zen ] [nodeD] full ping responses: {none}
[2015-05-05 14:00:58,053][DEBUG][discovery.zen ] [nodeD] filtered ping responses: (filter_client[true], filter_data[false]) {none}
[2015-05-05 14:00:58,053][TRACE][discovery.zen ] [nodeD] not enough master nodes [[]]
[2015-05-05 14:00:58,053][TRACE][discovery.zen ] [nodeD] starting to ping
[2015-05-05 14:01:01,065][TRACE][discovery.zen ] [nodeD] full ping responses: {none}
[2015-05-05 14:01:01,065][DEBUG][discovery.zen ] [nodeD] filtered ping responses: (filter_client[true], filter_data[false]) {none}
[2015-05-05 14:01:01,065][TRACE][discovery.zen ] [nodeD] not enough master nodes [[]]
[2015-05-05 14:01:01,065][TRACE][discovery.zen ] [nodeD] starting to ping
[2015-05-05 14:01:04,079][TRACE][discovery.zen ] [nodeD] full ping responses: {none}
[2015-05-05 14:01:04,079][DEBUG][discovery.zen ] [nodeD] filtered ping responses: (filter_client[true], filter_data[false]) {none}
[2015-05-05 14:01:04,079][TRACE][discovery.zen ] [nodeD] not enough master nodes [[]]
[2015-05-05 14:01:04,079][TRACE][discovery.zen ] [nodeD] starting to ping
[2015-05-05 14:01:05,795][ERROR][marvel.agent ] [nodeD] exporter [es_exporter] has thrown an exception:
Related
Now I have two nodes(192.168.72.129, 192.168.72.130)
It's the setting in config/elasticsearch.yml
======node-1======
cluster.name: cluster-es
node.name: node-1
network.host: 0.0.0.0
node.master: true
node.data: true
http.port: 9200
http.cors.allow-origin: "*"
http.cors.enabled: true
transport.port: 9300
http.max_content_length: 200mb
cluster.initial_master_nodes: ["node-1"]
discovery.seed_hosts: ["192.168.72.129","192.168.72.130"]
gateway.recover_after_nodes: 2
network.tcp.keep_alive: true
network.tcp.no_delay: true
transport.tcp.compress: true
cluster.routing.allocation.cluster_concurrent_rebalance: 16
cluster.routing.allocation.node_concurrent_recoveries: 16
cluster.routing.allocation.node_initial_primaries_recoveries: 16
======node-2======
cluster.name: cluster-es
node.name: node-2
network.host: 0.0.0.0
node.master: true
node.data: true
http.port: 9200
http.cors.allow-origin: "*"
http.cors.enabled: true
transport.port: 9300
http.max_content_length: 200mb
cluster.initial_master_nodes: ["node-1"]
discovery.seed_hosts: ["192.168.72.129","192.168.72.130"]
gateway.recover_after_nodes: 2
network.tcp.keep_alive: true
network.tcp.no_delay: true
transport.tcp.compress: true
cluster.routing.allocation.cluster_concurrent_rebalance: 16
cluster.routing.allocation.node_concurrent_recoveries: 16
cluster.routing.allocation.node_initial_primaries_recoveries: 16
but when I curl http://192.168.72.129:9200/_cat/nodes
there is only one node to show, how can I solve it?
I have 2 servers, and create elasticsearch nodes in the 2 servers. the content of docker-compose.yml files are like these:
es0:
image: elasticsearch:7.6.0
container_name: es0
environment:
- "ES_JAVA_OPTS=-Xms1024m -Xmx1024m"
ulimits:
memlock:
soft: -1
hard: -1
ports:
- 9200:9200
- 9300:9300
volumes:
- "/mnt/docker/es0/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml"
- "/mnt/docker/es0/data:/usr/share/elasticsearch/data"
- "/mnt/docker/es0/plugins:/usr/share/elasticsearch/plugins"
- "/mnt/docker/es0/config/cert:/usr/share/elasticsearch/config/cert"
es1:
image: elasticsearch:7.6.0
container_name: es1
environment:
- "ES_JAVA_OPTS=-Xms1024m -Xmx1024m"
ulimits:
memlock:
soft: -1
hard: -1
ports:
- 9200:9200
- 9300:9300
volumes:
- "/mnt/docker/es1/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml"
- "/mnt/docker/es1/data:/usr/share/elasticsearch/data"
- "/mnt/docker/es1/plugins:/usr/share/elasticsearch/plugins"
- "/mnt/docker/es1/config/cert:/usr/share/elasticsearch/config/cert"
and I configured the elasticsearch.yml like these:
cluster.name: hs-cluster
node.name: es-00
node.master: true
node.data: true
http.host: 0.0.0.0
http.port: 9200
transport.host: 0.0.0.0
transport.tcp.port: 9300
#network.host: 0.0.0.0
network.bind_host: ["192.168.0.2", "101.xx.xx.136"]
network.publish_host: 192.168.0.2
gateway.recover_after_nodes: 1
http.cors.enabled: true
http.cors.allow-origin: "*"
cluster.initial_master_nodes: ["es-00", "es-01"]
discovery.seed_hosts: [ "192.168.0.2:9300", "192.168.0.3:9300" ]
bootstrap.memory_lock: true
bootstrap.system_call_filter: false
cluster.name: hs-cluster
node.name: es-01
node.master: true
node.data: true
http.host: 0.0.0.0
http.port: 9200
transport.host: 0.0.0.0
transport.tcp.port: 9300
#network.host: 0.0.0.0
network.bind_host: ["192.168.0.3", "101.xx.xx.137"]
network.publish_host: 192.168.0.3
gateway.recover_after_nodes: 1
http.cors.enabled: true
http.cors.allow-origin: "*"
cluster.initial_master_nodes: ["es-00", "es-01"]
discovery.seed_hosts: [ "192.168.0.2:9300", "192.168.0.3:9300" ]
bootstrap.memory_lock: true
bootstrap.system_call_filter: false
when I run the instances, they all started successfully. But when I call _cluster/state?pretty, they all gave the error message:
{
"error" : {
"root_cause" : [
{
"type" : "master_not_discovered_exception",
"reason" : null
}
],
"type" : "master_not_discovered_exception",
"reason" : null
},
"status" : 503
}
that means they can't find each other.
I also tried to set network.host: 0.0.0.0
but the result was the same.
Who know the reason of this master not discovered exception? How to resolve it?
btw, I can ran the cluster in the same server with docker compose. But in different servers, it is failed. I also ran telnet xxx 9300 in each server, they all connected.
What is your default docker-engine network configuration?
Sometimes multiple servers have the same network, so dockers don't route from one server to another.
To resolve this you have to modify the daemon.json file to the following:
node1
{
"bip": "10.40.18.1/28"
}
node2
{
"bip": "10.40.18.65/28"
}
Used this guide for SSL certs creation
I'm trying to connect to remote Elasticsearch cluster. Both clusters are using SSL certificates (signed by same CA), is it possible ?
Local cluster:
cluster.name: client1
searchguard.enterprise_modules_enabled: false
node.name: ekl.test.com
node.master: true
node.data: true
node.ingest: true
network.host: 0.0.0.0
#http.host: 0.0.0.0
network.publish_host: ["ekl1.test1.com","ekl.test.com"]
http.port: 9200
discovery.zen.ping.unicast.hosts: ["ekl.test.com", "ekl2.test2.com"]
discovery.zen.minimum_master_nodes: 1
xpack.security.enabled: false
searchguard.ssl.transport.pemcert_filepath: '/etc/elasticsearch/ssl/node1.pem'
searchguard.ssl.transport.pemkey_filepath: 'ssl/node1.key'
searchguard.ssl.transport.pemtrustedcas_filepath: '/etc/elasticsearch/ssl/root-ca.pem'
searchguard.ssl.transport.enforce_hostname_verification: false
searchguard.ssl.transport.resolve_hostname: false
searchguard.ssl.http.enabled: true
searchguard.ssl.http.pemcert_filepath: '/etc/elasticsearch/ssl/node1_http.pem'
searchguard.ssl.http.pemkey_filepath: '/etc/elasticsearch/ssl/node1_http.key'
searchguard.ssl.http.pemtrustedcas_filepath: '/etc/elasticsearch/ssl/root-ca.pem'
searchguard.nodes_dn:
- CN=ekl.test.com,OU=Ops,O=BugBear BG\, Ltd.,DC=BugBear,DC=com
- CN=ekl1.test1.com,OU=Ops,O=BugBear BG\, Ltd.,DC=BugBear,DC=com
searchguard.authcz.admin_dn:
- CN=admin.test.com,OU=Ops,O=BugBear Com\, Inc.,DC=example,DC=com
Remote cluster:
cluster.name: client2
searchguard.enterprise_modules_enabled: false
node.name: ekl1.test.com
node.master: false
node.data: true
node.ingest: false
network.host: 0.0.0.0
#http.host: 0.0.0.0
network.publish_host: ["ekl.test.com","ekl1.test1.com"]
http.port: 9200
discovery.zen.ping.unicast.hosts: ["ekl6.test1.com", "ekl1.test1.com"]
discovery.zen.minimum_master_nodes: 1
xpack.security.enabled: false
searchguard.ssl.transport.pemcert_filepath: '/etc/elasticsearch/ssl/node2.pem'
searchguard.ssl.transport.pemkey_filepath: 'ssl/node2.key'
searchguard.ssl.transport.pemtrustedcas_filepath: '/etc/elasticsearch/ssl/root-ca.pem'
searchguard.ssl.transport.enforce_hostname_verification: false
searchguard.ssl.transport.resolve_hostname: false
searchguard.ssl.http.enabled: true
searchguard.ssl.http.pemcert_filepath: '/etc/elasticsearch/ssl/node2_http.pem'
searchguard.ssl.http.pemkey_filepath: '/etc/elasticsearch/ssl/node2_http.key'
searchguard.ssl.http.pemtrustedcas_filepath: '/etc/elasticsearch/ssl/root-ca.pem'
searchguard.nodes_dn:
- CN=ekl.test.com,OU=Ops,O=BugBear BG\, Ltd.,DC=BugBear,DC=com
- CN=ekl1.test1.com,OU=Ops,O=BugBear BG\, Ltd.,DC=BugBear,DC=com
searchguard.authcz.admin_dn:
- CN=admin.test.com,OU=Ops,O=BugBear Com\, Inc.,DC=example,DC=com
Certificates are self-signed
I can make curl to remote cluster from local one.
curl -vX GET "https://admin:Pass#ekl1.test1.com:9200"
I added remote domain in Kibana GUI: ekl1.test1.com:9200
and getting this error in ES log:
RemoteClusterConnection] [4P1fXFO] fetching nodes from external cluster >[client2] failed
org.elasticsearch.transport.ConnectTransportException: [][172.31.37.123:9200] >handshake_timeout[30s]
Solved by specifying port 9300 instead 9200 in Kibana interface
and
http.cors.enabled: true
http.cors.allow-origin: "*"
I have a 1 data - 1 master es cluster. (using 6.4.2 on CentOS 7)
On my master01:
==> /opt/elasticsearch/logs/master01-elastic.my-local-domain-master01-elastic/esa-local-stg-cluster.log <==
[2019-02-08T11:06:21,267][INFO ][o.e.n.Node ] [master01-elastic] initialized
[2019-02-08T11:06:21,267][INFO ][o.e.n.Node ] [master01-elastic] starting ...
[2019-02-08T11:06:21,460][INFO ][o.e.t.TransportService ] [master01-elastic] publish_address {10.18.0.13:9300}, bound_addresses {10.18.0.13:9300}
[2019-02-08T11:06:21,478][INFO ][o.e.b.BootstrapChecks ] [master01-elastic] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2019-02-08T11:06:24,543][INFO ][o.e.c.s.MasterService ] [master01-elastic] zen-disco-elected-as-master ([0] nodes joined)[, ], reason: new_master {master01-elastic}{10kX4tQMTzS0O8AQYvieZw}{GH9oflu7QZuJB_U7sPJDlg}{10.18.0.13}{10.18.0.13:9300}{xpack.installed=true}
[2019-02-08T11:06:24,550][INFO ][o.e.c.s.ClusterApplierService] [master01-elastic] new_master {master01-elastic}{10kX4tQMTzS0O8AQYvieZw}{GH9oflu7QZuJB_U7sPJDlg}{10.18.0.13}{10.18.0.13:9300}{xpack.installed=true}, reason: apply cluster state (from master [master {master01-elastic}{10kX4tQMTzS0O8AQYvieZw}{GH9oflu7QZuJB_U7sPJDlg}{10.18.0.13}{10.18.0.13:9300}{xpack.installed=true} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)[, ]]])
[2019-02-08T11:06:24,575][INFO ][o.e.h.n.Netty4HttpServerTransport] [master01-elastic] publish_address {10.18.0.13:9200}, bound_addresses {10.18.0.13:9200}
[2019-02-08T11:06:24,575][INFO ][o.e.n.Node ] [master01-elastic] started
[2019-02-08T11:06:24,614][INFO ][o.e.l.LicenseService ] [master01-elastic] license [c2004733-fa30-4249-bb07-d5f2238816ad] mode [basic] - valid
[2019-02-08T11:06:24,615][INFO ][o.e.g.GatewayService ] [master01-elastic] recovered [0] indices into cluster_state
[root#master01-elastic ~]# systemctl status elasticsearch
● master01-elastic_elasticsearch.service - Elasticsearch-master01-elastic
Loaded: loaded (/usr/lib/systemd/system/master01-elastic_elasticsearch.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2019-02-08 11:06:12 EST; 2 days ago
Docs: http://www.elastic.co
Main PID: 18695 (java)
CGroup: /system.slice/master01-elastic_elasticsearch.service
├─18695 /bin/java -Xms2g -Xmx2g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -server -Djava.awt.headless=true -Dfile.encoding...
└─18805 /usr/share/elasticsearch/modules/x-pack/x-pack-ml/platform/linux-x86_64/bin/controller
Feb 08 11:06:12 master01-elastic systemd[1]: Started Elasticsearch-master01-elastic.
[root#master01-elastic ~]# ss -tula | grep -i 9300
[root#master01-elastic ~]#
cluster logs on my master01:
[2019-02-11T02:36:21,406][INFO ][o.e.n.Node ] [master01-elastic] initialized
[2019-02-11T02:36:21,406][INFO ][o.e.n.Node ] [master01-elastic] starting ...
[2019-02-11T02:36:21,619][INFO ][o.e.t.TransportService ] [master01-elastic] publish_address {10.18.0.13:9300}, bound_addresses {10.18.0.13:9300}
[2019-02-11T02:36:21,654][INFO ][o.e.b.BootstrapChecks ] [master01-elastic] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2019-02-11T02:36:24,813][INFO ][o.e.c.s.MasterService ] [master01-elastic] zen-disco-elected-as-master ([0] nodes joined)[, ], reason: new_master {master01-elastic}{10kX4tQMTzS0O8AQYvieZw}{Vgq60hVVRn-3aO_uBuc2uQ}{10.18.0.13}{10.18.0.13:9300}{xpack.installed=true}
[2019-02-11T02:36:24,818][INFO ][o.e.c.s.ClusterApplierService] [master01-elastic] new_master {master01-elastic}{10kX4tQMTzS0O8AQYvieZw}{Vgq60hVVRn-3aO_uBuc2uQ}{10.18.0.13}{10.18.0.13:9300}{xpack.installed=true}, reason: apply cluster state (from master [master {master01-elastic}{10kX4tQMTzS0O8AQYvieZw}{Vgq60hVVRn-3aO_uBuc2uQ}{10.18.0.13}{10.18.0.13:9300}{xpack.installed=true} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)[, ]]])
[2019-02-11T02:36:24,856][INFO ][o.e.h.n.Netty4HttpServerTransport] [master01-elastic] publish_address {10.18.0.13:9200}, bound_addresses {10.18.0.13:9200}
[2019-02-11T02:36:24,856][INFO ][o.e.n.Node ] [master01-elastic] started
[2019-02-11T02:36:24,873][INFO ][o.e.l.LicenseService ] [master01-elastic] license [c2004733-fa30-4249-bb07-d5f2238816ad] mode [basic] - valid
[2019-02-11T02:36:24,875][INFO ][o.e.g.GatewayService ] [master01-elastic] recovered [0] indices into cluster_state
This makes master undiscoverable so in my data01
[2019-02-11T02:24:09,882][WARN ][o.e.d.z.ZenDiscovery ] [data01-elastic] not enough master nodes discovered during pinging (found [[]], but needed [1]), pinging again
Also on my data01
[root#data01-elastic ~]# cat /etc/elasticsearch/data01-elastic/elasticsearch.yml | grep -i zen
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.unicast.hosts: 10.18.0.13:9300
[root#data01-elastic ~]# ping 10.18.0.13
PING 10.18.0.13 (10.18.0.13) 56(84) bytes of data.
64 bytes from 10.18.0.13: icmp_seq=1 ttl=64 time=0.171 ms
64 bytes from 10.18.0.13: icmp_seq=2 ttl=64 time=0.147 ms
How can I further troubleshoot this?
The cluster was deployed using these ansible scripts:
with this configuration for the master:
- hosts: masters
tasks:
- name: Elasticsearch Master Configuration
import_role:
name: elastic.elasticsearch
vars:
es_instance_name: "{{ ansible_hostname }}"
es_data_dirs:
- "{{ data_dir }}"
es_log_dir: "/opt/elasticsearch/logs"
es_config:
node.name: "{{ ansible_hostname }}"
cluster.name: "{{ cluster_name }}"
discovery.zen.ping.unicast.hosts: "{% for host in groups['masters'] -%}{{ hostvars[host]['ansible_ens33']['ipv4']['address'] }}:9300{% if not loop.last %},{% endif %}{%- endfor %}"
http.port: 9200
transport.tcp.port: 9300
node.data: false
node.master: true
bootstrap.memory_lock: true
network.host: '{{ ansible_facts["ens33"]["ipv4"]["address"] }}'
discovery.zen.minimum_master_nodes: 1
es_xpack_features: []
es_scripts: false
es_templates: false
es_version_lock: true
es_heap_size: 2g
es_api_port: 9200
and this for the data
- hosts: data
tasks:
- name: Elasticsearch Data Configuration
import_role:
name: elastic.elasticsearch
vars:
es_instance_name: "{{ ansible_hostname }}"
es_data_dirs:
- "{{ data_dir }}"
es_log_dir: "/opt/elasticsearch/logs"
es_config:
node.name: "{{ ansible_hostname }}"
cluster.name: "{{ cluster_name }}"
discovery.zen.ping.unicast.hosts: "{% for host in groups['masters'] -%}{{ hostvars[host]['ansible_ens33']['ipv4']['address'] }}:9300{% if not loop.last %},{% endif %}{%- endfor %}"
http.port: 9200
transport.tcp.port: 9300
node.data: true
node.master: false
bootstrap.memory_lock: true
network.host: '{{ ansible_facts["ens33"]["ipv4"]["address"] }}'
discovery.zen.minimum_master_nodes: 1
es_xpack_features: []
es_scripts: false
es_templates: false
es_version_lock: true
es_heap_size: 6g
es_api_port: 9200
The 2 VMs I was trying to establish communication among were Centos7 which has firewalld enabled by default.
Disabling and stopping the service solved the issue.
I cant't start elasticsearch with node.master:false
elasticsearch.yml
cluster.name: graylog2
node.name: "second"
node.master: false
node.data: true
index.number_of_shards: 2
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: 192.168.93.76
script.disable_dynamic: true
service elasticsearch restart
netstat -an | grep 9200
NULL
YML has very strict syntax, you need to add a space between node.master and false:
node.master: false