Elasticsearch getting stopped and starting again and again - elasticsearch

I am running elasticsearch on a 4GB instance and having a 2GB heap size. The elasticsearch runs fine but after some requests it gets stopped and than starts back on its own with the following logs.
[2016-07-11 11:11:04,492][INFO ][discovery ] [Ev Teel Urizen] elasticsearch/9amXiXXmTTyV_l_s3t6gnw
[2016-07-11 11:11:07,568][INFO ][cluster.service ] [Ev Teel Urizen] new_master {Ev Teel Urizen}{9amXiXXmTTyV_l_s3t6gnw}{P.P.P.P}{P.P.P.P:$300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-07-11 11:11:07,611][INFO ][http ] [Ev Teel Urizen] publish_address {P.P.P.P:9200}, bound_addresses {P.P.P.P:9200}
[2016-07-11 11:11:07,611][INFO ][node ] [Ev Teel Urizen] started
[2016-07-11 11:11:07,641][INFO ][gateway ] [Ev Teel Urizen] recovered [1] indices into cluster_state
[2016-07-11 11:11:07,912][INFO ][cluster.routing.allocation] [Ev Teel Urizen] Cluster health status changed from [RED] to [GREEN] (reason: [shards start$d [[user-details-index][0]] ...]).
[2016-07-11 11:13:01,482][INFO ][node ] [Ev Teel Urizen] stopping ...
[2016-07-11 11:13:01,503][INFO ][node ] [Ev Teel Urizen] stopped
[2016-07-11 11:13:01,503][INFO ][node ] [Ev Teel Urizen] closing ...
[2016-07-11 11:13:01,507][INFO ][node ] [Ev Teel Urizen] closed
where P.P.P.P is the private IP of the instance.
EDIT: Logs after changing log-level to DEBUG
[2016-07-11 12:46:34,129][DEBUG][index.shard ] [Cameron Hodge] [user-details-index][0] recovery completed from [shard_store], took [106ms]
[2016-07-11 12:46:34,129][DEBUG][cluster.action.shard ] [Cameron Hodge] [user-details-index][0] sending shard started for target shard [[user-details-index][0], node[TUeznaAxQsqaFq6iDbVUVw], [P], v[149], s[INITIALIZING], a[id=j_DlCimrSoi7kff-9Ah9xw], unassigned_info[[reason=CLUSTER_RECOVERED], at[2016-07-11T12:46:33.774Z]]], indexUUID [tStiDr3PRWaKIKCIHk5h0A], message [after recovery from store]
[2016-07-11 12:46:34,129][DEBUG][cluster.action.shard ] [Cameron Hodge] received shard started for target shard [[user-details-index][0], node[TUeznaAxQsqaFq6iDbVUVw], [P], v[149], s[INITIALIZING], a[id=j_DlCimrSoi7kff-9Ah9xw], unassigned_info[[reason=CLUSTER_RECOVERED], at[2016-07-11T12:46:33.774Z]]], indexUUID [tStiDr3PRWaKIKCIHk5h0A], message [after recovery from store]
[2016-07-11 12:46:34,130][DEBUG][cluster.service ] [Cameron Hodge] processing [shard-started ([user-details-index][0], node[TUeznaAxQsqaFq6iDbVUVw], [P], v[149], s[INITIALIZING], a[id=j_DlCimrSoi7kff-9Ah9xw], unassigned_info[[reason=CLUSTER_RECOVERED], at[2016-07-11T12:46:33.774Z]]), reason [after recovery from store]]: execute
[2016-07-11 12:46:34,131][INFO ][cluster.routing.allocation] [Cameron Hodge] Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[user-details-index][0]] ...]).
[2016-07-11 12:46:34,131][DEBUG][cluster.service ] [Cameron Hodge] cluster state updated, version [4], source [shard-started ([user-details-index][0], node[TUeznaAxQsqaFq6iDbVUVw], [P], v[149], s[INITIALIZING], a[id=j_DlCimrSoi7kff-9Ah9xw], unassigned_info[[reason=CLUSTER_RECOVERED], at[2016-07-11T12:46:33.774Z]]), reason [after recovery from store]]
[2016-07-11 12:46:34,131][DEBUG][cluster.service ] [Cameron Hodge] publishing cluster state version [4]
[2016-07-11 12:46:34,135][DEBUG][cluster.service ] [Cameron Hodge] set local cluster state to version 4
[2016-07-11 12:46:34,135][DEBUG][index.shard ] [Cameron Hodge] [user-details-index][0] state: [POST_RECOVERY]->[STARTED], reason [global state is [STARTED]]
[2016-07-11 12:46:34,153][DEBUG][cluster.service ] [Cameron Hodge] processing [shard-started ([user-details-index][0], node[TUeznaAxQsqaFq6iDbVUVw], [P], v[149], s[INITIALIZING], a[id=j_DlCimrSoi7kff-9Ah9xw], unassigned_info[[reason=CLUSTER_RECOVERED], at[2016-07-11T12:46:33.774Z]]), reason [after recovery from store]]: took 23ms done applying updated cluster_state (version: 4, uuid: G_fovEaNRFOfF0lHjz0h2A)
[2016-07-11 12:47:00,583][DEBUG][indices.memory ] [Cameron Hodge] recalculating shard indexing buffer, total is [203.1mb] with [1] active shards, each shard set to indexing=[203.1mb], translog=[64kb]
[2016-07-11 12:47:01,648][INFO ][node ] [Cameron Hodge] stopping ...
[2016-07-11 12:47:01,660][DEBUG][indices ] [Cameron Hodge] [user-details-index] closing ... (reason [shutdown])
[2016-07-11 12:47:01,661][DEBUG][indices ] [Cameron Hodge] [user-details-index] closing index service (reason [shutdown])
[2016-07-11 12:47:01,662][DEBUG][index ] [Cameron Hodge] [user-details-index] [0] closing... (reason: [shutdown])
[2016-07-11 12:47:01,664][DEBUG][index.shard ] [Cameron Hodge] [user-details-index][0] state: [STARTED]->[CLOSED], reason [shutdown]
[2016-07-11 12:47:01,664][DEBUG][index.shard ] [Cameron Hodge] [user-details-index][0] operations counter reached 0, will not accept any further writes
[2016-07-11 12:47:01,664][DEBUG][index.engine ] [Cameron Hodge] [user-details-index][0] flushing shard on close - this might take some time to sync files to disk
[2016-07-11 12:47:01,666][DEBUG][index.engine ] [Cameron Hodge] [user-details-index][0] close now acquiring writeLock
[2016-07-11 12:47:01,666][DEBUG][index.engine ] [Cameron Hodge] [user-details-index][0] close acquired writeLock
[2016-07-11 12:47:01,668][DEBUG][index.translog ] [Cameron Hodge] [user-details-index][0] translog closed
[2016-07-11 12:47:01,672][DEBUG][index.engine ] [Cameron Hodge] [user-details-index][0] engine closed [api]
[2016-07-11 12:47:01,672][DEBUG][index.store ] [Cameron Hodge] [user-details-index][0] store reference count on close: 0
[2016-07-11 12:47:01,672][DEBUG][index ] [Cameron Hodge] [user-details-index] [0] closed (reason: [shutdown])
[2016-07-11 12:47:01,672][DEBUG][indices ] [Cameron Hodge] [user-details-index] closing index cache (reason [shutdown])
[2016-07-11 12:47:01,672][DEBUG][index.cache.query.index ] [Cameron Hodge] [user-details-index] full cache clear, reason [close]
[2016-07-11 12:47:01,673][DEBUG][index.cache.bitset ] [Cameron Hodge] [user-details-index] clearing all bitsets because [close]
[2016-07-11 12:47:01,673][DEBUG][indices ] [Cameron Hodge] [user-details-index] clearing index field data (reason [shutdown])
[2016-07-11 12:47:01,674][DEBUG][indices ] [Cameron Hodge] [user-details-index] closing analysis service (reason [shutdown])
[2016-07-11 12:47:01,674][DEBUG][indices ] [Cameron Hodge] [user-details-index] closing mapper service (reason [shutdown])
[2016-07-11 12:47:01,674][DEBUG][indices ] [Cameron Hodge] [user-details-index] closing index query parser service (reason [shutdown])
[2016-07-11 12:47:01,680][DEBUG][indices ] [Cameron Hodge] [user-details-index] closing index service (reason [shutdown])
[2016-07-11 12:47:01,680][DEBUG][indices ] [Cameron Hodge] [user-details-index] closed... (reason [shutdown])
[2016-07-11 12:47:01,680][INFO ][node ] [Cameron Hodge] stopped
[2016-07-11 12:47:01,680][INFO ][node ] [Cameron Hodge] closing ...
[2016-07-11 12:47:01,685][INFO ][node ] [Cameron Hodge] closed

Related

ElasticSearch cannot bind public ip address

elasticsearch (version 5.0.0-alpha4) with apt-get and run with Oracle JRE 8.
It works fine with local ip address bind.
curl 127.0.0.1:9200
{
"name" : "Deadly Ernest",
"cluster_name" : "elasticsearch",
"version" : {
"number" : "5.0.0-alpha4",
"build_hash" : "3f5b994",
"build_date" : "2016-06-27T16:23:46.861Z",
"build_snapshot" : false,
"lucene_version" : "6.1.0"
},
"tagline" : "You Know, for Search"
}
But When I try to bind a public ip address, it failed.
uncomment and edit /etc/elasticsearch/elasticsearch.yml
network.host: 139.129.221.xxx
when I try to start the service with sudo -i service elasticsearch restart it seems to be ok * Starting Elasticsearch Server ...done.
The log shows that the server doesn't start correctly.
[2016-07-12 14:23:18,359][INFO ][node ] [Algrim the Strong] starting ...
[2016-07-12 14:23:18,471][INFO ][transport ] [Algrim the Strong] publish_address {139.129.221.xxx:9300}, bound_addresses {139.129.221.xxx:9300}
[2016-07-12 14:23:18,492][ERROR][bootstrap ] [Algrim the Strong] Exception
java.lang.RuntimeException: bootstrap checks failed
initial heap size [268435456] not equal to maximum heap size [2147483648]; this can cause resize pauses and prevents mlockall from locking the entire heap
please set [discovery.zen.minimum_master_nodes] to a majority of the number of master eligible nodes in your cluster
at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:125)
at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:85)
at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:65)
at org.elasticsearch.bootstrap.Bootstrap$5.validateNodeBeforeAcceptingRequests(Bootstrap.java:178)
at org.elasticsearch.node.Node.start(Node.java:373)
at org.elasticsearch.bootstrap.Bootstrap.start(Bootstrap.java:193)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:252)
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:96)
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:91)
at org.elasticsearch.cli.SettingCommand.execute(SettingCommand.java:54)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:91)
at org.elasticsearch.cli.Command.main(Command.java:53)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:70)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:63)
Suppressed: java.lang.IllegalStateException: initial heap size [268435456] not equal to maximum heap size [2147483648]; this can cause resize pauses and prevents mlockall from locking the entire heap
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:126)
... 13 more
Suppressed: java.lang.IllegalStateException: please set [discovery.zen.minimum_master_nodes] to a majority of the number of master eligible nodes in your cluster
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:126)
... 13 more
[2016-07-12 14:23:18,528][INFO ][node ] [Algrim the Strong] stopping ...
[2016-07-12 14:23:18,566][INFO ][node ] [Algrim the Strong] stopped
[2016-07-12 14:23:18,566][INFO ][node ] [Algrim the Strong] closing ...
[2016-07-12 14:23:18,595][INFO ][node ] [Algrim the Strong] closed
After several tries, I finally make it right with following configurations.
/etc/elasticsearch/jvm.options
-Xms2g
-Xmx2g
/etc/elasticsearch/elasticsearch.yml
network.host: 139.129.221.xxx
http.port: 9200
discovery.zen.minimum_master_nodes: 1
Then run sudo service elasticsearch restart to start elasticsearch.

how to configure the elasticserch.yml for repository-hdfs plugin of elasticsearch

elasticsearch 2.3.2
repository-hdfs 2.3.1
I configure the elasticsearch.yml file as the elastic official
repositories
hdfs:
uri: "hdfs://<host>:<port>/" # optional - Hadoop file-system URI
path: "some/path" # required - path with the file-system where data is stored/loaded
load_defaults: "true" # optional - whether to load the default Hadoop configuration (default) or not
conf_location: "extra-cfg.xml" # optional - Hadoop
configuration XML to be loaded (use commas for multi values)
conf.<key> : "<value>" # optional - 'inlined' key=value added to the Hadoop configuration
concurrent_streams: 5 # optional - the number of concurrent streams (defaults to 5)
compress: "false" # optional - whether to compress the metadata or not (default)
chunk_size: "10mb" # optional - chunk size (disabled by default)
but it raise Exception ,the format is incorrect
error info :
Exception in thread "main" SettingsException
[Failed to load settings from [elasticsearch.yml]]; nested: ScannerException[while scanning a simple key'
in 'reader', line 99, column 2:
repositories
^
could not find expected ':'
in 'reader', line 100, column 10:
hdfs:
^];
Likely root cause: while scanning a simple key
in 'reader', line 99, column 2:
repositories
^
could not find expected ':'
in 'reader', line 100, column 10:
hdfs:
I edit it as:
repositories:
hdfs:
uri: "hdfs://191.168.4.220:9600/"
but it doesn't work
I want know what the format is.
I find the aws configure for elasticsearch.xml
cloud:
aws:
access_key: AKVAIQBF2RECL7FJWGJQ
secret_key: vExyMThREXeRMm/b/LRzEB8jWwvzQeXgjqMX+6br
repositories:
s3:
bucket: "bucket_name"
region: "us-west-2"
private-bucket:
bucket: <bucket not accessible by default key>
access_key: <access key>
secret_key: <secret key>
remote-bucket:
bucket: <bucket in other region>
region: <region>
external-bucket:
bucket: <bucket>
access_key: <access key>
secret_key: <secret key>
endpoint: <endpoint>
protocol: <protocol>
I imitate it,but still doesn't work
I try to install repository-hdfs 2.3.1 in elasticsearch 2.3.2 ,but failed :
ERROR: Plugin [repository-hdfs] is incompatible with Elasticsearch [2.3.2]. Was designed for version [2.3.1]
The plugin can be only installed in elasticsearch 2.3.1.
You should specify uri,path,conf_location option and maybe delete conf.key option. Take the following config as an example.
security.manager.enabled: false
repositories.hdfs:
uri: "hdfs://master:9000" # optional - Hadoop file-system URI
path: "/aaa/bbb" # required - path with the file-system where data is stored/loaded
load_defaults: "true" # optional - whether to load the default Hadoop configuration (default) or not
conf_location: "/home/ec2-user/app/hadoop-2.6.3/etc/hadoop/core-site.xml,/home/ec2-user/app/hadoop-2.6.3/etc/hadoop/hdfs-site.xml" # optional - Hadoop configuration XML to be loaded (use commas for multi values)
concurrent_streams: 5 # optional - the number of concurrent streams (defaults to 5)
compress: "false" # optional - whether to compress the metadata or not (default)
chunk_size: "10mb" # optional - chunk size (disabled by default)
I start es successfully:
[----#----------- elasticsearch-2.3.1]$ bin/elasticsearch
[2016-05-06 04:40:58,173][INFO ][node ] [Protector] version[2.3.1], pid[17641], build[bd98092/2016-04-04T12:25:05Z]
[2016-05-06 04:40:58,174][INFO ][node ] [Protector] initializing ...
[2016-05-06 04:40:58,830][INFO ][plugins ] [Protector] modules [reindex, lang-expression, lang-groovy], plugins [repository-hdfs], sites []
[2016-05-06 04:40:58,863][INFO ][env ] [Protector] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [8gb], net total_space [9.9gb], spins? [unknown], types [rootfs]
[2016-05-06 04:40:58,863][INFO ][env ] [Protector] heap size [1007.3mb], compressed ordinary object pointers [true]
[2016-05-06 04:40:58,863][WARN ][env ] [Protector] max file descriptors [4096] for elasticsearch process likely too low, consider increasing to at least [65536]
[2016-05-06 04:40:59,192][INFO ][plugin.hadoop.hdfs ] Loaded Hadoop [1.2.1] libraries from file:/home/ec2-user/app/elasticsearch-2.3.1/plugins/repository-hdfs/
[2016-05-06 04:41:01,598][INFO ][node ] [Protector] initialized
[2016-05-06 04:41:01,598][INFO ][node ] [Protector] starting ...
[2016-05-06 04:41:01,823][INFO ][transport ] [Protector] publish_address {xxxxxxxxx:9300}, bound_addresses {xxxxxxx:9300}
[2016-05-06 04:41:01,830][INFO ][discovery ] [Protector] hdfs/9H8wli0oR3-Zp-M9ZFhNUQ
[2016-05-06 04:41:04,886][INFO ][cluster.service ] [Protector] new_master {Protector}{9H8wli0oR3-Zp-M9ZFhNUQ}{xxxxxxx}{xxxxx:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-05-06 04:41:04,908][INFO ][http ] [Protector] publish_address {xxxxxxxxx:9200}, bound_addresses {xxxxxxx:9200}
[2016-05-06 04:41:04,908][INFO ][node ] [Protector] started
[2016-05-06 04:41:05,415][INFO ][gateway ] [Protector] recovered [1] indices into cluster_state
[2016-05-06 04:41:06,097][INFO ][cluster.routing.allocation] [Protector] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[website][0], [website][0]] ...]).
But ,when i try to create a snapshot :
PUT /_snapshot/my_backup
{
"type": "hdfs",
"settings": {
"path":"/aaa/bbb/"
}
}
i get the following error:
Caused by: java.io.IOException: Mkdirs failed to create file:/aaa/bbb/tests-zTkKRtoZTLu3m3RLascc1w

ElasticSearch Crashing repeatedly

I run ElasticSearch 1.6.2 on debian GNU/Linux 8 server
Elastic search fall repeatedly, apparently for no reason
I don't have any error messages in ElasticSearch logs but I have some warnings :
before a crash :
[2016-02-11 05:10:21,893][INFO ][monitor.jvm ] [noeud-0] [gc][young][5554661][67] duration [790ms], collections [1]/[1s], total [790ms]/[1.9m], memory [320mb]->[320mb]/[1.9gb], all_pools {[young] [266.2mb]->[266.2mb]/[266.2mb]}{[survivor] [1.2mb]->[1.2mb]/[33.2mb]}{[old] [52.5mb]->[52.5mb]/[1.6gb]}
[2016-02-11 20:42:08,422][INFO ][monitor.jvm ] [noeud-0] [gc][young][5610361][68] duration [808ms], collections [1]/[1.6s], total [808ms]/[1.9m], memory [319.8mb]->[56.7mb]/[1.9gb], all_pools {[young] [265.8mb]->[3.3mb]/[266.2mb]}{[survivor] [801.6kb]->[361.7kb]/[33.2mb]}{[old] [53.2mb]->[53.2mb]/[1.6gb]}
and after the crash and the restart :
[2016-02-12 12:08:22,472][INFO ][node ] [noeud-0] version[1.6.2], pid[833], build[6220391/2015-07-29T09:24:47Z]
[2016-02-12 12:08:22,473][INFO ][node ] [noeud-0] initializing ...
[2016-02-12 12:08:22,609][INFO ][plugins ] [noeud-0] loaded [], sites [head]
[2016-02-12 12:08:22,661][INFO ][env ] [noeud-0] using [1] data paths, mounts [[/ (/dev/simfs)]], net usable_space [17.2gb], net total_space [20gb], types [simfs]
[2016-02-12 12:08:25,480][INFO ][node ] [noeud-0] initialized
[2016-02-12 12:08:25,481][INFO ][node ] [noeud-0] starting ...
[2016-02-12 12:08:25,570][INFO ][transport ] [noeud-0] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/ip.ip.ip.ip:9300]}
[2016-02-12 12:08:25,649][INFO ][discovery ] [noeud-0] my_cluster_name/123abc
[2016-02-12 12:08:29,436][INFO ][cluster.service ] [noeud-0] new_master [noeud-0][123abc][java8][inet[/ip.ip.ip.ip:9300]], reason: zen-disco-join (elected_as_master)
[2016-02-12 12:08:29,462][INFO ][http ] [noeud-0] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/ip.ip.ip.ip:9200]}
[2016-02-12 12:08:29,463][INFO ][node ] [noeud-0] started
[2016-02-12 12:08:29,496][INFO ][gateway ] [noeud-0] recovered [1] indices into cluster_state
[2016-02-12 19:12:05,812][WARN ][monitor.jvm ] [noeud-0] [gc][young][25368][2] duration [1.5s], collections [1]/[2.7s], total [1.5s]/[1.5s], memory [295.2mb]->[29.3mb]/[1.9gb], all_pools {[young] [266.2mb]->[4.9mb]/[266.2mb]}{[survivor] [28.9mb]->[15mb]/[33.2mb]}{[old] [0b]->[9.4mb]/[1.6gb]}
I read that ElasticSearch could be kill by the OOM-Killer but there is no \var\log\kern.log file to check that
What can I do to investigate and find the reason why ElasticSearch crash?

Elastic Search node not forming cluster

I am trying to setup a ES cluster over windows machine using uni cast. I think I have made all required configuration changes, but still my ES nodes do not form cluster.Could someone please let me know what I am missing. Please find below my elasticseach.yml configurations
=======Noed 8=======
cluster.name: elasticsearch
node.name: NODE8
node.data: true
network.host: "10.249.167.8"
network.publish_host: "10.249.167.8"
network.bind: "10.249.167.8"
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["10.249.167.9", "10.249.167.10", "10.249.167.8"]
transport.tcp.port: 9300
=======Node9 Config========
cluster.name: elasticsearch
node.name: NODE9
node.data: true
network.host: "10.249.167.9"
network.publish_host: "10.249.167.9"
network.bind: "10.249.167.9"
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["10.249.167.9", "10.249.167.10", "10.249.167.8"]
transport.tcp.port: 9300
I can query both ES node individually, but they dont form cluster
Node 8 Get : http://10.249.167.8:9200/_cat/nodes?h=ip,port,heapPercent,name
10.249.167.8 9300 2 Cecilia Reyes
Node 9 Get : http://10.249.167.9:9200/_cat/nodes?h=ip,port,heapPercent,name
10.249.167.9 9300 9 Victorius
Following are the startup logs, any help would be appreciated a ton, I am stuck on this for a while now:(
[2016-02-13 01:08:06,395][WARN ][bootstrap ] unable to install syscall filter: syscall filtering not supported for OS: 'Windows Server 2012 R2'
[2016-02-13 01:08:06,645][INFO ][node ] [NODE8] version[2.1.1], pid[7628], build[40e2c53/2015-12-15T13:05:55Z]
[2016-02-13 01:08:06,645][INFO ][node ] [NODE8] initializing ...
[2016-02-13 01:08:07,020][INFO ][plugins ] [NODE8] loaded [cloud-azure], sites []
[2016-02-13 01:08:07,051][INFO ][env ] [NODE8] using [1] data paths, mounts [[(C:)]], net usable_space [94.6gb], net total_space [126.6gb], spins? [unknown], types [NTFS]
[2016-02-13 01:08:09,170][INFO ][node ] [NODE8] initialized
[2016-02-13 01:08:09,170][INFO ][node ] [NODE8] starting ...
[2016-02-13 01:08:09,357][INFO ][transport ] [NODE8] publish_address {10.249.167.8:9300}, bound_addresses {10.249.167.8:9300}
[2016-02-13 01:08:09,373][INFO ][discovery ] [NODE8] elasticsearch/i42Qv-qNSJaSoLRCt2e5tg
[2016-02-13 01:08:13,936][INFO ][cluster.service ] [NODE8] new_master {NODE8}{i42Qv-qNSJaSoLRCt2e5tg}{10.249.167.8}{10.249.167.8:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-02-13 01:08:13,983][INFO ][http ] [NODE8] publish_address {10.249.167.8:9200}, bound_addresses {10.249.167.8:9200}
[2016-02-13 01:08:13,983][INFO ][node ] [NODE8] started
[2016-02-13 01:08:16,715][INFO ][gateway ] [NODE8] recovered [1] indices into cluster_state
Node 9 Log========================================================================
[2016-02-13 01:08:44,988][WARN ][bootstrap ] unable to install syscall filter: syscall filtering not supported for OS: 'Windows Server 2012 R2'
[2016-02-13 01:08:45,237][INFO ][node ] [NODE9] version[2.1.1], pid[6468], build[40e2c53/2015-12-15T13:05:55Z]
[2016-02-13 01:08:45,237][INFO ][node ] [NODE9] initializing ...
[2016-02-13 01:08:45,601][INFO ][plugins ] [NODE9] loaded [cloud-azure], sites []
[2016-02-13 01:08:45,625][INFO ][env ] [NODE9] using [1] data paths, mounts [[(C:)]], net usable_space [113.6gb], net total_space [126.6gb], spins? [unknown], types [NTFS]
[2016-02-13 01:08:47,554][INFO ][node ] [NODE9] initialized
[2016-02-13 01:08:47,554][INFO ][node ] [NODE9] starting ...
[2016-02-13 01:08:47,753][INFO ][transport ] [NODE9] publish_address {10.249.167.9:9300}, bound_addresses {10.249.167.9:9300}
[2016-02-13 01:08:47,763][INFO ][discovery ] [NODE9] elasticsearch/ys7WjfT3QR2DqwLFr-m6Ew
[2016-02-13 01:08:52,292][INFO ][cluster.service ] [NODE9] new_master {NODE9}{ys7WjfT3QR2DqwLFr-m6Ew}{10.249.167.9}{10.249.167.9:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-02-13 01:08:52,342][INFO ][http ] [NODE9] publish_address {10.249.167.9:9200}, bound_addresses {10.249.167.9:9200}
[2016-02-13 01:08:52,342][INFO ][node ] [NODE9] started
[2016-02-13 01:08:53,649][INFO ][gateway ] [NODE9] recovered [0] indices into cluster_state
The problem is looks like split brain problem.
It means the node gets separated and act as master on their own.
From your logs, its clearly visible that your nodes had created two different clusters literally.
to avoid the split brain, the possible way is
mentioning the minimum master nodes
discovery.zen.minimum_master_nodes
can be calculated by using following calculation
minimum master node = (N/2)+1
where N is number of nodes
for example, if you are having 3 nodes in your cluster, you can set as
discovery.zen.minimum_master_nodes: 2

Elasticsearch data path on a Vagrant shared directory

I have Elasticsearch installed on a Vagrant machine with a Windows host running VirtualBox. It runs okay when I use it with the default data path, but if I try to switch the path to a synced Vagrant folder, it throws an ElasticsearchIllegalStateException.
I am running with the command line
elasticsearch -Des.path.logs=/shared/logs -Des.path.data=/shared/data
Where /shared is the mount point for my synced folder in Vagrant.
The error I'm getting is:
[2015-07-28 14:15:05,005][WARN ][bootstrap ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).
[2015-07-28 14:15:05,104][INFO ][node ] [Caliban] version[1.7.0], pid[1832], build[929b973/2015-07-16T14:31:07Z]
[2015-07-28 14:15:05,105][INFO ][node ] [Caliban] initializing ...
[2015-07-28 14:15:05,390][INFO ][plugins ] [Caliban] loaded [], sites []
{1.7.0}: Initialization Failed ...
- ElasticsearchIllegalStateException[Failed to created node environment]
FileSystemException[/shared: Not a directory]
If I run this using a regular non-synced directories, then it works fine, e.g.
elasticsearch -Des.path.logs=/home/vagrant -Des.path.data=/home/vagrant
Results in:
[2015-07-28 14:20:27,598][WARN ][bootstrap ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).
[2015-07-28 14:20:27,693][INFO ][node ] [Madame MacEvil] version[1.7.0], pid[1989], build[929b973/2015-07-16T14:31:07Z]
[2015-07-28 14:20:27,694][INFO ][node ] [Madame MacEvil] initializing ...
[2015-07-28 14:20:27,958][INFO ][plugins ] [Madame MacEvil] loaded [], sites []
[2015-07-28 14:20:28,020][INFO ][env ] [Madame MacEvil] using [1] data paths, mounts [[/ (/dev/mapper/VolGroup-lv_root)]], net usable_space [33.6gb], net total_space [37.9gb], types [ext4]
[2015-07-28 14:20:31,951][INFO ][node ] [Madame MacEvil] initialized
[2015-07-28 14:20:31,951][INFO ][node ] [Madame MacEvil] starting ...
[2015-07-28 14:20:32,053][INFO ][transport ] [Madame MacEvil] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.0.2.15:9300]}
[2015-07-28 14:20:32,074][INFO ][discovery ] [Madame MacEvil] elasticsearch/18ZhO5W8SwWwJve7KBdV5g
[2015-07-28 14:20:35,859][INFO ][cluster.service ] [Madame MacEvil] new_master [Madame MacEvil][18ZhO5W8SwWwJve7KBdV5g][localhost.localdomain][inet[/10.0.2.15:9300]], reason: zen-disco-join (elected_as_master)
[2015-07-28 14:20:35,890][INFO ][http ] [Madame MacEvil] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.0.2.15:9200]}
[2015-07-28 14:20:35,894][INFO ][node ] [Madame MacEvil] started
[2015-07-28 14:20:35,919][INFO ][gateway ] [Madame MacEvil] recovered [0] indices into cluster_state
I pulled the logs for a failed initialization and it had the following Java exception:
org.elasticsearch.ElasticsearchIllegalStateException: Failed to created node environment
at org.elasticsearch.node.internal.InternalNode.<init>(InternalNode.java:167)
at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:159)
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:77)
at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:245)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)
Caused by: java.nio.file.FileSystemException: /shared: Not a directory
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileStore.readAttributes(UnixFileStore.java:111)
at sun.nio.fs.UnixFileStore.getTotalSpace(UnixFileStore.java:118)
at org.elasticsearch.monitor.fs.JmxFsProbe.getFSInfo(JmxFsProbe.java:61)
at org.elasticsearch.env.NodeEnvironment.maybeLogPathDetails(NodeEnvironment.java:221)
at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:176)
at org.elasticsearch.node.internal.InternalNode.<init>(InternalNode.java:165)
... 4 more
Is this a known issue with VirtualBox + Vagrant and Elasticsearch?
try:
sudo chown -R elasticsearch:elasticsearch /var/lib/elasticsearch/

Resources