Elasticsearch data path on a Vagrant shared directory - elasticsearch

I have Elasticsearch installed on a Vagrant machine with a Windows host running VirtualBox. It runs okay when I use it with the default data path, but if I try to switch the path to a synced Vagrant folder, it throws an ElasticsearchIllegalStateException.
I am running with the command line
elasticsearch -Des.path.logs=/shared/logs -Des.path.data=/shared/data
Where /shared is the mount point for my synced folder in Vagrant.
The error I'm getting is:
[2015-07-28 14:15:05,005][WARN ][bootstrap ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).
[2015-07-28 14:15:05,104][INFO ][node ] [Caliban] version[1.7.0], pid[1832], build[929b973/2015-07-16T14:31:07Z]
[2015-07-28 14:15:05,105][INFO ][node ] [Caliban] initializing ...
[2015-07-28 14:15:05,390][INFO ][plugins ] [Caliban] loaded [], sites []
{1.7.0}: Initialization Failed ...
- ElasticsearchIllegalStateException[Failed to created node environment]
FileSystemException[/shared: Not a directory]
If I run this using a regular non-synced directories, then it works fine, e.g.
elasticsearch -Des.path.logs=/home/vagrant -Des.path.data=/home/vagrant
Results in:
[2015-07-28 14:20:27,598][WARN ][bootstrap ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).
[2015-07-28 14:20:27,693][INFO ][node ] [Madame MacEvil] version[1.7.0], pid[1989], build[929b973/2015-07-16T14:31:07Z]
[2015-07-28 14:20:27,694][INFO ][node ] [Madame MacEvil] initializing ...
[2015-07-28 14:20:27,958][INFO ][plugins ] [Madame MacEvil] loaded [], sites []
[2015-07-28 14:20:28,020][INFO ][env ] [Madame MacEvil] using [1] data paths, mounts [[/ (/dev/mapper/VolGroup-lv_root)]], net usable_space [33.6gb], net total_space [37.9gb], types [ext4]
[2015-07-28 14:20:31,951][INFO ][node ] [Madame MacEvil] initialized
[2015-07-28 14:20:31,951][INFO ][node ] [Madame MacEvil] starting ...
[2015-07-28 14:20:32,053][INFO ][transport ] [Madame MacEvil] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.0.2.15:9300]}
[2015-07-28 14:20:32,074][INFO ][discovery ] [Madame MacEvil] elasticsearch/18ZhO5W8SwWwJve7KBdV5g
[2015-07-28 14:20:35,859][INFO ][cluster.service ] [Madame MacEvil] new_master [Madame MacEvil][18ZhO5W8SwWwJve7KBdV5g][localhost.localdomain][inet[/10.0.2.15:9300]], reason: zen-disco-join (elected_as_master)
[2015-07-28 14:20:35,890][INFO ][http ] [Madame MacEvil] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.0.2.15:9200]}
[2015-07-28 14:20:35,894][INFO ][node ] [Madame MacEvil] started
[2015-07-28 14:20:35,919][INFO ][gateway ] [Madame MacEvil] recovered [0] indices into cluster_state
I pulled the logs for a failed initialization and it had the following Java exception:
org.elasticsearch.ElasticsearchIllegalStateException: Failed to created node environment
at org.elasticsearch.node.internal.InternalNode.<init>(InternalNode.java:167)
at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:159)
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:77)
at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:245)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)
Caused by: java.nio.file.FileSystemException: /shared: Not a directory
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileStore.readAttributes(UnixFileStore.java:111)
at sun.nio.fs.UnixFileStore.getTotalSpace(UnixFileStore.java:118)
at org.elasticsearch.monitor.fs.JmxFsProbe.getFSInfo(JmxFsProbe.java:61)
at org.elasticsearch.env.NodeEnvironment.maybeLogPathDetails(NodeEnvironment.java:221)
at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:176)
at org.elasticsearch.node.internal.InternalNode.<init>(InternalNode.java:165)
... 4 more
Is this a known issue with VirtualBox + Vagrant and Elasticsearch?

try:
sudo chown -R elasticsearch:elasticsearch /var/lib/elasticsearch/

Related

Port forwarding with Elastic docker image

I'm trying to test docker out with this docker image. Things should be straight forward. But it isn't.
I ran this command to start the container:
sudo docker run -d -p 9200:9200 -p 9300:9300 elasticsearch -Des.node.name="ElasticTestNode"
Then I tried to run this command in my host machine:
# curl -XPUT "http://localhost:9200/movies/movie/3" -d'
{
"title": "To Kill a Mockingbird",
"director": "Robert Mulligan",
"year": 1962,
"genres": ["Crime", "Drama", "Mystery"]
}'
I was expecting to see some kind of successful message. Instead, the command simply stuck. No output and not stopping at all. I have to Ctrl-X to quit.
Ran out of idea, I started a bash shell inside the docker and tested:
sudo sudo docker exec -i -t some-docker-id /bin/bash
root#somehash:/usr/share/elasticsearch# curl -XPUT "http://localhost:9200/movies/movie/3" -d'
{
"title": "To Kill a Mockingbird",
"director": "Robert Mulligan",
"year": 1962,
"genres": ["Crime", "Drama", "Mystery"]
}'
{"_index":"movies","_type":"movie","_id":"3","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true}root#somehash:/usr/share/elasticsearch#
And it was a success. What have I done wrong?
Updates: Tried another command on my host machine:
$ curl -XPUT -v "http://localhost:9200/movies/movie/3" -d'
{
"title": "To Kill a Mockingbird",
"director": "Robert Mulligan",
"year": 1962,
"genres": ["Crime", "Drama", "Mystery"]
}'
* Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 9200 (#0)
> PUT /movies/movie/3 HTTP/1.1
> Host: localhost:9200
> User-Agent: curl/7.43.0
> Accept: */*
> Content-Length: 139
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 139 out of 139 bytes
Stuck here...
# sudo docker logs docker-id
[2016-09-28 11:52:16,630][INFO ][node ] [ElasticTestNode] version[2.4.0], pid[1], build[ce9f0c7/2016-08-29T09:14:17Z]
[2016-09-28 11:52:16,631][INFO ][node ] [ElasticTestNode] initializing ...
[2016-09-28 11:52:17,202][INFO ][plugins ] [ElasticTestNode] modules [reindex, lang-expression, lang-groovy], plugins [], sites []
[2016-09-28 11:52:17,219][INFO ][env ] [ElasticTestNode] using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/sda8)]], net usable_space [5.4gb], net total_space [19.5gb], spins? [possibly], types [ext4]
[2016-09-28 11:52:17,219][INFO ][env ] [ElasticTestNode] heap size [990.7mb], compressed ordinary object pointers [true]
[2016-09-28 11:52:18,816][INFO ][node ] [ElasticTestNode] initialized
[2016-09-28 11:52:18,816][INFO ][node ] [ElasticTestNode] starting ...
[2016-09-28 11:52:18,877][INFO ][transport ] [ElasticTestNode] publish_address {172.17.0.22:9300}, bound_addresses {[::]:9300}
[2016-09-28 11:52:18,881][INFO ][discovery ] [ElasticTestNode] elasticsearch/LCo5k0dARimsWFXjN1Yu0A
[2016-09-28 11:52:21,915][INFO ][cluster.service ] [ElasticTestNode] new_master {ElasticTestNode}{LCo5k0dARimsWFXjN1Yu0A}{172.17.0.22}{172.17.0.22:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-09-28 11:52:21,924][INFO ][http ] [ElasticTestNode] publish_address {172.17.0.22:9200}, bound_addresses {[::]:9200}
[2016-09-28 11:52:21,925][INFO ][node ] [ElasticTestNode] started
[2016-09-28 11:52:21,960][INFO ][gateway ] [ElasticTestNode] recovered [0] indices into cluster_state
It seems that the port mapping of docker sometimes fail. I have experienced this issue multiple times. The same test script works on a boot while doesn't on another.
One thing consistence is that if things went bad on one boot, it stays fail every time I restart the docker image. It stays fail even after I ditched the container and start a new one with the image. It seems to be an issue of the docker daemon.
The way for me to solve this issue is to stop all container and restart the docker daemon:
sudo docker stop $(docker ps -a -q)
sudo systemctl restart docker
sudo docker start $(docker ps -a -q)
It works for me. Hope someone would find it helpful.

Elasticsearch getting stopped and starting again and again

I am running elasticsearch on a 4GB instance and having a 2GB heap size. The elasticsearch runs fine but after some requests it gets stopped and than starts back on its own with the following logs.
[2016-07-11 11:11:04,492][INFO ][discovery ] [Ev Teel Urizen] elasticsearch/9amXiXXmTTyV_l_s3t6gnw
[2016-07-11 11:11:07,568][INFO ][cluster.service ] [Ev Teel Urizen] new_master {Ev Teel Urizen}{9amXiXXmTTyV_l_s3t6gnw}{P.P.P.P}{P.P.P.P:$300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-07-11 11:11:07,611][INFO ][http ] [Ev Teel Urizen] publish_address {P.P.P.P:9200}, bound_addresses {P.P.P.P:9200}
[2016-07-11 11:11:07,611][INFO ][node ] [Ev Teel Urizen] started
[2016-07-11 11:11:07,641][INFO ][gateway ] [Ev Teel Urizen] recovered [1] indices into cluster_state
[2016-07-11 11:11:07,912][INFO ][cluster.routing.allocation] [Ev Teel Urizen] Cluster health status changed from [RED] to [GREEN] (reason: [shards start$d [[user-details-index][0]] ...]).
[2016-07-11 11:13:01,482][INFO ][node ] [Ev Teel Urizen] stopping ...
[2016-07-11 11:13:01,503][INFO ][node ] [Ev Teel Urizen] stopped
[2016-07-11 11:13:01,503][INFO ][node ] [Ev Teel Urizen] closing ...
[2016-07-11 11:13:01,507][INFO ][node ] [Ev Teel Urizen] closed
where P.P.P.P is the private IP of the instance.
EDIT: Logs after changing log-level to DEBUG
[2016-07-11 12:46:34,129][DEBUG][index.shard ] [Cameron Hodge] [user-details-index][0] recovery completed from [shard_store], took [106ms]
[2016-07-11 12:46:34,129][DEBUG][cluster.action.shard ] [Cameron Hodge] [user-details-index][0] sending shard started for target shard [[user-details-index][0], node[TUeznaAxQsqaFq6iDbVUVw], [P], v[149], s[INITIALIZING], a[id=j_DlCimrSoi7kff-9Ah9xw], unassigned_info[[reason=CLUSTER_RECOVERED], at[2016-07-11T12:46:33.774Z]]], indexUUID [tStiDr3PRWaKIKCIHk5h0A], message [after recovery from store]
[2016-07-11 12:46:34,129][DEBUG][cluster.action.shard ] [Cameron Hodge] received shard started for target shard [[user-details-index][0], node[TUeznaAxQsqaFq6iDbVUVw], [P], v[149], s[INITIALIZING], a[id=j_DlCimrSoi7kff-9Ah9xw], unassigned_info[[reason=CLUSTER_RECOVERED], at[2016-07-11T12:46:33.774Z]]], indexUUID [tStiDr3PRWaKIKCIHk5h0A], message [after recovery from store]
[2016-07-11 12:46:34,130][DEBUG][cluster.service ] [Cameron Hodge] processing [shard-started ([user-details-index][0], node[TUeznaAxQsqaFq6iDbVUVw], [P], v[149], s[INITIALIZING], a[id=j_DlCimrSoi7kff-9Ah9xw], unassigned_info[[reason=CLUSTER_RECOVERED], at[2016-07-11T12:46:33.774Z]]), reason [after recovery from store]]: execute
[2016-07-11 12:46:34,131][INFO ][cluster.routing.allocation] [Cameron Hodge] Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[user-details-index][0]] ...]).
[2016-07-11 12:46:34,131][DEBUG][cluster.service ] [Cameron Hodge] cluster state updated, version [4], source [shard-started ([user-details-index][0], node[TUeznaAxQsqaFq6iDbVUVw], [P], v[149], s[INITIALIZING], a[id=j_DlCimrSoi7kff-9Ah9xw], unassigned_info[[reason=CLUSTER_RECOVERED], at[2016-07-11T12:46:33.774Z]]), reason [after recovery from store]]
[2016-07-11 12:46:34,131][DEBUG][cluster.service ] [Cameron Hodge] publishing cluster state version [4]
[2016-07-11 12:46:34,135][DEBUG][cluster.service ] [Cameron Hodge] set local cluster state to version 4
[2016-07-11 12:46:34,135][DEBUG][index.shard ] [Cameron Hodge] [user-details-index][0] state: [POST_RECOVERY]->[STARTED], reason [global state is [STARTED]]
[2016-07-11 12:46:34,153][DEBUG][cluster.service ] [Cameron Hodge] processing [shard-started ([user-details-index][0], node[TUeznaAxQsqaFq6iDbVUVw], [P], v[149], s[INITIALIZING], a[id=j_DlCimrSoi7kff-9Ah9xw], unassigned_info[[reason=CLUSTER_RECOVERED], at[2016-07-11T12:46:33.774Z]]), reason [after recovery from store]]: took 23ms done applying updated cluster_state (version: 4, uuid: G_fovEaNRFOfF0lHjz0h2A)
[2016-07-11 12:47:00,583][DEBUG][indices.memory ] [Cameron Hodge] recalculating shard indexing buffer, total is [203.1mb] with [1] active shards, each shard set to indexing=[203.1mb], translog=[64kb]
[2016-07-11 12:47:01,648][INFO ][node ] [Cameron Hodge] stopping ...
[2016-07-11 12:47:01,660][DEBUG][indices ] [Cameron Hodge] [user-details-index] closing ... (reason [shutdown])
[2016-07-11 12:47:01,661][DEBUG][indices ] [Cameron Hodge] [user-details-index] closing index service (reason [shutdown])
[2016-07-11 12:47:01,662][DEBUG][index ] [Cameron Hodge] [user-details-index] [0] closing... (reason: [shutdown])
[2016-07-11 12:47:01,664][DEBUG][index.shard ] [Cameron Hodge] [user-details-index][0] state: [STARTED]->[CLOSED], reason [shutdown]
[2016-07-11 12:47:01,664][DEBUG][index.shard ] [Cameron Hodge] [user-details-index][0] operations counter reached 0, will not accept any further writes
[2016-07-11 12:47:01,664][DEBUG][index.engine ] [Cameron Hodge] [user-details-index][0] flushing shard on close - this might take some time to sync files to disk
[2016-07-11 12:47:01,666][DEBUG][index.engine ] [Cameron Hodge] [user-details-index][0] close now acquiring writeLock
[2016-07-11 12:47:01,666][DEBUG][index.engine ] [Cameron Hodge] [user-details-index][0] close acquired writeLock
[2016-07-11 12:47:01,668][DEBUG][index.translog ] [Cameron Hodge] [user-details-index][0] translog closed
[2016-07-11 12:47:01,672][DEBUG][index.engine ] [Cameron Hodge] [user-details-index][0] engine closed [api]
[2016-07-11 12:47:01,672][DEBUG][index.store ] [Cameron Hodge] [user-details-index][0] store reference count on close: 0
[2016-07-11 12:47:01,672][DEBUG][index ] [Cameron Hodge] [user-details-index] [0] closed (reason: [shutdown])
[2016-07-11 12:47:01,672][DEBUG][indices ] [Cameron Hodge] [user-details-index] closing index cache (reason [shutdown])
[2016-07-11 12:47:01,672][DEBUG][index.cache.query.index ] [Cameron Hodge] [user-details-index] full cache clear, reason [close]
[2016-07-11 12:47:01,673][DEBUG][index.cache.bitset ] [Cameron Hodge] [user-details-index] clearing all bitsets because [close]
[2016-07-11 12:47:01,673][DEBUG][indices ] [Cameron Hodge] [user-details-index] clearing index field data (reason [shutdown])
[2016-07-11 12:47:01,674][DEBUG][indices ] [Cameron Hodge] [user-details-index] closing analysis service (reason [shutdown])
[2016-07-11 12:47:01,674][DEBUG][indices ] [Cameron Hodge] [user-details-index] closing mapper service (reason [shutdown])
[2016-07-11 12:47:01,674][DEBUG][indices ] [Cameron Hodge] [user-details-index] closing index query parser service (reason [shutdown])
[2016-07-11 12:47:01,680][DEBUG][indices ] [Cameron Hodge] [user-details-index] closing index service (reason [shutdown])
[2016-07-11 12:47:01,680][DEBUG][indices ] [Cameron Hodge] [user-details-index] closed... (reason [shutdown])
[2016-07-11 12:47:01,680][INFO ][node ] [Cameron Hodge] stopped
[2016-07-11 12:47:01,680][INFO ][node ] [Cameron Hodge] closing ...
[2016-07-11 12:47:01,685][INFO ][node ] [Cameron Hodge] closed

how to configure the elasticserch.yml for repository-hdfs plugin of elasticsearch

elasticsearch 2.3.2
repository-hdfs 2.3.1
I configure the elasticsearch.yml file as the elastic official
repositories
hdfs:
uri: "hdfs://<host>:<port>/" # optional - Hadoop file-system URI
path: "some/path" # required - path with the file-system where data is stored/loaded
load_defaults: "true" # optional - whether to load the default Hadoop configuration (default) or not
conf_location: "extra-cfg.xml" # optional - Hadoop
configuration XML to be loaded (use commas for multi values)
conf.<key> : "<value>" # optional - 'inlined' key=value added to the Hadoop configuration
concurrent_streams: 5 # optional - the number of concurrent streams (defaults to 5)
compress: "false" # optional - whether to compress the metadata or not (default)
chunk_size: "10mb" # optional - chunk size (disabled by default)
but it raise Exception ,the format is incorrect
error info :
Exception in thread "main" SettingsException
[Failed to load settings from [elasticsearch.yml]]; nested: ScannerException[while scanning a simple key'
in 'reader', line 99, column 2:
repositories
^
could not find expected ':'
in 'reader', line 100, column 10:
hdfs:
^];
Likely root cause: while scanning a simple key
in 'reader', line 99, column 2:
repositories
^
could not find expected ':'
in 'reader', line 100, column 10:
hdfs:
I edit it as:
repositories:
hdfs:
uri: "hdfs://191.168.4.220:9600/"
but it doesn't work
I want know what the format is.
I find the aws configure for elasticsearch.xml
cloud:
aws:
access_key: AKVAIQBF2RECL7FJWGJQ
secret_key: vExyMThREXeRMm/b/LRzEB8jWwvzQeXgjqMX+6br
repositories:
s3:
bucket: "bucket_name"
region: "us-west-2"
private-bucket:
bucket: <bucket not accessible by default key>
access_key: <access key>
secret_key: <secret key>
remote-bucket:
bucket: <bucket in other region>
region: <region>
external-bucket:
bucket: <bucket>
access_key: <access key>
secret_key: <secret key>
endpoint: <endpoint>
protocol: <protocol>
I imitate it,but still doesn't work
I try to install repository-hdfs 2.3.1 in elasticsearch 2.3.2 ,but failed :
ERROR: Plugin [repository-hdfs] is incompatible with Elasticsearch [2.3.2]. Was designed for version [2.3.1]
The plugin can be only installed in elasticsearch 2.3.1.
You should specify uri,path,conf_location option and maybe delete conf.key option. Take the following config as an example.
security.manager.enabled: false
repositories.hdfs:
uri: "hdfs://master:9000" # optional - Hadoop file-system URI
path: "/aaa/bbb" # required - path with the file-system where data is stored/loaded
load_defaults: "true" # optional - whether to load the default Hadoop configuration (default) or not
conf_location: "/home/ec2-user/app/hadoop-2.6.3/etc/hadoop/core-site.xml,/home/ec2-user/app/hadoop-2.6.3/etc/hadoop/hdfs-site.xml" # optional - Hadoop configuration XML to be loaded (use commas for multi values)
concurrent_streams: 5 # optional - the number of concurrent streams (defaults to 5)
compress: "false" # optional - whether to compress the metadata or not (default)
chunk_size: "10mb" # optional - chunk size (disabled by default)
I start es successfully:
[----#----------- elasticsearch-2.3.1]$ bin/elasticsearch
[2016-05-06 04:40:58,173][INFO ][node ] [Protector] version[2.3.1], pid[17641], build[bd98092/2016-04-04T12:25:05Z]
[2016-05-06 04:40:58,174][INFO ][node ] [Protector] initializing ...
[2016-05-06 04:40:58,830][INFO ][plugins ] [Protector] modules [reindex, lang-expression, lang-groovy], plugins [repository-hdfs], sites []
[2016-05-06 04:40:58,863][INFO ][env ] [Protector] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [8gb], net total_space [9.9gb], spins? [unknown], types [rootfs]
[2016-05-06 04:40:58,863][INFO ][env ] [Protector] heap size [1007.3mb], compressed ordinary object pointers [true]
[2016-05-06 04:40:58,863][WARN ][env ] [Protector] max file descriptors [4096] for elasticsearch process likely too low, consider increasing to at least [65536]
[2016-05-06 04:40:59,192][INFO ][plugin.hadoop.hdfs ] Loaded Hadoop [1.2.1] libraries from file:/home/ec2-user/app/elasticsearch-2.3.1/plugins/repository-hdfs/
[2016-05-06 04:41:01,598][INFO ][node ] [Protector] initialized
[2016-05-06 04:41:01,598][INFO ][node ] [Protector] starting ...
[2016-05-06 04:41:01,823][INFO ][transport ] [Protector] publish_address {xxxxxxxxx:9300}, bound_addresses {xxxxxxx:9300}
[2016-05-06 04:41:01,830][INFO ][discovery ] [Protector] hdfs/9H8wli0oR3-Zp-M9ZFhNUQ
[2016-05-06 04:41:04,886][INFO ][cluster.service ] [Protector] new_master {Protector}{9H8wli0oR3-Zp-M9ZFhNUQ}{xxxxxxx}{xxxxx:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-05-06 04:41:04,908][INFO ][http ] [Protector] publish_address {xxxxxxxxx:9200}, bound_addresses {xxxxxxx:9200}
[2016-05-06 04:41:04,908][INFO ][node ] [Protector] started
[2016-05-06 04:41:05,415][INFO ][gateway ] [Protector] recovered [1] indices into cluster_state
[2016-05-06 04:41:06,097][INFO ][cluster.routing.allocation] [Protector] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[website][0], [website][0]] ...]).
But ,when i try to create a snapshot :
PUT /_snapshot/my_backup
{
"type": "hdfs",
"settings": {
"path":"/aaa/bbb/"
}
}
i get the following error:
Caused by: java.io.IOException: Mkdirs failed to create file:/aaa/bbb/tests-zTkKRtoZTLu3m3RLascc1w

ElasticSearch Crashing repeatedly

I run ElasticSearch 1.6.2 on debian GNU/Linux 8 server
Elastic search fall repeatedly, apparently for no reason
I don't have any error messages in ElasticSearch logs but I have some warnings :
before a crash :
[2016-02-11 05:10:21,893][INFO ][monitor.jvm ] [noeud-0] [gc][young][5554661][67] duration [790ms], collections [1]/[1s], total [790ms]/[1.9m], memory [320mb]->[320mb]/[1.9gb], all_pools {[young] [266.2mb]->[266.2mb]/[266.2mb]}{[survivor] [1.2mb]->[1.2mb]/[33.2mb]}{[old] [52.5mb]->[52.5mb]/[1.6gb]}
[2016-02-11 20:42:08,422][INFO ][monitor.jvm ] [noeud-0] [gc][young][5610361][68] duration [808ms], collections [1]/[1.6s], total [808ms]/[1.9m], memory [319.8mb]->[56.7mb]/[1.9gb], all_pools {[young] [265.8mb]->[3.3mb]/[266.2mb]}{[survivor] [801.6kb]->[361.7kb]/[33.2mb]}{[old] [53.2mb]->[53.2mb]/[1.6gb]}
and after the crash and the restart :
[2016-02-12 12:08:22,472][INFO ][node ] [noeud-0] version[1.6.2], pid[833], build[6220391/2015-07-29T09:24:47Z]
[2016-02-12 12:08:22,473][INFO ][node ] [noeud-0] initializing ...
[2016-02-12 12:08:22,609][INFO ][plugins ] [noeud-0] loaded [], sites [head]
[2016-02-12 12:08:22,661][INFO ][env ] [noeud-0] using [1] data paths, mounts [[/ (/dev/simfs)]], net usable_space [17.2gb], net total_space [20gb], types [simfs]
[2016-02-12 12:08:25,480][INFO ][node ] [noeud-0] initialized
[2016-02-12 12:08:25,481][INFO ][node ] [noeud-0] starting ...
[2016-02-12 12:08:25,570][INFO ][transport ] [noeud-0] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/ip.ip.ip.ip:9300]}
[2016-02-12 12:08:25,649][INFO ][discovery ] [noeud-0] my_cluster_name/123abc
[2016-02-12 12:08:29,436][INFO ][cluster.service ] [noeud-0] new_master [noeud-0][123abc][java8][inet[/ip.ip.ip.ip:9300]], reason: zen-disco-join (elected_as_master)
[2016-02-12 12:08:29,462][INFO ][http ] [noeud-0] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/ip.ip.ip.ip:9200]}
[2016-02-12 12:08:29,463][INFO ][node ] [noeud-0] started
[2016-02-12 12:08:29,496][INFO ][gateway ] [noeud-0] recovered [1] indices into cluster_state
[2016-02-12 19:12:05,812][WARN ][monitor.jvm ] [noeud-0] [gc][young][25368][2] duration [1.5s], collections [1]/[2.7s], total [1.5s]/[1.5s], memory [295.2mb]->[29.3mb]/[1.9gb], all_pools {[young] [266.2mb]->[4.9mb]/[266.2mb]}{[survivor] [28.9mb]->[15mb]/[33.2mb]}{[old] [0b]->[9.4mb]/[1.6gb]}
I read that ElasticSearch could be kill by the OOM-Killer but there is no \var\log\kern.log file to check that
What can I do to investigate and find the reason why ElasticSearch crash?

Elastic Search node not forming cluster

I am trying to setup a ES cluster over windows machine using uni cast. I think I have made all required configuration changes, but still my ES nodes do not form cluster.Could someone please let me know what I am missing. Please find below my elasticseach.yml configurations
=======Noed 8=======
cluster.name: elasticsearch
node.name: NODE8
node.data: true
network.host: "10.249.167.8"
network.publish_host: "10.249.167.8"
network.bind: "10.249.167.8"
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["10.249.167.9", "10.249.167.10", "10.249.167.8"]
transport.tcp.port: 9300
=======Node9 Config========
cluster.name: elasticsearch
node.name: NODE9
node.data: true
network.host: "10.249.167.9"
network.publish_host: "10.249.167.9"
network.bind: "10.249.167.9"
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["10.249.167.9", "10.249.167.10", "10.249.167.8"]
transport.tcp.port: 9300
I can query both ES node individually, but they dont form cluster
Node 8 Get : http://10.249.167.8:9200/_cat/nodes?h=ip,port,heapPercent,name
10.249.167.8 9300 2 Cecilia Reyes
Node 9 Get : http://10.249.167.9:9200/_cat/nodes?h=ip,port,heapPercent,name
10.249.167.9 9300 9 Victorius
Following are the startup logs, any help would be appreciated a ton, I am stuck on this for a while now:(
[2016-02-13 01:08:06,395][WARN ][bootstrap ] unable to install syscall filter: syscall filtering not supported for OS: 'Windows Server 2012 R2'
[2016-02-13 01:08:06,645][INFO ][node ] [NODE8] version[2.1.1], pid[7628], build[40e2c53/2015-12-15T13:05:55Z]
[2016-02-13 01:08:06,645][INFO ][node ] [NODE8] initializing ...
[2016-02-13 01:08:07,020][INFO ][plugins ] [NODE8] loaded [cloud-azure], sites []
[2016-02-13 01:08:07,051][INFO ][env ] [NODE8] using [1] data paths, mounts [[(C:)]], net usable_space [94.6gb], net total_space [126.6gb], spins? [unknown], types [NTFS]
[2016-02-13 01:08:09,170][INFO ][node ] [NODE8] initialized
[2016-02-13 01:08:09,170][INFO ][node ] [NODE8] starting ...
[2016-02-13 01:08:09,357][INFO ][transport ] [NODE8] publish_address {10.249.167.8:9300}, bound_addresses {10.249.167.8:9300}
[2016-02-13 01:08:09,373][INFO ][discovery ] [NODE8] elasticsearch/i42Qv-qNSJaSoLRCt2e5tg
[2016-02-13 01:08:13,936][INFO ][cluster.service ] [NODE8] new_master {NODE8}{i42Qv-qNSJaSoLRCt2e5tg}{10.249.167.8}{10.249.167.8:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-02-13 01:08:13,983][INFO ][http ] [NODE8] publish_address {10.249.167.8:9200}, bound_addresses {10.249.167.8:9200}
[2016-02-13 01:08:13,983][INFO ][node ] [NODE8] started
[2016-02-13 01:08:16,715][INFO ][gateway ] [NODE8] recovered [1] indices into cluster_state
Node 9 Log========================================================================
[2016-02-13 01:08:44,988][WARN ][bootstrap ] unable to install syscall filter: syscall filtering not supported for OS: 'Windows Server 2012 R2'
[2016-02-13 01:08:45,237][INFO ][node ] [NODE9] version[2.1.1], pid[6468], build[40e2c53/2015-12-15T13:05:55Z]
[2016-02-13 01:08:45,237][INFO ][node ] [NODE9] initializing ...
[2016-02-13 01:08:45,601][INFO ][plugins ] [NODE9] loaded [cloud-azure], sites []
[2016-02-13 01:08:45,625][INFO ][env ] [NODE9] using [1] data paths, mounts [[(C:)]], net usable_space [113.6gb], net total_space [126.6gb], spins? [unknown], types [NTFS]
[2016-02-13 01:08:47,554][INFO ][node ] [NODE9] initialized
[2016-02-13 01:08:47,554][INFO ][node ] [NODE9] starting ...
[2016-02-13 01:08:47,753][INFO ][transport ] [NODE9] publish_address {10.249.167.9:9300}, bound_addresses {10.249.167.9:9300}
[2016-02-13 01:08:47,763][INFO ][discovery ] [NODE9] elasticsearch/ys7WjfT3QR2DqwLFr-m6Ew
[2016-02-13 01:08:52,292][INFO ][cluster.service ] [NODE9] new_master {NODE9}{ys7WjfT3QR2DqwLFr-m6Ew}{10.249.167.9}{10.249.167.9:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-02-13 01:08:52,342][INFO ][http ] [NODE9] publish_address {10.249.167.9:9200}, bound_addresses {10.249.167.9:9200}
[2016-02-13 01:08:52,342][INFO ][node ] [NODE9] started
[2016-02-13 01:08:53,649][INFO ][gateway ] [NODE9] recovered [0] indices into cluster_state
The problem is looks like split brain problem.
It means the node gets separated and act as master on their own.
From your logs, its clearly visible that your nodes had created two different clusters literally.
to avoid the split brain, the possible way is
mentioning the minimum master nodes
discovery.zen.minimum_master_nodes
can be calculated by using following calculation
minimum master node = (N/2)+1
where N is number of nodes
for example, if you are having 3 nodes in your cluster, you can set as
discovery.zen.minimum_master_nodes: 2

Resources