Elasticsearch node keeps restarting - elasticsearch

I have a simple 3 node cluster here at home used for testing. 3 eligible master nodes and two data nodes. However, one of the two data nodes can never seem to initialize the data shards properly.
The cluster health and shards overview illustrate this.
{"cluster_name":"test","status":"yellow","timed_out":false,"number_of_nodes":3,"number_of_data_nodes":2,"active_primary_shards":28,"active_shards":50,"relocating_shards":0,"initializing_shards":2,"unassigned_shards":4,"number_of_pending_tasks":0}
The shards overview indicate that the various shards are being initialized
.marvel-2015.05.28 0 r INITIALIZING 127.0.0.1 ES2
.marvel-2015.05.28 0 p STARTED 7656 21.3mb 127.0.0.1 ES1
.kibana 0 r STARTED 127.0.0.1 ES2
.kibana 0 p STARTED 16 13.8kb 127.0.0.1 ES1
kibana-int 4 p STARTED 0 115b 127.0.0.1 ES1
kibana-int 2 r INITIALIZING 127.0.0.1 ES2
However, at some point during this process, the node just dies and restarts.
{"cluster_name":"olympus","status":"yellow","timed_out":false,"number_of_nodes":2,"number_of_data_nodes":1,"active_primary_shards":28,"active_shards":28,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":28,"number_of_pending_tasks":0}
The node eventually comes back up and the aforementioned process starts up again, but then it just dies eventually yet again.
The logs on the problematic node doesnt seem to say much either.
[2015-05-28 11:31:38,984][INFO ][cluster.service ] [ES2] detected_master [ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]], added {[ES3][onaC5Y2bSd-nVTDRXsxjBg][es3pi][inet[/192.168.88.21:9300]]{data=false, master=true},[ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]],}, reason: zen-disco-receive(from master [[ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]]])
[2015-05-28 11:31:40,717][INFO ][http ] [ES2] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.88.20:9200]}
[2015-05-28 11:31:40,722][INFO ][node ] [ES2] started
[2015-05-28 11:34:21,024][INFO ][node ] [ES2] version[1.5.2], pid[591], build[62ff986/2015-04-27T09:21:06Z]
[2015-05-28 11:34:21,065][INFO ][node ] [ES2] initializing ...
[2015-05-28 11:34:21,826][INFO ][plugins ] [ES2] loaded [marvel], sites [marvel]
[2015-05-28 11:35:41,387][INFO ][node ] [ES2] initialized
[2015-05-28 11:35:41,401][INFO ][node ] [ES2] starting ...
[2015-05-28 11:35:43,174][INFO ][transport ] [ES2] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.88.20:9300]}
[2015-05-28 11:35:43,538][INFO ][discovery ] [ES2] olympus/FQWotgreS2iTZ_NAJEBmaA
[2015-05-28 11:35:47,758][INFO ][cluster.service ] [ES2] detected_master [ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]], added {[ES3][onaC5Y2bSd-nVTDRXsxjBg][es3pi][inet[/192.168.88.21:9300]]{data=false, master=true},[ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]],}, reason: zen-disco-receive(from master [[ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]]])
[2015-05-28 11:35:49,442][INFO ][http ] [ES2] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.88.20:9200]}
[2015-05-28 11:35:49,447][INFO ][node ] [ES2] started
[2015-05-28 11:39:40,064][INFO ][node ] [ES2] version[1.5.2], pid[655], build[62ff986/2015-04-27T09:21:06Z]
[2015-05-28 11:39:40,088][INFO ][node ] [ES2] initializing ...
[2015-05-28 11:39:40,880][INFO ][plugins ] [ES2] loaded [marvel], sites [marvel]
[2015-05-28 11:41:03,892][INFO ][node ] [ES2] initialized
[2015-05-28 11:41:03,906][INFO ][node ] [ES2] starting ...
[2015-05-28 11:41:05,665][INFO ][transport ] [ES2] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.88.20:9300]}
[2015-05-28 11:41:06,042][INFO ][discovery ] [ES2] olympus/5JH1JO7PRICesi0tadHZJw
[2015-05-28 11:41:10,233][INFO ][cluster.service ] [ES2] detected_master [ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]], added {[ES3][onaC5Y2bSd-nVTDRXsxjBg][es3pi][inet[/192.168.88.21:9300]]{data=false, master=true},[ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]],}, reason: zen-disco-receive(from master [[ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]]])
[2015-05-28 11:41:11,844][INFO ][http ] [ES2] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.88.20:9200]}
[2015-05-28 11:41:11,850][INFO ][node ] [ES2] started
[2015-05-28 11:44:47,113][INFO ][node ] [ES2] version[1.5.2], pid[763], build[62ff986/2015-04-27T09:21:06Z]
[2015-05-28 11:44:47,173][INFO ][node ] [ES2] initializing ...
[2015-05-28 11:44:47,985][INFO ][plugins ] [ES2] loaded [marvel], sites [marvel]
[2015-05-28 11:46:08,003][INFO ][node ] [ES2] initialized
[2015-05-28 11:46:08,017][INFO ][node ] [ES2] starting ...
[2015-05-28 11:46:09,818][INFO ][transport ] [ES2] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.88.20:9300]}
[2015-05-28 11:46:10,184][INFO ][discovery ] [ES2] olympus/AjDpUJt1SK2ncVah2ZVQJA
[2015-05-28 11:46:14,397][INFO ][cluster.service ] [ES2] detected_master [ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]], added {[ES3][onaC5Y2bSd-nVTDRXsxjBg][es3pi][inet[/192.168.88.21:9300]]{data=false, master=true},[ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]],}, reason: zen-disco-receive(from master [[ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]]])
[2015-05-28 11:46:16,056][INFO ][http ] [ES2] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.88.20:9200]}
[2015-05-28 11:46:16,061][INFO ][node ] [ES2] started
[2015-05-28 11:49:49,595][INFO ][node ] [ES2] version[1.5.2], pid[824], build[62ff986/2015-04-27T09:21:06Z]
[2015-05-28 11:49:49,628][INFO ][node ] [ES2] initializing ...
[2015-05-28 11:49:50,394][INFO ][plugins ] [ES2] loaded [marvel], sites [marvel]
[2015-05-28 11:51:17,611][INFO ][node ] [ES2] initialized
[2015-05-28 11:51:17,626][INFO ][node ] [ES2] starting ...
[2015-05-28 11:51:19,409][INFO ][transport ] [ES2] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.88.20:9300]}
[2015-05-28 11:51:19,775][INFO ][discovery ] [ES2] olympus/KVeatmVAR4aGzumHUy4PQA
[2015-05-28 11:51:24,007][INFO ][cluster.service ] [ES2] detected_master [ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]], added {[ES3][onaC5Y2bSd-nVTDRXsxjBg][es3pi][inet[/192.168.88.21:9300]]{data=false, master=true},[ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]],}, reason: zen-disco-receive(from master [[ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]]])
[2015-05-28 11:51:25,739][INFO ][http ] [ES2] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.88.20:9200]}
[2015-05-28 11:51:25,744][INFO ][node ] [ES2] started
[2015-05-28 11:55:05,150][INFO ][node ] [ES2] version[1.5.2], pid[933], build[62ff986/2015-04-27T09:21:06Z]
[2015-05-28 11:55:05,157][INFO ][node ] [ES2] initializing ...
[2015-05-28 11:55:06,032][INFO ][plugins ] [ES2] loaded [marvel], sites [marvel]
[2015-05-28 11:56:24,370][INFO ][node ] [ES2] initialized
[2015-05-28 11:56:24,384][INFO ][node ] [ES2] starting ...
[2015-05-28 11:56:26,136][INFO ][transport ] [ES2] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.88.20:9300]}
[2015-05-28 11:56:26,496][INFO ][discovery ] [ES2] olympus/DVO0kHFbQf66qX95DHfzjw
[2015-05-28 11:56:30,717][INFO ][cluster.service ] [ES2] detected_master [ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]], added {[ES3][onaC5Y2bSd-nVTDRXsxjBg][es3pi][inet[/192.168.88.21:9300]]{data=false, master=true},[ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]],}, reason: zen-disco-receive(from master [[ES1][mZUzucXKTnGUdfXNx1OYEQ][es1][inet[/192.168.88.22:9300]]])
[2015-05-28 11:56:32,396][INFO ][http ] [ES2] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.88.20:9200]}
[2015-05-28 11:56:32,410][INFO ][node ] [ES2] started
Anyone have any ideas on what's going wrong here? It would be much appreciated!

Related

Wagtail Elasticsearch failed on tags

I created a customised Document model in wagtail admin by
class CustomizedDocument(Document):
...
And i have updated settings.WAGTAILDOCS_DOCUMENT_MODEL.
However, i realised my search on tags fails. I suspect there is something to do with elasticsearch. But i m really new to that.
Here is the tracestack message from elasticsearch
[INFO ][o.e.n.Node ] initialized
[INFO ][o.e.n.Node ] [oceLPbj] starting ...
[INFO ][o.e.t.TransportService ] [oceLPbj] publish_address {127.0.0.1:9300}, bound_addresses {[fe80::1]:9300}, {[::1]:9300}, {127.0.0.1:9300}
[INFO ][o.e.c.s.ClusterService ] [oceLPbj] new_master {oceLPbj}{oceLPbjSQ7ib2pTjx9gpPg}{WDDcwdlISnu-EW8mjQcOxQ}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[INFO ][o.e.h.n.Netty4HttpServerTransport] [oceLPbj] publish_address {127.0.0.1:9200}, bound_addresses {[fe80::1]:9200}, {[::1]:9200}, {127.0.0.1:9200}
[INFO ][o.e.n.Node ] [oceLPbj] started
[INFO ][o.e.g.GatewayService ] [oceLPbj] recovered [4] indices into cluster_state
[INFO ][o.e.c.r.a.AllocationService] [oceLPbj] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[wagtail__wagtaildocs_document][1], [wagtail__wagtaildocs_document][3]] ...]).
[INFO ][o.e.c.m.MetaDataCreateIndexService] [oceLPbj] [wagtail__wagtaildocs_document_awv9f81] creating index, cause [api], templates [], shards [5]/[1], mappings []
[INFO ][o.e.c.m.MetaDataMappingService] [oceLPbj] [wagtail__wagtaildocs_document_awv9f81/CzKReMnsQ9qziLnOQQ5K3g] create_mapping [wagtaildocs_abstractdocument_wagtaildocs_document_distributor_portal_customizeddocument]
[INFO ][o.e.c.m.MetaDataMappingService] [oceLPbj] [wagtail__wagtaildocs_document_awv9f81/CzKReMnsQ9qziLnOQQ5K3g] create_mapping [wagtaildocs_abstractdocument_wagtaildocs_document]
Could anyone help me find out what's going on here? Any help is appreciated. Let me know what else information i should provide.

Elasticsearch auto discovery with AWS

Hi running Elasticsearch 1.6.0 and AWS plugin 2.6.0 on Windows 2008 in Amazon.
I have AWS plgin setup, I don't get any Exception in the logs but the nodes can't seem to dicover each other.
bootstrap.mlockall: true
cluster.name: my-cluster
node.name: "ES MASTER 01"
node.data: false
node.master: true
plugin.mandatory: "cloud-aws"
cloud.aws.access_key: "AK...Z7Q"
cloud.aws.secret_key: "gKW...nAO"
cloud.aws.region: "us-east"
discovery.zen.minimum_master_nodes: 1
discovery.type: "ec2"
discovery.ec2.groups: "Elastic Search"
discovery.ec2.ping_timeout: "30s"
discovery.ec2.availability_zones: "us-east-1a"
discovery.zen.ping.multicast.enabled: false
Logs:
[2015-07-13 15:02:19,346][INFO ][node ] [ES MASTER 01] version[1.6.0], pid[2532], build[cdd3ac4/2015-06-09T13:36:34Z]
[2015-07-13 15:02:19,346][INFO ][node ] [ES MASTER 01] initializing ...
[2015-07-13 15:02:19,378][INFO ][plugins ] [ES MASTER 01] loaded [cloud-aws], sites []
[2015-07-13 15:02:19,440][INFO ][env ] [ES MASTER 01] using [1] data paths, mounts [[(C:)]], net usable_space [6.8gb], net total_space [29.9gb], types [NTFS]
[2015-07-13 15:02:26,461][INFO ][node ] [ES MASTER 01] initialized
[2015-07-13 15:02:26,461][INFO ][node ] [ES MASTER 01] starting ...
[2015-07-13 15:02:26,851][INFO ][transport ] [ES MASTER 01] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.30.0.123:9300]}
[2015-07-13 15:02:26,866][INFO ][discovery ] [ES MASTER 01] my-cluster/SwhSDhiDQzq4pM8jkhIuzw
[2015-07-13 15:02:56,884][WARN ][discovery ] [ES MASTER 01] waited for 30s and no initial state was set by the discovery
[2015-07-13 15:02:56,962][INFO ][http ] [ES MASTER 01] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.30.0.123:9200]}
[2015-07-13 15:02:56,962][INFO ][node ] [ES MASTER 01] started
[2015-07-13 15:03:13,455][INFO ][cluster.service ] [ES MASTER 01] new_master [ES MASTER 01][SwhSDhiDQzq4pM8jkhIuzw][WIN-3Q4EH3B8H1O][inet[/172.30.0.123:9300]]{data=false, master=true}, reason: zen-disco-join (elected_as_master)
[2015-07-13 15:03:13,517][INFO ][gateway ] [ES MASTER 01] recovered [0] indices into cluster_state
It can surely work with private IP, if and only if your node instances are able to access ec2 information on the same VPC in order to find out about the Cluster it should join.
you can set this Discovery Permission as a policy as the following and apply it to your IAM role:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "whatever",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeInstances",
"ec2:DescribeRegions",
"ec2:DescribeSecurityGroups",
"ec2:DescribeTags"
],
"Resource": [
"*"
]
}
]
}

Elasticsearch not working with sugarcrm 7.5.0.1

I have upgraded a SugarCRM 7.2.2.0 to 7.5.0.1.
Then I have updated java to v7 and elasticsearch to v1.3.1
After starting elasticsearch, and launching indexing, the global search keep not giving any results.
Here is the output of elasticsearch when it is launched :
/usr/local/bin/elasticsearch/bin/elasticsearch
[2014-12-17 09:11:37,057][INFO ][node ] [Wild Child] version[1.3.1], pid[19801], build[2de6dc5/2014-07-28T14:45:15Z]
[2014-12-17 09:11:37,059][INFO ][node ] [Wild Child] initializing...
[2014-12-17 09:11:37,066][INFO ][plugins ] [Wild Child] loaded [], sites []
[2014-12-17 09:11:39,896][WARN ][common.network ] failed to resolve local host, fallback to loopback
java.net.UnknownHostException: sm4.localdomain: sm4.localdomain: Name or service not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1473)
at org.elasticsearch.common.network.NetworkUtils.<clinit>(NetworkUtils.java:54)
at org.elasticsearch.transport.netty.NettyTransport.<init>(NettyTransport.java:204)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
...
Caused by: java.net.UnknownHostException: sm4.localdomain: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 62 more
[2014-12-17 09:11:40,779][INFO ][node ] [Wild Child] initialized
[2014-12-17 09:11:40,780][INFO ][node ] [Wild Child] starting ...
[2014-12-17 09:11:41,002][INFO ][transport ] [Wild Child] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/92.39.241.87:9300]}
[2014-12-17 09:11:41,046][INFO ][discovery ] [Wild Child] elasticsearch/OeRmy39vTz2WTcnjSXoHHA
[2014-12-17 09:11:44,095][INFO ][cluster.service ] [Wild Child] new_master [Wild Child][OeRmy39vTz2WTcnjSXoHHA][localhost][inet[/92.39.241.87:9300]], reason: zen-disco-join (elected_as_master)
[2014-12-17 09:11:44,133][INFO ][http ] [Wild Child] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/92.39.241.87:9200]}
[2014-12-17 09:11:44,134][INFO ][node ] [Wild Child] started
[2014-12-17 09:11:45,039][INFO ][gateway ] [Wild Child] recovered [1] indices into cluster_state
Though the localhost:9200 is accessible.
Then when I schedule system index on Sugar, nothing seems to happen in any elasticsearch log.
Did anyone already have this problem ?
Any help will be much appreciated !
Thank you
Cheers, Victor
Don't know if you have this resolved yet. I recently did the same upgrade, and had to update my java version to 1.7, and additionally, had to ensure that my locations for logs and data were writable (elasticsearch user and elasticsearch group).
I eventually solved my issue.
Indeed, upgrading java is necessary.
For my part, I had to add the line in the crontab to call cron.php to make elasticsearch work with Sugarcrm.

Elasticsearch doesn't respond after index recovering

ES not responded to any requests after loosing an index (for unknown reason). After server restart ES trying to recover index but as soon as it read entire index (about 200mb only) ES stop to respond. The last error I saw was SearchPhaseExecutionException[Failed to execute phase [query_fetch], all shards failed]. I'm using ES on single node virtual server. Index has only one shard with about 3mln documents (200mb).
How I can recover this index?
Here's the ES log
[2014-06-21 18:43:15,337][WARN ][bootstrap ] jvm uses the client vm, make sure to run `java` with the server vm for best performance by adding `-server` to the command line
[2014-06-21 18:43:15,554][WARN ][common.jna ] Unknown mlockall error 0
[2014-06-21 18:43:15,759][INFO ][node ] [Crimson Cowl] version[1.1.0], pid[1031], build[2181e11/2014-03-25T15:59:51Z]
[2014-06-21 18:43:15,759][INFO ][node ] [Crimson Cowl] initializing ...
[2014-06-21 18:43:15,881][INFO ][plugins ] [Crimson Cowl] loaded [], sites [head]
[2014-06-21 18:43:21,957][INFO ][node ] [Crimson Cowl] initialized
[2014-06-21 18:43:21,958][INFO ][node ] [Crimson Cowl] starting ...
[2014-06-21 18:43:22,275][INFO ][transport ] [Crimson Cowl] bound_address {inet[/10.0.0.13:9300]}, publish_address {inet[/10.0.0.13:9300]}
[2014-06-21 18:43:25,385][INFO ][cluster.service ] [Crimson Cowl] new_master [Crimson Cowl][UJNl8hGgRzeFo-DQ3vk2nA][esubuntu][inet[/10.0.0.13:9300]], reason: zen-disco-join (elected_as_master)
[2014-06-21 18:43:25,438][INFO ][discovery ] [Crimson Cowl] elasticsearch/UJNl8hGgRzeFo-DQ3vk2nA
[2014-06-21 18:43:25,476][INFO ][http ] [Crimson Cowl] bound_address {inet[/10.0.0.13:9200]}, publish_address {inet[/10.0.0.13:9200]}
[2014-06-21 18:43:26,348][INFO ][gateway ] [Crimson Cowl] recovered [2] indices into cluster_state
[2014-06-21 18:43:26,349][INFO ][node ] [Crimson Cowl] started
After deleting another index on the same node ES respond to request, but failed to recover index. Here's the log
[2014-06-22 08:00:06,651][WARN ][bootstrap ] jvm uses the client vm, make sure to run `java` with the server vm for best performance by adding `-server` to the command line
[2014-06-22 08:00:06,699][WARN ][common.jna ] Unknown mlockall error 0
[2014-06-22 08:00:06,774][INFO ][node ] [Baron Macabre] version[1.1.0], pid[2035], build[2181e11/2014-03-25T15:59:51Z]
[2014-06-22 08:00:06,774][INFO ][node ] [Baron Macabre] initializing ...
[2014-06-22 08:00:06,779][INFO ][plugins ] [Baron Macabre] loaded [], sites [head]
[2014-06-22 08:00:08,766][INFO ][node ] [Baron Macabre] initialized
[2014-06-22 08:00:08,767][INFO ][node ] [Baron Macabre] starting ...
[2014-06-22 08:00:08,824][INFO ][transport ] [Baron Macabre] bound_address {inet[/10.0.0.3:9300]}, publish_address {inet[/10.0.0.3:9300]}
[2014-06-22 08:00:11,890][INFO ][cluster.service ] [Baron Macabre] new_master [Baron Macabre][eWDP4ZSXSGuASJLJ2an1nQ][esubuntu][inet[/10.0.0.3:9300]], reason: zen-disco-join (elected_as_master)
[2014-06-22 08:00:11,975][INFO ][discovery ] [Baron Macabre] elasticsearch/eWDP4ZSXSGuASJLJ2an1nQ
[2014-06-22 08:00:12,000][INFO ][http ] [Baron Macabre] bound_address {inet[/10.0.0.3:9200]}, publish_address {inet[/10.0.0.3:9200]}
[2014-06-22 08:00:12,645][INFO ][gateway ] [Baron Macabre] recovered [1] indices into cluster_state
[2014-06-22 08:00:12,647][INFO ][node ] [Baron Macabre] started
[2014-06-22 08:05:01,284][WARN ][index.engine.internal ] [Baron Macabre] [wordstat][0] failed engine
java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.index.ParallelPostingsArray.<init>(ParallelPostingsArray.java:35)
at org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.<init>(FreqProxTermsWriterPerField.java:254)
at org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.newInstance(FreqProxTermsWriterPerField.java:279)
at org.apache.lucene.index.ParallelPostingsArray.grow(ParallelPostingsArray.java:48)
at org.apache.lucene.index.TermsHashPerField$PostingsBytesStartArray.grow(TermsHashPerField.java:307)
at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:324)
at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:185)
at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:171)
at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:248)
at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:453)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1529)
at org.elasticsearch.index.engine.internal.InternalEngine.innerIndex(InternalEngine.java:532)
at org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:470)
at org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryOperation(InternalIndexShard.java:744)
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:228)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:197)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2014-06-22 08:05:02,168][WARN ][cluster.action.shard ] [Baron Macabre] [wordstat][0] sending failed shard for [wordstat][0], node[eWDP4ZSXSGuASJLJ2an1nQ], [P], s[INITIALIZING], indexUUID [LC3LMLxgS3CkkG_pvfTeSg], reason [engine failure, message [OutOfMemoryError[Java heap space]]]
[2014-06-22 08:05:02,169][WARN ][cluster.action.shard ] [Baron Macabre] [wordstat][0] received shard failed for [wordstat][0], node[eWDP4ZSXSGuASJLJ2an1nQ], [P], s[INITIALIZING], indexUUID [LC3LMLxgS3CkkG_pvfTeSg], reason [engine failure, message [OutOfMemoryError[Java heap space]]]
[2014-06-22 08:53:22,253][INFO ][node ] [Baron Macabre] stopping ...
[2014-06-22 08:53:22,267][INFO ][node ] [Baron Macabre] stopped
[2014-06-22 08:53:22,267][INFO ][node ] [Baron Macabre] closing ...
[2014-06-22 08:53:22,272][INFO ][node ] [Baron Macabre] closed
[2014-06-22 08:53:23,667][WARN ][bootstrap ] jvm uses the client vm, make sure to run `java` with the server vm for best performance by adding `-server` to the command line
[2014-06-22 08:53:23,708][WARN ][common.jna ] Unknown mlockall error 0
[2014-06-22 08:53:23,777][INFO ][node ] [Living Totem] version[1.1.0], pid[2137], build[2181e11/2014-03-25T15:59:51Z]
[2014-06-22 08:53:23,777][INFO ][node ] [Living Totem] initializing ...
[2014-06-22 08:53:23,781][INFO ][plugins ] [Living Totem] loaded [], sites [head]
[2014-06-22 08:53:25,828][INFO ][node ] [Living Totem] initialized
[2014-06-22 08:53:25,828][INFO ][node ] [Living Totem] starting ...
[2014-06-22 08:53:25,885][INFO ][transport ] [Living Totem] bound_address {inet[/10.0.0.3:9300]}, publish_address {inet[/10.0.0.3:9300]}
[2014-06-22 08:53:28,913][INFO ][cluster.service ] [Living Totem] new_master [Living Totem][D-eoRm7fSrCU_dTw_NQipA][esubuntu][inet[/10.0.0.3:9300]], reason: zen-disco-join (elected_as_master)
[2014-06-22 08:53:28,939][INFO ][discovery ] [Living Totem] elasticsearch/D-eoRm7fSrCU_dTw_NQipA
[2014-06-22 08:53:28,964][INFO ][http ] [Living Totem] bound_address {inet[/10.0.0.3:9200]}, publish_address {inet[/10.0.0.3:9200]}
[2014-06-22 08:53:29,433][INFO ][gateway ] [Living Totem] recovered [1] indices into cluster_state
[2014-06-22 08:53:29,433][INFO ][node ] [Living Totem] started
[2014-06-22 08:58:05,268][WARN ][index.engine.internal ] [Living Totem] [wordstat][0] failed engine
java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.<init>(FreqProxTermsWriterPerField.java:261)
at org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.newInstance(FreqProxTermsWriterPerField.java:279)
at org.apache.lucene.index.ParallelPostingsArray.grow(ParallelPostingsArray.java:48)
at org.apache.lucene.index.TermsHashPerField$PostingsBytesStartArray.grow(TermsHashPerField.java:307)
at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:324)
at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:185)
at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:171)
at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:248)
at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:453)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1529)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1199)
at org.elasticsearch.index.engine.internal.InternalEngine.innerIndex(InternalEngine.java:523)
at org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:470)
at org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryOperation(InternalIndexShard.java:744)
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:228)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:197)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2014-06-22 08:58:06,046][WARN ][cluster.action.shard ] [Living Totem] [wordstat][0] sending failed shard for [wordstat][0], node[D-eoRm7fSrCU_dTw_NQipA], [P], s[INITIALIZING], indexUUID [LC3LMLxgS3CkkG_pvfTeSg], reason [engine failure, message [OutOfMemoryError[Java heap space]]]
[2014-06-22 08:58:06,047][WARN ][cluster.action.shard ] [Living Totem] [wordstat][0] received shard failed for [wordstat][0], node[D-eoRm7fSrCU_dTw_NQipA], [P], s[INITIALIZING], indexUUID [LC3LMLxgS3CkkG_pvfTeSg], reason [engine failure, message [OutOfMemoryError[Java heap space]]]
In order to recover you're Elasticsearch cluster you will need to allocate more memory to the heap. As you are running on a fairly small instance this may be a bit challenging but here is what you will need to do:
Change the configuration to allocate more memory to the heap. Not
clear what your current settings are but there are several ways to
boost this - the easiest is to set the environment variable
ES_HEAP_SIZE. I'd start with 1GB, try that and then boost it in
small increments as you are already near the limit of what you can
do with a 1.6GB memory instance. Alternatively you may make
changes to the files used to launch Elasticsearch - depends on
how you have them installed, but should be in the bin directory
underneath the Elasticsearch home directory. For a linux
installation the files are elasticsearch and
elasticsearch.in.sh.
Move to a larger instance. This would be much easier to recover from
on a system with more memory - so if the above step does not work,
you could copy all your files to another larger instance and try the
above steps again with a larger heap size.
What has happened here is your server has become overloaded. Possibly there is a bad sector. What you need to do is delete your existing indices and re-index them.
On Linux,
Elasticsearch temporary files are kept in usr/local/var/elasticsearch/
Delete this folder, then repopulate your index

Elasticsearch restarts once every 10 min

I'm having some problem with elasticsearch restarts once every 10 min.
Here is the log file.
[2012-01-11 21:55:15,059][INFO ][node ] [Tyrannosaur] {0.18.7}[22401]: stopping ...
[2012-01-11 21:55:15,416][INFO ][node ] [Tyrannosaur] {0.18.7}[22401]: stopped
[2012-01-11 21:55:15,417][INFO ][node ] [Tyrannosaur] {0.18.7}[22401]: closing ...
[2012-01-11 21:55:15,443][INFO ][node ] [Tyrannosaur] {0.18.7}[22401]: closed
[2012-01-11 21:55:22,364][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: initializing ...
[2012-01-11 21:55:22,376][INFO ][plugins ] [Williams, Eric] loaded [], sites []
[2012-01-11 21:55:26,245][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: initialized
[2012-01-11 21:55:26,245][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: starting ...
[2012-01-11 21:55:26,364][INFO ][transport ] [Williams, Eric] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/91.123.195.252:9300]}
[2012-01-11 21:55:29,421][INFO ][cluster.service ] [Williams, Eric] new_master [Williams, Eric][xPA_opsKQGStNtubxelOQQ][inet[/91.123.195.252:9300]], reason: zen-disco-join (elected_as_master)
[2012-01-11 21:55:29,527][INFO ][discovery ] [Williams, Eric] resp/xPA_opsKQGStNtubxelOQQ
[2012-01-11 21:55:29,903][INFO ][http ] [Williams, Eric] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/91.123.195.252:9200]}
[2012-01-11 21:55:29,905][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: started
[2012-01-11 21:55:32,511][INFO ][gateway ] [Williams, Eric] recovered [1] indices into cluster_state
[2012-01-11 21:56:56,137][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: stopping ...
[2012-01-11 21:56:56,236][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: stopped
[2012-01-11 21:56:56,237][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: closing ...
[2012-01-11 21:56:56,262][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: closed
[2012-01-11 21:57:03,026][INFO ][node ] [Carnivore] {0.18.7}[23075]: initializing ...
[2012-01-11 21:57:03,041][INFO ][plugins ] [Carnivore] loaded [], sites []
[2012-01-11 21:57:07,682][INFO ][node ] [Carnivore] {0.18.7}[23075]: initialized
[2012-01-11 21:57:07,683][INFO ][node ] [Carnivore] {0.18.7}[23075]: starting ...
[2012-01-11 21:57:07,841][INFO ][transport ] [Carnivore] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/91.123.195.252:9300]}
[2012-01-11 21:57:10,925][INFO ][cluster.service ] [Carnivore] new_master [Carnivore][qFbBoUEeQEuqH5suILfsww][inet[/91.123.195.252:9300]], reason: zen-disco-join (elected_as_master)
[2012-01-11 21:57:10,987][INFO ][discovery ] [Carnivore] resp/qFbBoUEeQEuqH5suILfsww
[2012-01-11 21:57:11,246][INFO ][http ] [Carnivore] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/91.123.195.252:9200]}
[2012-01-11 21:57:11,248][INFO ][node ] [Carnivore] {0.18.7}[23075]: started
[2012-01-11 21:57:13,001][INFO ][gateway ] [Carnivore] recovered [1] indices into cluster_state
Config file
cluster:
name: resp
path:
logs: /tmp/
data: /opt/www/resp/shared/elasticsearch
This is how elasticsearch is started
elasticsearch -f -Des.config=/opt/www/resp/current/config/elasticsearch.yml
Version
ElasticSearch Version: 0.18.7, JVM: 14.0-b16
Any one know what the problem may be?
ElasticSearch require java 1.6 or greater.
Consider upgrading your JVM
http://www.elasticsearch.org/guide/reference/setup/installation.html
I'm not sure why, but the problem was related to god.
It restarted the elasticsearch daemon every 10 min.
So the solution was to deactivate god.

Resources