Hi running Elasticsearch 1.6.0 and AWS plugin 2.6.0 on Windows 2008 in Amazon.
I have AWS plgin setup, I don't get any Exception in the logs but the nodes can't seem to dicover each other.
bootstrap.mlockall: true
cluster.name: my-cluster
node.name: "ES MASTER 01"
node.data: false
node.master: true
plugin.mandatory: "cloud-aws"
cloud.aws.access_key: "AK...Z7Q"
cloud.aws.secret_key: "gKW...nAO"
cloud.aws.region: "us-east"
discovery.zen.minimum_master_nodes: 1
discovery.type: "ec2"
discovery.ec2.groups: "Elastic Search"
discovery.ec2.ping_timeout: "30s"
discovery.ec2.availability_zones: "us-east-1a"
discovery.zen.ping.multicast.enabled: false
Logs:
[2015-07-13 15:02:19,346][INFO ][node ] [ES MASTER 01] version[1.6.0], pid[2532], build[cdd3ac4/2015-06-09T13:36:34Z]
[2015-07-13 15:02:19,346][INFO ][node ] [ES MASTER 01] initializing ...
[2015-07-13 15:02:19,378][INFO ][plugins ] [ES MASTER 01] loaded [cloud-aws], sites []
[2015-07-13 15:02:19,440][INFO ][env ] [ES MASTER 01] using [1] data paths, mounts [[(C:)]], net usable_space [6.8gb], net total_space [29.9gb], types [NTFS]
[2015-07-13 15:02:26,461][INFO ][node ] [ES MASTER 01] initialized
[2015-07-13 15:02:26,461][INFO ][node ] [ES MASTER 01] starting ...
[2015-07-13 15:02:26,851][INFO ][transport ] [ES MASTER 01] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.30.0.123:9300]}
[2015-07-13 15:02:26,866][INFO ][discovery ] [ES MASTER 01] my-cluster/SwhSDhiDQzq4pM8jkhIuzw
[2015-07-13 15:02:56,884][WARN ][discovery ] [ES MASTER 01] waited for 30s and no initial state was set by the discovery
[2015-07-13 15:02:56,962][INFO ][http ] [ES MASTER 01] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.30.0.123:9200]}
[2015-07-13 15:02:56,962][INFO ][node ] [ES MASTER 01] started
[2015-07-13 15:03:13,455][INFO ][cluster.service ] [ES MASTER 01] new_master [ES MASTER 01][SwhSDhiDQzq4pM8jkhIuzw][WIN-3Q4EH3B8H1O][inet[/172.30.0.123:9300]]{data=false, master=true}, reason: zen-disco-join (elected_as_master)
[2015-07-13 15:03:13,517][INFO ][gateway ] [ES MASTER 01] recovered [0] indices into cluster_state
It can surely work with private IP, if and only if your node instances are able to access ec2 information on the same VPC in order to find out about the Cluster it should join.
you can set this Discovery Permission as a policy as the following and apply it to your IAM role:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "whatever",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeInstances",
"ec2:DescribeRegions",
"ec2:DescribeSecurityGroups",
"ec2:DescribeTags"
],
"Resource": [
"*"
]
}
]
}
Related
I created a customised Document model in wagtail admin by
class CustomizedDocument(Document):
...
And i have updated settings.WAGTAILDOCS_DOCUMENT_MODEL.
However, i realised my search on tags fails. I suspect there is something to do with elasticsearch. But i m really new to that.
Here is the tracestack message from elasticsearch
[INFO ][o.e.n.Node ] initialized
[INFO ][o.e.n.Node ] [oceLPbj] starting ...
[INFO ][o.e.t.TransportService ] [oceLPbj] publish_address {127.0.0.1:9300}, bound_addresses {[fe80::1]:9300}, {[::1]:9300}, {127.0.0.1:9300}
[INFO ][o.e.c.s.ClusterService ] [oceLPbj] new_master {oceLPbj}{oceLPbjSQ7ib2pTjx9gpPg}{WDDcwdlISnu-EW8mjQcOxQ}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[INFO ][o.e.h.n.Netty4HttpServerTransport] [oceLPbj] publish_address {127.0.0.1:9200}, bound_addresses {[fe80::1]:9200}, {[::1]:9200}, {127.0.0.1:9200}
[INFO ][o.e.n.Node ] [oceLPbj] started
[INFO ][o.e.g.GatewayService ] [oceLPbj] recovered [4] indices into cluster_state
[INFO ][o.e.c.r.a.AllocationService] [oceLPbj] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[wagtail__wagtaildocs_document][1], [wagtail__wagtaildocs_document][3]] ...]).
[INFO ][o.e.c.m.MetaDataCreateIndexService] [oceLPbj] [wagtail__wagtaildocs_document_awv9f81] creating index, cause [api], templates [], shards [5]/[1], mappings []
[INFO ][o.e.c.m.MetaDataMappingService] [oceLPbj] [wagtail__wagtaildocs_document_awv9f81/CzKReMnsQ9qziLnOQQ5K3g] create_mapping [wagtaildocs_abstractdocument_wagtaildocs_document_distributor_portal_customizeddocument]
[INFO ][o.e.c.m.MetaDataMappingService] [oceLPbj] [wagtail__wagtaildocs_document_awv9f81/CzKReMnsQ9qziLnOQQ5K3g] create_mapping [wagtaildocs_abstractdocument_wagtaildocs_document]
Could anyone help me find out what's going on here? Any help is appreciated. Let me know what else information i should provide.
i want to run Elasticsearch on two servers in the same cluster.
The Problem is : I can't connect the 2 servers on elasticsearch. I mean i can't get the 2 nodes on the same cluster.
can somebody tell me if my Configuration in the elasticsearch.yml is right and
Sever 1:
cluster.name: MyData
node.name: Node_1
network.host: '192.160.122.4'
http.port: 9200
discovery.zen.ping.unicast.hosts: ["192.160.122.4","192.160.122.3"]
Server2:
cluster.name: MyData
node.name: Node_2
network.host: '192.160.122.3'
http.port: 9200
discovery.zen.ping.unicast.hosts: ["192.160.122.4","192.160.122.3"]
what i need to change it ?
Thanks
to the Logs:
Log of server 1:
[2017-03-21T10:38:18,859][INFO ][o.e.n.Node ] [Node_1]
initialized
[2017-03-21T10:38:18,906][INFO ][o.e.n.Node ] [Node_1]
starting ...
[2017-03-21T10:38:19,764][INFO ][o.e.t.TransportService ] [Node_1]
publish_address {192.160.122.4:9300}, bound_addresses {192.160.122.4:9300}
[2017-03-21T10:38:19,764][INFO ][o.e.b.BootstrapChecks ] [Node_1] bound
or publishing to a non-loopback or non-link-local address, enforcing
bootstrap checks
[2017-03-21T10:38:22,899][INFO ][o.e.c.s.ClusterService ] [Node_1]
new_master {Master-Node_1}{4ZEftg6TRCOJqE0kEv-Mrg}{MYrWUFABQ5OOT0US73j13w}
{192.160.122.4}{192.160.122.4:9300}, reason: zen-disco-elected-as-master
([0]
nodes joined)
[2017-03-21T10:38:22,950][INFO ][o.e.h.HttpServer ] [Node_1]
publish_address {192.160.122.4:9200}, bound_addresses {192.160.122.4:9200}
[2017-03-21T10:38:22,997][INFO ][o.e.n.Node ] [Node_1] started
[2017-03-21T10:38:23,948][INFO ][o.e.g.GatewayService ] [Node_1]
recovered [5] indices into cluster_state
[2017-03-21T10:38:32,357][INFO ][o.e.c.r.a.AllocationService] [Node_1]
Cluster health status changed from [RED] to [YELLOW] (reason: [shards
started [[node_1][4]] ...]).
Log of Server2:
[2017-03-21T11:55:57,277][INFO ][o.e.n.Node ] [Node_2] initialized
[2017-03-21T11:55:57,293][INFO ][o.e.n.Node ] [Node_2] starting ...
[2017-03-21T11:56:01,099][INFO ][o.e.t.TransportService ] [Node_2] publish_address {192.160.122.3:9300}, bound_addresses {192.160.122.3:9300}
[2017-03-21T11:56:01,115][INFO ][o.e.b.BootstrapChecks ] [Node_2] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-03-21T11:56:01,115][ERROR][o.e.b.Bootstrap ] [Node_2] node validation exception
bootstrap checks failed
JVM is using the client VM [Java HotSpot(TM) Client VM] but should be using a server VM for the best performance
[2017-03-21T11:56:01,146][INFO ][o.e.n.Node ] [Node_2] stopping ...
[2017-03-21T11:56:01,193][INFO ][o.e.n.Node ] [Node_2] stopped
[2017-03-21T11:56:01,193][INFO ][o.e.n.Node ] [Node_2] closing ...
[2017-03-21T11:56:01,240][INFO ][o.e.n.Node ] [Node_2] closed
I have upgraded a SugarCRM 7.2.2.0 to 7.5.0.1.
Then I have updated java to v7 and elasticsearch to v1.3.1
After starting elasticsearch, and launching indexing, the global search keep not giving any results.
Here is the output of elasticsearch when it is launched :
/usr/local/bin/elasticsearch/bin/elasticsearch
[2014-12-17 09:11:37,057][INFO ][node ] [Wild Child] version[1.3.1], pid[19801], build[2de6dc5/2014-07-28T14:45:15Z]
[2014-12-17 09:11:37,059][INFO ][node ] [Wild Child] initializing...
[2014-12-17 09:11:37,066][INFO ][plugins ] [Wild Child] loaded [], sites []
[2014-12-17 09:11:39,896][WARN ][common.network ] failed to resolve local host, fallback to loopback
java.net.UnknownHostException: sm4.localdomain: sm4.localdomain: Name or service not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1473)
at org.elasticsearch.common.network.NetworkUtils.<clinit>(NetworkUtils.java:54)
at org.elasticsearch.transport.netty.NettyTransport.<init>(NettyTransport.java:204)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
...
Caused by: java.net.UnknownHostException: sm4.localdomain: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 62 more
[2014-12-17 09:11:40,779][INFO ][node ] [Wild Child] initialized
[2014-12-17 09:11:40,780][INFO ][node ] [Wild Child] starting ...
[2014-12-17 09:11:41,002][INFO ][transport ] [Wild Child] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/92.39.241.87:9300]}
[2014-12-17 09:11:41,046][INFO ][discovery ] [Wild Child] elasticsearch/OeRmy39vTz2WTcnjSXoHHA
[2014-12-17 09:11:44,095][INFO ][cluster.service ] [Wild Child] new_master [Wild Child][OeRmy39vTz2WTcnjSXoHHA][localhost][inet[/92.39.241.87:9300]], reason: zen-disco-join (elected_as_master)
[2014-12-17 09:11:44,133][INFO ][http ] [Wild Child] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/92.39.241.87:9200]}
[2014-12-17 09:11:44,134][INFO ][node ] [Wild Child] started
[2014-12-17 09:11:45,039][INFO ][gateway ] [Wild Child] recovered [1] indices into cluster_state
Though the localhost:9200 is accessible.
Then when I schedule system index on Sugar, nothing seems to happen in any elasticsearch log.
Did anyone already have this problem ?
Any help will be much appreciated !
Thank you
Cheers, Victor
Don't know if you have this resolved yet. I recently did the same upgrade, and had to update my java version to 1.7, and additionally, had to ensure that my locations for logs and data were writable (elasticsearch user and elasticsearch group).
I eventually solved my issue.
Indeed, upgrading java is necessary.
For my part, I had to add the line in the crontab to call cron.php to make elasticsearch work with Sugarcrm.
ES not responded to any requests after loosing an index (for unknown reason). After server restart ES trying to recover index but as soon as it read entire index (about 200mb only) ES stop to respond. The last error I saw was SearchPhaseExecutionException[Failed to execute phase [query_fetch], all shards failed]. I'm using ES on single node virtual server. Index has only one shard with about 3mln documents (200mb).
How I can recover this index?
Here's the ES log
[2014-06-21 18:43:15,337][WARN ][bootstrap ] jvm uses the client vm, make sure to run `java` with the server vm for best performance by adding `-server` to the command line
[2014-06-21 18:43:15,554][WARN ][common.jna ] Unknown mlockall error 0
[2014-06-21 18:43:15,759][INFO ][node ] [Crimson Cowl] version[1.1.0], pid[1031], build[2181e11/2014-03-25T15:59:51Z]
[2014-06-21 18:43:15,759][INFO ][node ] [Crimson Cowl] initializing ...
[2014-06-21 18:43:15,881][INFO ][plugins ] [Crimson Cowl] loaded [], sites [head]
[2014-06-21 18:43:21,957][INFO ][node ] [Crimson Cowl] initialized
[2014-06-21 18:43:21,958][INFO ][node ] [Crimson Cowl] starting ...
[2014-06-21 18:43:22,275][INFO ][transport ] [Crimson Cowl] bound_address {inet[/10.0.0.13:9300]}, publish_address {inet[/10.0.0.13:9300]}
[2014-06-21 18:43:25,385][INFO ][cluster.service ] [Crimson Cowl] new_master [Crimson Cowl][UJNl8hGgRzeFo-DQ3vk2nA][esubuntu][inet[/10.0.0.13:9300]], reason: zen-disco-join (elected_as_master)
[2014-06-21 18:43:25,438][INFO ][discovery ] [Crimson Cowl] elasticsearch/UJNl8hGgRzeFo-DQ3vk2nA
[2014-06-21 18:43:25,476][INFO ][http ] [Crimson Cowl] bound_address {inet[/10.0.0.13:9200]}, publish_address {inet[/10.0.0.13:9200]}
[2014-06-21 18:43:26,348][INFO ][gateway ] [Crimson Cowl] recovered [2] indices into cluster_state
[2014-06-21 18:43:26,349][INFO ][node ] [Crimson Cowl] started
After deleting another index on the same node ES respond to request, but failed to recover index. Here's the log
[2014-06-22 08:00:06,651][WARN ][bootstrap ] jvm uses the client vm, make sure to run `java` with the server vm for best performance by adding `-server` to the command line
[2014-06-22 08:00:06,699][WARN ][common.jna ] Unknown mlockall error 0
[2014-06-22 08:00:06,774][INFO ][node ] [Baron Macabre] version[1.1.0], pid[2035], build[2181e11/2014-03-25T15:59:51Z]
[2014-06-22 08:00:06,774][INFO ][node ] [Baron Macabre] initializing ...
[2014-06-22 08:00:06,779][INFO ][plugins ] [Baron Macabre] loaded [], sites [head]
[2014-06-22 08:00:08,766][INFO ][node ] [Baron Macabre] initialized
[2014-06-22 08:00:08,767][INFO ][node ] [Baron Macabre] starting ...
[2014-06-22 08:00:08,824][INFO ][transport ] [Baron Macabre] bound_address {inet[/10.0.0.3:9300]}, publish_address {inet[/10.0.0.3:9300]}
[2014-06-22 08:00:11,890][INFO ][cluster.service ] [Baron Macabre] new_master [Baron Macabre][eWDP4ZSXSGuASJLJ2an1nQ][esubuntu][inet[/10.0.0.3:9300]], reason: zen-disco-join (elected_as_master)
[2014-06-22 08:00:11,975][INFO ][discovery ] [Baron Macabre] elasticsearch/eWDP4ZSXSGuASJLJ2an1nQ
[2014-06-22 08:00:12,000][INFO ][http ] [Baron Macabre] bound_address {inet[/10.0.0.3:9200]}, publish_address {inet[/10.0.0.3:9200]}
[2014-06-22 08:00:12,645][INFO ][gateway ] [Baron Macabre] recovered [1] indices into cluster_state
[2014-06-22 08:00:12,647][INFO ][node ] [Baron Macabre] started
[2014-06-22 08:05:01,284][WARN ][index.engine.internal ] [Baron Macabre] [wordstat][0] failed engine
java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.index.ParallelPostingsArray.<init>(ParallelPostingsArray.java:35)
at org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.<init>(FreqProxTermsWriterPerField.java:254)
at org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.newInstance(FreqProxTermsWriterPerField.java:279)
at org.apache.lucene.index.ParallelPostingsArray.grow(ParallelPostingsArray.java:48)
at org.apache.lucene.index.TermsHashPerField$PostingsBytesStartArray.grow(TermsHashPerField.java:307)
at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:324)
at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:185)
at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:171)
at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:248)
at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:453)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1529)
at org.elasticsearch.index.engine.internal.InternalEngine.innerIndex(InternalEngine.java:532)
at org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:470)
at org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryOperation(InternalIndexShard.java:744)
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:228)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:197)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2014-06-22 08:05:02,168][WARN ][cluster.action.shard ] [Baron Macabre] [wordstat][0] sending failed shard for [wordstat][0], node[eWDP4ZSXSGuASJLJ2an1nQ], [P], s[INITIALIZING], indexUUID [LC3LMLxgS3CkkG_pvfTeSg], reason [engine failure, message [OutOfMemoryError[Java heap space]]]
[2014-06-22 08:05:02,169][WARN ][cluster.action.shard ] [Baron Macabre] [wordstat][0] received shard failed for [wordstat][0], node[eWDP4ZSXSGuASJLJ2an1nQ], [P], s[INITIALIZING], indexUUID [LC3LMLxgS3CkkG_pvfTeSg], reason [engine failure, message [OutOfMemoryError[Java heap space]]]
[2014-06-22 08:53:22,253][INFO ][node ] [Baron Macabre] stopping ...
[2014-06-22 08:53:22,267][INFO ][node ] [Baron Macabre] stopped
[2014-06-22 08:53:22,267][INFO ][node ] [Baron Macabre] closing ...
[2014-06-22 08:53:22,272][INFO ][node ] [Baron Macabre] closed
[2014-06-22 08:53:23,667][WARN ][bootstrap ] jvm uses the client vm, make sure to run `java` with the server vm for best performance by adding `-server` to the command line
[2014-06-22 08:53:23,708][WARN ][common.jna ] Unknown mlockall error 0
[2014-06-22 08:53:23,777][INFO ][node ] [Living Totem] version[1.1.0], pid[2137], build[2181e11/2014-03-25T15:59:51Z]
[2014-06-22 08:53:23,777][INFO ][node ] [Living Totem] initializing ...
[2014-06-22 08:53:23,781][INFO ][plugins ] [Living Totem] loaded [], sites [head]
[2014-06-22 08:53:25,828][INFO ][node ] [Living Totem] initialized
[2014-06-22 08:53:25,828][INFO ][node ] [Living Totem] starting ...
[2014-06-22 08:53:25,885][INFO ][transport ] [Living Totem] bound_address {inet[/10.0.0.3:9300]}, publish_address {inet[/10.0.0.3:9300]}
[2014-06-22 08:53:28,913][INFO ][cluster.service ] [Living Totem] new_master [Living Totem][D-eoRm7fSrCU_dTw_NQipA][esubuntu][inet[/10.0.0.3:9300]], reason: zen-disco-join (elected_as_master)
[2014-06-22 08:53:28,939][INFO ][discovery ] [Living Totem] elasticsearch/D-eoRm7fSrCU_dTw_NQipA
[2014-06-22 08:53:28,964][INFO ][http ] [Living Totem] bound_address {inet[/10.0.0.3:9200]}, publish_address {inet[/10.0.0.3:9200]}
[2014-06-22 08:53:29,433][INFO ][gateway ] [Living Totem] recovered [1] indices into cluster_state
[2014-06-22 08:53:29,433][INFO ][node ] [Living Totem] started
[2014-06-22 08:58:05,268][WARN ][index.engine.internal ] [Living Totem] [wordstat][0] failed engine
java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.<init>(FreqProxTermsWriterPerField.java:261)
at org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.newInstance(FreqProxTermsWriterPerField.java:279)
at org.apache.lucene.index.ParallelPostingsArray.grow(ParallelPostingsArray.java:48)
at org.apache.lucene.index.TermsHashPerField$PostingsBytesStartArray.grow(TermsHashPerField.java:307)
at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:324)
at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:185)
at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:171)
at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:248)
at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:453)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1529)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1199)
at org.elasticsearch.index.engine.internal.InternalEngine.innerIndex(InternalEngine.java:523)
at org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:470)
at org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryOperation(InternalIndexShard.java:744)
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:228)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:197)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2014-06-22 08:58:06,046][WARN ][cluster.action.shard ] [Living Totem] [wordstat][0] sending failed shard for [wordstat][0], node[D-eoRm7fSrCU_dTw_NQipA], [P], s[INITIALIZING], indexUUID [LC3LMLxgS3CkkG_pvfTeSg], reason [engine failure, message [OutOfMemoryError[Java heap space]]]
[2014-06-22 08:58:06,047][WARN ][cluster.action.shard ] [Living Totem] [wordstat][0] received shard failed for [wordstat][0], node[D-eoRm7fSrCU_dTw_NQipA], [P], s[INITIALIZING], indexUUID [LC3LMLxgS3CkkG_pvfTeSg], reason [engine failure, message [OutOfMemoryError[Java heap space]]]
In order to recover you're Elasticsearch cluster you will need to allocate more memory to the heap. As you are running on a fairly small instance this may be a bit challenging but here is what you will need to do:
Change the configuration to allocate more memory to the heap. Not
clear what your current settings are but there are several ways to
boost this - the easiest is to set the environment variable
ES_HEAP_SIZE. I'd start with 1GB, try that and then boost it in
small increments as you are already near the limit of what you can
do with a 1.6GB memory instance. Alternatively you may make
changes to the files used to launch Elasticsearch - depends on
how you have them installed, but should be in the bin directory
underneath the Elasticsearch home directory. For a linux
installation the files are elasticsearch and
elasticsearch.in.sh.
Move to a larger instance. This would be much easier to recover from
on a system with more memory - so if the above step does not work,
you could copy all your files to another larger instance and try the
above steps again with a larger heap size.
What has happened here is your server has become overloaded. Possibly there is a bad sector. What you need to do is delete your existing indices and re-index them.
On Linux,
Elasticsearch temporary files are kept in usr/local/var/elasticsearch/
Delete this folder, then repopulate your index
I'm having some problem with elasticsearch restarts once every 10 min.
Here is the log file.
[2012-01-11 21:55:15,059][INFO ][node ] [Tyrannosaur] {0.18.7}[22401]: stopping ...
[2012-01-11 21:55:15,416][INFO ][node ] [Tyrannosaur] {0.18.7}[22401]: stopped
[2012-01-11 21:55:15,417][INFO ][node ] [Tyrannosaur] {0.18.7}[22401]: closing ...
[2012-01-11 21:55:15,443][INFO ][node ] [Tyrannosaur] {0.18.7}[22401]: closed
[2012-01-11 21:55:22,364][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: initializing ...
[2012-01-11 21:55:22,376][INFO ][plugins ] [Williams, Eric] loaded [], sites []
[2012-01-11 21:55:26,245][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: initialized
[2012-01-11 21:55:26,245][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: starting ...
[2012-01-11 21:55:26,364][INFO ][transport ] [Williams, Eric] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/91.123.195.252:9300]}
[2012-01-11 21:55:29,421][INFO ][cluster.service ] [Williams, Eric] new_master [Williams, Eric][xPA_opsKQGStNtubxelOQQ][inet[/91.123.195.252:9300]], reason: zen-disco-join (elected_as_master)
[2012-01-11 21:55:29,527][INFO ][discovery ] [Williams, Eric] resp/xPA_opsKQGStNtubxelOQQ
[2012-01-11 21:55:29,903][INFO ][http ] [Williams, Eric] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/91.123.195.252:9200]}
[2012-01-11 21:55:29,905][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: started
[2012-01-11 21:55:32,511][INFO ][gateway ] [Williams, Eric] recovered [1] indices into cluster_state
[2012-01-11 21:56:56,137][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: stopping ...
[2012-01-11 21:56:56,236][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: stopped
[2012-01-11 21:56:56,237][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: closing ...
[2012-01-11 21:56:56,262][INFO ][node ] [Williams, Eric] {0.18.7}[22961]: closed
[2012-01-11 21:57:03,026][INFO ][node ] [Carnivore] {0.18.7}[23075]: initializing ...
[2012-01-11 21:57:03,041][INFO ][plugins ] [Carnivore] loaded [], sites []
[2012-01-11 21:57:07,682][INFO ][node ] [Carnivore] {0.18.7}[23075]: initialized
[2012-01-11 21:57:07,683][INFO ][node ] [Carnivore] {0.18.7}[23075]: starting ...
[2012-01-11 21:57:07,841][INFO ][transport ] [Carnivore] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/91.123.195.252:9300]}
[2012-01-11 21:57:10,925][INFO ][cluster.service ] [Carnivore] new_master [Carnivore][qFbBoUEeQEuqH5suILfsww][inet[/91.123.195.252:9300]], reason: zen-disco-join (elected_as_master)
[2012-01-11 21:57:10,987][INFO ][discovery ] [Carnivore] resp/qFbBoUEeQEuqH5suILfsww
[2012-01-11 21:57:11,246][INFO ][http ] [Carnivore] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/91.123.195.252:9200]}
[2012-01-11 21:57:11,248][INFO ][node ] [Carnivore] {0.18.7}[23075]: started
[2012-01-11 21:57:13,001][INFO ][gateway ] [Carnivore] recovered [1] indices into cluster_state
Config file
cluster:
name: resp
path:
logs: /tmp/
data: /opt/www/resp/shared/elasticsearch
This is how elasticsearch is started
elasticsearch -f -Des.config=/opt/www/resp/current/config/elasticsearch.yml
Version
ElasticSearch Version: 0.18.7, JVM: 14.0-b16
Any one know what the problem may be?
ElasticSearch require java 1.6 or greater.
Consider upgrading your JVM
http://www.elasticsearch.org/guide/reference/setup/installation.html
I'm not sure why, but the problem was related to god.
It restarted the elasticsearch daemon every 10 min.
So the solution was to deactivate god.