URL to access cluster environment for ElasticSearch2.4.3 - elasticsearch

We have an ElasticSearch2.4.3 cluster environment of two nodes. I want to ask what URL should I provide to access the environment so that it works in High Availability?
We have two master Node1 and Node2. The host name for Node1 is node1.elastic.com and Node2 is node2.elastic.com. Both the nodes are master according to formula (n/2 +1).
We have enabled cluster setting by modifying the elastic.yml file by adding
discovery.zen.ping.unicast.hosts for the two nodes.
From our java application, we are connecting to node1.elastic.com. It works fine till both the nodes are up. Data is getting populated in both the ES servers and everything is good. But as soon as Node1 goes down entire elastic search cluster gets disconnected. And it automatically doesn't switch to Node2 for processing requests.
I feel like the URL which I am giving is not right, and it has to be something else to provide an automatic switch.
Logs from Node1
[2020-02-10 12:15:45,639][INFO ][node ] [Wildpride] initialized
[2020-02-10 12:15:45,639][INFO ][node ] [Wildpride] starting ...
[2020-02-10 12:15:45,769][WARN ][common.network ] [Wildpride] _non_loopback_ is deprecated as it picks an arbitrary interface. specify explicit scope(s), interface(s), address(es), or hostname(s) instead
[2020-02-10 12:15:45,783][WARN ][common.network ] [Wildpride] _non_loopback_ is deprecated as it picks an arbitrary interface. specify explicit scope(s), interface(s), address(es), or hostname(s) instead
[2020-02-10 12:15:45,784][INFO ][transport ] [Wildpride] publish_address {000.00.00.204:9300}, bound_addresses {[fe80::9af2:b3ff:fee9:90ca]:9300}, {000.00.00.204:9300}
[2020-02-10 12:15:45,788][INFO ][discovery ] [Wildpride] XXXX/Hg_5eGZIS0e249KUTQqPPg
[2020-02-10 12:16:15,790][WARN ][discovery ] [Wildpride] waited for 30s and no initial state was set by the discovery
[2020-02-10 12:16:15,799][WARN ][common.network ] [Wildpride] _non_loopback_ is deprecated as it picks an arbitrary interface. specify explicit scope(s), interface(s), address(es), or hostname(s) instead
[2020-02-10 12:16:15,802][WARN ][common.network ] [Wildpride] _non_loopback_ is deprecated as it picks an arbitrary interface. specify explicit scope(s), interface(s), address(es), or hostname(s) instead
[2020-02-10 12:16:15,803][INFO ][http ] [Wildpride] publish_address {000.00.00.204:9200}, bound_addresses {[fe80::9af2:b3ff:fee9:90ca]:9200}, {000.00.00.204:9200}
[2020-02-10 12:16:15,803][INFO ][node ] [Wildpride] started
[2020-02-10 12:16:35,552][INFO ][node ] [Wildpride] stopping ...
[2020-02-10 12:16:35,619][WARN ][discovery.zen.ping.unicast] [Wildpride] [17] failed send ping to {#zen_unicast_1#}{000.00.00.206}{000.00.00.206:9300}
java.lang.IllegalStateException: can't add nodes to a stopped transport
at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:916)
at org.elasticsearch.transport.netty.NettyTransport.connectToNodeLight(NettyTransport.java:906)
at org.elasticsearch.transport.TransportService.connectToNodeLight(TransportService.java:267)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$3.run(UnicastZenPing.java:395)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2020-02-10 12:16:35,620][WARN ][discovery.zen.ping.unicast] [Wildpride] failed to send ping to [{Wildpride}{Hg_5eGZIS0e249KUTQqPPg}{000.00.00.204}{000.00.00.204:9300}]
SendRequestTransportException[[Wildpride][000.00.00.204:9300][internal:discovery/zen/unicast]]; nested: TransportException[TransportService is closed stopped can't send request];
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:340)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPingRequestToNode(UnicastZenPing.java:440)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:426)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$2.doRun(UnicastZenPing.java:249)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: TransportException[TransportService is closed stopped can't send request]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:320)
... 7 more
[2020-02-10 12:16:35,623][INFO ][node ] [Wildpride] stopped
[2020-02-10 12:16:35,623][INFO ][node ] [Wildpride] closing ...
[2020-02-10 12:16:35,642][INFO ][node ] [Wildpride] closed

TL;DR: There is no automatic switch in elasticsearch and you'll need some kind of loadbalancer in front of the elasticsearch cluster.
For an HA setup, you need at least 3 master eligible nodes. In front of the cluster there have to be a loadbalancer (also HA) to distribute the requests across the cluster. Or the client needs to be somehow aware of cluster and in a failure scenario failover to any node left.
If you go with 2 master nodes, the cluster can get into the "split brain" state. If your network gets somehow fragmented and the nodes become invisible to each other, both of them will think it is the last one working and keep serving the read/write requests independently. In that way they drift away from each other and it will become nearly impossible to fix it - at least, when the fragmentation is gone, there is a lot of trouble to fix. With 3 nodes, in a fragmentation scenario the clusternwill only continue to serve requests if there are at least 2 nodes visible to each other.

Related

What is causing elasticsearch to shutdown shortly after starting up?

I'm having an issue with Elasticsearch on EC2 where I'm starting up several new instances from the same AMI, and very occasionally (like < 1% of the time), the Elasticsearch service will stop shortly after starting. I've looked at the log file, but it's not really clear to me why the service is stopping. Are there any clues in this that I'm missing, or is there anywhere else I should look for logs when this happens?
[2020-07-28T18:17:44,251][INFO ][o.e.c.c.ClusterBootstrapService] [ip-10-0-0-68] no discovery configuration found, will perform best-effort cluster bootstrapping after [3s] unless existing master is discovered
[2020-07-28T18:17:44,375][INFO ][o.e.c.s.MasterService ] [ip-10-0-0-68] elected-as-master ([1] nodes joined)[{ip-10-0-0-68}{C1lEYCg6RUWry4avn4isxw}{IjXE3KNOQO2UeZyrX2o3FA}{127.0.0.1}{127.0.0.1:9300}{dilm}{ml.machine_memory=32601837568, xpack.installed=true, ml.max_open_jobs=20} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 4, version: 26, delta: master node changed {previous [], current [{ip-10-0-0-68}{C1lEYCg6RUWry4avn4isxw}{IjXE3KNOQO2UeZyrX2o3FA}{127.0.0.1}{127.0.0.1:9300}{dilm}{ml.machine_memory=32601837568, xpack.installed=true, ml.max_open_jobs=20}]}
[2020-07-28T18:17:44,416][INFO ][o.e.c.s.ClusterApplierService] [ip-10-0-0-68] master node changed {previous [], current [{ip-10-0-0-68}{C1lEYCg6RUWry4avn4isxw}{IjXE3KNOQO2UeZyrX2o3FA}{127.0.0.1}{127.0.0.1:9300}{dilm}{ml.machine_memory=32601837568, xpack.installed=true, ml.max_open_jobs=20}]}, term: 4, version: 26, reason: Publication{term=4, version=26}
[2020-07-28T18:17:44,446][INFO ][o.e.h.AbstractHttpServerTransport] [ip-10-0-0-68] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
[2020-07-28T18:17:44,447][INFO ][o.e.n.Node ] [ip-10-0-0-68] started
[2020-07-28T18:17:44,595][INFO ][o.e.l.LicenseService ] [ip-10-0-0-68] license [a9a29e21-5167-497e-9e49-ccc785ea2d47] mode [basic] - valid
[2020-07-28T18:17:44,596][INFO ][o.e.x.s.s.SecurityStatusChangeListener] [ip-10-0-0-68] Active license is now [BASIC]; Security is disabled
[2020-07-28T18:17:44,602][INFO ][o.e.g.GatewayService ] [ip-10-0-0-68] recovered [0] indices into cluster_state
[2020-07-28T18:18:29,947][INFO ][o.e.n.Node ] [ip-10-0-0-68] stopping ...
[2020-07-28T18:18:29,962][INFO ][o.e.x.w.WatcherService ] [ip-10-0-0-68] stopping watch service, reason [shutdown initiated]
[2020-07-28T18:18:29,963][INFO ][o.e.x.w.WatcherLifeCycleService] [ip-10-0-0-68] watcher has stopped and shutdown
[2020-07-28T18:18:30,014][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [ip-10-0-0-68] [controller/2184] [Main.cc#150] Ml controller exiting
[2020-07-28T18:18:30,015][INFO ][o.e.x.m.p.NativeController] [ip-10-0-0-68] Native controller process has stopped - no new native processes can be started
[2020-07-28T18:18:30,024][INFO ][o.e.n.Node ] [ip-10-0-0-68] stopped
[2020-07-28T18:18:30,024][INFO ][o.e.n.Node ] [ip-10-0-0-68] closing ...
[2020-07-28T18:18:30,032][INFO ][o.e.n.Node ] [ip-10-0-0-68] closed
[2020-07-28T18:18:29,947][INFO ][o.e.n.Node ] [ip-10-0-0-68] stopping ...
This log line means Elasticsearch shut down gracefully after receiving a shutdown signal (typically SIGTERM) from an external source. It's not possible to say what the external source is, it depends on your system. It could for instance be systemd if that's how you're starting Elasticsearch. If so, hopefully its logs tell you why it's sending that shutdown signal.

elasticsearch: How to reinitialize a node?

elasticsearch 1.7.2 on CentOS
We have a 3 node cluster that has been running fine. A networking problem caused the "B" node to lose network access. (It then turns out that the C node had the "minimum_master_nodes" as 1, not 2.)
So we are now poking along with just the A node.
We fixed the issues on the B and C nodes, but they refuse to come up and join the cluster. On B and C:
# curl -XGET http://localhost:9200/_cluster/health?pretty=true
{
"error" : "MasterNotDiscoveredException[waited for [30s]]",
"status" : 503
}
The elasticsearch.yml is as follows (the name on "b" and "c" nodes are reflected in the node names on those systems, ALSO, the IP addys on each node reflect the other 2 nodes, HOWEVER, on the "c" node, the index.number_of_replicas was mistakenly set to 1.)
cluster.name: elasticsearch-prod
node.name: "PROD-node-3a"
node.master: true
index.number_of_replicas: 2
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["192.168.3.100", "192.168.3.101"]
We have no idea why they won't join. They have network visibility to A, and A can see them. Each node correctly has the other two defined in "discovery.zen.ping.unicast.hosts:"
On B and C, the log is very sparse, and tells us nothing:
# cat elasticsearch.log
[2015-09-24 20:07:46,686][INFO ][node ] [The Profile] version[1.7.2], pid[866], build[e43676b/2015-09-14T09:49:53Z]
[2015-09-24 20:07:46,688][INFO ][node ] [The Profile] initializing ...
[2015-09-24 20:07:46,931][INFO ][plugins ] [The Profile] loaded [], sites []
[2015-09-24 20:07:47,054][INFO ][env ] [The Profile] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [148.7gb], net total_space [157.3gb], types [rootfs]
[2015-09-24 20:07:50,696][INFO ][node ] [The Profile] initialized
[2015-09-24 20:07:50,697][INFO ][node ] [The Profile] starting ...
[2015-09-24 20:07:50,942][INFO ][transport ] [The Profile] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.181.3.138:9300]}
[2015-09-24 20:07:50,983][INFO ][discovery ] [The Profile] elasticsearch/PojoIp-ZTXufX_Lxlwvdew
[2015-09-24 20:07:54,772][INFO ][cluster.service ] [The Profile] new_master [The Profile][PojoIp-ZTXufX_Lxlwvdew][elastic-search-3c-prod-centos-case-48307][inet[/10.181.3.138:9300]], reason: zen-disco-join (elected_as_master)
[2015-09-24 20:07:54,801][INFO ][http ] [The Profile] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.181.3.138:9200]}
[2015-09-24 20:07:54,802][INFO ][node ] [The Profile] started
[2015-09-24 20:07:54,880][INFO ][gateway ] [The Profile] recovered [0] indices into cluster_state
[2015-09-24 20:42:45,691][INFO ][node ] [The Profile] stopping ...
[2015-09-24 20:42:45,727][INFO ][node ] [The Profile] stopped
[2015-09-24 20:42:45,727][INFO ][node ] [The Profile] closing ...
[2015-09-24 20:42:45,735][INFO ][node ] [The Profile] closed
How do we bring the whole beast to life?
Rebooting B and C makes no difference at all
I am hesitant to cycle A, as that is what our app is hitting...
Well, we do not know what brought it to life, but it kind of magically came back up.
I believe that the shard reroute, (shown here: elasticsearch: Did I lose data when two of my three nodes went down? ) caused the nodes to rejoin the cluster. Our theory is that node A, the only surviving node, was not a "healthy" master, because it knew that one shard (the "p" cut of shard 1, as spelled out here: elasticsearch: Did I lose data when two of my three nodes went down? ) was not allocated.
Since the master knew it was not intact, the other nodes declined to join the cluster, throwing the "MasterNotDiscoveredException"
Once we got all the "p" shards assigned to the surviving A node, the other nodes joined up, and did the whole replicating dance.
HOWEVER Data was lost by allocating the shard like that. We ultimately set up a new cluster, and are rebuilding the index (which takes several days).

Elasticsearch fails to start

I'm trying to implement a 2 node ES cluster using Amazon EC2 instances. After everything is setup and I try to start the ES, it fails to start. Below are the config files:
/etc/elasticsearch/elasticsearch.yml - http://pastebin.com/3Q1qNqmZ
/etc/init.d/elasticsearch - http://pastebin.com/f3aJyurR
Below are the /var/log/elasticsearch/es-cluster.log content -
[2014-06-08 07:06:01,761][WARN ][common.jna ] Unknown mlockall error 0
[2014-06-08 07:06:02,095][INFO ][node ] [logstash] version[0.90.13], pid[29666], build[249c9c5/2014-03-25T15:27:12Z]
[2014-06-08 07:06:02,095][INFO ][node ] [logstash] initializing ...
[2014-06-08 07:06:02,108][INFO ][plugins ] [logstash] loaded [], sites []
[2014-06-08 07:06:07,504][INFO ][node ] [logstash] initialized
[2014-06-08 07:06:07,510][INFO ][node ] [logstash] starting ...
[2014-06-08 07:06:07,646][INFO ][transport ] [logstash] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.164.27.207:9300]}
[2014-06-08 07:06:12,177][INFO ][cluster.service ] [logstash] new_master [logstash][vCS_3LzESEKSN-thhGWeGA][inet[/<an_ip_is_here>:9300]], reason: zen-disco-join (elected_as_master)
[2014-06-08 07:06:12,208][INFO ][discovery ] [logstash] es-cluster/vCS_3LzESEKSN-thhGWeGA
[2014-06-08 07:06:12,334][INFO ][http ] [logstash] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/<an_ip_is_here>:9200]}
[2014-06-08 07:06:12,335][INFO ][node ] [logstash] started
[2014-06-08 07:06:12,379][INFO ][gateway ] [logstash] recovered [0] indices into cluster_state
I see several things that you should correct in your configuration files.
1) Need different node names. You are using the same config file for both nodes. You do not want to do this if you are setting node name like you are: node.name: "logstash". Either create separate configuration files with different node.name entries or comment it out and let ES auto assign the node.name.
2) Mlockall setting is throwing an error. I would not start out setting bootstrap.mlockall: True until you've first gotten ES to run without it and then have spent a little time configuring linux to support it. It can cause problems with booting up:
Warning
mlockall might cause the JVM or shell session to exit if it tries to
allocate more memory than is available!
I'd check out the documentation on the configuration variables and be careful about making too many adjustments right out of the gate.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-configuration.html
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-service.html
If you do want to make memory adjustments to ES this previous stackoverflow article should be helpful:
How to change Elasticsearch max memory size

logstash - Exception in thread ">output" org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [30s]

Log stash is 100% a disaster for me. I am using LS 1.4.1 and ES 1.02 in the same machine.
Here is how I start logstash indexer:
/usr/local/share/logstash-1.4.1/bin/logstash -f /usr/local/share/logstash.indexer.config
input {
redis {
host => "redis.queue.do.development.sf.test.com"
data_type => "list"
key => "logstash"
codec => json
}
}
output {
stdout { }
elasticsearch {
bind_host => "127.0.0.1"
port => "9300"
}
}
ES I set:
network.bind_host: 127.0.0.1
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["127.0.0.1:9300"]
And wow..this is what I get:
/usr/local/share/logstash-1.4.1/bin/logstash -f /usr/local/share/logstash.indexer.config
Using milestone 2 input plugin 'redis'. This plugin should be stable, but if you see strange behavior, please let us know! For more information on plugin milestones, see http://logstash.net/docs/1.4.1/plugin-milestones {:level=>:warn}
log4j, [2014-05-29T12:02:29.545] WARN: org.elasticsearch.discovery: [logstash-do-logstash-sf-development-20140527082230-866-2010] waited for 30s and no initial state was set by the discovery
Exception in thread ">output" org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [30s]
at org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$3.onTimeout(org/elasticsearch/action/support/master/TransportMasterNodeOperationAction.java:180)
at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(org/elasticsearch/cluster/service/InternalClusterService.java:492)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java/util/concurrent/ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java/util/concurrent/ThreadPoolExecutor.java:615)
at java.lang.Thread.run(java/lang/Thread.java:744)
See http://logstash.net/docs/1.4.1/outputs/elasticsearch
VERSION NOTE: Your Elasticsearch cluster must be running Elasticsearch 1.1.1. If you use any other version of Elasticsearch, you should set protocol => http in this plugin.
So your problem is that logstash doesn't support the older ES version you are using without using an http transport.
Setting 'protocol => "http"' worked for me. I expected the EPEL repo to have complementary versions of logstash and elasticsearch, but ES is used for lots of stuff, thus is not tightly coupled with the logstash rpms.
For me, the problem wasn't with the versions of elasticsearch or logstash. I had just installed them and I was using the latest version of each (1.5.0 & 1.4.2 respectively).
Running the following worked for me as well:
logstash -e 'input { stdin { } } output { elasticsearch { protocol => "http" } }'
But I wanted to get to the bottom of why I wasn't able to connect over the other protocols. Though the documentation doesn't say what the default protocol is, I was pretty sure I was either using transport or node for port 9300 by default because of the following output I got when I started elasticsearch
[2015-04-14 22:21:56,355][INFO ][node ] [Super-Nova] version[1.5.0], pid[10796], build[5448160/2015-03-23T14:30:58Z]
[2015-04-14 22:21:56,355][INFO ][node ] [Super-Nova] initializing ...
[2015-04-14 22:21:56,358][INFO ][plugins ] [Super-Nova] loaded [], sites []
[2015-04-14 22:21:58,186][INFO ][node ] [Super-Nova] initialized
[2015-04-14 22:21:58,187][INFO ][node ] [Super-Nova] starting ...
[2015-04-14 22:21:58,257][INFO ][transport ] [Super-Nova] bound_address {inet[/127.0.0.1:9300]}, publish_address {inet[/127.0.0.1:9300]}
[2015-04-14 22:21:58,273][INFO ][discovery ] [Super-Nova] elasticsearch/KPaTxb9vRnaNXBncN5KN7g
[2015-04-14 22:22:02,053][INFO ][cluster.service ] [Super-Nova] new_master [Super-Nova][KPaTxb9vRnaNXBncN5KN7g][Azads-MBP-2][inet[/127.0.0.1:9300]], reason: zen-disco-join (elected_as_master)
[2015-04-14 22:22:02,069][INFO ][http ] [Super-Nova] bound_address {inet[/127.0.0.1:9200]}, publish_address {inet[/127.0.0.1:9200]}
[2015-04-14 22:22:02,069][INFO ][node ] [Super-Nova] started
At first, I tried opening up port 9300 by following these instructions. That didn't change a thing, so most likely that port wasn't blocked.
Then I stumbled upon this github issue. There wasn't really a solution there that helped, but I did double check to make sure my elasticsearch cluster name was right by checking elasticsearch.yaml (This file is ususally stored where elasticsearch is installed. Run "which elasticsearch" to give you an idea where to look). Lo and behold, my elastisearch cluster.name had my name appended to it. Removing it so that the cluster name was just "elasticsearch" helped logstash discover my elasticsearch instance.

elasticsearch auto-discovery rackspace not working

I'm trying to use ElasticSearch for an application I'm building, and I am hosting it on Rackspace servers. However, the auto-discovery feature is not working. I thought that it was because auto-discovery uses broadcast and multicast to find the other nodes with the matching cluster name. I found this article saying that Rackspace does now support multicast and broadcast with their new Cloud Networks feature. Then following the article's instructions I created a network, and added that network to both of the servers the nodes were running on. I then tried restarting ElasticSearch on both nodes, but they didn't find each other, and each declared themselves as the "master" (here's the output from the logs):
[2013-04-03 22:14:03,516][INFO ][node ] [Nemesis] {0.20.6}[2752]: initializing ...
[2013-04-03 22:14:03,530][INFO ][plugins ] [Nemesis] loaded [], sites []
[2013-04-03 22:14:07,873][INFO ][node ] [Nemesis] {0.20.6}[2752]: initialized
[2013-04-03 22:14:07,873][INFO ][node ] [Nemesis] {0.20.6}[2752]: starting ...
[2013-04-03 22:14:08,052][INFO ][transport ] [Nemesis] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/166.78.177.149:9300]}
[2013-04-03 22:14:11,117][INFO ][cluster.service ] [Nemesis] new_master [Nemesis][3ih_VZsNQem5W4csDk-Ntg][inet[/166.78.177.149:9300]], reason: zen-disco-join (elected_as_master)
[2013-04-03 22:14:11,168][INFO ][discovery ] [Nemesis] elasticsearch/3ih_VZsNQem5W4csDk-Ntg
[2013-04-03 22:14:11,202][INFO ][http ] [Nemesis] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/166.78.177.149:9200]}
[2013-04-03 22:14:11,202][INFO ][node ] [Nemesis] {0.20.6}[2752]: started
[2013-04-03 22:14:11,275][INFO ][gateway ] [Nemesis] recovered [0] indices into cluster_state
The other node's log:
[2013-04-03 22:13:54,538][INFO ][node ] [Jaguar] {0.20.6}[3364]: initializing ...
[2013-04-03 22:13:54,546][INFO ][plugins ] [Jaguar] loaded [], sites []
[2013-04-03 22:13:58,825][INFO ][node ] [Jaguar] {0.20.6}[3364]: initialized
[2013-04-03 22:13:58,826][INFO ][node ] [Jaguar] {0.20.6}[3364]: starting ...
[2013-04-03 22:13:58,977][INFO ][transport ] [Jaguar] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/166.78.63.101:9300]}
[2013-04-03 22:14:02,041][INFO ][cluster.service ] [Jaguar] new_master [Jaguar][WXAO9WOoQDuYQo7Z2GeAOw][inet[/166.78.63.101:9300]], reason: zen-disco-join (elected_as_master)
[2013-04-03 22:14:02,094][INFO ][discovery ] [Jaguar] elasticsearch/WXAO9WOoQDuYQo7Z2GeAOw
[2013-04-03 22:14:02,129][INFO ][http ] [Jaguar] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/166.78.63.101:9200]}
[2013-04-03 22:14:02,129][INFO ][node ] [Jaguar] {0.20.6}[3364]: started
[2013-04-03 22:14:02,211][INFO ][gateway ] [Jaguar] recovered [0] indices into cluster_state
Is adding the network not enough (Rackspace also gave me an IP for this network)? Do I need to somehow specify in the conf file to check that network when using multicast to find other nodes?
I also found this article which offered a different approach. Per the article's instructions I put this into /config/elasticsearch.yml:
cloud:
account: account #
key: account key
compute:
type: rackspace
discovery:
type: cloud
However, then when I tried to restart ElasticSearch I got this:
Stopping ElasticSearch...
Stopped ElasticSearch.
Starting ElasticSearch...
Waiting for ElasticSearch.......
WARNING: ElasticSearch may have failed to start.
And it did fail to start. I checked into the log file for any errors, but this was all that was there:
[2013-04-03 22:31:00,788][INFO ][node ] [Chamber] {0.20.6}[4354]: initializing ...
[2013-04-03 22:31:00,797][INFO ][plugins ] [Chamber] loaded [], sites []
And it stopped there without any errors and without continuing.
Has anyone successfully gotten ElasticSearch to work in the Rackspace cloud before? I know that the unicast option is also available, but I'd prefer to not have to specify each IP address individually, as I would like it to be easy to add other nodes later. Thanks!
UPDATE
I haven't solved the issue yet, but after some searching I found this post that says the "old" cloud plugin was discontinued and replaced with just an Ec2 plugin for Amazon's cloud, which explains why the changes I made to the config file do not work.
Multicast is disabled on public cloud for security reasons(can be verified with ifconfig). Here is an article that should get you what you need:
https://developer.rackspace.com/blog/elasticsearch-autodiscovery-on-the-rackspace-cloud/

Resources