Elastic Search Clustering in the Cloud - elasticsearch

I have 2 Linux VM's (both at same datacenter of Cloud Provider): Elastic1 and Elastic2 (where Elastic 2 is a clone of Elastic 1). Both have same version centos, same cluster name, and same version ES, again - Elastic2 is a clone.
I use the service wrapper to automatically start them both at boot, and introduced each others ip to their respective iptables file, so now I can successfully ping between nodes.
I thought this would be enough to allow ES to form a cluster, but to no avail.
Both Elastic1 and Elastic2 have 1 index each named e1 and e2 respectfully. Each index has 1 shard with no replicas.
I can use the head and paramedic plugins on each server successfully. And use curl -XGET 'http://localhost:9200/_cluster/nodes?pretty=true' to validate the cluster name is the same and each server only has 1 node listed.
Is there anything glaring out at why these nodes arent talking? Ive restarted the ES service and rebooted on both servers to no avail. Could cloning be the problem??

In your elasticsearch.yml:
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ['host1:9300', 'host2:9300']
So, just list your node IPs with the transport port (default is 9300) under unicast hosts. Multicast is enabled by default, but is generally impossible on cloud environments without use of external plugins.
Also, make sure to check your IP rules / security groups! That's easy to forget.

Related

Cluster configuration on MonetDB: cannot discover other nodes

I have installed and configured a 3 monetdb nodes cluster on 3 virtual machines on my MacBook (Using Oracle Virtual Box). I use MonetDB 5 server 11.37.7
I have followed the Cluster Management documentation of MonetDB, but the monetdb discover command only returns the dbfarm of the local instance. Each node still isn't aware of other nodes.
I can connect to any nodes from any other node using monetdb -h [host] -P [passphrase], I can also discover the remote farms of a specific host by using monetdb -h host -P passphrase discover
The answer to this question monetdb cluster management can't setup helped me in setting the listenaddr property to 0.0.0.0, but still, the discover command only returns the local monetdb farm.
EDIT
Thanks to Jennie suggestion below, I noticed that the monetdb log file contains error while sending broadcast message: Network is unreachable.
I used netcat utility to brodcast UDP message from one node to the other 2 and it worked, I can ping, ssh and the 3 nodes are part of the same network configured with virtualbox, but the error is still there.
All your VMs must be in the same LAN environment. monetdb discover basically goes over all IP addresses under the same subnet.
Can you some how verify that's the case?
I got it working, thanks to #Jennie's post. For anyone using VirtualBox:
Use the first network adapter of each configured node with Bridge access instead of NAT
Configure the following property of your dbfarm: listenaddr=0.0.0.0
For testing purpose, it may be worth reducing the property discoveryttl to less than the default 10mns

why do we need to setup a publish address[network.host] value

Looks like elastic search is not discoverable without setting the box's ip address in this property : network.host .
Why cant it just bind to the box's ip address(like it happens in application servers like rest apps).
Why is there even a provision to bind to a particular ip address?
The key property that matters is network.publish_host. You configure this indirectly via network.host. The publish host is the address that nodes advertise to other nodes as the address to be reached on when they join the cluster. So, it needs to be something that is reachable from the other nodes. E.g. 127.0.0.1 would not work for this; likewise a loadbalanced address won't work either.
Also see documentation for these properties
Many servers have multiple network interfaces and a common problem before this change was Elasticsearch picking the wrong one for the publish host and then failing to cluster because the nodes ended up advertising the wrong address to each other. Since Elasticsearch cannot know the right interface, you have to tell it.
This change has been introduced in 2.0 as explained in the breaking changes > network changes documentation:
This change prevents Elasticsearch from trying to connect to other nodes on your network unless you specifically tell it to do so. When moving to production you should configure the network.host parameter
The ES folks also released a blog article back then to explain the underlying reasons for this change, i.e. mainly to prevent your node from accidentally binding to another cluster available on the network.
To run on a local network a single node I added these to my
un-comment or comment
elasticsearch.yml
http.port: 9201
http.bind_host: 192.168.1.172 #works
or
http.port: 9201
http.publish_host: 192.168.1.172 #by itself does not work
http.host: 192.168.1.172 #works alone

Elastic Search : Adding nodes to cluster on the fly

I want to setup Elastic-Search cluster. As it is a distributed system, I should be able to add more nodes on the fly(meaning: adding new nodes after it is deployed once). How is this done and how does Elastic-search manage to do it?
Elasticsearch handles this using Zen Discovery
The zen discovery is the built in discovery module for elasticsearch
and the default. It provides unicast discovery, but can be extended to
support cloud environments and other forms of discovery.
This is done through elasticsearch.yml configuration file. You have two options - multicast and unicast:
Multicast lets your new node to connect to your cluster without specifying IPs, however it's not recommended.
Unicast. You specify a list of nodes in your cluster (their IPs).
Both ways, your started node will try to ping other nodes and if their cluster names are matching, it will join it.
Configuration example:
cluster.name: elasticsearch_media
node.name: "media-dev"
node.master: true
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["153.32.228.250[9300-9400]", "10.122.234.19[9300-9400]"]
All you have to do is to edit the main configuration file on your new node and change the cluster name to the cluster you are currently running. Of course the new node must be discoverable. This depends on your network settings.
Try to write a script which will accept command line arguments about cluster name and IP addresses, authentications etc.,
This script will open and modify elasticsearch.yml file on the remote server.

Elasticsearch Access Log

I'm trying to track down who is issuing queries to an ElasticSearch Cluster. Elastic doesn't appear to have an access log.
Is there a place where I can find out which IP is hitting the cluster?
Elasticsearch doesn't provide any security out of the box, and that is on purpose and by design.
So you have a couple solutions out there:
Don't let your ES cluster exposed to the open world, but put it behind a firewall (i.e. whitelist the hosts that can access ports 9200/9300 on your nodes)
Look into the Shield plugin for Elasticsearch in order to secure your environment.
Put an nginx server in front of your cluster to act as a reverse proxy.
Add simple basic authentication with either the elasticsearch-jetty plugin or simply the elasticsearch-http-basic plugin, which also allowws you to whitelist the client IPs that are allowed to access your cluster.
If you want to have access logs, you need either 2 or 3, but all solutions above will allow you to secure your ES environment.

elasticsearch on Ec2 cannot hit public IP(timeout)

I have elasticsearch running on EC2,
I can hit form local IP address(ex. curl -XGET localhost:9200)
I cannot hit from public IP address, whether on the same machine, or from our network, it always times out,
IPtables are allowing
port is open(to itself as well as private network)
Elasticsearch http.cors is enabled and allows "*"
aside from Iptables, amazon security config, elasticsearch config could there be anything I am overlooking? (we can access 443 and get kibana up, it just times out on the elasticsearch ajax call or if I try to access 9200 directly)
been working on this for over a day so I humbly come to you all!
thank you
I had exactly the same issue.
I managed to solve it as follows:
Do what TJ said in his comment, + restart the instance. I wasn't sure if this was/is necessary, but I did it for good measure.
I made sure that the following is set in the elasticsearch.yml file:
a. http.enabled: true
b. http.cors.enabled: true
c. http.cors.allow-origin: "*"
Restarted elasticsearch (service elasticsearch restart)
Then when I tried to access elasticsearch from the public IP it worked - http://[PUBLIC IP OF INSTANCE]:9200
Hope this helps.
I just spent lots of time trying to get this working and just succeeded.
Setup: Elasticsearch 6.2.4, running on a Windows Server 2012, EC2 instance.
I also installed the discovery-ec2 plugin, not sure now if it is required, my assumption is, yes it is required although some of the settings it allows were not necessary to get it working.
Config (.yml). I tried tons of different .yml config settings which in the end did not help, in the end I think the main setting is:
network.host: 0.0.0.0
I tried setting the network.host to ec2:privateIpv4 and ec2:publicIpv4 (plugin settings) but they didn't help.
I had added the required Custom TCP Rules (allowing 9200 and 9300...not sure if 9300 is needed).
Either it failed to start (usually with a binding to 9300 error) or started but was not publicly accessible.
The Fix. What got it working in the end is you must also open the port in windows firewall. As soon as I added the inbound rule, boom it connected :)
I then stripped out all the extra configs I had been trying, restarted Elasticsearch... and it still worked!

Resources