ElasticSearch Security and Ports - elasticsearch

I have setup an elastic-search cluster with two data nodes , one master node and one client node with KIBANA.
I was running it with iptables disabled on each node (CC 6). Now i need to enable iptables and i want to know which of the ports (9200 , 9300) , i need to open on each node and in which direction (incoming or outgoing). The discovery is using uni-cast.
I would also like to know on which node i should place authentication , i.e just the client node ?
Cluster: mycluster
data-node1
data-node2
master-node1
client-node1
Thanks.

9200 is used for the HTTP API, 9300 is used for communication between nodes and cluster.
For the above configuration I would:
Bind port 9200 on all hosts to 127.0.0.1
Bind port 9300 on all hosts to the local lan, i.e. 192.168.x.x
Run nginx and apply basic authentication (htpasswd for example), reverse proxy to 127.0.0.1:5601 (kibana), assuming you're running your client node on the same machine as you are running Kibana on.
In your Kibana configuration, have it connect to localhost:9200 and bind the interface to 127.0.0.1

Related

Overriding `tcp.publish_port` breaks clustering when elasticsearch is in a container

I'm trying to run an elasticsearch cluster with each es-node running in its own container. These containers are deployed using ECS across several machines that may be running other unrelated containers. To avoid port conflicts each port a container exposes is assigned a random value. These random ports are consistent across all running containers of the same type. In other words, all running es-node containers map port 9300 to the same random number.
Here's the config I'm using:
network:
host: 0.0.0.0
plugin:
mandatory: cloud-aws
cluster:
name: ${ES_CLUSTER_NAME}
discovery:
type: ec2
ec2:
groups: ${ES_SECURITY_GROUP}
any_group: false
zen.ping.multicast.enabled: false
transport:
tcp.port: 9300
publish_port: ${_INSTANCE_PORT_TRANSPORT}
cloud.aws:
access_key: ${AWS_ACCESS_KEY}
secret_key: ${AWS_SECRET_KEY}
region: ${AWS_REGION}
In this case _INSTANCE_PORT_TRANSPORT is the port that 9300 is bound to on the host machine. I've confirmed that all the environment variables used above are set correctly. I'm also setting network.publish_host to the host machine's local IP via a command line arg.
When I forced _INSTANCE_PORT_TRANSPORT (and in turn transport.publish_port) to be 9300, everything worked great, but as soon as it's given a random value, nodes can no longer connect to each other. I see errors like this using logger.discovery=TRACE:
ConnectTransportException[[][10.0.xxx.xxx:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /10.0.xxx.xxx:9300];
at org.elasticsearch.transport.netty.NettyTransport.connectToChannelsLight(NettyTransport.java:952)
at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:916)
at org.elasticsearch.transport.netty.NettyTransport.connectToNodeLight(NettyTransport.java:888)
at org.elasticsearch.transport.TransportService.connectToNodeLight(TransportService.java:267)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$3.run(UnicastZenPing.java:395)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
It seems like the port a node binds to is the same as the port it pings while trying to connect to other nodes. Is there any way to make them different? If not, what's the point of transport.publish_port?
The way the discovery-ec2 plugin works is that it's collecting a list of IP addresses using AWS EC2 API and use this list as unicast list of nodes.
But it does not collect any information from the running cluster. Obviously the node is not yet connected!
So it does not know anything about the publish_port of other nodes.
It just adds an IP address. And that's all. Elasticsearch then is using the default port which is 9300.
So there is nothing you can do IMO to fix that in the short time.
But we can imagine adding a new feature which is close to what has been implemented for Google Compute Engine. We are using a specific metadata to get this port from the GCE APIs.
We could do the same for Azure and EC2. Do you want to open an issue so we can track the effort?

Elasticsearch cluster configuration is not discovering any nodes under both unicast and multicast

I've been trying to use the lovely ansible-elasticsearch project to set up a nine-node Elasticsearch cluster.
Each node is up and running... but they are not communcating with each other. The master nodes think there are zero data nodes. The data nodes are not connecting to the master nodes.
They all have the same cluster.name. I have tried with multicast enabled (discovery.zen.ping.multicast.enabled: true) and disabled (previous setting to false, and discovery.zen.ping.unicast.hosts:["host1","host2",..."host9"]) but in either case the nodes are not communicating.
They have network connectivity to one another - verified via telnet over port 9300.
Sample output:
$ curl host1:9200/_cluster/health
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":"waited for [30s]"}],"type":"master_not_discovered_exception","reason":"waited for [30s]"},"status":503}
I cannot think of any more reasons why they wouldn't connect - looking for any more ideas of what to try.
Edit: I finally resolved this issue. The settings that worked were publish_host to "_non_loopback:ipv4_" and unicast with discovery.zen.ping.unicast.hosts set to ["host1:9300","host2:9300","host3:9300"] - listing only the dedicated master nodes. I have a minimum master node count of 2.
The only reasons I can think that can cause that behavior are:
Connectivity issues - Ping is not a good tool to check that nodes can connect to each other. Use telnet and try connecting from host1 to host2 on port 9300.
Your elasticsearch.yml is set to bind 127.0.0.1 or the wrong host (if you're not sure, bind 0.0.0.0 to see if that solves your connectivity issues and then it's important to change it to bind only internal hosts to avoid exposure of elasticsearch directly to the internet).
Your publish_host is incorrect - This usually happens when you run ES inside a docker container for example, you need to make sure that the publish_host is set to an address that can be accessed via other hosts.

camel elasticsearch access port 80

I have port 9200 proxied via 80 on my server running elasticsearch. I have a camel route that needs to index documents to this server.
Is it supported in the camel-elasticsearch plugin? ie access elastic search via a non 9300 port?
I understand that port 9300 uses a native elasticsearch transport protocol.
What are my options here? Can I proxy 9300 via apache? I'm not sure if that works.
Or does the camel-elasticsearch plugin support http transport? Please help. Thanks.
The port you are referring to, it is the HTTP port:
http.port A bind port range. Defaults to 9200-9300.
http.publish_port The port that HTTP clients should use when communicating with this node. Useful when a cluster node is behind a
proxy or firewall and the http.port is not directly addressable from
the outside. Defaults to the actual port assigned via http.port.
http.bind_host The host address to bind the HTTP service to. Defaults to http.host (if set) or network.bind_host.
http.publish_host The host address to publish for HTTP clients to connect to. Defaults to http.host (if set) or
network.publish_host.
http.host Used to set the http.bind_host and the http.publish_host Defaults to http.host or network.host.
So you really don't need a proxy, you can have elasticsearch listen directly on port 80.
If you already have a process running on port 80; then you can proxy the connections to 9200 (and leave elastic search as default).
Java Transport Client ---> Apache HTTP Proxy(80) ----> ES (9300) [ Can
I do this? As I understand that Java Transport Client uses a non-http
protocol? ]
The protocol has nothing to do with the port.
Simply pass 80 to InetSocketTransportAddress. See the documentation for a complete example.

How to secure a Digital Ocean Elasticsearch cluster?

I need to expose my ES cluster to the world, and am securing it via Nginx with a proxy *:9201 -> localhost:9200 (working).
However, in order to form a cluster, I am trying to use the private network on DigitalOcean to get the nodes to talk to each other.
How can I expose the node-node transport on the private network interface unsecured, while not exposing port 9200 to the world?
I am trying something like
network.publish_host: 10.128.97.184
http.port: 9200
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: 10.128.97.184,10.128.97.185
in elasticsearch.yml but it's not working, probably because the port 9300 might also be nginx-protected?
my nginx file looks like
root#els-node-1:~# cat /etc/nginx/sites-enabled/elasticsearch
server {
listen *:9201;
access_log /var/log/nginx/elasticsearch.access.log;
location / {
auth_basic "Restricted";
auth_basic_user_file /etc/nginx/htpasswd;
proxy_pass http://localhost:9200;
proxy_read_timeout 90;
}
}
And I am able to form the cluster, but I can't see how to secure the external 9200 (disable it to all but 127.0.0.1) and keep the internal interface open for addessses like 10.x.x.x
Thanks for help!
Even if you use the private network, your ES cluster is not secure as anybody within the same Digital Ocean private network can still access your nodes through the open ports 9200 and 9300 (and potentially other services).
Your best bet is to secure your boxes via iptables and only white list the IPs you know are your own servers.
Drop all incoming and forwarded packages and add explicit rules for the other nodes in the cluster only.
Also, use network.bind_host instead of network.publish_host and in addition set up ES to use the eth1 interface only, checkout the ES network docs for details.

Port 9300 on Elasticsearch

According to the documentation, Elasticsearch reserves port 9300-9400 for cluster communication and port 9200-9300 for accessing the elasticsearch APIs. You get the impression that these ranges are inclusive: so port 9300 is part of the first and the second port range.
Now, my IT ops department won't like that, so hopefully I got it wrong. Anyone knows?
Elasticsearch will bind to a single port for both HTTP and the node/transport APIs.
It'll try the lowest available port first, and if it is already taken, try the next. If you run a single node on your machine, it'll only bind to 9200 and 9300.
See also: Elasticsearch Internals: Networking Introduction

Resources