how to run multiple instances of elasticsearch on one host - elasticsearch

I have several machines each with 128 GB of ram, each host is running a single instance of Elasticsearch.
I would like to run another data node on each host and allocate around 30 GB to the jvm heap.
I know I have to create a separate config file .yml and data directory..etc. My question is do I need to modify the service wrapper so that each node will be started/ stopped seperatly?
I am running ES version 1.3 on Centos 6.5
thank you

You need to prepare two elasticsearch.yml config files to configure settings accordingly and specify these files when startup up the two nodes.
bin/elasticsearch -Des.config=$ES_HOME/config/elasticsearch.1.yml
bin/elasticsearch -Des.config=$ES_HOME/config/elasticsearch.2.yml
At least the following should be set differently for the two nodes:
http.port
transport.tcp.port
path_data
path_logs
path_pid
node.name
The following needs to point to the other in both files to allow the nodes to find each other:
discovery.zen.ping.unicast.hosts: '127.0.0.1:9302'
EDIT: the property is now deprecated, look at : https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-settings.html
See this blog and this discussion

elasicsearch.yml-1
cluster.name: test
node.name: node-1
path.data: /Users/musab/Desktop/elasticsearch/data
path.logs: /Users/musab/Desktop/elasticsearch/logs
node.max_local_storage_nodes: 4
elasicsearch.yml-2
cluster.name: test
node.name: node-1
path.data: /Users/musab/Desktop/elasticsearch/data
path.logs: /Users/musab/Desktop/elasticsearch/logs
node.max_local_storage_nodes: 4

Related

How to add new node replica to Elasticsearch is running, without to retart service?

My cluster has a yellow health as it has only one single node, so the replicas remain unasigned simply because no other node is available to contain them.
I'am readed help in homepage:
I try add new node-data to my cluster, but it not appear when i check health's cluster.
This is config new node:
cluster.name: elasticsearch
node.name: node-data-1
node.master: false
node.data: true
node.ingest: false
node.ml: false
http.port: 9201
Old config, i don't edit. It's default.
Can someone explain me which files do I've to edit and what commands do I've to launch in order to create another node in my cluster? Do I've to run two ES instance? How can I do this?
Thanks in advance.
Make sure to remove the /data folder from each Elasticsearch install before executing the nodes. Also, in the nodes that won't be the master set the following property:
cluster.initial_master_nodes: ["machine_running_master"]

How to set up multi-node Elasticsearch cluster in development mode?

I have an ES cluster (v 5.6.12) up and running in dev mode, config below:
node1.com
cluster.name: elastic-test
node.name: "node-1"
path.data: /path/to/data
path.logs: /path/to/logs
network.host: 127.0.0.1
http.host: 0.0.0.0
discovery.zen.ping.unicast.hosts: ["node1.com", "node2.com"]
node.master: true
I am trying to connect node 2 to the same cluster:
node2.com
cluster.name: elastic-test
node.name: "node-2"
path.data: /path/to/data
path.logs: /path/to/logs
network.host: 127.0.0.1
http.host: 0.0.0.0
discovery.zen.ping.unicast.hosts: ["node1.com", "node2.com"]
node.master: true
I tried to change the network.host to their respective addresses, but this takes them out of dev mode. I also tried setting the bind and publish hosts to make the node discover-able to other nodes:
network.bind_host: 127.0.0.1
network.publish_host: node1.com
But again, this takes the nodes into production.
Is it actually possible to have multiple nodes on different servers communicate within development mode?
Short answer NO. For most use cases running a single node cluster for DEV suffices but there could be scenarios where multi node clusters are required in DEV environment, however it is not possible to currently form a multi node cluster without binding to a non local IP address.
That being said, difference between development mode and production mode with respect to Elasticsearch is just preventing ES cluster from starting if some settings are not configured appropriately. So, as long as you are able to configure the settings described in the below link then you can form a cluster and name it as DEV so users don't misidentify it as a production cluster
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/system-config.html#dev-vs-prod

How to configure two app server nodes to connect to same elasticsearch cluster?

I have one code base to connect elastic search (localhost:9200) for the full-text-search feature. We deployed this code on two different machines (m1 & m2) under load balancing server. In this case, how to configure ES in 2 different machines to connect ES and index should reflect both sides.
I am using Elasticsearch v 5.1.2
Machine 1
cluster.name: production
node.name: database
Machine 2
cluster.name: production
node.name: app
Above setting worked on ES v 1.7.1
**Question?
What configuration should I do to make it work on ES v5.1.2?
Please help me to solve this issue.
Thanks in advance
I'm assuming these nodes aren't a part of same cluster.
Try http://MACHINE_1_IP:9200/_cat/nodes?v and check if all nodes are listed as part of cluster.
If they are not - just a quick guess, have you looked at network.host setting ? It binds to local loop by default ( That maybe something introduced in 2 + )
This can be solved by using network-module setting (ref).
Update the elasticsearch.yml on both app server by keeping same cluster name and different node name
EX :
Server_1
update the elasticseach.yml
cluster.name: Production
node.name: APP
network.host: [server_1_IP, _local_]
discovery.zen.ping.unicast.hosts: [server_1_IP, server_2_IP]
On Server_2
update the elasticseach.yml
cluster.name: Production
node.name: DB
network.host: [server_2_IP, _local_]
discovery.zen.ping.unicast.hosts: [server_1_IP, server_2_IP]

How to configure two nodes to connect to same cluster in elasticsearch?

I have 2 separate machines. Port 9200 is already taken by a separate elasticsearch running, so I specify 9201 as the http.port in the yml file. i set cluster.name: MyCluster.
When I start ./elasticsearch on machine 1 and machine 2, they are not connected, but each are single node master's.
What do I need to do so that they can connect to each other and be part of the same cluster?
I also set network.host: 0.0.0.0 so I know they can see each other. I am using 2.4.0 of Elastcisearch.
In machine 1:
cluster.name: hello_world
network.host: "hostname_or_ip_1"
network.port: 9201
discovery.zen.ping.unicast.hosts: ["hostname_or_ip_2:9201"]
In machine 2:
cluster.name: hello_world
network.host: "hostname_or_ip_2"
network.port: 9201
discovery.zen.ping.unicast.hosts: ["hostname_or_ip_1:9201"]
Both cluster name should be same
discovery.zen.ping.unicast.hosts should point to correct machine
address with port
Make sure to restart elasticsearch node after editing config file
Look at unicast discovery with host:port. https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-zen.html
You might also need to be explicit about the transport.tcp.port in your elasticsearch.yml:
transport.tcp.port: 9301

How to set up ES cluster?

Assuming I have 5 machines I want to run an elasticsearch cluster on, and they are all connected to a shared drive. I put a single copy of elasticsearch onto that shared drive so all three can see it. Do I just start the elasticsearch on that shared drive on eall of my machines and the clustering would automatically work its magic? Or would I have to configure specific settings to get the elasticsearch to realize that its running on 5 machines? If so, what are the relevant settings? Should I worry about configuring for replicas or is it handled automatically?
its super easy.
You'll need each machine to have it's own copy of ElasticSearch (simply copy the one you have now) -- the reason is that each machine / node whatever is going to keep it's own files that are sharded accross the cluster.
The only thing you really need to do is edit the config file to include the name of the cluster.
If all machines have the same cluster name elasticsearch will do the rest automatically (as long as the machines are all on the same network)
Read here to get you started:
https://www.elastic.co/guide/en/elasticsearch/guide/current/deploy.html
When you create indexes (where the data goes) you define at that time how many replicas you want (they'll be distributed around the cluster)
It is usually handled automatically.
If autodiscovery doesn't work. Edit the elastic search config file, by enabling unicast discovery
Node 1:
cluster.name: mycluster
node.name: "node1"
node.master: true
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["node1.example.com"]
Node 2:
cluster.name: mycluster
node.name: "node2"
node.master: false
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["node1.example.com"]
and so on for node 3,4,5. Make node 1 master, and the rest only as data nodes.
Edit: Please note that by ES rule, if you have N nodes, then by convention, N/2+1 nodes should be masters for fail-over mechanisms They may or may not be data nodes, though.
Also, in case auto-discovery doesn't work, most probable reason is because the network doesn't allow it (and therefore disabled). If too many auto-discovery pings take place across multiple servers, the resources to manage those pings will prevent other services from running correctly.
For ex, think of a 10,000 node cluster and all 10,000 nodes doing the auto-pings.
Elastic Search 7 changed the configurations for cluster initialisation.
What is important to note is the ES instances communicate internally using the Transport layer(TCP) and not the HTTP protocol which is normally used to perform ops on the indices. Below is sample config for 2 machines cluster.
cluster.name: cluster-new
node.name: node-1
node.master: true
node.data: true
bootstrap.memory_lock: true
network.host: 0.0.0.0
http.port: 9200
transport.host: 102.123.322.211
transport.tcp.port: 9300
discovery.seed_hosts: [“102.123.322.211:9300”,"102.123.322.212:9300”]
cluster.initial_master_nodes:
- "node-1"
- "node-2”
Machine 2 config:-
cluster.name: cluster-new
node.name: node-2
node.master: true
node.data: true
bootstrap.memory_lock: true
network.host: 0.0.0.0
http.port: 9200
transport.host: 102.123.322.212
transport.tcp.port: 9300
discovery.seed_hosts: [“102.123.322.211:9300”,"102.123.322.212:9300”]
cluster.initial_master_nodes:
- "node-1"
- "node-2”
cluster.name: This has be same across all the machines that are going to be part of a cluster.
node.name : Identifier for the ES instance. Defaults to machine name if not given.
node.master: specifies whether this ES instance is going to be master or not
node.data: specifies whether this ES instance is going to be data node or not(hold data)
bootsrap.memory_lock: disable swapping.You can start the cluster without setting this flag. But its recommended to set the lock.More info: https://www.elastic.co/guide/en/elasticsearch/reference/master/setup-configuration-memory.html
network.host: 0.0.0.0 if you want to expose the ES instance over network. 0.0.0.0 is different from 127.0.0.1( aka localhost or loopback address).
It means all IPv4 addresses on the machine. If machine has multiple ip addresses with a server listening on 0.0.0.0, the client can reach the machine from any of the IPv4 addresses.
http.port: port on which this ES instance will listen to for HTTP requests
transport.host: The IPv4 address of the host(this will be used to communicate with other ES instances running on different machines). More info: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-transport.html
transport.tcp.port: 9300 (the port where the machine will accept the tcp connections)
discovery.seed_hosts: This was changed in recent versions. Initialise all the IPv4 addresses with TCP port(important) of ES instances that are going to be part of this cluster. This is going to be same across all ES instances that are part of this cluster.
cluster.initial_master_nodes: node names(node.name) of the ES machines that are going to participate in master election.(Quorum based decision making :- https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-quorums.html#modules-discovery-quorums)
I tried the steps that #KannarKK suggested on ES 2.0.2, however, I could not bring the cluster up and running. Evidently, I figured out something, as I had set tcp port number on Master, on the Slave configuration discovery.zen.ping.unicast.hosts needs Master's port number along with IP address ( tcp port number ) for discovery. So when I try following configuration it works for me.
Node 1
cluster.name: mycluster
node.name: "node1"
node.master: true
node.data: true
http.port : 9200
tcp.port : 9300
discovery.zen.ping.multicast.enabled: false
# I think unicast.host on master is redundant.
discovery.zen.ping.unicast.hosts: ["node1.example.com"]
Node 2
cluster.name: mycluster
node.name: "node2"
node.master: false
node.data: true
http.port : 9201
tcp.port : 9301
discovery.zen.ping.multicast.enabled: false
# The port number of Node 1
discovery.zen.ping.unicast.hosts: ["node1.example.com:9300"]

Resources