elasticsearch: Poss to change number of replicas after system is running? - elasticsearch

elasticsearch 1.7.2 on centos
3 node cluster
This question is how to manage ES config via mods to elasticsearch.yml + restart of elasticsearch service. (Not via api.)
Out of box, the config is:
index.number_of_replicas: 1
So on a 3 node cluster, any 2 nodes have the whole package.
If I want any 1 node to be complete, I would set:
index.number_of_replicas: 2
a) Correct?
b) Can I just walk up to an existing setup and make this change?
c) And, can I just walk up , and adjust it up to 2, and down to 1, whenever? (up to make each node a possible stand alone, down to save disk space)

The number of replica can be changed at any point of time. You can increase or decrease the replica dynamically. There is a good example shown here.
Also please note that , you cant change the number of shards after index creation , but number of replica is open to change via index settings API.

fwiw, another way to do this (I have now proven out) is to update the yml file (elasticsearch.yml). Change the element:
index.number_of_replicas: 2
Up or down, as desired, and restart the elasticsearch service
service elasticsearch restart
The cluster will go yellow (yellow status) while the replicas are being created/moved, and then will go green.

Related

Elasticsearch - Adding node without replication?

I have a master/data Elasticsearch node. It has now reached 90% capacity and I need to provision additional space to continue adding more data.
I have created a new server with 700gb disk space, installed ES & Kibana, and now wish for this second server to provide additional space to / work with the master node.
My problem:
As it says on the ES website:
When you add more nodes to a cluster, it automatically allocates
replica shards.
My issue is that I do not wish to replicate the data from the master node, but instead just provide additional space using this second server which can then be queried by the master node.
My question:
What is the best way to achieve this? Is adding a node the incorrect thing to do here?
Using index-level shard allocation filtering, you can constrain a given index (or set of indexes) to stay on a given node (or set of nodes).
Simply run this:
PUT orders,orders_1,orders_2,orders_3,orders_4,orders_5/_settings
{
"index.routing.allocation.require._name": "your-first-node-name"
}
Note that you can also use ._ip or ._host instead of ._name if you prefer.
Then you can add a new node and let it join the cluster and nothing will rebalance, all your current shards will stay on your current node.
And if you need to create a new index on the second node and want to make sure that it will stay on that node you can specify the same settings at index creation time:
PUT new_orders
{
"settings": {
"index.routing.allocation.require._name": "your-second-node-name"
}
}
The index called new_orders will be created on the second node and stay there.

ElasticSearch: Starting Multiple Cluster

I started two clusters of ElasticSearch with different names but the other one won't show up either in Marvel or querying for health manually.
curl 'http://127.0.0.1:9200/_cat/health?v'
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1501062768 15:22:48 Cove_dev_cluster yellow 1 1 8 8 0 0 8 0 - 50.0%
But it's running on my screen.
I am assuming you are running both clusters (single nodes I believe in this case) on the same machine... In this case the nodes have a default port range setting of 9200-9300 and they are configured to bind to first available port in the specified range. More details available in Network Settings documentation.
So in your case the other cluster is running on port 9201 most likely. If you check for Marvel or query the health manually on port 9201 you should find the other cluster.
However, if you want to have two nodes participating in the same cluster, then make sure that the cluster name matches in the configuration of both instances of elasticsearch you have running.
Hope this helps.

Adding cluster to existing elastic search in elk

Currently I have existing
1. Elastic search
2. Logstash
3. Kibana
I have existing data on them.
Now i have setup ELK cluster with 3 Master nodes , 5 data nodes 3 client nodes.
But i am not sure how can i get existing data into them.
Is it possible that if i make the existing ES node as data node and then attach it to the cluster . Then will that data gets replicated to other data nodes as well? and then take that node offline
Option 1
How about just try with fewer nodes? It is not hard to test if it is supported if you setup one node, feed some data, and add one more and configure them as a cluster to see if data get synchronized.
Option 2
Another option is to use an elasticsearch migration tool like https://github.com/taskrabbit/elasticsearch-dump, basically, you could setup a clean cluster and migrate all your data in old node to this cluster.

Index creation move elastic search cluster to red

I have setup an elastic search cluster with 1 master node and 1 client node, but problem is as I am creating index my cluster move to red state with 3 initializing_shards on client node, master node shards working fine.
don't know how to resolve it.
It was installation issue we have re installed elastic search and that solved our problem.
As you said in question, You have only 1 master and 1 client node but you should have at least 1 data node to store at least primary shards.

Remove of data folder is not synced in Elasticsearch upon index delete

We have an ES cluster with 2 nodes. When we delete an index not all folders in the cluster (on filesystem) are deleted which causes some problems when restarting one server.
Then our deleted indices gets distributed with some weird state indicating that the cluster health is not green.
Example. We delete index with name someIndex and after deletion we check file system, one can see this:
Node1
ElasticSearch\data\clustername\nodes\0\indices\
ElasticSearch\data\clustername\nodes\1\indices\
Node2
ElasticSearch\data\clustername\nodes\0\indices\
ElasticSearch\data\clustername\nodes\1\indices\someIndex (<-- still present)
Anyone know whats causing this?
ES-version: 0.90.5
There are two nodes directories for each on your filesystem (these are nodes\0 and nodes\1).
When you start Elasticsearch, you start up a node (in ES-lingo). Your machine can host multiple nodes, which happens if you start Elasticsearch multiple times. The default settings for the http port is 9200-9300, that means, ES is looking for a free port in that range and binds its node to it (the same is true for the transport module with 9300-9400)
So, if you start an ES process while another is still running, that is, it's bound to a port, you start a second node and ES will create a new directory for it. Maybe this has happened if you issued a restart, but ES couldn't shut down in time before the new node started up.
But now you have a third node in your cluster and ES will assign shards to it. Then you do a cluster restart or something similar and you start one node on each of your machine. ES cannot find the shards that were assigned to the third node, because it's not spun up, and it will show you a red or yellow state, depending on what shards live on the third node. If you delete you index data, you won't delete the data from this missing node.
If you don't care about the data, you can just shutdown ES and delete these directories or start two ES nodes on each of your machines and then delete the index again.
Then you could change the port settings to one specific port, that would prevent second processes from starting up, since they won't be able to bind to a free port.

Resources