ElasticSearch - Restrict primary and replica shards count on nodes - elasticsearch

I have ElasticSearch 7.16.2 cluster running with three nodes (2 master , 1 Voting node). An index has two primary shards and two replicas, and on restarting a node, both primary shards move to single node. How to restrict index in a nodes to have one primary shard and one replica each.

You can use the index level shard allocation settings to achieve that, it might be not that straight forward and it's a bit complex setting and can cause further unbalance when you have a changing nodes and indices in the cluster.
In order to avoid the issue which happens on the node restart, you must disable the shard allocation and shard rebalance before starting your nodes in Elasticsearch cluster.
Command to disable allocation
PUT /_cluster/settings
{
"persistent":{
"cluster.routing.allocation.enable": "all"
}
}
Command to disable rebalance
PUT /_cluster/settings
{
"persistent":{
"cluster.routing.rebalance.enable": "all"
}
}
Apart from that, you can use the reroute API to manually move the shards to a node in Elasticsearch to fix your current shard allocation.

the config is index.routing.allocation.total_shards_per_node. but you have a problem. first of all I assume you have three data node. (if you don't have, increase the data nodes.).
the problem is you have 4 primary and replica shard in total and one node must assign two shards to itself. so you could not the set index.routing.allocation.total_shards_per_node to 1. at least it must be 2 and your problem not solved.
the config is dynamic: https://www.elastic.co/guide/en/elasticsearch/reference/master/increase-shard-limit.html
also you could set cluster.routing.allocation.total_shards_per_node config for cluster.

Related

Elasticseach 5.6.8 cluster shards location

i have following picture in cluster i am using cerebro. It seems to be all shards on 3rd-node.
And if data comes on i see load on 1rd node > 4 and another nodes are ok.
Logstash -> LB -> ES-nodes (1,2,3). What i am doing wrong?
Thank you in advance.
The high load on that one particular node could be for a couple reasons. The ones that initially spring to mind:
If it is the Master Node then the large number of shards could be having an adverse affect.
You could be sending numerous large read requests to that one particular node so it has to deal with all the aggregations. E.g. if you have Kibana connected to that node.
Some general notes:
The shards with the solid box are the primary shards. The shards with the dotted box are replica shards. You currently have primaries = 8 and replicas = 2. This means there are 8 primary shards per index, and each of those has 2 replica shards. There is much more info about shards in the ES guide. It's for an old version of ES but is still valid.
The fact that all your primary shards are on the same node is a coincidence. This will often happen if you have one node start up before the others. All the primary shards will be allocated to it, then the replicas will go onto other nodes once they start up. If you take down your first node you should see the primaries move to other nodes.
To the left of the node name will be a star. The one with the filled in star is the currently elected Master. Due to your number of shards the master will have a large overhead, relatively speaking. This is because it has to manage so many shards. Try setting "number_of_shards":3, "number_of_replicas":1. Note that those numbers are only applied to new indexes so recreate your indexes to see this take affect.
Your unicast settings are correct.

Unexpected ElasticSearch shard allocation for a single replica with allocation disabled

We have a two node environment and there is certain data that we only want to store on the master node (as the other node is not highly available).
To do this, I've set the number of replicas to 0 and also set the following properties on the indices for which we do not want shard allocation to occur:
"index.routing.allocation.enable": "none",
"index.routing.allocation.rebalance": "none"
My expectation here is that doing so will keep all 5 shards on the master node. However, as soon as I connect the worker node to the environment, 2 or 3 of the shards from each index are moved over to the worker node! How can I stop this from happening and keep all of the shards for the specified index on the master node? Thank you!
I think you need to shard allocation filtering to specify which nodes are allowed to host the shards of a particular index.
https://www.elastic.co/guide/en/elasticsearch/reference/current/shard-allocation-filtering.html

Can a node have more than one shard in Elasticsearch?

I am reading "Elasticsearch: The Definitive Guide" and I would like to confirm something.
When we create an index, it will be assigned to 5 shards by default (or we can use the "number_of_shards" setting).
But if I am using just one node (one server), will the index be spread into 5 shards in the same node? I guess what I am asking is - can a node have multiple shards?
Yes a node can have multiple shards of one or more indices. You can verify it for yourself by executing the GET _cat/shards?v command. Read more about the command here. The problem with having a single node Elasticsearch cluster is that replica shards for indices will not be allocated (but primary shards will be) as it does not make sense to have both the primary and replica of the same shard on the same machine.

elasticsearch undefined index and how to get rid of it

I am seeing the following index Unassigned which is very annoying. How do I get rid of it
Those unassigned shards are actually unassigned replicas of your actual shards from the master node.
The main purpose of replicas is for failover: if the node holding a primary shard dies, then a replica is promoted to the role of primary.
At index time, a replica shard does the same amount of work as the primary shard. New documents are first indexed on the primary and then on any replicas. Increasing the number of replicas does not change the capacity of the index.
However, replica shards can serve read requests. If, as is often the case, your index is search-heavy, you can increase search performance by increasing the number of replicas, but only if you also add extra hardware.
In order to assign these shards, you need to run a new instance of elasticsearch to create a secondary node to carry the data replicas. (The node can be master eligible or just a workhorse. Of course, you can set those configurations in the elasticsearch config files)
For more details about it you can refer to the official documentation and the Elasticsearch Definitive Guide (the work on it is still in progress but you will find what you are looking for here)

Elasticsearch with two nodes and the default 5 shards?

I have set up a cluster with two nodes but I have some confusions about shard and replica.
What I intend is a setup where there is a master(node A) handling write and a slave(node B) that helps with read and search operation. Ideally if the master is not functional I can recover the data from the slave.
I read that the default is 5 shards and 1 replica. Does it mean that my primary data would then be automatically split between node A and node B. Would that means if one node is down I would lost half the data?
Given the description of my need above, am I doing it right?
The only config I have changed at this point is the following
cluster:
name: maincluster
node:
name: masternode
master: true
I am really new to elasticsearch and please kindly point out if I am missing anything.
5 shards and 1 replica means that your data will be split into 5 shards per index.
Each shard will have one replica (5 more backup shards) for a total of 10 shards spread across your set of nodes.
The replica shard will be placed onto a different node than the primary shard (so that if one node fails you have redundancy).
With 2 nodes and replication set to 1 or more, losing a node will still give you access to all of your data, since the primary shard and replication shard will not ever be on same node.
I would install the elasticsearch head plugin it provides a very graphical view of nodes and shards (primary and replica).

Resources