Consistency and Partition-Tolerance in Elasticsearch

Consistency and Partition-Tolerance in Elasticsearch - elasticsearch

As I am new to elasticsearch using [elasticsearch version 7.4] and with a lot of studies it is not clear till now how much shards / Nodes are preferred in particular index.  As of now, I have configured 3 shards and 2 replicas with 3 Nodes(each having 8GB RAM, 500GB HDD). and having 55GB of Data in One Index. 
So I need your views/suggestions in the following points.
Is above given no of shards, Nodes, replicas is sufficient.  
For CAP theorem I will prefer CP i.e: Consistency and Partition-tolerance for this in 3 Node cluster 
For Consistency configured write_consistency=all 
For Partition-tolerance set master-eligible node to (N/2) + 1 in my case it is 3.

I can hopefully give you some useful advice from my time running Elasticsearch clusters :)
1)
Shards: See this blog post for more information, but your average shard will be 55gb/3 = 18gb which is a good shard size (in my experience it's best to keep
shards between 5gb-25gb, the ES docs recommend this as well).
Replicas: 2 replicas is my go-to for a good balance between failure tolerance and performance, so this is good.
Nodes: Those 3 nodes should be sufficient, and you won't need that much disk. With 2 replicas you'll have roughly 55gb * 3 = 165gb of data stored (could be more depending on your mapping) across 1500gb of hard drive, so perhaps you could save some money by using nodes with 100gb disks.
2)
For partition tolerance I might suggest setting write_consistency=quorum. That way, even if you lose a node and therefore 1 replica shard you'll still be able to write with 1 primary and 1 replica left. Otherwise, you'd need to reboot/recreate that node to start writing again. See https://www.elastic.co/guide/en/elasticsearch/reference/2.4/docs-index_.html#index-consistency for more details.
Master-eligible: Yes I recommend a minimum of 3 master nodes, so you'll want to set all 3 of these nodes to be master and data nodes.

Related

ElasticSearch replica and nodes

I am testing now clustering with ElasticSearch and have question about the replicas between the nodes.
As you can see in the screenshot from Head I have 2 indexes.
movies has 5 shards and 2 replica
students has 5 shards and 1 replica
Which one is better and which one is faster with 3 active nodes and why?

Costs of having more number of replicas would be
more storage space required(Obviously)
less indexing performance
while the advantage from it would be
better search performance
better resiliency
Note that even though you have 2 replicas, it does not mean that your cluster can endure 2 nodes going down since all indexing request would fail if only one out of 3 copies of shards is available.(because of indexing quorum)
For detailed explanation please refer to this official document

"Better" is subjective.
With two replicas, you can handle two of the three machines in your cluster going down, though at the price of writing all the data to every machine. Read performance should also be higher as the cluster has more nodes from which to request the data.
With one replica, you can only survive the outage of one machine in your cluster, but you'll get a performance boost by writing 2 copies of the data across 3 servers (less IO on each server).
So it comes down to risk and performance. Hope that helps.

Elasticsearch - Add one node to a running cluster

I have Elasticsearch cluster ( it's not a real cluster cause i have only 1 node )
The cluster have 1.8 TB of data something like 350 indexes .
I would like to add new node to the cluster BUT i don't want to run replication for all the data .
I want to split my shards into 2 nodes ( Each node will be with 1 shard) .
for each index i have 2 shards 0 & 1 and i would like to split my data .
This is possible ? how this will effect Kibana performance ?
Thanks a lot
Amit

for each index i have 2 shards 0 & 1 and i would like to split my data . This is possible ?
When you add your second node to your cluster, then your data will be automatically "rebalanced" and your data will be replicated.
Presumably, if you were to run
$ curl -XGET localhost:9200/_cluster/health?pretty
then you would see that your current cluster health is probably yellow because there is no replication taking place. You can tell because you presumably have an equal number of assigned primary shards as you have unassigned shards (unallocated replicas).
What will happen when you start the second node in the cluster? It will immediately begin to copy shards from the original node. Once complete, the data will exist on both nodes and this is what you want to happen. As you scale further by eventually adding a third node, you will actually spread the shards across the cluster in a less predictable way that does not result in a 1:1 relationship between the nodes. It's only once you add a third node that you can reasonably avoid copying all of the shard data to every node.
Other considerations:
Be sure to set discovery.zen.minimum_master_nodes to 2. This should always be set to M / 2 + 1 using integer division (truncated division) where M is the number of master eligible nodes in your cluster. If you don't set this setting, then you will eventually cause yourself data loss.
You want replication because it gives you higher availability in the event of hardware failure on either node. Due to the above setting with a two node cluster, your cluster would be readonly until you added a second node or the setting was unset, but at least the data would still exist.
how this will effect Kibana performance ?
It's hard to say whether this will really improve performance, but it most likely will simply by spreading the workload across two machines.

How to configure number of shards per cluster in elasticsearch

I don't understand the configuration of shards in ES.
I have few questions about sharding in ES:
The number of primary shards is configured through index.number_of_shards parameter, right?
So, it means that the number of shards are configured per index.
If so, if I have 2 indexes, then I will have 10 shards on the node ?
Assuming I have one node (Node 1) that configured with 3 shards and 1 replica.
Then, I create a new node (Node 2), in the same cluster, with 2 shards.
So, I assume I will have replica only to two shards, right?
In addition, what happens in case Node 1 is down, how the cluster "knows" that it should have 3 shards instead of 2? Since I have only 2 shards on Node 2, then it means that I lost the data of one of the shards in Node 1 ?

So first off I'd start reading about indexes, primary shards, replica shards and nodes to understand the differences:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/glossary.html
This is a pretty good description:
2.3 Index Basics
The largest single unit of data in elasticsearch is an index. Indexes
are logical and physical partitions of documents within elasticsearch.
Documents and document types are unique per-index. Indexes have no
knowledge of data contained in other indexes. From an operational
standpoint, many performance and durability related options are set
only at the per-index level. From a query perspective, while
elasticsearch supports cross-index searches, in practice it usually
makes more organizational sense to design for searches against
individual indexes.
Elasticsearch indexes are most similar to the ‘database’ abstraction
in the relational world. An elasticsearch index is a fully partitioned
universe within a single running server instance. Documents and type
mappings are scoped per index, making it safe to re-use names and ids
across indexes. Indexes also have their own settings for cluster
replication, sharding, custom text analysis, and many other concerns.
Indexes in elasticsearch are not 1:1 mappings to Lucene indexes, they
are in fact sharded across a configurable number of Lucene indexes, 5
by default, with 1 replica per shard. A single machine may have a
greater or lesser number of shards for a given index than other
machines in the cluster. Elasticsearch tries to keep the total data
across all indexes about equal on all machines, even if that means
that certain indexes may be disproportionately represented on a given
machine. Each shard has a configurable number of full replicas, which
are always stored on unique instances. If the cluster is not big
enough to support the specified number of replicas the cluster’s
health will be reported as a degraded ‘yellow’ state. The basic dev
setup for elasticsearch, consequently, always thinks that it’s
operating in a degraded state given that by default indexes, a single
running instance has no peers to replicate its data to. Note that this
has no practical effect on its operation for development purposes. It
is, however, recommended that elasticsearch always run on multiple
servers in production environments. As a clustered database, many of
data guarantees hinge on multiple nodes being available.
From here: http://exploringelasticsearch.com/modeling_data.html#sec-modeling-index-basics
When you create an index it you tell it how many primary and replica shards http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-create-index.html. ES defaults to 5 primary shard and 1 replica shard per primary for a total of 10 shards.
These shards will be balanced over how many nodes you have in the cluster, provided that a primary and it's replica(s) cannot reside on the same node. So if you start with 2 nodes and the default 5 primary shards and 1 replica per primary you will get 5 shards per node. Add more nodes and the number of shards per node drops. Add more indexes and the number of shards per node increases.
In all cases the number of nodes must be 1 greater than the configured number of replicas. So if you configure 1 replica you should have 2 nodes so that the primary can be on one and the single replica on the other, otherwise the replicas will not be assigned and your cluster status will be Yellow. If you have it configured for 2 replicas which means 1 primary shard and 2 replica shards you need at least 3 nodes to keep them all separate. And so on.
Your questions can't be answered directly because they are based on incorrect assumptions about how ES works. You don't add a node with shards - you add a node and then ES will re-balance the existing shards across the entire cluster. Yes, you do have some control over this if you want but I would not do so until you are much more familiar with ES operations. I'd read up on it here: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html and here: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-reroute.html and here: http://exploringelasticsearch.com/advanced_techniques.html#advanced-routing
From the last link:
8.1.1 How Elasticsearch Routing Works
Understanding routing is important in large elasticsearch clusters. By
exercising fine-grained control over routing the quantity of cluster
resources used can be severely reduced, often by orders of magnitude.
The primary mechanism through which elasticsearch scales is sharding.
Sharding is a common technique for splitting data and computation
across multiple servers, where a property of a document has a function
returning a consistent value applied to it in order to determine which
server it will be stored on. The value used for this in elasticsearch
is the document’s _id field by default. The algorithm used to convert
a value to a shard id is what’s known as a consistent hashing
algorithm.
Maintaining good cluster performance is contingent upon even shard
balancing. If data is unevenly distributed across a cluster some
machines will be over-utilized while others will remain mostly idle.
To avoid this, we want as even a distribution of numbers coming out of
our consistent hashing algorithm as possible. Document ids hash well
generally because they are evenly distributed if they are either UUIDs
or monotonically increasing ids (1,2,3,4 …).
This is the default approach, and it generally works well as it solves
the problem of evening out data across the cluster. It also means that
fetches for a single document only need to be routed to the shard that
document hashes to. But what about routing queries? If, for instance,
we are storing user history in elasticsearch, and are using UUIDs for
each piece of user history data, user data will be stored evenly
across the cluster. There’s some waste here, however, in that this
means that our searches for that user’s data have poor data locality.
Queries must be run on all shards within the index, and run against
all possible data. Assuming that we have many users we can likely
improve query performance by consistently routing all of a given
user’s data to a single shard. Once the user’s data has been
so-segmented, we’ll only need to execute across a single shard when
performing operations on that user’s data.

Yes, the number of shards is per index. So if you had 2 indexes, each with 5 shards, then yes, you would have a total of 10 shards distributed across all your nodes.
The number of shards is unrelated to the number of nodes in the cluster. If you have 3 shards and one node, obviously all 3 shards will reside on that one node. However, if you then add an additional node, more shards are not magically created and you can't specify that a certain number of shards should reside on that new node. Rather, the existing shards are distributed as evenly as possible across all nodes resulting in one node with 2 shards and one node with 1 shard, for a total of 3. If you added a third node, then each node would house 1 shard for a total of 3. In other words, the number of shards is fixed and doesn't scale as you add more nodes.
As to your final question, it's based on a false premise, so it's difficult to answer. Rather, lets take the example of above of three shards and two nodes. In that setup, one node will house 2 shards and one node will house 1 shard. If either of those nodes go down, your cluster goes down, because neither has a complete set of shards. The first node is missing 1 shard and the second node is missing 2. This is where replicas come in. By adding replicas, you can ensure that each node ends up with a full set of shards. For example, if you added 1 replica in the above scenario, then the first node would have 2 active shards and 1 replica of the third that lives on the other node. The second node would have 1 active shard and 1 replica each of the two that live on the first. As a result, if either node went down, the cluster can merely activate the replicas and still continue working.

1) Yes, the number of shards is configured per index. It is a static operation and should be done while creating an index. If you want to change the number of shards at a later point of time, you have to reindex the document again and takes time.
2) The number of shards don't depend on number of nodes in you cluster. Lets say you are a book seller website. You have 100 books to sell. your website have an elastic cluster with 3 nodes. you create a book index with 5 shards. Each and very shard contains 20 books. 2 shards will reside on node1, 2 shards will reside in node2 and 1 shard will reside in node3. now let's say node 2 has gone down. But, still we have 2 shards in node 1 and 1 shard in node 3. Querying elastic search will still return the data on node 1 and node 3. i.e, 60 books data will still be available. 40 books data is lost.
But, the overall cluster status will be red meaning cluster is partially functioning, but somedata is not available.
To make the system fault tolerant you can configure replicas. By default elasticsearch makes one replica of each shard. So in this case if the default configuration is not over written the copy of 2 shards on node 2 will be replicated either on node 1 or node 3 and they become the primary shards when node 2 is not available. So all the data is available even when node 2 is down.
in this case the overall cluster health will be yellow, meaning cluster is still functional. But some nodes are lost.

Answer 1) yes you will have 10 shards fr 2 index with 5 shards.
Answer 2) I think you confused with shards and index.
Shards are split piece for index not for node.
If you create a index with 3 shards and 1 replica.
You will get 3 primary shard and 3 replica shards.
If you start new node the shards will be balanced with new node.So you will have 3 shard in old node and 3 shards in new node.
If old node fails you will survive with new node data.It will have exact copy of old node.
To understand basic concepts of elasticsearch refer
HOpe it helps..!

Shards / Replicas settings for high availability

We have java application with embedded Elasticsearch in a cluster of 14 nodes. All the data resides in a central database, and they are indexed in elasticsearch for querying. A full reindex can be done at any time.
The system are very query-heavy, the amount of writes are small. The number of documents will not be higher than, say, 300.000.
The size of each document varies greatly, from just a couple of ids, to extracted text from e.g word-documents of several pages.
I want to make sure that in case of a total breakdown, it should be sufficient that one or two nodes are available for the system to work.
Write consistency should not be a problem since the master copy of the data is in the database, and it seems that ES is capable of resolving conflicting data by using the newest version (which should be all right in our case)
My first though is to use 1 shard, and 13 replicas. This will naturally ensure that all nodes have access to all data. This could also be accomplished by having 2 shards / 13 replicas, so this yield that to ensure that all data is available, the number of replicas should be the number of nodes - 1, not depending on the number of shards (which could be anything).
If the requirement of number of nodes are reduced to "2 nodes should be up at any time", then a shards / replica distribution of "x/number of nodes - 2" should be sufficient.
So, for the question:
Asserting the above setup and that my thoughts is correct, would a setup with 1 shard / 13 replicas make sense or would there be anything to gain by adding more shards and run e.g a 4 shards/13 replicas setup?

After a good bit of research and talking to ES-gurus;
As long as the shard size is small enough, the most efficient way of setting up this cluster would indeed be 1 shard only, with 13 replicas. I have not been able to pinpoint the threshold size of the shard for this starting to perform worse.

If the index is big... you will need more than one shard (if you want perfomance). Do You really need 13 replica? When you put only 2 replicas, ES manage that to keep it that way, if the principal node fail, ES will create a new reply. May be you will need a balancer node too.

Understanding the write_consistency and quorum rule of Elasticsearch

According to the elasticsearch documentation, the rule for write_consistency level quorum is:
quorum (>replicas/2+1)
Using ES 0.19.10, on a setup with 16 shards / 3 replicas we will get
16 primary shards
48 replicas
Running 2 nodes, we will have 16(primary) + 16(replicas) = 32 active shards.
For the quorum rule to be met, quorum > 48/2 + 1 = 25 active shards.
Now, testing this proves otherwise, write_consistency level is not met (write operations times out) until we have 3 nodes running. This kind of makes sense, since we could get a split-brain between groups of 2 nodes each in this setup, but I dont quite understand how this rule is supposed to work? Am I using the wrong numbers here?

Primary shard count doesn't actually matter, so I'm going to replace it with N.
If you have an index with N shards and 2 replicas, there are three shards in the replication group. This means the quorum is two: the primary plus one of the replicas. You need two active shards, which usually means two active machines, to satisfy the write consistency parameter
An index with N shards and 3 replicas has four shards in the replication group (primary + 3 replicas), so a quorum is three.
An index with N shards and 1 replica is a special case, since you can't really have a quorum with only two shards. With only one replica, Elasticsearch only requires a single active shard (e.g. the primary), so the quorum setting is identical to the one setting for this particular arrangement.
A few notes:
0.19 is really old, you should definitely, absolutely, positively upgrade. I can't even count how many bugfixes and performance improvements have been added since that release :)
Write consistency is merely a gateway check. Before executing the indexing request, the node will do a straw-poll to see if write_consistency is met. If it is, it tries to execute the index and push the replication. This doesn't guarantee that the replicas will succeed...they could easily fail and you'll see it in the response. It is simply a mechanism to halt the indexing process if the consistency setting is not satisfied.
A "fully replicated" setup with two nodes is 1 primary shard + 1 replica. Each node has a complete set of data. There is no reason to have more replicas, since ES refuses to put the copies of the same data on the same machine (doesn't make sense, doesn't help HA). The inability to index is just a side effect of write consistency, but it's pointing out a bigger problem with your setup :)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio