Does every node in a CouchDB cluster have copy of the data? - couchdb-2.x

More I read the documentation, more I get confused.
http://docs.couchdb.org/en/2.1.1/cluster/theory.html
Let me put my question in simple way:
I have 3 nodes cluster having below configuration:
n=3, q=8
Would this configuration make sure that in case of failure of 2 nodes, 3rd node will have all the documents/data for read and write?
Does value of r and w will have any impact?
if I configure the value of r=1 and w=3 then will it make sure any node in cluster will have all documents/data stored in different shards.

Answer received from Joan Touzet:1) Yes, 2) not in the config file 3) no, n=3 q=8 ensures all 3 nodes have all docs split across 24 shards, 8 per.

Related

Number of nodes AWS Elasticsearch

I read documentation, but unfortunately I still don't understand one thing. While creating AWS Elasticsearch domain, I need to choose "Number of nodes" in "Data nodes" section.
If i specify 3 data nodes and 3-AZ, what it actually means?
I have suggestions:
I'll get 3 nodes with their own storages (EBS). One of node is master and other 2 are replicas in different AZ. Just copy of master, not to lose data if master node become broken.
I'll get 3 nodes with their own storages (EBS). All of them will work independent and on their storadges are different data. So at the same time data can be processed by different nodes and store on different storages.
It looks like in other AZ's should be replicas. but then I don't understand why I have different values of free space on different nodes
Please, explain how it works.
Many thanks for any info or links.
I haven't used AWS Elasticsearch, but I've used the Cloud Elasticsearch service.
When you use 3 AZ (availability zones), means that your cluster will use 3 zones in order to make it resilient. If one zone has problems, then the nodes in that zone will have problems as well.
As the description section mentions, you need to specify multiples of 3 if you choose 3 AZ. If you have 3 nodes, then every AZ will have one zone. If one zone has problems, then that node is out, the two remaining will have to pick up from there.
Now in order to answer your question. What do you get with these configurations. You can check so yourself. Use this via kibana or any HTTP client
GET _nodes
Check for the sections:
nodes.roles
nodes.attributes
In the various documentations, blog posts etc you will see that for production usage, 3 nodes and 3 AZ is a good starting point in order to have a resilient production cluster.
So let's take it step by step:
You need an even number of master nodes in order to avoid the split brain problem.
You need more than one node in your cluster in order to make it resilient (if the node is unavailable).
By combining these two you have the minimum requirement of 3 nodes (no mention of zones yet).
But having one master and two data nodes, will not cut it. You need to have 3 master-eligible nodes. So if you have one node that is out, the other two can still form a quorum and vote a new master, so your cluster will be operational with two nodes. But in order for this to work, you need to set your primary shards and replica shards in a way that any two of your nodes can hold your entire data.
Examples (for simplicity we have only one index):
1 primary, 2 replicas. Every node holds one shard which is 100% of the data
3 primaries, 1 replica. Every node will hold one primary and one replica (33% primary, 33% replica). Two nodes combined (which is the minimum to form a quorum as well) will hold all your data (and some more)
You can have more combinations but you get the idea.
As you can see, the shard configuration needs to go along with your number and type of nodes (master-eligible, data only etc).
Now, if you add the availability zones, you take care of the problem of one zone being problematic. If your cluster was as a whole in one zone (3 nodes in one node), then if that zone was problematic then your whole cluster is out.
If you set up one master node and two data nodes (which are not master eligible), having 3 AZ (or 3 nodes even) doesn't do much for resiliency, since if the master goes out, your cluster cannot elect a new one and it will be out until a master node is up again. Now for the same setup if a data node goes out, then if you have your shards configured in a way that there is redundancy (meaning that the two nodes remaining have all the data if combined), then it will work fine.
Your answers should be covered in following three points.
If i specify 3 data nodes and 3-AZ, what it actually means?
This means that your data and replica's will be available in 3AZs with none of the replica in same AZ as the data node. Check this link. For example, When you say you want 2 data nodes in 2 AZ. DN1 will be saved in (let's say) AZ1 and it's replica will be stored in AZ2. DN2 will be in AZ2 and it's replica will be in AZ1.
It looks like in other AZ's should be replicas. but then I don't understand why I have different values of free space on different nodes
It is because when you give your AWS Elasticsearch some amount of storage, the cluster divides the specified storage space in all data nodes. If you specify 100G of storage on the cluster with 2 data nodes, it divides the storage space equally on all data nodes i.e. two data nodes with 50G of available storage space on each.
Sometime you will see more nodes than you specified on the cluster. It took me a while to understand this behaviour. The reason behind this is when you update these configs on AWS ES, it takes some time to stabilize the cluster. So if you see more data or master nodes as expected hold on for a while and wait for it to stabilize.
Thanks everyone for help. To understand how much space available/allocated, run next queries:
GET /_cat/allocation?v
GET /_cat/indices?v
GET /_cat/shards?v
So, if i create 3 nodes, than I create 3 different nodes with separated storages, they are not replicas. Some data is stored in one node, some data in another.

Scalling down elasticsearch cluster

First of all, I would like to mention that I am not Elasticsearch expert.
I have a 3 node elasticsearch cluster. The utilization of the resources are not proportional to cost. So I have decided to reduce 2 nodes.
Now I am thinking what is the gracious way to kill 2 node out of 3 without downtime? What can be consequences?
I can not compeltely shut it down the whole cluster. Running Elasticsearch version: 5.6.8
Any help or suggestion will be really appreciated.
For high availability you need at least 3 nodes for the master election. Be sure to set discovery.zen.minimum_master_nodes correctly:
2 (= majority) for 3 nodes — only this is highly available
2 for 2 nodes (also the majority), but you are losing HA because as soon as one node is down you will not be able to elect a master any more
1 for 1 node
If you are removing data nodes, be sure that the data is replicated to at least one other node. Either set the replication factor number_of_replicas to 2 (= 3 copies, so on all nodes in your case) if you want to kill 2 of the 3 nodes. Or slightly more gracefully, set "index.routing.allocation.require._name": "A" to ensure that data must be allocated on the node with the name A. Ensure with the cat shards API that the surviving node has all the required data.

Multiple datacenter replication and local quorum?

I created a cluster from 6 nodes.
3 nodes in Eu west1 and 3 nodes in EU west2
I set the locality for every group of nodes like : --locality=region=europe,datacenter=west1
I also set the replica to 6 to have all ranges and all data on every node.
What will happen if the connection between data centers is lost the whole cluster goes down ?
I tried to kill 3 nodes in one of the datacenters and cluster is not operational because the majority of the nodes are down and quorum is less that 4.
Is it possible to make the 2 datacentes to work with their local quorum 2/3
I also played a bit with replications settings and sometimes cluster is healthy if I kill 3 nodes from 6 and was I was able to write to the cluster. Sometimes I can only read from the cluster. Cluster is working with replica of 5 and 3 nodes killed from 6. Still paying with this but if someone can give me more information will be very helpful.
To be able to replicate across datacentes is very cool feature but if I lost the whole cluster when one of the datacenters is down ruin the whole good idea at least for me.
CockroachDB requires a majority of replicas to be fully operational, which means > half, not >= half. In order to survive the loss of a full datacenter or region, you must have three DCs/regions, not two. Try running two nodes in each of three regions instead of three nodes in two regions.
Is it possible to make the 2 datacenters to work with their local quorum 2/3
Not for a single table (because it would be impossible to guarantee consistency if each datacenter were able to act in isolation from the other). You've configured the data to be replicated across all six replicas, which means four replicas are required to make a quorum. If you want each datacenter to be able to operate independently of the other, you would need two separate tables, with each one configured to be located within one of the datacenters.
Thanks for the answer just to clear few thing. But looks like you got my point and what I want to accomplish.
But as far as I understand if I have 2x3 node in 2 different DC's if one DC goes down. I have 3 live nodes for the quorum I need at least 4 . N/2 +1.
So if I have 3x3 I can lost one DC because if I have 2 DC's live I will have a quorum .
And one last question if I don't set replication to 9 if I loose 3 nodes some in one DC some ranges will be not available right ?

discovery.zen.minimum_master_nodes value for a cluster of two nodes

I have two dedicated Windows Servers (Windows Server 2012R2, 128GB memory on each server) for ES (2.2.0). If I have one node on each server and the two nodes form a cluster. What is the proper value for
discovery.zen.minimum_master_nodes
I read this general rule in elasticsearch.yml:
Prevent the "split brain" by configuring the majority of nodes (total
number of nodes / 2 + 1):
I saw this SO thread:
Proper value of ES_HEAP_SIZE for a dedicated machine with two nodes in a cluster
There is an answer saying:
As described in Elasticsearch Pre-Flight Checklist, you can set
discovery.zen.minimum_master_nodes to at least (N/2)+1 on clusters
with N > 2 nodes.
Please note "N > 2". What is the proper value in my case?
N is the number of ES nodes (not physical machines but ES processes) that can be part of the cluster.
In your case, with one node on two machines, N = 2 (note that it was 4 here), so the formula N/2 + 1 yields 2, which means that both of your nodes MUST be eligible as master nodes if you want to prevent split brain situations.
If you set that value to 1 (which is the default value!) and you experience networking issues and both of your nodes can't see each other for a brief moment, each node will think it is alone in the cluster and both will elect themselves as master. You end up in a situation where you have two masters and that's not a good thing. Whereas if you set that value to 2 and you experience networking issues, the current master node will stay elected and the second node will never decide to elect itself as new master. Whenever network is back up, the second node will rejoin the cluster and continue serving requests.
The ideal topology is to have 3 dedicated master nodes (i.e. with master: true and data:false) and have discovery.zen.minimum_master_nodes set to 2. That way you'll never have to change the setting whatever the number of data nodes are part of your cluster.
So the N > 2 constraint should indeed be N >= 2, but I guess it was somehow implied, because otherwise you're creating a fertile ground for split brain situations.
Interestingly, in ES 7 discovery.zen.minimum_master_nodes is no longer need to be defined.
discovery.zen.minimum_master_nodes value for a cluster of two nodes
https://www.elastic.co/blog/a-new-era-for-cluster-coordination-in-elasticsearch

How to explicilty define datanodes to store a particular given file in HDFS?

I want to write a script or something like .xml file which explicitly defines the datanodes in Hadoop cluster to store a particular file blocks.
for example:
Suppose there are 4 slave nodes and 1 Master node (total 5 nodes in hadoop cluster ).
there are two files file01(size=120 MB) and file02(size=160 MB).Default block size =64MB
Now I want to store one of two blocks of file01 at slave node1 and other one at slave node2.
Similarly one of three blocks of file02 at slave node1, second one at slave node3 and third one at slave node4.
So,my question is how can I do this ?
actually there is one method :Make changes in conf/slaves file every time to store a file.
but I don't want to do this
So, there is another solution to do this ??
I hope I made my point clear.
Waiting for your kind response..!!!
There is no method to achieve what you are asking here - the Name Node will replicate blocks to data nodes based upon rack configuration, replication factor and node availability, so even if you do managed to get a block on two particular data nodes, if one of those nodes goes down, the name node will replicate the block to another node.
Your requirement is also assuming a replication factor of 1, which doesn't give you any data redundancy (which is a bad thing if you lose a data node).
Let the namenode manage block assignments and use the balancer periodically if you want to keep your cluster evenly distibuted
NameNode is an ultimate authority to decide on the block placement.
There is Jira about the requirements to make this algorithm pluggable:
https://issues.apache.org/jira/browse/HDFS-385
but unfortunetely it is in the 0.21 version, which is not production (alhough working not bad at all).
I would suggest to plug you algorithm to 0.21 if you are on the research state and then wait for 0.23 to became production, or, to downgrade the code to 0.20 if you do need it now.

Resources