CAP: Could we sacrifice Availability to gain Partition tolerance? - cluster-computing

My understanding: In essence, A and P are the same thing. Because from the perspective of the entire multi-node cluster, AP is always positively correlated: there is basically no way for us to make design choices such as "sacrificing A to get P" or "sacrificing P to get A".
For example: Can you design a high-availability multi-node cluster that does not allow network partitions (that is: once a network partition appears, it will be unavailable)?
Single-node systems are not considered here, because CAP is a law for distributed clusters.

Therefore, the CAP theory can only be interpreted as: when the multi-node cluster has a network partition (P), and it is so serious that no partition in the cluster contains the majority nodes. You can now choose to maintain consistency (C) or maintain availability (A).

Related

How do I set failover on my netapp clusters?

I have two clusters of NetApp (main and dr), in each I have two nodes.
If one of the nodes in either cluster goes down, the other node kicks in and act as one node cluster.
Now my question is, what happens when a whole cluster falls down due to problems of power supply?
I've heard about "Metro Cluster" but I want to ask if there is another option to do so.
It depends on what RPO you need. Metrocluster does synchronous replication of every write and thus provides zero RPO (data loss)
On the other hand you could use Snapmirror which basically takes periodic snapshots and stores them on the other cluster. As you can imagine you should expect some data loss.

How to deal with Split Brain with an cluster have the two number of nodes?

I am leaning some basic concept of cluster computing and I have some questions to ask.
According to this article:
If a cluster splits into two (or more) groups of nodes that can no longer communicate with each other (aka.partitions), quorum is used to prevent resources from starting on more nodes than desired, which would risk data corruption.
A cluster has quorum when more than half of all known nodes are online in the same partition, or for the mathematically inclined, whenever the following equation is true:
total_nodes < 2 * active_nodes
For example, if a 5-node cluster split into 3- and 2-node paritions, the 3-node partition would have quorum and could continue serving resources. If a 6-node cluster split into two 3-node partitions, neither partition would have quorum; pacemaker’s default behavior in such cases is to stop all resources, in order to prevent data corruption.
Two-node clusters are a special case.
By the above definition, a two-node cluster would only have quorum when both nodes are running. This would make the creation of a two-node cluster pointless
Questions:
From above,I came out with some confuse, why we can not stop all cluster resources like “6-node cluster”?What`s the special lies in the two node cluster?
You are correct that a two node cluster can only have quorum when they are in communication. Thus if the cluster was to split, using the default behavior, the resources would stop.
The solution is to not use the default behavior. Simply set Pacemaker to no-quorum-policy=ignore. This will instruct Pacemaker to continue to run resources even when quorum is lost.
...But wait, now what happens if the cluster communication is broke but both nodes are still operational. Will they not consider their peers dead and both become the active nodes? Now I have two primaries, and potentially diverging data, or conflicts on my network, right? This issue is addressed via STONITH. Properly configured STONITH will ensure that only one node is ever active at a given time and essentially prevent split-brains from even occurring.
An excellent article further explaining STONITH and it's importance was written by LMB back in 2010 here: http://advogato.org/person/lmb/diary/105.html

Datastax Cassandra - Spanning Cluster node across amazon region

I planning to launch three EC2 instance across Amazon hosting region. For say, Region-A,Region-B and Region-C.
Based on the above plan, Each region act as Cluster(Or Datacenter) and have one node.(Correct me if I am wrong).
Using this infrastructure, Can I attain below configuration?
Replication Factor : 2
Write and Read Level:QUORUM.
My basic intention to do these are to achieve "If two region are went down, I can be survive with remaining one region".
Please help me with your inputs.
Note: I am very new to cassandra, hence whatever your inputs you are given will be useful for me.
Thanks
If you have a replication factor of 2 and use CL of Quorum, you will not tolerate failure i.e. if a node goes down, and you only get 1 ack - thats not a majority of responses.
If you deploy across multiple regions, each region is, as you mention, a DC in your Cluster. Each individual DC is a complete replica of all your data i.e. it will hold all the data for your keyspace. If you read/write at a LOCAL_* consistency (eg. LOCAL_ONE, LOCAL_QUORUM) level within each region, then you can tolerate the loss of the other regions.
The number of replicas in each DC/Region and the consistency level you are using to read/write in that DC will determine how much failure you can tolerate. If you are using QUORUM - this is a cross-DC consistency level. It will require a majority of acks from ALL replicas in your cluster in all DCs. If you loose 2 regions then its unlikely that you will be getting a quorum of responses.
Also, its worth remembering that Cassandra can be made aware of the AZ's it is deployed on in the Region and can do its best to ensure replicas of your data are placed in multiple AZs. This will give you even better tolerance to failure.
If this was me and I didnt need to have a strong cross-DC consistency level (like QUORUM). I would have 4 nodes in each region, deployed across each AZ and then a replication factor of 3 in each region. I would then be reading/writing at LOCAL_QUORUM or LOCAL_ONE (preferably). If you go with LOCAL_ONE than you could have fewer replicas in each DC e.g a replication factor of 2 with LOCAL_ONE means you could tolerate the loss of 1 replica.
However, this would be more expensive than what your initially suggesting but (for me) that would be the minimum setup I would need if I wanted to be in multiple regions and tolerate the loss of 2. You could go with 3 nodes in each region if you wanted to really save costs.

CAP with distributed System

When we talk about nosql distributed database system, we know that all of them fall under the 2 out of three of CAP theoram. For a distributed cluster where network failure and node failure are inevitable partition tolerance is a necessity hence leaving us to chose one from availability and consistency. So its basically CP or AP.
My questions are
Under which category does hadoop fall into.
Let's say I have a cluster with 6 nodes ABC and DEF, During a network failure let's say node A,B,C and node D,E,F are divided into two independent cluster.
Now in a consistent and partition tolerant system (CP) model since an update in node A wont replicate to node D the consistency of the system wont allow user to update or read data till the network is up again running, Hence making the database down.
Whereas an Available and partition tolerant system would allow the user of node D to see the old data when update is made at node A but doesn't guarantee the user of node D of the latest data. But after some time when the network is up running again it replicates the latest data of node A into node D and hence allows the user of node D to view the latest data.
From the above two scenarios we can conclude that In an AP model there is no scope for database going hence allowing user to write and read even during failure and promises user latest data when the network is up again, So Why do people go for Consistent and partition tolerant model (CP). In my perspective during network failure (AP) has an advantage over (CP) allowing user to read and write data while the database under (CP) is down.
Is there any system that can provide CAP together excluding the concept of Cassandra's eventually consistency.
When does a user Choose availability over consistency and vice versa. Is there any database out there that allows user to switch its choice accordingly between CP and AP.
Thanks in advance :)
HDFS has a unique central decision point, the namenode. As such it can only fall in the CP side, since taking down the namenode takes down the entire HDFS system (no Availability). Hadoop does not try to hide this:
The NameNode is a Single Point of Failure for the HDFS Cluster. HDFS is not currently a High Availability system. When the NameNode goes down, the file system goes offline. There is an optional SecondaryNameNode that can be hosted on a separate machine. It only creates checkpoints of the namespace by merging the edits file into the fsimage file and does not provide any real redundancy.
Since the decission where to place data and where it can be read from is always handled by the namenode, which maintains a consistent view in memory, HDFS is always consistent (C). It is also partition tolerant in that it can handle loosing data nodes, subject to replication factor and data topology strategies.
Is there any system that can provide CAP together?
Yes, such systems are often mentioned in Marketing and other non-technical publications.
When does a user Choose availability over consistency and vice versa.
This is a business use case decision. When availability is more important they choose AP. When consistency is more important, they choose CP. In general when money changes hands the consistency takes precedence. Almost every other case favors availability.
Is there any database out there that allows user to switch its choice accordingly between CP and AP
Systems that allows you to modify both the write and the read quorums can be tuned to be either CP or AP, depending on the needs.

HA Minimal Cluster for 5- Nines

I am trying to find out if 3 node HA cluster is common practice? Most of the references on Google point to 2 node cluster. But i not able to convince myself that an application that require 5 Nine's, can implement 2 node HA cluster on commodity hardware.
The reason behind it is simple. If a machine on which one node goes offline, then there will be only one node left without any back up.
To reduce dependency on node that went offline, i think a 3 node cluster is a min requirement.
In order to give a factual answer, much more data would be required.
But from an anecdotal perspective, two nodes of commodity hardware are not nearly enough to give you five-nines with any level of reliability (or at least sleep-at-night comfort).
Most cluster diagrams are likely drawn with only two nodes for ease of explanation, "If A fails, B keeps working".
Given your five-nines however, and "commodity hardware", I would consider more than three as a requirement; perhaps as many as five or more.
Remember to allow for network, power and perhaps even geographical diversity if you are really after that kind of reliability.

Resources