Cluster in opendaylight - cluster-computing

I am trying to set up cluster according to this manual
https://docs.opendaylight.org/en/stable-magnesium/getting-started-guide/clustering.html
But i also woukd like to know how does it work.
There is written that i choose which node/controller is leader and which will follow after
leader is down using member role - 1 - 2 in akka.config.
But in some work i have read it is using RAFT algorithm to choose/elect leader. Am i mixing it up somehow?
Can someone explain it to me please?

The nodes you specify when setting up an OpenDaylight cluster are all equal, there is no pre-selection of a leader. When the cluster starts up, the controller nodes will participate in a RAFT election to choose the leader.

Related

Hashicorp Raft consensus deadlock state

I am implementing a Raft service using Hashicorp Raft library for distributed consensus.
https://github.com/hashicorp/raft
I have a simple layout with Raft, 1 followers and a leader.
I bootstrap the cluster with the leader and add 2 follower Raft nodes to the Raft cluster, things look fine. When I knock one of the followers offline, the leader gets:
failed to contact quorum of nodes, stepping down. The problem with this is now there are 0 nodes in leader state and no one can promote to leader because majority of votes required from quorum of nodes. Now because the previous leader is now a follower, even my service discovery tools can't remove the old ip address from the leader because it requires leader power to do so.
My cluster enters this infinite loop (deadlock) of trying to connect to a node that's offline forever and no one can get promoted to leader. Any ideas?
Edit: After realizing, I guess I need a system where there are an odd number of nodes to reach quorum. (ie 3 nodes, 1 gets knocked offline then I can tell the new leader to remove old IP address)
I don't have experience with that library, but:
"The problem with this is now there are 0 nodes in leader state and no one can promote to leader because majority of votes required from quorum of nodes".
With one of three nodes out, you still have quorum/majority. Raft would promote one of followers to leader.
The fact that your system stalled after you removed one of two followers tells me that one of followers was not added correctly in the first place.
You had one leader and one follower initially. Since that was working, it means that follower and the leader can communicate; all good here.
You have added second follower; how do you know that it was done correctly? Can you try to do it again, and knock out this second follower - the system should keep working as the leader and the first follower are ok.
So I can conclude, that the second follower did not join the cluster, and when you knocked out the first follower, the systems stopped, as it should - no more majority of correctly configured nodes are available.

etcd - is the leader node automatically and dynamically elected? newbie question

It's been about several hours since I am actively looking into etcd and I have a newbie question:
Is the leader node dynamically and automatically elected or the "human" operator of the cluster must take action?
If yes, then what are the conditions that must be met in order to begin the election of a new leader node?
If no, then what should be done?

How to deal with Split Brain with an cluster have the two number of nodes?

I am leaning some basic concept of cluster computing and I have some questions to ask.
According to this article:
If a cluster splits into two (or more) groups of nodes that can no longer communicate with each other (aka.partitions), quorum is used to prevent resources from starting on more nodes than desired, which would risk data corruption.
A cluster has quorum when more than half of all known nodes are online in the same partition, or for the mathematically inclined, whenever the following equation is true:
total_nodes < 2 * active_nodes
For example, if a 5-node cluster split into 3- and 2-node paritions, the 3-node partition would have quorum and could continue serving resources. If a 6-node cluster split into two 3-node partitions, neither partition would have quorum; pacemaker’s default behavior in such cases is to stop all resources, in order to prevent data corruption.
Two-node clusters are a special case.
By the above definition, a two-node cluster would only have quorum when both nodes are running. This would make the creation of a two-node cluster pointless
Questions:
From above,I came out with some confuse, why we can not stop all cluster resources like “6-node cluster”?What`s the special lies in the two node cluster?
You are correct that a two node cluster can only have quorum when they are in communication. Thus if the cluster was to split, using the default behavior, the resources would stop.
The solution is to not use the default behavior. Simply set Pacemaker to no-quorum-policy=ignore. This will instruct Pacemaker to continue to run resources even when quorum is lost.
...But wait, now what happens if the cluster communication is broke but both nodes are still operational. Will they not consider their peers dead and both become the active nodes? Now I have two primaries, and potentially diverging data, or conflicts on my network, right? This issue is addressed via STONITH. Properly configured STONITH will ensure that only one node is ever active at a given time and essentially prevent split-brains from even occurring.
An excellent article further explaining STONITH and it's importance was written by LMB back in 2010 here: http://advogato.org/person/lmb/diary/105.html

Bootstrap expect=1 in consul results in weird behavior in cluster

Trying to launch a cluster of nodes one at a time, and I'm a bit confused about the bootstrap-expect value.
The way it is set up is that consul is launched with bootstrap-expect, then after it starts consul join is ran
Currently, the deployment sets bootstrap-expect have it set to the number of nodes in the cluster, and a leader is elected after that number.
However, when bootstrap-expect is set to 1 (thought process is so we can have a cluster without waiting for all the nodes), something strange happens.
So first, each node thinks it is the leader - which is expected since bootstrap-expect is set to 1. But after doing consul join to each other, a new cluster leader isn't elected - what happens is strange - each node in the cluster still thinks itself as a cluster leader.
Why don't the nodes, when joining a cluster, elect a new leader? Or at least respect the prexisting leader?
This is condition called Split Brain that you've "intentionally" created. Each server think's it's the leader and has it's own version of the log and each of these versions are not reconcilable with each other. Split Brain is famously hard to recover from. Since the Servers can not agree on what the Cluster State should be, they can't decide who the new leader should be, and they continue without a successful election. You can read up on Raft to learn more about why.

What cluster node should be active?

There is some cluster and there is some unix network daemon. This daemon is started on each cluster node, but only one can be active.
When active daemon breaks (whether program breaks of node breaks), other node should become active.
I could think of few possible algorithms, but I think there is some already done research on this and some ready-to-go algorithms? Am I right? Can you point me to the answer?
Thanks.
Jgroups is a Java network stack which includes DistributedLockManager type of support and cluster voting capabilities. These allow any number of unix daemons to agree on who should be active. All of the nodes could be trying to obtain a lock (for example) and only one will succeed until the application or the node fails.
Jgroups also have the concept of the coordinator of a specific communication channel. Only one node can be coordinator at one time and when a node fails, another node becomes coordinator. It is simple to test to see if you are the coordinator in which case you would be active.
See: http://www.jgroups.org/javadoc/org/jgroups/blocks/DistributedLockManager.html
If you are going to implement this yourself there is a bunch of stuff to keep in mind:
Each node needs to have a consistent view of the cluster.
All nodes will need to inform all of the rest of the nodes that they are online -- maybe with multicast.
Nodes that go offline (because of ap or node failure) will need to be removed from all other nodes' "view".
You can then have the node with the lowest IP or something be the active node.
If this isn't appropriate then you will need to have some sort of voting exchange so the nodes can agree who is active. Something like: http://en.wikipedia.org/wiki/Two-phase_commit_protocol

Resources