This happens from time to time, one of my nodes goes into an 'unknown' state. Where can I get technical information on what the cluster is? specifically ...
what controls the state in the cluster?
how does discovery and health information flow?
and what is the mechanism for consensus?
My cluster is made of two machines around a shared Oracle database.
The status of the cluster depends of connectivity between the nodes.
The status in Runtime Manager depends of connectivity between the Runtime Manager Agent, installed in each node, to Runtime Manager in the Anypoint Platform.Unknown status probably means the later. There are several possible causes, like network connectivity issues, bugs in older versions of the agent, expired certificates, etc.
I'm not quite sure to what consensus are you referring but I don't think there is a consensus mechanism that applies here. There is a quorum mechanism but with only 2 nodes I don't think it is applicable.
Related
There are n nodes in a web cluster. Files may be uploaded to any node and then must be distributed to every other node. This distribution does not have to happen in a transaction (in fact it must not, distributed transactions don't scale) and some latency is acceptable, although must be minimal. Conflicts can be resolved arbitrarily (typically last write wins) provided that the resolution is also distributed to all nodes so that eventually all nodes have the same set of files. Nodes can be added and removed dynamically without having to reconfigure existing nodes. There must be no single point of failure and no additional boxes required to solve this (such as RabbitMQ)
I am thinking along the lines of using consul.io for dynamic configuration so that each node can refer to consul to determine what other nodes are available and writing a daemon (Golang) that monitors the relevant folders and communicates with other nodes using ZeroMQ.
Feels like I would be re-inventing the wheel though. This is a common problem and I expect there are solutions available already that I don't know about? Or perhaps my approach is wrong and there is another way to solve this?
Yes, there has been some stuff going on with distributed synchronization lately:
You could use syncthing (open source) or BitTorrent Sync.
Syncthing is node-based, i.e. you add nodes to a cluster and choose which folders to synchronize.
BTSync is folder-based, i.e. you obtain a "secret" for a folder and can synchronize with everyone in the swarm for that folder.
From my experience, BTSync has a better discovery and connectivity, but the whole synchronization process is closed source and nobody really knows what happens. Syncthing is written in go, but sometimes has trouble discovering peers.
Both syncthing and BTSync use LAN discovery via broadcast and a tracker for discovery, AFAIK.
EDIT: Or, if you're really cool, use IPFS to host the latest version, IPNS to "name" that and mount the IPNS on the servers. You can set the IPFS bootstrap list to some of your servers, which would even make you independent of external trackers. :)
According to my reading on jboss documentation it says,
We define high availability as the ability for the system to continue
functioning after failure of one or more of the servers. A part of
high availability is failover which we define as the ability for
client connections to migrate from one server to another in event of
server failure so client applications can continue to operate.
Is failover part of high availability? How can we differentiate failover vs high availability?
Failover is a means of achieving high availability (HA). Think of HA as a feature and failover as one possible implementation of that feature. Failover is not always the only consideration when achieving HA.
For example, Cassandra achieves HA through replication, but the degree of availability is determined by data consistency settings. In essence, these settings dictate how many nodes need to respond for an action (a read or a write) to succeed. Requiring more nodes to respond means less availability, and requiring fewer nodes means more availability. That's an example of HA that has nothing to do with failover, strictly speaking.
High Availability
Refers to the fact that the server system is in some way tolerant to failure.
Most of the time this is done with hardware redundancy. Assume a machine has redundant power supplies, if one fails the machine will keep running.
Failover
Then you have application redundancy (failover), which usually refers to the ability for an application running on multiple hardware installations to respond to clients in a consistent manner from any of those hardware installations. That way, if the hardware does totally fail, or the O/S dies on a particular machine, another machine can carry on.
SQL Server deals with application redundancy in four ways:
Clustering
Mirroring
Replication
Log Shipping
High-availability (HA for short) is a broad term, so when I think about it I tend to think as HA clusters.
From Wikipedia High-availability cluster:
High-availability clusters are groups of computers that
support server applications that can be reliably utilized with a
minimum amount of down-time. They operate by using high availability
software to harness redundant computers in groups or clusters that
provide continued service when system components fail. Without
clustering, if a server running a particular application crashes, the
application will be unavailable until the crashed server is fixed.
So the takeaway from the description above is that HA clusters will provide you with the minimum amount of down-time during a failover. Let me explain the two types of failover that HA clusters can provide you:
Hot-Hot / Active-Active: The redundant computers are truly operating in parallel, producing the exact same state, and the exact same output. They are all active nodes, operating as a perfect mirror of each other. In this scenario, your failover down-time is zero, and you can simply pull the power plug from any machine in the cluster without any downtime or disruption to your service.
Hot-Warn / Active-Passive: Only one primary computer is the active one, while the other computers in the cluster are passively rebuilding the same state as the primary. When the primary computer fails, it has to be disabled or killed (automatically or by an operator) and then a passive computer from the cluster needs to be made active (automatically or by an operator).
So what is the catch? The catch is that applications that can operate in a HA cluster are not trivial to design as they need to be true deterministic finite-state machines. A classic problem is when your application needs to use the clock to build state based on time, as clocks are very non-deterministic by nature.
Disclaimer: I am one of the developers of CoralSequencer.
Is it safe to use etcd across multiple data centers? As it expose etcd port to public internet.
Do I have to use client certificates in this case or etcd has some sort of authification?
Yes, but there are two big issues you need to tackle:
Security. This all depends on what type of info you are storing in etcd. Using a point to point VPN is probably preferred over exposing the entire cluster to the internet. Client certificates can also be used.
Tuning. etcd relies on replication between machines for two things, aliveness and consensus. Since a successful write must be committed to at majority of the cluster before it returns as successful, your write performance will degrade as the distance between the machines increases. Aliveness is measured with periodic heartbeats between the machines. By default, etcd has a fairly aggressive 50ms heartbeat timeout, which is optimized for bare metal servers running on a local network. Without tuning this timeout value, your cluster will constantly think that members have disappeared and trigger frequent master elections. This gets worse if both of your environments are on cloud providers that have variable networks plus disk writes that traverse the network, a double whammy.
More info on etcd tuning: https://etcd.io/docs/latest/tuning/
I have a Websphre 7 cluster with nodes running on different servers.
When a server with one node loses connection to the network, it takes about a minute, after which the Websphre knows that the member is unavailable.
How can I speed up the status updates?
UPD. The cluster is used only for EJB. EJB called from the local network.
I think this is always going to be a tradeoff between performance during normal operations and how quickly a down cluster member is detected.
See this article, Understanding HTTP plug-in failover in a clustered environment and this plugin-cfg.xml reference in the WebSphere 7 InfoCenter.
From the article, the answer will involve the ConnectTimeout, ServerIOTimeout, and RetryInterval settings, but note the warning that:
In an environment with busy workload or a slow network connection, setting this value too low could make the HTTP plug-in mark a cluster member down falsely. Therefore, caution should be used whenever choosing a value for ConnectTimeout.
We just tested an AppFabric cluster of 2 servers where we removed the "lead" server. The second server timeouts on any request to it with the error:
Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode<ERRCA0017>:SubStatus<ES0006>:
There is a temporary failure. Please retry later.
(One or more specified Cache servers are unavailable, which could be caused by busy network or servers. Ensure that security permission has been granted for this client account on the cluster and that the AppFabric Caching Service is allowed through the firewall on all cache hosts. Retry later.)
In practive this means that if one server in the cluster goes down then they all go down. (Note we are not using Windows cluster, only linking multiple AppFabric cache servers to each other.)
I need the cluster to continue operating even if a single server goes down. How do I do this?
(I realize this question is borderlining Serverfault, but imho developers should know this.)
You'll have to install the AppFabric cache on at least three lead servers for the cache to survive a single server crash. The docs state that the cluster will only go down if the "majority" of the lead servers go down, but in the fine print, they explain that 1 out of 2 constitutes a majority. I've verified that removing a server from a three lead-node cluster works as advertised.
Typical distributed systems concept. For a write or read quorum to occur in an ensemble you need to have 2f + 1 servers up where f is number of servers failing. I think appfabric or any CP (as in CAP theorem) consensus based systems need this to happen for working of the cluster.
--Sai
Thats actually a problem with the Appfabric architecture and it is rather confusing in terms of the "lead-host" concept. The idea is that the majority of lead hosts should be running so that the cluster remains up and running. So if you had three servers you'd have to have at least two lead hosts constantly communicating with each other and eating up server resources and if both go down then the whole cluster fails. The idea is to have a peer-to-peer architecture where all servers act as peers meaning that even if two servers go down the cluster remains functioning with no application downtimes. Try NCache:
http://www.alachisoft.com/ncache/