I am trying to adopt a service discovery mechanism for my system. I have a bunch of nodes and they will communicate with each other via gRpc. Because in some frameworks like Mesos, a new node is brought up after it fails would possibly has a different ip address and a different port, I am thinking of using service discovery so that each node can have a cluster config that is agnostic to node failure.
My current options are to using DNS or strongly-consistent key-value store like etcd or zookeeper. My problem is to understand how the cache of name mappings in healthy nodes get invalidated and updated when a node goes through down and up.
The possible ways I can think of are:
When healthy nodes detect a connection problem, they invalidate
their cache entry immediately and keep pulling the DNS registry
until the node is connectable again.
When a node is down and up, the DNS registry broadcasts the events to all healthy nodes. Seems this may require heartbeats from DNS registry.
The cache in each node has a TTL field and within a TTL interval each node has to live with the node failure until the cache entry expires and pulls from the DNS registry again.
My question is which option (you can name more) is the case in reality and why it is better than other alternatives?
Related
I have a cluster of (two) database servers (HA/ High Availability). My application connects to one of them (active) at a time. The other one remains passive and always ready to get connected when the active one fails over.
It’s a typical Windows cluster mechanism. Now I have a challenge to handle these two servers, but how can I let the my app know which one to be connected, since both (active & passive) ned to be registered in consul.
I am seeing some errors in my nifi cluster, I have a 3 node secured nifi cluster i am seeing the below errors. at the 2 nodes
ERROR [main] org.apache.nifi.web.server.JettyServer Unable to load flow due to:
java.io.IOException: org.apache.nifi.cluster.ConnectionException:
Failed to connect node to cluster due to: java.io.IOException:
Could not begin listening for incoming connections in order to load balance data across the cluster.
Please verify the values of the 'nifi.cluster.load.balance.port' and 'nifi.cluster.load.balance.host'
properties as well as the 'nifi.security.*' properties
See the clustering configuration guide for the list of clustering options you have to configure. For load balancing, you'll need to specify ports that are open in your firewall so that the nodes can communicate. You'll also need to make sure that each host has its node hostname property set, its host ports set and that there are no firewall restricts between the nodes and your Apache Zookeeper cluster.
If you want to simplify the setup to play around, you can use the information in the clustering configuration section of the admin guide to set up an embedded ZooKeeper node within each NiFi instance. However, I would recommend setting up an external ZooKeeper cluster. A little more work, but ultimately worth it.
Consider a redis sentinel setup with 5 machines. Each machine has sentinel process(s1,s2,s3,s4,s5) and redis instance(r1,r2,r3,r4,r5) running. One is master(r1) and others as slave(r2...r5). During failover of master r1, redis configuration slaveof of must be override with new master r3.
Who will override the redis configuration of slave redis(r2,r4,r5)? Elected sentinel responsible for failover(assuming s2 is elected sentinel) s2 will override the redis configuration at r2,r4,r5 or sentinel running at their respective machine will override the local redis configuration(sn will override configuration of rn)?
Elected Sentinel would update the configuration.This is the full list of Sentinel capabilities at a high level:
Monitoring: Sentinel constantly checks if your master and slave instances are working as expected.
Notification: Sentinel can notify the system administrator, another computer programs, via an API, that something is wrong with one of the monitored Redis instances.
Automatic failover: If a master is not working as expected, Sentinel can start a failover process where a slave is promoted to master, the other additional slaves are reconfigured to use the new master, and the applications using the Redis server informed about the new address to use when connecting.
Configuration provider: Sentinel acts as a source of authority for clients service discovery: clients connect to Sentinels in order to ask for the address of the current Redis master responsible for a given service. If a failover occurs, Sentinels will report the new address.
For more details, refer to docs
For example; I have 3 nifi nodes in nifi cluster. Example hostnames of these nodes;
192.168.12.50:8080(primary)
192.168.54.60:8080
192.168.95.70:8080
I know that I can access to nifi-rest api from all nifi nodes. I have GetHTTP processor for get cluster summary from rest-api, and this processor runs on only pimary node. I did set "URL" property of this processor to 192.168.12.50:8080/nifi-api/controller/cluster.
But, if primary node is down, new primary node will be elected. Thus, I will not be able to access 192.168.12.50:8080 address from new primary node. Because this node was down. So, I will not be able to get cluster summary result from rest-api.
In this case, Can I use "localhost:8080/nifi-api/controller/cluster" instead of "192.168.12.50:8080/nifi-api/controller/cluster" for each node in nifi cluster?
It depends on a few things... if you are running securely then you have certificates that are generated for each node specific to the hostname, so the host in the web requests needs to match the host in the certificates, so you can't use localhost in that case.
It also depends how NiFi's web server is configured. If nifi.web.http.host or nifi.web.https.host has a specific hostname specified, then the web server is only bound to that hostname and may not accept connections with a different hostname. In a default unsecure setup, if you leave nifi.web.http.host blank then it binds to all interfaces.
You may be able to use the expression language function to obtain the hostname of the current node. So you could make the url something like "http://${hostname()}/nifi-api/controller/cluster".
We have a 8 node cluster. Our applications are pointing to one node in this cluster using Transport Client. Issue here is if that node is down, then the applications won't work. we've resolved this by adding all the other 7 node ip's in the Trasport client object.
My question here is, do we have any concept like global node which internally connects to cluster, to which i can point our applications, so that we don't have to restart all our applications whenever we've added a new node to cluster.
Transport Client itself is a participant in ES cluster . You can consider setting "client.transport.sniff", true in Transport client which will detect new nodes in cluster.