I am currently running one cluster with two nodes on one VM, both nodes are listing to different Ports, one is on port 80 and one is on port 81. My firewall is configured to allow port 80 communications through. With that being said, if I disconnect the node of port 80 the UI sends me this message “This node is currently not connected to the cluster. Any modifications to the data flow made here will not replicate across the cluster.” But the process in the background connects to the new node and keeps running normally, and the Canvas (UI) bugs out and I get a “disconnection message in the top left of the screen, where it would usually show you how many nodes you have running, but if I disconnect the node on port 81, everything runs smoothly. Not sure if both nodes need to be on the same Port or not, or if both nodes need to be on the same port but different VMs. Can anyone help?
Apache NiFi 1.x clustering follows a zero-master design. Each of the cluster nodes runs an active NiFi process, and each runs the web and API server on their port (80 and 81 here). Because you are running the two processes on the same physical machine, they require different ports.
As you communicate with the NiFi process on port 80 - changing the flow, starting/stopping processors, etc - it will coordinate these changes with the NiFi process on port 81. If you connected to the UI on port 81, you would see your changes reflected, and you would also be able to make updates that are coordinated across the cluster.
If you remove a node from the cluster, this coordination no longer involves that node.
Typically, you would expose the web UI/API port of each of the cluster nodes, so that if one node fails or is disconnected, you can continue to administer the cluster through any other active, healthy node.
Related
I have a cluster of (two) database servers (HA/ High Availability). My application connects to one of them (active) at a time. The other one remains passive and always ready to get connected when the active one fails over.
It’s a typical Windows cluster mechanism. Now I have a challenge to handle these two servers, but how can I let the my app know which one to be connected, since both (active & passive) ned to be registered in consul.
I am seeing some errors in my nifi cluster, I have a 3 node secured nifi cluster i am seeing the below errors. at the 2 nodes
ERROR [main] org.apache.nifi.web.server.JettyServer Unable to load flow due to:
java.io.IOException: org.apache.nifi.cluster.ConnectionException:
Failed to connect node to cluster due to: java.io.IOException:
Could not begin listening for incoming connections in order to load balance data across the cluster.
Please verify the values of the 'nifi.cluster.load.balance.port' and 'nifi.cluster.load.balance.host'
properties as well as the 'nifi.security.*' properties
See the clustering configuration guide for the list of clustering options you have to configure. For load balancing, you'll need to specify ports that are open in your firewall so that the nodes can communicate. You'll also need to make sure that each host has its node hostname property set, its host ports set and that there are no firewall restricts between the nodes and your Apache Zookeeper cluster.
If you want to simplify the setup to play around, you can use the information in the clustering configuration section of the admin guide to set up an embedded ZooKeeper node within each NiFi instance. However, I would recommend setting up an external ZooKeeper cluster. A little more work, but ultimately worth it.
We have a 8 node cluster. Our applications are pointing to one node in this cluster using Transport Client. Issue here is if that node is down, then the applications won't work. we've resolved this by adding all the other 7 node ip's in the Trasport client object.
My question here is, do we have any concept like global node which internally connects to cluster, to which i can point our applications, so that we don't have to restart all our applications whenever we've added a new node to cluster.
Transport Client itself is a participant in ES cluster . You can consider setting "client.transport.sniff", true in Transport client which will detect new nodes in cluster.
I am trying to adopt a service discovery mechanism for my system. I have a bunch of nodes and they will communicate with each other via gRpc. Because in some frameworks like Mesos, a new node is brought up after it fails would possibly has a different ip address and a different port, I am thinking of using service discovery so that each node can have a cluster config that is agnostic to node failure.
My current options are to using DNS or strongly-consistent key-value store like etcd or zookeeper. My problem is to understand how the cache of name mappings in healthy nodes get invalidated and updated when a node goes through down and up.
The possible ways I can think of are:
When healthy nodes detect a connection problem, they invalidate
their cache entry immediately and keep pulling the DNS registry
until the node is connectable again.
When a node is down and up, the DNS registry broadcasts the events to all healthy nodes. Seems this may require heartbeats from DNS registry.
The cache in each node has a TTL field and within a TTL interval each node has to live with the node failure until the cache entry expires and pulls from the DNS registry again.
My question is which option (you can name more) is the case in reality and why it is better than other alternatives?
I've been trying to use the lovely ansible-elasticsearch project to set up a nine-node Elasticsearch cluster.
Each node is up and running... but they are not communcating with each other. The master nodes think there are zero data nodes. The data nodes are not connecting to the master nodes.
They all have the same cluster.name. I have tried with multicast enabled (discovery.zen.ping.multicast.enabled: true) and disabled (previous setting to false, and discovery.zen.ping.unicast.hosts:["host1","host2",..."host9"]) but in either case the nodes are not communicating.
They have network connectivity to one another - verified via telnet over port 9300.
Sample output:
$ curl host1:9200/_cluster/health
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":"waited for [30s]"}],"type":"master_not_discovered_exception","reason":"waited for [30s]"},"status":503}
I cannot think of any more reasons why they wouldn't connect - looking for any more ideas of what to try.
Edit: I finally resolved this issue. The settings that worked were publish_host to "_non_loopback:ipv4_" and unicast with discovery.zen.ping.unicast.hosts set to ["host1:9300","host2:9300","host3:9300"] - listing only the dedicated master nodes. I have a minimum master node count of 2.
The only reasons I can think that can cause that behavior are:
Connectivity issues - Ping is not a good tool to check that nodes can connect to each other. Use telnet and try connecting from host1 to host2 on port 9300.
Your elasticsearch.yml is set to bind 127.0.0.1 or the wrong host (if you're not sure, bind 0.0.0.0 to see if that solves your connectivity issues and then it's important to change it to bind only internal hosts to avoid exposure of elasticsearch directly to the internet).
Your publish_host is incorrect - This usually happens when you run ES inside a docker container for example, you need to make sure that the publish_host is set to an address that can be accessed via other hosts.