I want to replace current 3 ZooKeeper servers with 3 new ZooKeeper servers. I have added:
new Zoo to Ambari,
add new Zoo to variables:
hbase.zookeeper.quorum
ha.zookeeper.quorum
zookeeper.connect
hadoop.registry.zk.quorum
yarn.resourcemanager.zk-address
Restart services, restart RM, and still I can't connect to any new Zoo when I turn off all old Zoo servers.
zookeeper-client -server zoo-new1
I get the following error:
"Unable to read additional data from server sessionid 0x0, likely server has closed socket"
And on new Zoo server in logs (zookeeper.out):
"Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running"
When I run one of the old ZooKeepers, then everything is working, and I can connect also to the new ZooKeeper servers.
My best guess is that this has to do with one of the most important properties in zookeeper, namely leader election. If you start with a zookeeper quorum with 3 servers and add 3 more servers to it. You will have to have at least 4 servers running for the quorum to be accessible. When a zookeeper node was unable to elect a leader it will look as if it's down.
This is also the reason why your setup works when you start one of the old zookeepers, because they are now 4 alive of 6 possible. If you want the new setup to work you need to remove the old servers from the config, so that the quorum only knows about the three new ones. To simply shut a zookeeper server down will not remove it from the quorum.
Related
I am seeing some errors in my nifi cluster, I have a 3 node secured nifi cluster i am seeing the below errors. at the 2 nodes
ERROR [main] org.apache.nifi.web.server.JettyServer Unable to load flow due to:
java.io.IOException: org.apache.nifi.cluster.ConnectionException:
Failed to connect node to cluster due to: java.io.IOException:
Could not begin listening for incoming connections in order to load balance data across the cluster.
Please verify the values of the 'nifi.cluster.load.balance.port' and 'nifi.cluster.load.balance.host'
properties as well as the 'nifi.security.*' properties
See the clustering configuration guide for the list of clustering options you have to configure. For load balancing, you'll need to specify ports that are open in your firewall so that the nodes can communicate. You'll also need to make sure that each host has its node hostname property set, its host ports set and that there are no firewall restricts between the nodes and your Apache Zookeeper cluster.
If you want to simplify the setup to play around, you can use the information in the clustering configuration section of the admin guide to set up an embedded ZooKeeper node within each NiFi instance. However, I would recommend setting up an external ZooKeeper cluster. A little more work, but ultimately worth it.
Consider a redis sentinel setup with 5 machines. Each machine has sentinel process(s1,s2,s3,s4,s5) and redis instance(r1,r2,r3,r4,r5) running. One is master(r1) and others as slave(r2...r5). During failover of master r1, redis configuration slaveof of must be override with new master r3.
Who will override the redis configuration of slave redis(r2,r4,r5)? Elected sentinel responsible for failover(assuming s2 is elected sentinel) s2 will override the redis configuration at r2,r4,r5 or sentinel running at their respective machine will override the local redis configuration(sn will override configuration of rn)?
Elected Sentinel would update the configuration.This is the full list of Sentinel capabilities at a high level:
Monitoring: Sentinel constantly checks if your master and slave instances are working as expected.
Notification: Sentinel can notify the system administrator, another computer programs, via an API, that something is wrong with one of the monitored Redis instances.
Automatic failover: If a master is not working as expected, Sentinel can start a failover process where a slave is promoted to master, the other additional slaves are reconfigured to use the new master, and the applications using the Redis server informed about the new address to use when connecting.
Configuration provider: Sentinel acts as a source of authority for clients service discovery: clients connect to Sentinels in order to ask for the address of the current Redis master responsible for a given service. If a failover occurs, Sentinels will report the new address.
For more details, refer to docs
We have a 8 node cluster. Our applications are pointing to one node in this cluster using Transport Client. Issue here is if that node is down, then the applications won't work. we've resolved this by adding all the other 7 node ip's in the Trasport client object.
My question here is, do we have any concept like global node which internally connects to cluster, to which i can point our applications, so that we don't have to restart all our applications whenever we've added a new node to cluster.
Transport Client itself is a participant in ES cluster . You can consider setting "client.transport.sniff", true in Transport client which will detect new nodes in cluster.
I'm new to Cassandra, and I'm trying to set up a simple 2 node cluster on two test ec2 ubuntu instances. but replication is not working, nodetool ring doesn't show both instances. What could I be doing wrong?
I'm using cassandra version 2.0.11.
here's what my config like on both machines:
listen_address: <private_ip>
rpc_address: <private_ip>
broadcast_address: <public_ip>
seeds: <private_ip_of_other_machine>
endpoint_snitch: Ec2Snitch
I have configured EC2 security group to allow all traffic on all ports between these instances. What am I doing wrong here? I can provide the cassandra logs if required.
Thank you.
EDIT: the error I'm getting currently is this:
java.lang.RuntimeException: Unable to gossip with any seeds
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1340)
at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:543)
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:766)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:693)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:585)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:625)
ERROR 15:08:03 Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any seeds
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1340) ~[apache-cassandra-2.2.5.jar:2.2.5]
at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:543) ~[apache-cassandra-2.2.5.jar:2.2.5]
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:766) ~[apache-cassandra-2.2.5.jar:2.2.5]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:693) ~[apache-cassandra-2.2.5.jar:2.2.5]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:585) ~[apache-cassandra-2.2.5.jar:2.2.5]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [apache-cassandra-2.2.5.jar:2.2.5]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516) [apache-cassandra-2.2.5.jar:2.2.5]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:625) [apache-cassandra-2.2.5.jar:2.2.5]
WARN 15:08:03 No local state or state is in silent shutdown, not announcing shutdown
The 1st thing I see is that your seeds: list is wrong. Both nodes should have the same seeds: list. For a simple 2-node test setup, you only need 1 seed (pick either one). If the nodes are in the same AZ, you can use the private IP.
I am setting up my first Redis framework, and so far I have the following:
Server1:
- Redis master
- 3 Redis Sentinels (quorum set to 2)
Server2:
- Redis slave
- 3 Redis Sentinels (quorum set to 2)
The master and slave appear to be working properly and data is syncing from the master to the slave. When I install and start the sentinels, they too seem to run ok in the fact that if I connect to any of them, and run sentinel masters, it will show the sentinel is pointed at my Redis master and is showing the various properties.
However, the actual failover doesn't seem to work. For example, if I connect to my Redis master and run debug segfault to get it to fail, the failover to the slave does not occur. None of the sentinels log anything so it appears they are not actually connected. Here is the configuration for my sentinels:
port 26381
sentinel monitor redismaster ServerName 26380 2
sentinel down-after-milliseconds redismaster 10000
sentinel failover-timeout redismaster 180000
sentinel parallel-syncs redismaster 1
logfile "nodes/sentinel1/sentinel.log"
As you can see, this sentinel runs on 26381 (and subsequent sentinels run on 26382 and 26383). My Redis master runs on 26380. All of the ports are open, names/IPs resolve correctly, etc., so I don't think it is an infrastructure issue. In case it is useful, I am running Redis (2.8.17) which I downloaded from the MS Open Tech page.
Does anyone have any thoughts on what might be the problem, or suggestions on how to troubleshoot? I am having a hard time finding accurate documentation for setting up a H.A. instance of Redis on Windows, so any commands useful for troubleshooting these types of issues would be greatly appreciated.
I figured this out. One thing I neglected to mention in my question is that I have the masterauth configuration specified in my Redis master config file, so my clients have to provide a password to connect. I missed this in my sentinel configuration, and did not provide a password. The sentinel logging does not indicate this, so it was not obvious to me. Once I added this:
sentinel auth-pass redismaster <myPassword>
To my sentinel configuration file, everything started working as it should.