Followed this article
https://medium.com/#vipin.pratap18/rabbitmq-cluster-on-aws-ec2-with-high-availability-1bcd3f8a6404
But while doing sudo rabbitmqctl join_cluster rabbit#
facing issue
DIAGNOSTICS
attempted to contact: [rabbit#rabbitmqnode1]
rabbit#rabbitmqnode1:
connected to epmd (port 4369) on rabbitmqnode1
epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic
TCP connection succeeded but Erlang distribution failed
Node name (or hostname) mismatch: node "rabbit#ip-xxxxx" believes its node name is not "rabbit#ip-xxxxx" but something else.
All nodes and CLI tools must refer to node "rabbit#ip-10-0-1-122" using the same name the node itself uses (see its logs to find out what it is)
I solved it by following the logs itself
it clearly says "CLI tools must refer to node "rabbit#ip-10-0-x-xxx" using the same name the node itself uses"
while passing the cluster node name i was doing it wrong
NOTE : added hostname from cat /etc/hosts
did like below
sudo rabbitmqctl join_cluster rabbit#ip-10-0-x-yyy
and it worked
suucessfull message "Clustering node rabbit#ip-10-0-x-xxx with rabbit#ip-10-0-x-yyy"
Related
I'm trying to follow MonetDB docs on Cluster Management
to setup a 3 nodes cluster using 3 Centos machines, I created the 3 dbfarm using monetdbd create /path/to/mydbfarm and from the first node, I run monetdb discover and it returns nothing where it should discover the other nodes, and when I try to run monetdb -h [second node IP] -P mypasshphrase status it returns the following error
status: cannot connect: Connection refused
PS: I have a passwordless connection between these 3 nodes, ssh [any node IP] works just fine,
Thank you
By default MonetDB listens only for local connections. This is for security reasons.
To listen also for remote connections, run
monetdbd set listenaddr=0.0.0.0 .../path/to/dbfarm
on each of the nodes and restart monetdbd.
I am trying to configure two node(node1 and node2 HA cluster using pacemaker on centos 7. I executed below steps on both nodes
yum install pcs
systemctl enable pcsd.service pacemaker.service corosync.service
systemctl start pcsd.service
passwd hacluster
After that execute below command on node1
pcs cluster auth node1 node2
i am getting below error
Error: Unable to communicate with node2 Error: Unable to
communicate with node1
I have also verified that both nodes are listening on port 2224 and also used telnet to verify that both nodes are able to connect to each other on 2224.
Need help.
The issue got resolved after using FQDN instead of hostname(node1.demo.in, node2.demo.in). below command worked fine.
pcs cluster auth node1.demo.in node2.demo.in
Don't know exact cause for this. Any Idea?
I installed Hadoop (HDP 2.5.3) on 4 VMs with Ambari (1 Ambari Server and 3 Ambari Clients; with the DNS entries server, node0, node1, node2) with HDFS, YARN, MapReduce and Zookeeper.
However, YARN doesn't want to start. When starting the Resource Manager on node1 I get the following error:
resource_management.core.exceptions.ExecutionFailed: Execution of 'curl -sS -L -w '%{http_code}' -X GET 'http://node0:50070/webhdfs/v1/ats/done/?op=GETFILESTATUS&user.name=hdfs' 1>/tmp/tmpgsiRLj 2>/tmp/tmpMENUFa' returned 7. curl: (7) Failed to connect to node0 port 50070: connection refused 000
App Timeline Server and History Server on node1 don't want to start either. Zookeeper, NameNode, DataNode and Nodemanager on Node0 is up. The nodes can reach each other (tried with ping) so that shouldn't be the problem.
Hopefully one can help me. I'm really new to this topic and not really familiar with the system.
You should check the host file (/etc/hosts), see the host name and FNDN, check if there any duplicates name, IP address.
Could you also confirm the firewall activity by steps:
sudo ufw status
And also check the port in iptables (or allow port in firewall: udp, tcp).
i am using 3 node cluster one is master node and rest are slave node. my master node became inactive, its not getting pinged. how to make it up and running. the error its showing is No Route to Host from Hadoop-slave2/ip address to Hadoop-master:9000 failed on socket timeout exception: java.net.NoRouteToHostException: No route to host
Restart your namenode system and then try to ping it from slave node (If possible).
Else, do this and post the results of each command:
ping hostname
ping 127.0.0.1
telnet hostname
telnet hostname port
ps -aux | grep "port-number"
Note: Type this all in namenode system and post the results.
I'm using the cloudera distribution of Hadoop and recently had to change the IP addresses of a few nodes in the cluster. After the change, on one of the nodes (Old IP:10.88.76.223, New IP: 10.88.69.31) the following error comes up when I try to start the data node service.
Initialization failed for block pool Block pool BP-77624948-10.88.65.174-13492342342 (storage id DS-820323624-10.88.76.223-50010-142302323234) service to hadoop-name-node-01/10.88.65.174:6666
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException): Datanode denied communication with namenode: DatanodeRegistration(10.88.69.31, storageID=DS-820323624-10.88.76.223-50010-142302323234, infoPort=50075, ipcPort=50020, storageInfo=lv=-40;cid=cluster25;nsid=1486084428;c=0)
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:656)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:3593)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:899)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:91), I was unable to start the datanode service due to the following error:
Has anyone had success with changing the IP address of a hadoop data node and join it back to the cluster without data loss?
CHANGE HOST IP IN CLOUDERA MANAGER
Change Host IP on all node
sudo nano /etc/hosts
Edit the ip cloudera config.ini on all node if the master node ip changes
sudo nano /etc/cloudera-scm-agent/config.ini
Change IP in PostgreSQL Database
For the password Open PostgreSQL password
cat /etc/cloudera-scm-server/db.properties
Find the password lines
Example. com.cloudera.cmf.db.password=gUHHwvJdoE
Open PostgreSQL
psql -h localhost -p 7432 -U scm
Select table in PostgreSQL
select name,host_id,ip_address from hosts;
Update table IP
update hosts set ip_address = 'xxx.xxx.xxx.xxx' where host_id=x;
Exit the tool
\q
Restart the service on all node
service cloudera-scm-agent restart
Restart the service on master node
service cloudera-scm-server restart
Turns out its better to:
Decommission the server from the cluster to ensure that all blocks are replicated to other nodes in the cluster.
Remove the server from the cluster
Connect to the server and change the IP address then restart the cloudera agent
Notice that cloudera manager now shows two entries for this server. Delete the entry with the old IP and longest heartbeat time
Add the server to the required cluster and add required roles back to the server (e.g. HDFS datanode, HBASE RS, Yarn)
HDFS will read all data disks and recognize the block pool and cluster IDs, then register the datanode.
All data will be available and the process will be transparent to any client.
NOTE: If you run into name resolution errors from HDFS clients, the application has likely cached the old IP and will most likely need be restarted. Particularly Java clients that previously referenced this server e.g. HBASE clients must be restarted due to the JVM caching IPs indefinitely. Java based clients will likely throw errors relating to connectivity to the server with changed IP because they have the old IP cached until they are restarted.