Hbase Regions server is not able to communicate with HMaster

Hbase Regions server is not able to communicate with HMaster - hadoop

I am not able to setup the hbase in distributed mode. It works fine when i setup it on one machine(standalone mode). My Zookeeper, hmaster and region server starts properly.
But when i go to hbase shell and look for the status. It shows me 0 region server. I am attaching my logs of regions server. Plus the host files of my master(namenode) and slave(datanode). I have tried every P&C which are given on stackoverflow for changing the host file, but didn't work for me.
2013-06-24 15:03:45,844 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server naresh-pc/192.168.0.108:2181. Will not attempt to authenticate using SASL (unknown error)
2013-06-24 15:03:45,845 WARN org.apache.zookeeper.ClientCnxn: Session 0x13f75807d960001 for server null, unexpected error, closing socket connection and attempting to reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:692)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
Slave /etc/hosts :
127.0.0.1 localhost
127.0.1.1 ubuntu-pc
#ip for hadoop
192.168.0.108 master
192.168.0.126 slave
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
Master /etc/hosts :
127.0.0.1 localhost
127.0.1.1 naresh-pc
#ip for hadoop
192.168.0.108 master
192.168.0.126 slave
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
hbase-site.xml :
<configuration>
<property>
<name>hbase.master</name>
<value>master:60000</value>
<description>The host and port that the HBase master runs at.
A value of 'local' runs the master and a regionserver
in a single process.
</description>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:54310/hbase</value>
<description>The directory shared by region servers.</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed
Zookeeper true: fully-distributed with unmanaged Zookeeper
Quorum (see hbase-env.sh)
</description>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
<description>Property from ZooKeeper's config zoo.cfg.
The port at which the clients will connect.
</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master</value>
<description>Comma separated list of servers in the ZooKeeper Quorum.
For example,
"host1.mydomain.com,host2.mydomain.com".
By default this is set to localhost for local and
pseudo-distributed modes of operation. For a
fully-distributed setup, this should be set to a full
list of ZooKeeper quorum servers. If
HBASE_MANAGES_ZK is set in hbase-env.sh
this is the list of servers which we will start/stop
ZooKeeper on.
</description>
</property>
</configuration>
Zookeeper log:
2013-06-28 18:22:26,781 WARN org.apache.zookeeper.server.NIOServerCnxn: caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x13f8ac0b91b0002, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:722)
2013-06-28 18:22:26,858 INFO org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection for client /192.168.0.108:57447 which had sessionid 0x13f8ac0b91b0002
2013-06-28 18:25:21,001 INFO org.apache.zookeeper.server.ZooKeeperServer: Expiring session 0x13f8ac0b91b0002, timeout of 180000ms exceeded
2013-06-28 18:25:21,002 INFO org.apache.zookeeper.server.PrepRequestProcessor: Processed session termination for sessionid: 0x13f8ac0b91b0002
Master Log:
2013-06-28 18:22:41,932 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 1502022 ms
2013-06-28 18:22:43,457 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 1503547 ms

Remove 127.0.1.1 from hosts file and turn of IPV6. That should fix the problem.

Your Regionserver is looking for HMaster at naresh-pc but you do not have any such entry in your /etc/hosts file. Please make sure your configuration is proper.

Can you try all of this:
Make sure your /conf/regionservers file has just one entry: slave
Not sure what HBase version you are using, but instead of using port 54310 for hbase.rootdir property in your hbase-site.xml use port 9000
Your /etc/hosts file, on BOTH master and slave should should only have these custom entries:
127.0.0.1 localhost
192.168.0.108 master
192.168.0.126 slave
I am concerned that your logs state Opening socket connection to server naresh-pc/192.168.0.108:2181
Clearly the system thinks that zookeeper is on host naresh-pc, but in your config you are setting zookeeper quorum at host master, which HBase will bind to. That's a problem right there. In my experience, HBase is EXTREMELY fussy about host names, so make sure they are all in synch in all your configs and in your /etc/hosts file.
Also, this may be a minor issue, but wouldn't hurt to specify the zookeper data directory in your .xml file to have a minimum set of settings that should make the cluster work: hbase.zookeeper.property.dataDir

Related

Hadoop installation: Namenode cannot be started

Currently I am trying to install hadoop-2.6.0 on my ubuntu 14.10 (32 bit utopic). I followed the instruction from here:
http://www.itzgeek.com/how-tos/linux/ubuntu-how-tos/install-apache-hadoop-ubuntu-14-10-centos-7-single-node-cluster.html#axzz3X2DuWaxQ
However, namenode cannot be started when I try to format namenode.
This is what I keep receiving when I try to do hdfs or hadoop namenode -format:
15/04/11 16:32:13 FATAL namenode.NameNode: Fialed to start namenode
java.lang.IllegalArgumentException: URI has an authority component
at java.io.File.<init>(File.java:423)
at org.apache.hadoop.hdfs.server.namenode.NNSStorage.getStorageDirectory(NNStorage.java:329)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java: 270)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:241)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:935)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1379)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1504)
15/04/11 16:32:13 INFO util.ExitUtil: Exiting with status 1
15/04/11 16:32:14 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ThinkPad-Edge-E540/127.0.1.1
************************************************************/
I am new to linux and hadoop. Please help me out on this issue. Also, when I first tried to install hadoop, I was receiving error message like this:
java.net.ConnectException: Call From ThinkPad-Edge-E540/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Later, I uninstalled hadoop 2.6.0 and now I'm trying to follow current instruction as shown in the above link.
Update
I have removed all previous installed java (jdk1.7.0) that I installed in previous version. But the error message is still there.
Update
This is what showing in my etc/hosts:
127.0.0.1 localhost
127.0.1.1 myname-mycomputer (I have commented out this line per suggestion)
#The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00:0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

This problem arise when mistaken I specified wrong path for namenode and datanode in hdfs-site.xml and tmp dir path in core-site.xml,
Path should be well formatted, for example-
<property>
<name>dfs.namenode.edits.dir</name>
<value>file:///home/hadoop/hadoop-content/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/hadoop/hadoop-content/hdfs/datanode</value>
</property>
and for temp dir in core-site.xml it is like -
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-content/tmp</value>
</property>
sometimes we make mistake in specifying - file:///

In /etc/hosts:
1. Add this line:
your-ip-address your-host-name
example: 192.168.1.8 master
In /etc/hosts:
2. Delete the line with 127.0.1.1 (This will cause loopback)
3. In your core-site, change localhost to your-ip or your-hostname
Now, restart the cluster.

Hadoop slave cannot connect to master, even when service is running and ports are open

I'm running hadoop 2.5.1 and I'm having a problem when slaves are connecting to master. My goal is to set-up a hadoop cluster. I hope someone can help, I'm been poundering with this too long already! :)
This is what comes up to the log file of slave:
2014-10-18 22:14:07,368 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: master/192.168.0.104:8020
This is my core-site.xml -file (same on master and slave):
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master/</value>
</property>
</configuration>
This is my hosts -file ((almost)same on master and slave).. I have hard coded addresses to there without any success:
127.0.0.1 localhost
192.168.0.104 xubuntu: xubuntu
192.168.0.104 master
192.168.0.194 slave
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
Netstats from master:
xubuntu#xubuntu:/usr/local/hadoop/logs$ netstat -atnp | grep 8020
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 192.168.0.104:8020 0.0.0.0:* LISTEN 26917/java
tcp 0 0 192.168.0.104:52114 192.168.0.104:8020 ESTABLISHED 27046/java
tcp 0 0 192.168.0.104:8020 192.168.0.104:52114 ESTABLISHED 26917/java
Nmap from master to master:
Starting Nmap 6.40 ( http://nmap.org ) at 2014-10-18 22:36 EEST
Nmap scan report for master (192.168.0.104)
Host is up (0.000072s latency).
rDNS record for 192.168.0.104: xubuntu:
PORT STATE SERVICE
8020/tcp open unknown
..and nmap from slave to master (even when the port is open, the slave doesn't connect to it..):
ubuntu#ubuntu:/usr/local/hadoop/logs$ nmap master -p 8020
Starting Nmap 6.40 ( http://nmap.org ) at 2014-10-18 22:35 EEST
Nmap scan report for master (192.168.0.104)
Host is up (0.14s latency).
PORT STATE SERVICE
8020/tcp open unknown
What is this all about? The problem is not about firewall.. I have also read every thread there is to to this without any success. I'm frustrated to this.. :(

At least one of your problems is that you are using old configuration name for the HDFS. For version 2.5.1 the configuration name should be fs.defaultFS instead of fs.default.name. I also suggest defining the port in the value, so the value would be hdfs://master:8020.
Sorry, I'm not linux guru, so I don't know about nmap, but does telnet'ing work from slave to master to the port?

Fully Distributed HBase Error

I'm trying to setup HBase 0.96 to run on top of my Hadoop 2.2.0 cluster. I run start-hbase.sh and the master along with the regions startup. I can log into each region and see the processes running. However when check to see how many regions are up either through the web ui or a shell command I get a response of 0. Based on the logs it looks like the region servers are starting up not unable to notify the master that they are running. I confirmed that the master is listening on port 60000 and ports 60000 along with 60020 are both open. I've included my hbase-site file along with the logs from a region server.
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/master</value>
</property>
Log File:
2013-11-08 20:08:58,357 INFO [regionserver60020] regionserver.HRegionServer: reportForDuty to master=10.119.102.58,60000,1383941300240 with port=60020, startcode=1383941300420
2013-11-08 20:09:18,636 WARN [regionserver60020] regionserver.HRegionServer: error telling master we are up
com.google.protobuf.ServiceException: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connec$
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1667)
at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1708)
at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:5402)
at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:1924)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:790)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending local=/100.65.$
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:532)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:573)
at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:858)
at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1532)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1421)
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1650)
... 5 more
2013-11-08 20:09:18,676 WARN [regionserver60020] regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying.

I don't think the hbase.zookeeper.quorum is set correctly, which may cause the connection timeout. If you just wan't to test 0.96, start it in standalone mode and then make sure the zookeeper cluster is running before you change to distributed mode.

The HRegionServer complains that it can't connect to HMaster in order to report status (up).
It's probable that the HMaster process is not running so you may want to start it, or if you already started it to check the master log file.

Check your master server is listening on port 60000 by using the following command
netstat -l
tcp6 0 0 Vostro-350:60000 : LISTEN
if the server is listening on ipv6 then disabled it.
To disable you have to append the following to the file: /etc/sysctl.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
After reboot you should validate that IPV6 is really off by:
cat /proc/sys/net/ipv6/conf/all/disable_ipv6
(0 = IPV6 on ; 1 = IPV6 off)
Ref link : Connecting and Persisting to HBase

hbase cannot connect to hadoop

I am running an hdfs instance in pseudo-distributed mode, and tried to make another hbase instance connected to it on the same server. Logs in hadoop are fine, but I constantly got the connection failure in hbase' log
==================================================================================
2012-05-01 10:49:07,212 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181
2012-05-01 10:49:07,213 WARN org.apache.zookeeper.ClientCnxn: Session 0x13708dc552d0001 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
2012-05-01 10:49:08,882 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181
2012-05-01 10:49:08,882 WARN org.apache.zookeeper.ClientCnxn: Session 0x13708dc552d0001 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
==================================================================================
Configuration of core-site.xml#hadoop
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Configuration of hbase-site.xml#hbase
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
I also tried to replace localhost with the actual ip of the server, but got the same error.

First, you need to make sure your hbase master node is running, you can use jps to check.
If it is not running, you can run it by start-hbase.sh command or hbase master start.
And then check its status by other commands, like netstat -an | grep 9000
Second, if the previous method does not work, check your firewall configuration such as iptables and SELinux.
Use sudo iptables -L to check your iptables configuration. You can disable the iptables by sudo service iptables stop command under redhat based linux systems.
Use getenforce to check if SElinux is in enforcing mode.
Third, check the system configuration, for example, ssh etc.

You need to replace the core hadoop jar in the $HBASE_HOME/bin/lib/hadoop-{{version}}core.jar with the one in the $HADOOP_HOME/hadoop-{{version}}core.jar
I was running into the same problem when I tried to install hbase 0.92 from hbase 0.90-xxx which was working fine, i replaced the hbase-env.sh and hbase-site.xml from the old hbase to the new but forgot to copy the hadoop core jar.

I'm always suspicious when I see localhost in a config. Also when you use localhost, then it becomes very difficult (to impossible) to access any services from the psuedo distributed system from a host other than the one you are running on.
You did say you tried the IP address, but you might want to make sure its the IP address that the node really thinks its at.
Check the zookeeper logs from when it starts up and see what IP address it "thinks" its at. There should be a line like:
2012-01-31 09:32:46,083 - INFO [main:Environment#97] - Server environment:host.name=ip-10-8-127-58.ec2.internal
Then use the value of host.name as the value for all places in ALL Hadodop, HBase, Hive, Zookeeper, etc config files that need a hostname (assuming they are all on the same machine as you said you are in psuedo-distributed mode)
Also you did not show your hbase.zookeeper.quorum setting in hbase-site.xml. That is where hbase gets its knowledge of the zookeeper's address
<property>
<name>hbase.zookeeper.quorum</name>
<value>ip-10-8-127-58.ec2.internal</value>
</property>

I think Hbase cannot find the zookeeper quorum, you have to set the hbase.zookeeper.quorum property in hbase-site.xml. Also check if classpath is set properly or not, check this doc out http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#classpath

Hbase: How to specify hostname for Hbase master

I'm struggling to setup a Hbase distributed cluster with 2 nodes, one is my machine and one is the VM, using the "host-only" Adapter in VirtualBox.
My problem is that the region server (from VM machine) can't connect to Hbase master running on host machine. Although in Hbase shell I can list, create table, ..., in regionserver on VM machine ('slave'), the log always show
org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to master. Retrying. Error was:
java.net.ConnectException: Connection refused
Previously, I've successfully setup Hadoop, HDFS and MapReduce on this cluster with 2 nodes named as 'master', and 'slave', 'master' as master node and both 'master' and 'slave' work as slave nodes, these names bound to the vboxnet0 interface of VirtualBox (the hostnames in /etc/hostname is different). I'also specify the "slave.host.name" property for each node as 'master' and 'slave'.
It seems that the Hbase master on the 'master' always run with 'localhost' hostname, from the slave machine, I can't telnet to the hbase master with 'master' hostname. So is there any way to specify hostname use for Hbase master as 'master', I've tried specify some properties about DNS interface for ZooKeeper, Master, RegionServer to use internal interface between master and slave, but it still does not work at all.
/etc/hosts for both as something like
127.0.0.1 localhost
127.0.0.1 ubuntu.mymachine
# For Hadoop
192.168.56.1 master
192.168.56.101 slave
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

The answer that #Infinity provided seems to belong to version ~0.9.4.
For version 1.1.4.
according to the source code from
org.apache.hadoop.hbase.master.HMaster
the configuration should be:
<property>
<name>hbase.master.hostname</name>
<value>master.local</value>
<!-- master.local is the DNS name in my network pointing to hbase master -->
</property>
After setting this value, region servers are able to connect to hbase master;
however, in my environment, the region server complained about:
com.google.protobuf.ServiceException: java.net.SocketException: Invalid argument
The problem disappeared after I installed oracle JDK 8 instead of open-jdk-7 in all of my nodes.
So in conclusion, here is my solution:
use dns name server instead of setting /etc/hosts, as hbase is very
picky on hostname and seems requires DNS lookup as well as reverse
DNS lookup.
upgrade jdk to oracle 8
use the setting item mentioned
above.

My host file is like
127.0.0.1 localhost
192.168.2.118 shashwat.machine.com shashwat
make your hosts file as following:
127.0.0.1 localhost
For Hadoop
192.168.56.1 master
192.168.56.101 slave
and in hbase conf put following entries :
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
</property>
<property>
<name>hbase.master</name>
<value>master:60000</value>
<description>The host and port that the HBase master runs at.</description>
</property>
<property>
<name>hbase.regionserver.port</name>
<value>60020</value>
<description>The host and port that the HBase master runs at.</description>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/home/cluster/Hadoop/hbase-0.90.4/temp</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
<description>Property from ZooKeeper's config zoo.cfg.
The port at which the clients will connect.
</description>
</property>
If you are using localhost anywhere remove that and replace it with "master" which is name for namenode in your hostfile....
one morething you can do
sudo gedit /etc/hostname
this will open the hostname file bydefault ubuntu will be there so make it master. and restart your system.
For hbase specify in "regionserver" file inside conf dir put these entries
master
slave
and restart.everything.

There are two things that fix this class of problem for me:
1) Remove all "localhost" names, only have 127.0.0.1 pointing to the name of the hmaster node.
2) run "hostname X" on your hbase master node, to make sure the hostname matches what is in /etc/hosts.
Not being a networking expert, I can't say why this is important, but it is :)

Most of the time the error is coming from Zookeeper that send a wrong hostname.
You can check what Zookeeper sends as HBase master host:
Find Zookeeper bin folder:
bin/zkCli.sh -server 127.0.0.1:2181
get /hbase/master
This should give you the HBase master IP that answer Zookeeper, so this IP must be accessible.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Hbase Regions server is not able to communicate with HMaster - hadoop

Remove 127.0.1.1 from hosts file and turn of IPV6. That should fix the problem.

Your Regionserver is looking for HMaster at naresh-pc but you do not have any such entry in your /etc/hosts file. Please make sure your configuration is proper.

Related

Hadoop installation: Namenode cannot be started

Hadoop slave cannot connect to master, even when service is running and ports are open

Fully Distributed HBase Error

hbase cannot connect to hadoop

Hbase: How to specify hostname for Hbase master

Categories

Resources