Getting the following error "Datanode denied communication with namenode" while configuring hadoop 0.23.8 - hadoop

I am trying to configure hadoop 0.23.8 on my macbook and am running in with the following exception
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode: 192.168.1.13:50010
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:549)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:2548)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:784)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:394)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1571)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1262)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1565)
My core-site.xml looks like this
<configuration>
<property>
<name>dfs.federation.nameservices</name>
<value>ns1</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1</name>
<value>192.168.1.13:54310</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1</name>
<value>192.168.1.13:50070</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns1</name>
<value>192.168.1.13:50090</value>
</property>
</configuration>
Any ideas on what I may be doing wrong?

Had the same problem with 2.6.0, and shamouda's answer solved it (I was not using dfs.hosts at all so that could not be the answer. I did add
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
to hdfs-site.xml and that was enough to fix the issue.

I got the same problem with Hadoop 2.6.0 and the solution for my case was different than Tariq's answer.
I couldn't list the IP-Host mapping in /etc/hosts because I use DHCP for setting the IPs dynamically.
The problem was that my DNS does not allow Reverse DNS lookup (i.e. looking up the hostname given the IP), and HDFS by default use reverse DNS lookup whenever a datanode tries to register with a namenode. Luckily, this behaviour can be disabled by setting this property "dfs.namenode.datanode.registration.ip-hostname-check" to false in hdfs-site.xml
How to know that your DNS does not allow Reverse lookup? the answer in ubuntu is to use the command "host ". If it can resolve the hostname, then reverse lookup is enabled. If it fails, then reverse lookup is disabled.
References:
1. http://rrati.github.io/blog/2014/05/07/apache-hadoop-plus-docker-plus-fedora-running-images/
2. https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

Looks like name resolution issue to me. Possible reasons :
Machine is listed in the file defined by dfs.hosts.exclude
dfs.hosts is used and the machine is not listed within that file
Also make sure you have IP+hostname of the machine listed in your hosts file.
HTH

I got this problem.
earlier configuration in core-site.xml is like this.
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:12345</value>
</property>
Later I've modified the localhost name with my HOSTNAME (PC name)
<property>
<name>fs.default.name</name>
<value>hdfs://cnu:12345</value>
</property>
It worked for me.

Just for information. I have had the same problem and i have recognized, that there was a typo in the hostname of my slaves. Vise versa there the node itself can have the wrong hostname.

Related

hadoop datanode unable to start. "does not contain a valid host:port authority"

I'm currently using hadoop 1.2.1 (because I need to run a spatial processing software only support this version). I'm trying to deploy in multinode mode with one master and three slaves.
I'm sure I'm able to ssh between all master and slaves without password (including themselves). Also the hostname on each node is correct.
Each node shares the same host file:
192.168.56.101 master
192.168.56.102 slave1
192.168.56.103 slave2
192.168.56.104 slave3
I keep having problems in the slaves node, error log info is as follows,
2015-05-21 23:39:16,841 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.lang.IllegalArgumentException: Does not contain a valid host:port authority: file:///
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:212)
at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:244)
at org.apache.hadoop.hdfs.server.namenode.NameNode.getServiceAddress(NameNode.java:236)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:359)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:321)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1712)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1651)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1669)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1795)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:181
Configurations in core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
In mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracter</name>
<value>master:8012</value>
</property>
</configuration>
In hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration
There could be a problem with the naming convention of your node hostnames.
Make sure they do not contain symbols like "_".
Check Wikipedia for restrictions.
Try to change the "master" to the actual ip address, in all your config files.
You configed OK. You need run command "$HADOOP_HOME/bin/hdfs namenode -format master", after run command "$HADOOP_HOME/sbin/start-dfs"

Error: E0902: Exception occured: [User: Root is not allowed to impersonate root

I am trying to follow the steps given at http://www.rohitmenon.com/index.php/apache-oozie-installation/
Note: I am not using cloudera distibution of hadoop
The above link is similar to http://oozie.apache.org/docs/4.0.1/DG_QuickStart.html
but with more descriptive seems to me
however while running the below command as a root user i am getting exception
./bin/oozie-setup.sh sharelib create -fs
Note: i have two live node shown at dfshealth.jsp . and i have updated the core-site.xml for all three(including namenode) with property as below
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
i understand this is point where i am making mistake Could someone please guide me
Stacktrace
org.apache.oozie.service.HadoopAccessorException: E0902: Exception occured: [User: root is not allowed to impersonate root]
at
org.apache.oozie.service.HadoopAccessorService.createFileSystem(HadoopAccessorService.java:430)
at org.apache.oozie.tools.OozieSharelibCLI.run(OozieSharelibCLI.java:144)
at org.apache.oozie.tools.OozieSharelibCLI.main(OozieSharelibCLI.java:52)
Caused by: org.apache.hadoop.ipc.RemoteException: User: root is not allowed to impersonate root
at org.apache.hadoop.ipc.Client.call(Client.java:1107)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy5.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:135)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:276)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:241)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1411)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1429)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.oozie.service.HadoopAccessorService$2.run(HadoopAccessorService.java:422)
at org.apache.oozie.service.HadoopAccessorService$2.run(HadoopAccessorService.java:420)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
at org.apache.oozie.service.HadoopAccessorService.createFileSystem(HadoopAccessorService.java:420)
... 2 more
--------------------------------------
Note: Getting E0902: Exception occured: [User: oozie is not allowed to impersonate oozie] i have followed this link as well but not able to solve my problem
if i change the core-site.xml as below only for NameNode
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>[NAMENODE IP]</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>hadoop</value>
</property>
I get the exception as
Unauthorized connection for super-user: hadoop
After adding the property files into core-site.xml restart your hadoop and try. Even though if it not works format the namenode and start hadoop it will work.
You need to add these properties in core-site.xml for impersonation in order to solve your whitelist error
<property>
<name>hadoop.proxyuser.oozie.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.oozie.groups</name>
<value>*</value>
</property>
Hope this fixes your issue.
Follow the advice in the article below. Hadoop before 1.1.0 doesn't support wildcard so you have to explicitly specified the hosts and the groups
http://mail-archives.apache.org/mod_mbox/oozie-user/201212.mbox/%3CCAOcnVr1TZZ5X0Mrb7fFA8JdW6rO6PgoJ9u0=2UYbfXf_o8r=DA#mail.gmail.com%3E
I solved the problem by adding those lines in the core-site.xml-file
hadoop.proxyuser.root.hosts
value = *
hadoop.proxyuser.root.groups
value = *
and it works perfectly all my databases and tables are shown.
./oozie-setup.sh sharelib create -fs hdfs://localhost:9000
try to run this command using sudo.
check for hdfs if this path already exits i.e., /user/user_name/share/lib, if it exists remove it using
hadoop fs -rmr /user/user_name
After that run sudo ./oozied.sh. oozie will be started. Then check for your localhost:11000.

get "ERROR: Can't get master address from ZooKeeper; znode data == null" when using Hbase shell

I installed Hadoop2.2.0 and Hbase0.98.0 and here is what I do :
$ ./bin/start-hbase.sh
$ ./bin/hbase shell
2.0.0-p353 :001 > list
then I got this:
ERROR: Can't get master address from ZooKeeper; znode data == null
Why am I getting this error ? Another question:
do I need to run ./sbin/start-dfs.sh and ./sbin/start-yarn.sh before I run base ?
Also, what are used ./sbin/start-dfs.sh and ./sbin/start-yarn.sh for ?
Here is some of my conf doc :
hbase-sites.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://127.0.0.1:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/Users/apple/Documents/tools/hbase-tmpdir/hbase-data</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/Users/apple/Documents/tools/hbase-zookeeper/zookeeper</value>
</property>
</configuration>
core-sites.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
<description>The name of the default file system.</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/Users/micmiu/tmp/hadoop</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>io.native.lib.available</name>
<value>false</value>
</property>
</configuration>
yarn-sites.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
If you just want to run HBase without going into Zookeeper management for standalone HBase, then remove all the property blocks from hbase-site.xml except the property block named hbase.rootdir.
Now run /bin/start-hbase.sh. HBase comes with its own Zookeeper, which gets started when you run /bin/start-hbase.sh, which will suffice if you are trying to get around things for the first time. Later you can put distributed mode configurations for Zookeeper.
You only need to run /sbin/start-dfs.sh for running HBase since the value of hbase.rootdir is set to hdfs://127.0.0.1:9000/hbase in your hbase-site.xml. If you change it to some location on local the filesystem using file:///some_location_on_local_filesystem, then you don't even need to run /sbin/start-dfs.sh.
hdfs://127.0.0.1:9000/hbase says it's a place on HDFS and /sbin/start-dfs.sh starts namenode and datanode which provides underlying API to access the HDFS file system. For knowing about Yarn, please look at http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/YARN.html.
This could also happen if the vm or the host machine is put to sleep ,Zookeeper will not stay live.
Restarting the VM should solve the problem.
You need to start zookeeper and then run Hbase-shell
{HBASE_HOME}/bin/hbase-daemons.sh {start,stop} zookeeper
and you may want to check this property in hbase-env.sh
# Tell HBase whether it should manage its own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=false
Refer to Source - Zookeeper
One quick solution could be to Restart hbase:
1) Stop-hbase.sh
2) Start-hbase.sh
I had the exact same error. The Linux firewall was blocking connectivity. One can test ports via telnet. A quick fix is to turn off the firewall and see if it fixes it:
Completely disable the firewall on all of your nodes. Note: this command will not survive a reboot of your machines.
systemctl stop firewalld
Long term fix is that you must configure the firewall to allow the hbase ports.
Note, your version of hbase may use different ports:
https://issues.apache.org/jira/browse/HBASE-10123
The output from Hbase shell is quite high level that many misconfiguration would cause this message. To help yourself debug, it would be much better to look into the hbase log in
/var/log/hbase
to figure out the root cause of the issue.
I had the same problem too. For me, my root cause was due to hadoop-kms having a conflicting port number with my hbase-master. Both of them are using port 16000 so my HMaster didn't even get started when I invoke hbase shell. After I fixed that, my hbase worked.
Again, kms port conflict might not be your root-cause. Strongly suggest looking into /var/log/hbase to find the root cause.
In my case with same error in running hbase - I did not include the zookeeper properties in the hbase-site.xml and still get the above error messages (as based in Apache hbase guide, only the two properites: rootdir, and distributed are essential).
I can also trace back my output of jps command that find out that indeed my Hregion server and Hmaster were not properly up and running.
After stop and start (like a reset), I did have these two up and running and can run hbase properly.
if it's happening in VMWare or virtual box please restart Cloudera by command init1 please check you have root privilege and retry hope it will help :)
hbase shell

access hbase in IDE Eclipse , java.net.UnknownHostException

When I write the java code to access hbase in IDE Eclipse, the messages "java.net.UnknownHostException" are always been shown.But hbase shell works well.
I install the hadoop and hbase on a single linux node in pseudo distribution mode. And my hostname is yzd. Here are the /etc/hosts and hbase-site.xml:
/etc/hosts:
127.0.0.1 localhost yzd
hbase-site.xml:
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
Error message:
INFO [main] (HBaseRPC.java:117) - Using org.apache.hadoop.hbase.ipc.WritableRpcEngine for org.apache.hadoop.hbase.ipc.HMasterInterface
INFO [main] (HConnectionManager.java:596) - getMaster attempt 0 of 10 failed; retrying after sleep of 1000
java.net.UnknownHostException: unknown host: � 13846#yzdlocalhost
at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.<init>(HBaseClient.java:224)
at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:954)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:816)
at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:141)
at com.sun.proxy.$Proxy4.getProtocolVersion(Unknown Source)
at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:174)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:295)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:272)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:324)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:579)
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:94)
at com.hbasebook.hush.schema.SchemaManager.process(SchemaManager.java:126)
at com.hbasebook.hush.HushMain.main(HushMain.java:57)
Check the version of your local hbase matches the one you are using as a dependency in your pom. This should solve your issue. I was facing the same issue, I was using hbase in standalone mode. I hope this helps you.
First of all yzd is not host name, its domain name (You should prefer FQDN). Now this line
java.net.UnknownHostException: unknown host: � 13846#yzdlocalhost
clearly says that 13846#yzdlocalhost host is not there. Now you can do followings:
Use IP address instead of hostname in both hbase-site.xml and core-site.xml and check
Then use FQDN in etc/hosts file and tab-separate the values, now you can replace the IP with FQDN

HBase binding to an incorrect address

I'm attempting to run running HBase in pseudo-distributed mode. I have followed all of the steps in the tutorial.
My hbase-site.xml looks like this:
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
My regionservers looks like this (default):
localhost
In the logs, Zookeeper starts OK, MiniZK starts OK, then I get a BindException with this being the culprit:
Caused by: java.net.BindException: Problem binding to /192.168.0.1:0 : Cannot assign requested address
Where in the world did it get the address 192.168.0.1? And why is it trying to bind to port 0? That IP is my NAT gateway. The IP address of the machine it's on is 192.168.0.200.
I have looked in all of the config files but don't see anywhere that I would specify that address.
** UPDATE **
It looks like the problem was that HBase was trying to reverse-lookup my IP address by my hostname which-- because I'm using my router as a DNS-- resolved to ... my router.
When I add an "alias" in the /etc/hosts file to 127.0.0.1 it resolves just fine.
#arnon-rotem-gal-oz, I just installed whatever came in the HBase tarball. I'm assuming miniZK is a scaled-down version of Zookeeper? I'm not running a separate instance of it.
The code you posted did the trick to resolve the next problem that came up.
Check the zookeeper configuration file (zoo.cfg in the zookeeper/conf directory)
Also why do you have both zookeeper and miniZK?
Also (not directly related to your question) you need to tell hbase where to find the zookeeper e.g. adding the following to your hbase-site.xml
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>

Resources