Incorrect configuration: namenode address dfs.namenode.rpc-address is not configured - hadoop

I am getting this error when I try and boot up a DataNode. From what I have read, the RPC paramters are only used for a HA configuration, which I am not setting up (I think).
2014-05-18 18:05:00,589 INFO [main] impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(572)) - DataNode metrics system shutdown complete.
2014-05-18 18:05:00,589 INFO [main] datanode.DataNode (DataNode.java:shutdown(1313)) - Shutdown complete.
2014-05-18 18:05:00,614 FATAL [main] datanode.DataNode (DataNode.java:secureMain(1989)) - Exception in secureMain
java.io.IOException: Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.
at org.apache.hadoop.hdfs.DFSUtil.getNNServiceRpcAddresses(DFSUtil.java:840)
at org.apache.hadoop.hdfs.server.datanode.BlockPoolManager.refreshNamenodes(BlockPoolManager.java:151)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:745)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:278)
My files look like:
[root#datanode1 conf.cluster]# cat core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode:8020</value>
</property>
</configuration>
cat hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.datanode.data.dir</name>
<value>/hdfs/data</value>
</property>
<property>
<name>dfs.permissions.superusergroup</name>
<value>hadoop</value>
</property>
</configuration>
I am using the latest CDH5 distro.
Installed Packages
Name : hadoop-hdfs-datanode
Arch : x86_64
Version : 2.3.0+cdh5.0.1+567
Release : 1.cdh5.0.1.p0.46.el6
Any helpful advice on how to get past this?
EDIT: Just use Cloudera manager.

I too was facing the same issue and finally found that there was a space in fs.default.name value. truncating the space fixed the issue. The above core-site.xml doesn't seem to have space so the issue may be different from what i had. my 2 cents

These steps solved the problem for me:
export HADOOP_CONF_DIR = $HADOOP_HOME/etc/hadoop
echo $HADOOP_CONF_DIR
hdfs namenode -format
hdfs getconf -namenodes
./start-dfs.sh

check the core-site.xml under $HADOOP_INSTALL/etc/hadoop dir. Verify that the property fs.default.name is configured correctly

Obviously,your core-site.xml has configure error.
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode:8020</value>
</property>
Your <name>fs.defaultFS</name> setting as <value>hdfs://namenode:8020</value>,but your machine hostname is datanode1.So you just need change namenode to datanode1 will be OK.

I had the exact same issue. I found a resolution by checking the environment on the Data Node:
$ sudo update-alternatives --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.my_cluster 50
$ sudo update-alternatives --set hadoop-conf /etc/hadoop/conf.my_cluster
Make sure that the alternatives are set correctly on the Data Nodes.

Configuring the full host name in core-site.xml, masters and slaves solved the issue for me.
Old: node1 (failed)
New: node1.krish.com (Succeed)

creating dfs.name.dir and dfs.data.dir directories and configuring full hostname in core-site.xml, masters & slaves is solved my issue

In my situation, I fixed by change /etc/hosts config to lower case.

in my case, I have wrongly set HADOOP_CONF_DIR to an other Hadoop installation.
Add to hadoop-env.sh:
export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop/

This type of problem mainly arises is there is a space in the value or name of the property in any one of the following files-
core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml
just make sure you did not put any spaces or (changed the line) in between the opening and closing name and value tags.
Code:
<property>
<name>dfs.name.dir</name> <value>file:///home/hadoop/hadoop_tmp/hdfs/namenode</value>
<final>true</final>
</property>

I was facing the same issue, formatting HDFS solved my issue. Don't format HDFS if you have important meta data.
Command for formatting HDFS: hdfs namenode -format
When namenode was not working
After formatting HDFS

Check your '/etc/hosts' file:
There must be a line like below: (if not, so add that)
namenode 127.0.0.1
Replace 127.0.01 with your namenode IP.

Add the below line in hadoop-env.cmd
set HADOOP_HOME_WARN_SUPPRESS=1

Related

org.apache.hadoop.ipc.RpcException: RPC response exceeds maximum data length

I have set up hadoop cluster on 2 machines.
One machine has both master and slave-1.
2nd machine has slave-2.
When I started the cluster with start-all.sh, I got following error in secondarynamenode's .out file:
java.io.IOException: Failed on local exception: org.apache.hadoop.ipc.RpcException: RPC response exceeds maximum data length; Host Details : local host is: "ip-10-179-185-169/10.179.185.169"; destination host is: "hadoop-master":9000;
Following is my JPS output
98366 Jps
96704 DataNode
97284 NodeManager
97148 ResourceManager
96919 SecondaryNameNode
Can someone help me tackle this error ?
I also had this problem.
Please check core-site.xml
(this should be under the dir where you downloaded Hadoop, for me the path is: /home/algo/hadoop/etc/hadoop/core-site.xml)
The file should look like this:
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/algo/hdfs/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Solution: using hdfs://localhost:9000 as ip:port.
Might be a problem with the port number you are using. Try this : https://stackoverflow.com/a/60701948/8504709

Namenode daemon not starting properly

I have just started learning hadoop from the book Hadoop: The definitive guide.
I followed the tutorial for Hadoop installation in Pseudodistribution mode. I enabled the passwordless login to ssh.
Formatted the hdfs filesystem before using it for the first time. It started successfully for the first time.
After that I copied a text file using copyFromLocal to HDFS and everything went fine. But if I restart the system and start the daemons again and look at the web UI , only YARN is started successfully.
When I issue the stop-dfs.sh commmand I get
Stopping namenodes on [localhost]
localhost: no namenode to stop
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
If I format the hdfs file system again and then try starting the daemons then they all start successfully.
Here are my configuration files.Exactly as what is told in hadoop definitive guide book.
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost/</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
This is the error in the namenode log file
WARN org.apache.hadoop.hdfs.server.common.Storage: Storage directory /tmp/hadoop/dfs/name does not exist
WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/hadoop/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:327)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:215)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:975)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:585)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:645)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:812)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:796)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1493)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1559)
This is from mapred log
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
... 33 more
I visited apache hadoop : connection refused which says
Check that there isn't an entry for your hostname mapped to 127.0.0.1 or 127.0.1.1 in /etc/hosts (Ubuntu is notorious for this).
I found there is an entry in my /etc/hosts, but if I remove it my sudo breaks causing error sudo: unable to resolve host . What should I append in /etc/hosts if not remove my hostname mapped to 127.0.1.1
I cannot understand what is the root cause of this problem.
Well it says in your Namenode log file that default storage of your namenode directory is /tmp/hadoop. The /tmp directory is formatted in linux on reboot by some systems. So it must be the problem.
You need to change your default namenode and datanode directory by changing your hdfs-site.xml configuration file.
Add this in your hdfs-site.xml
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/"your-user-name"/hadoop</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/"your-user-name"/datanode</value>
</property>
After this format your namenode by hdfs namenode -format command.
I think this will end your problem.
If configuration file is not a problem, please try following:
1.first delete all contents from temporary folder:
rm -Rf <tmp dir> (my was /usr/local/hadoop/tmp)
2.format the namenode:
bin/hadoop namenode -format
3.start all processes again:
bin/start-all.sh

hadoop datanode unable to start. "does not contain a valid host:port authority"

I'm currently using hadoop 1.2.1 (because I need to run a spatial processing software only support this version). I'm trying to deploy in multinode mode with one master and three slaves.
I'm sure I'm able to ssh between all master and slaves without password (including themselves). Also the hostname on each node is correct.
Each node shares the same host file:
192.168.56.101 master
192.168.56.102 slave1
192.168.56.103 slave2
192.168.56.104 slave3
I keep having problems in the slaves node, error log info is as follows,
2015-05-21 23:39:16,841 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.lang.IllegalArgumentException: Does not contain a valid host:port authority: file:///
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:212)
at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:244)
at org.apache.hadoop.hdfs.server.namenode.NameNode.getServiceAddress(NameNode.java:236)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:359)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:321)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1712)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1651)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1669)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1795)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:181
Configurations in core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
In mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracter</name>
<value>master:8012</value>
</property>
</configuration>
In hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration
There could be a problem with the naming convention of your node hostnames.
Make sure they do not contain symbols like "_".
Check Wikipedia for restrictions.
Try to change the "master" to the actual ip address, in all your config files.
You configed OK. You need run command "$HADOOP_HOME/bin/hdfs namenode -format master", after run command "$HADOOP_HOME/sbin/start-dfs"

How to add an hard disk to hadoop

I installed Hadoop 2.4 on Ubuntu 14.04 and now I am trying to add an internal sata HD to the existing cluster.
I have mounted the new hd in /mnt/hadoop and assigned its ownership to the hadoop user
Then I tried to add it to the configuration file as follow:
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/namenode, file:///mnt/hadoop/hadoopdata/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/datanode, file:///mnt/hadoop/hadoopdata/hdfs/datanode</value>
</property>
</configuration>
Afterwards, I started the hdfs:
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/hadoop/hadoop/logs/hadoop-hadoop-namenode-hadoop-Datastore.out
localhost: starting datanode, logging to /home/hadoop/hadoop/logs/hadoop-hadoop-datanode-hadoop-Datastore.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/hadoop/logs/hadoop-hadoop-secondarynamenode-hadoop-Datastore.out
It seems that it does not fire up the second hd
This is my core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
In addition I tried to refresh the namenode and I get a connection problem:
Refreshing namenode [localhost:9000]
refreshNodes: Call From hadoop-Datastore/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Error: refresh of namenodes failed, see error messages above.
In addition, I can't connect to the Hadoop web interface.
It seems that I have two related problems:
1) A connection problem
2) I cannot connect to the new installed hd
Are these problem related?
How can I fix these issues?
Thanks
EDIT
I can ping the localhost and I can access localhost:50090/status.jsp
However, I cannot access 50030 and 50070
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/namenode, file:///mnt/hadoop/hadoopdata/hdfs/namenode</value>
</property>
This is documented as:
Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.
Are you sure you need this? Do you want your fsimage to be copied in both locations, for redundancy? And if yes, did you actually copy the fsimage on the new HDD before starting the namenode? See Adding a new namenode data directory to an existing cluster.
The new data directory (dfs.data.dir) is OK, the datanode should pick it up and start using it for placing blocks.
Also, as a general troubleshooting advice, look into the namenode and datanode logs for more clues.
Regarding your comment: "sudo chown -R hadoop.hadoop /usr/local/hadoop_store."
The owner has to be hdfs user. Try:
sudo chown -R hdfs.hadoop /usr/local/hadoop_store.

get "ERROR: Can't get master address from ZooKeeper; znode data == null" when using Hbase shell

I installed Hadoop2.2.0 and Hbase0.98.0 and here is what I do :
$ ./bin/start-hbase.sh
$ ./bin/hbase shell
2.0.0-p353 :001 > list
then I got this:
ERROR: Can't get master address from ZooKeeper; znode data == null
Why am I getting this error ? Another question:
do I need to run ./sbin/start-dfs.sh and ./sbin/start-yarn.sh before I run base ?
Also, what are used ./sbin/start-dfs.sh and ./sbin/start-yarn.sh for ?
Here is some of my conf doc :
hbase-sites.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://127.0.0.1:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/Users/apple/Documents/tools/hbase-tmpdir/hbase-data</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/Users/apple/Documents/tools/hbase-zookeeper/zookeeper</value>
</property>
</configuration>
core-sites.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
<description>The name of the default file system.</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/Users/micmiu/tmp/hadoop</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>io.native.lib.available</name>
<value>false</value>
</property>
</configuration>
yarn-sites.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
If you just want to run HBase without going into Zookeeper management for standalone HBase, then remove all the property blocks from hbase-site.xml except the property block named hbase.rootdir.
Now run /bin/start-hbase.sh. HBase comes with its own Zookeeper, which gets started when you run /bin/start-hbase.sh, which will suffice if you are trying to get around things for the first time. Later you can put distributed mode configurations for Zookeeper.
You only need to run /sbin/start-dfs.sh for running HBase since the value of hbase.rootdir is set to hdfs://127.0.0.1:9000/hbase in your hbase-site.xml. If you change it to some location on local the filesystem using file:///some_location_on_local_filesystem, then you don't even need to run /sbin/start-dfs.sh.
hdfs://127.0.0.1:9000/hbase says it's a place on HDFS and /sbin/start-dfs.sh starts namenode and datanode which provides underlying API to access the HDFS file system. For knowing about Yarn, please look at http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/YARN.html.
This could also happen if the vm or the host machine is put to sleep ,Zookeeper will not stay live.
Restarting the VM should solve the problem.
You need to start zookeeper and then run Hbase-shell
{HBASE_HOME}/bin/hbase-daemons.sh {start,stop} zookeeper
and you may want to check this property in hbase-env.sh
# Tell HBase whether it should manage its own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=false
Refer to Source - Zookeeper
One quick solution could be to Restart hbase:
1) Stop-hbase.sh
2) Start-hbase.sh
I had the exact same error. The Linux firewall was blocking connectivity. One can test ports via telnet. A quick fix is to turn off the firewall and see if it fixes it:
Completely disable the firewall on all of your nodes. Note: this command will not survive a reboot of your machines.
systemctl stop firewalld
Long term fix is that you must configure the firewall to allow the hbase ports.
Note, your version of hbase may use different ports:
https://issues.apache.org/jira/browse/HBASE-10123
The output from Hbase shell is quite high level that many misconfiguration would cause this message. To help yourself debug, it would be much better to look into the hbase log in
/var/log/hbase
to figure out the root cause of the issue.
I had the same problem too. For me, my root cause was due to hadoop-kms having a conflicting port number with my hbase-master. Both of them are using port 16000 so my HMaster didn't even get started when I invoke hbase shell. After I fixed that, my hbase worked.
Again, kms port conflict might not be your root-cause. Strongly suggest looking into /var/log/hbase to find the root cause.
In my case with same error in running hbase - I did not include the zookeeper properties in the hbase-site.xml and still get the above error messages (as based in Apache hbase guide, only the two properites: rootdir, and distributed are essential).
I can also trace back my output of jps command that find out that indeed my Hregion server and Hmaster were not properly up and running.
After stop and start (like a reset), I did have these two up and running and can run hbase properly.
if it's happening in VMWare or virtual box please restart Cloudera by command init1 please check you have root privilege and retry hope it will help :)
hbase shell

Resources