Hadoop unknownhostexception when format node - hadoop

i'm trying to install hadoop on a CentOS server.
I'm using that tutorial.
I stopped on that line:
bin/hadoop namenode –format
12/12/05 15:07:06 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = java.net.UnknownHostException: karpov: karpov
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 1.1.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1411108; compiled by 'hortonfo' on Mon Nov 19 10:48:11 UTC 2012
************************************************************/
Re-format filesystem in /tmp/hadoop-root/dfs/name ? (Y or N)
Where karpov is my machine name!
On Re-format filesystem in /tmp/hadoop-root/dfs/name ? (Y or N) both Y and N input don't change nothing..
How can i solve? Please help me!

Check /etc/hosts and make sure you have your host added.

May be some special character in your /etc/hosts File.Clean it and write it agin the error must GO.

Related

Why I can't start the NameNode in this Hadoop 1.2.1 installation?

I am new in Apache Hadoop and I am following a video course on Udemy.
The course is based on Hadoop 1.2.1, is it a too old version? Is better start my study with another course based on a more recent version or is it ok?
So I have installed Hadoop 1.2.1 on an Ubuntu 12.04 system and I have configured it in pseudo distribution mode.
According with the tutorial I have do it using the following settings in the following configuration files:
conf/core-site.xml:
fs.default.name
hdfs://localhost:9000
conf/hdfs-site.xml:
dfs.replication
1
conf/mapred-site.xml:
mapred.job.tracker
localhost:9001
Then in the Linux shell I do:
ssh localhost
So I am connected through SSH to my local system.
Then I go into the Hadoop bin directory, /home/andrea/hadoop/hadoop-1.2.1/bin/ and here I perform this command that have to perform the format of the name node (what exactly means?):
bin/hadoop namenode –format
And this I the obtained output:
andrea#andrea-virtual-machine:~/hadoop/hadoop-1.2.1/bin$ ./hadoop namenode –format
16/01/17 12:55:25 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = andrea-virtual-machine/127.0.1.1
STARTUP_MSG: args = [–format]
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.7.0_79
************************************************************/
Usage: java NameNode [-format [-force ] [-nonInteractive]] | [-upgrade] | [-rollback] | [-finalize] | [-importCheckpoint] | [-recover [ -force ] ]
16/01/17 12:55:25 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at andrea-virtual-machine/127.0.1.1
************************************************************/
andrea#andrea-virtual-machine:~/hadoop/hadoop-1.2.1/bin$
Then I try to start all the nodes performing this command:
./start–all.sh
and now I obtain:
andrea#andrea-virtual-machine:~/hadoop/hadoop-1.2.1/bin$ ./start-all.sh
starting namenode, logging to /home/andrea/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-andrea-namenode-andrea-virtual-machine.out
localhost: starting datanode, logging to /home/andrea/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-andrea-datanode-andrea-virtual-machine.out
localhost: starting secondarynamenode, logging to /home/andrea/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-andrea-secondarynamenode-andrea-virtual-machine.out
starting jobtracker, logging to /home/andrea/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-andrea-jobtracker-andrea-virtual-machine.out
localhost: starting tasktracker, logging to /home/andrea/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-andrea-tasktracker-andrea-virtual-machine.out
andrea#andrea-virtual-machine:~/hadoop/hadoop-1.2.1/bin$
Now I try to open in the browser the following URLs:
http//localhost:50070/
and can't open it (page not found)
and:
http://localhost:50030/
this is correctly opened and redirect to this jsp page:
http://localhost:50030/jobtracker.jsp
So, in the shell I perform the jps command that lists all the running Java process for the user:
andrea#andrea-virtual-machine:~/hadoop/hadoop-1.2.1/bin$ jps
6247 Jps
5720 DataNode
5872 SecondaryNameNode
6116 TaskTracker
5965 JobTracker
andrea#andrea-virtual-machine:~/hadoop/hadoop-1.2.1/bin$
As you can see seems that the NameNode is not started.
On the tutorial that I am following say that:
If NameNode or DataNode is not listed than it might happen that the
namenode's or datanode's root directory which is set by the property
'dfs.name.dir' is getting messed up. It by default points to the /tmp
directory which operating system changes from time to time. Thus, HDFS
when comes up after some changes by OS, gets confused and namenode
doesn't start.
So to solve this problem provide this solution (that can't work for me).
First stop all nodes by the stop-all.sh script.
Then I have to explicitly set the 'dfs.name.dir' and 'dfs.data.dir'.
So I have created a dfs directory into the Hadoop path and into this directory I have created 2 directories (at the same level): data and name (the idea is to make two folders inside it which would be used for datanode demon and namenode demon).
So I have something like this:
andrea#andrea-virtual-machine:~/hadoop/hadoop-1.2.1/dfs$ tree
.
├── data
└── name
Then I use this configuration for the hdfs-site.xml where I explicitly set the previous 2 directories:
<configuration>
<property>
<name>dfs.data.dir</name>
<value>/home/andrea/hadoop/hadoop-1.2.1/dfs/data/</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/andrea/hadoop/hadoop-1.2.1/dfs/name/</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
So, after this changing, I run again the command to format the NameNode:
hadoop namenode –format
And I obtain this output:
andrea#andrea-virtual-machine:~/hadoop/hadoop-1.2.1/dfs$ hadoop namenode –format16/01/17 13:14:53 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = andrea-virtual-machine/127.0.1.1
STARTUP_MSG: args = [–format]
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.7.0_79
************************************************************/
Usage: java NameNode [-format [-force ] [-nonInteractive]] | [-upgrade] | [-rollback] | [-finalize] | [-importCheckpoint] | [-recover [ -force ] ]
16/01/17 13:14:53 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at andrea-virtual-machine/127.0.1.1
************************************************************/
andrea#andrea-virtual-machine:~/hadoop/hadoop-1.2.1/dfs$
So I start again all the nodes by: start-all.sh and this is the obtained output:
andrea#andrea-virtual-machine:~/hadoop/hadoop-1.2.1/bin$ start-all.sh
starting namenode, logging to /home/andrea/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-andrea-namenode-andrea-virtual-machine.out
localhost: starting datanode, logging to /home/andrea/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-andrea-datanode-andrea-virtual-machine.out
localhost: starting secondarynamenode, logging to /home/andrea/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-andrea-secondarynamenode-andrea-virtual-machine.out
starting jobtracker, logging to /home/andrea/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-andrea-jobtracker-andrea-virtual-machine.out
localhost: starting tasktracker, logging to /home/andrea/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-andrea-tasktracker-andrea-virtual-machine.out
andrea#andrea-virtual-machine:~/hadoop/hadoop-1.2.1/bin$
Then I perform the jps command to see if all the nodes are correctly started but this is what I obtain:
andrea#andrea-virtual-machine:~/hadoop/hadoop-1.2.1/bin$ jps
8041 SecondaryNameNode
8310 TaskTracker
8406 Jps
8139 JobTracker
andrea#andrea-virtual-machine:~/hadoop/hadoop-1.2.1/bin$
The situation worsened because now I have 2 nodes that are not started: the NameNode and the DataNode.
What am I missing? How can I try to solve this issue and start all my nodes?
Would you try to turn.of the IPTABLES.once and reformat along with exporting the java path.
IF you have configure in hdfs-site.xml with ,
While you are formatting name node
<property>
<name>dfs.name.dir</name>
<value>/home/andrea/hadoop/hadoop-1.2.1/dfs/name/</value>
</property>
then while formatting name node you should see
> successfully formatted /home/andrea/hadoop/hadoop-1.2.1/dfs/name/
message if name node format is successful. As per your logs I am not able to see those successful logs. Check permission issues may be there.
If it didn't start try using another command:
hadoop-daemon.sh start namenode
Hope it works...

Data-Node Does Not Start

I have trouble starting my Hadoop data-node. I did all the research that I could and none of the methods were helpful in solving my issue. Here's my terminal console output when I try to start it using
hadoop datanode -start
This is what happens:
root#Itanium:~/Desktop/hadoop# hadoop datanode -start
Warning: $HADOOP_HOME is deprecated.
13/09/29 22:11:42 INFO datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = Itanium/127.0.1.1
STARTUP_MSG: args = [-start]
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.7.0_25
************************************************************/
Usage: java DataNode
[-rollback]
13/09/29 22:11:42 INFO datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at Itanium/127.0.1.1
************************************************************/
root#Itanium:~/Desktop/hadoop# jps
31438 SecondaryNameNode
32013 Jps
31818 TaskTracker
1146 Bootstrap
31565 JobTracker
30930 NameNode
root#Itanium:~/Desktop/hadoop#
As we can see the DataNode attempts to start but then shuts down. All the while I have been having trouble with NameNode starting up. I used to fix this by manually starting it using
start-dfs.sh
And now the problem is with DataNode. I really would appreciate all your help in resolving this issue.
And one more generic question. Why is Hadoop displaying such inconsistent behavior. I am sure I did not change any of the *-site.xml settings.
use this command hadoop datanode -rollback
I had a similar issue as well. Looking at the comment posted by Anup "seems to be an issue with namespaceIDs not matching" I was able to find a reference that showed me how to solve my issue.
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/#caveats
I took a look at the logfile on the slave nodes where the DataNodes did not start. They both had the following exception :
2014-11-05 10:26:14,289 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /scratch/hdfs/data/srinivasand: namenode namespaceID = 1296690356; datanode namespaceID = 1228298945
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:232)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:147)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:385)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
Fixing this exception solved the issue.
The fix is to either
a) delete the dfs data directory. reformat using namenode -format.
b) update the VERSION file so that the two namespace IDs match.
I was able to use option b) and the datanodes started successfully after that.
The bug report that leads to this issue is recorded at : https://issues.apache.org/jira/browse/HDFS-107
I ever got the same issue, it turns out that 50010 port is occupied by other application, stop the application, restart Hadoop

Incompatible build versions error in Hadoop slave node

My Hadoop cluster was running without any mistake. I don't know what has changed but when I try to start Hadoop components with start-all.sh command from master, I check running processes with jps command and see that DataNode does not work in slave node.
The datanode log is below. Versions of Hadoop installation (1.0.4) are the same on the machines in the cluster. I could not find how to solve the problem.
2013-09-18 09:35:21,638 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = noon101/10.240.20.30
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.0.4-SNAPSHOT
STARTUP_MSG: build = -r ; compiled by 'hduser' on Wed May 29 10:55:16 EEST 2013
************************************************************/
2013-09-18 09:35:21,752 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2013-09-18 09:35:21,761 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2013-09-18 09:35:21,762 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-09-18 09:35:21,762 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2013-09-18 09:35:21,867 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2013-09-18 09:35:21,869 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2013-09-18 09:35:26,964 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Incompatible build versions: namenode BV = 1393290; datanode BV =
2013-09-18 09:35:27,070 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible build versions: namenode BV = 1393290; datanode BV =
at org.apache.hadoop.hdfs.server.datanode.DataNode.handshake(DataNode.java:566)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:362)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
2013-09-18 09:35:27,071 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at noon101/10.240.20.30
************************************************************/
Below are part of datanode logs.
Slave node:
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = noon101/10.240.20.30
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.0.4-SNAPSHOT
STARTUP_MSG: build = -r ; compiled by 'hduser' on Wed May 29 10:55:16 EEST 2013
Master node:
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = noon102/10.240.20.32
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.0.4
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct 3 05:13:58 UTC 2012
When looking at the nodes, I see version and build values are different. Even if this is the problem, I still do not know how to solve it.
This generally would happen if there was any kind of change to the conf directory on the Master Name node. Are you sure there wasn't some kind of an ant script which ran or something which messed with the jar files in the 'lib' directory of Hadoop?
This issue seems to be fairly prevalent. The following link has a clear step-by-step instruction of how you can salvage the situation. The gist of it being, you try to copy the conf directory of the Datanode to Master node after backing it up.
http://permalink.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/22499
See if this helps.
I had the same problem and it's solved! I received the error "Incompatible build versions: namenode BV = ; datanode BV = 985326" many times as I had many < property > ... < /property > tabs under ... < configuration > tab in mapred-site.xml. I had to make sure all configurations are same in all nodes (master+slaves). If a single value in < property > ... < /property > tab differs from one node to another, you are likely to see the Incompatible Build Version error.
So here is the summary:
1) Make sure you have same hadoop version in all nodes.
2) Make sure you have same all *-site.xml are same in all nodes that is you use same configuration in core-site.xml, mapred-site.xml, and hdfs-site.xml.
Good Luck!

Hadoop: Format aborted in /mnt/hdfs/1/namenode

I've created a few ebs filesystems on ec2 to use with hadoop. I've set the JAVE_HOME in the hadoop environment. But when I go to format the first volume it aborts with the following message
[root#hadoop-node01 conf]# sudo -u hdfs hadoop namenode -format
13/02/06 15:33:22 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = hadoop-node01.mydomain.com/10.xx.xx.201
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2-cdh3u5
STARTUP_MSG: build = file:///data/1/tmp/topdir/BUILD/hadoop-0.20.2-cdh3u5 -r 30233064aaf5f2492bc687d61d72956876102109; compiled by 'root' on Fri Oct 5 18:45:46 PDT 2012
************************************************************/
Re-format filesystem in /mnt/hdfs/1/namenode ? (Y or N) y
Format aborted in /mnt/hdfs/1/namenode
13/02/06 15:33:27 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop-node01.mydomain.com/10.xx.xx.201
************************************************************/
This is my namenode configuration:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.name.dir</name> <value>/mnt/hdfs/1/namenode,/mnt/hdfs/2/namenode,/mnt/hdfs/3/namenode,/mnt/hdfs/4/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/mnt/hdfs/1/datanode,/mnt/hdfs/2/datanode,/mnt/hdfs/3/datanode,/mnt/hdfs/4/datanode</value>
</property>
</configuration>
Does anyone have any idea why this error is happening or how to get around the problem?
Unfortunately in 1.x the format command's prompt is case-sensitive. Answer with a capital Y instead and it won't abort.

Unable to identify the .xml/file to update the host after installing Hadoop

I encountered below mentioned error after installing Hadoop and executing hadoop namenode -format command.
Based on the displayed logs, I figured out that I need to update the "host" in the configuration. But, I am unable to find the exact location of the configuration file (.xml), which needs to be updated.
I am installing on Fedora on a single node. I am looking for your help in addressing this issue. Please point me to any specific link or documentation that could be helpful while debugging.
[hadoop#hadoop ~]$ hadoop namenode -format
Warning: $HADOOP_HOME is deprecated.
13/02/03 11:33:09 INFO namenode.NameNode: STARTUP_MSG:
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = java.net.UnknownHostException: hadoop: hadoop
13/02/03 11:34:27 INFO namenode.NameNode: SHUTDOWN_MSG: :
SHUTDOWN_MSG: Shutting down NameNode at java.net.UnknownHostException: hadoop: hadoop
Configuration files are in $HADOOP_HOME/etc/hadoop, where $HADOOP_HOME probably is /usr/local/hadoop. There can be several files with your hostname, you should check there.
Also, you should check /etc/hosts.

Resources