Datanode failing in Hadoop on single Machine - hadoop

I set up and configured sudo node hadoop environment on ubuntu 12.04 LTS using following tutorial
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/#formatting-the-hdfs-filesystem-via-the-namenode
After typing hadoop/bin $ start-all.sh
everything going fine then i checked the Jps
then NameNode, JobTracker ,TaskTracker,SecondaryNode have been started but DataNode not started ...
If any know how to resolve this issue please let me know..

ya i resolved it...
java.io.IOException: Incompatible namespaceIDs
If you see the error java.io.IOException: Incompatible namespaceIDs in the logs of a DataNode (logs/hadoop-hduser-datanode-.log), chances are you are affected by issue HDFS-107 (formerly known as HADOOP-1212).
The full error looked like this on my machines:
... ERROR org.apache.hadoop.dfs.DataNode: java.io.IOException: Incompatible namespaceIDs in /app/hadoop/tmp/dfs/data: namenode namespaceID = 308967713; datanode namespaceID = 113030094
at org.apache.hadoop.dfs.DataStorage.doTransition(DataStorage.java:281)
at org.apache.hadoop.dfs.DataStorage.recoverTransitionRead(DataStorage.java:121)
at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:230)
at org.apache.hadoop.dfs.DataNode.(DataNode.java:199)
at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:1202)
at org.apache.hadoop.dfs.DataNode.run(DataNode.java:1146)
at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:1167)
at org.apache.hadoop.dfs.DataNode.main(DataNode.java:1326)
t the moment, there seem to be two workarounds as described below.
Workaround 1: Start from scratch
I can testify that the following steps solve this error, but the side effects won’t make you happy (me neither). The crude workaround I have found is to:
Stop the cluster
Delete the data directory on the problematic DataNode: the directory is specified by dfs.data.dir in conf/hdfs-site.xml; if you followed this tutorial, the relevant directory is /app/hadoop/tmp/dfs/data
Reformat the NameNode (NOTE: all HDFS data is lost during this process!)
Restart the cluster
When deleting all the HDFS data and starting from scratch does not sound like a good idea (it might be ok during the initial setup/testing), you might give the second approach a try.
Workaround 2: Updating namespaceID of problematic DataNodes
Big thanks to Jared Stehler for the following suggestion. I have not tested it myself yet, but feel free to try it out and send me your feedback. This workaround is “minimally invasive” as you only have to edit one file on the problematic DataNodes:
Stop the DataNode
Edit the value of namespaceID in /current/VERSION to match the value of the current NameNode
Restart the DataNode
If you followed the instructions in my tutorials, the full path of the relevant files are:
NameNode: /app/hadoop/tmp/dfs/name/current/VERSION
DataNode: /app/hadoop/tmp/dfs/data/current/VERSION (background: dfs.data.dir is by default set to ${hadoop.tmp.dir}/dfs/data, and we set hadoop.tmp.dir in this tutorial to /app/hadoop/tmp).
The solution for the problem is clearly given in the following site:
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/#java-io-ioexception-incompatible-namespaceids

Related

How to restore version file for datanode after formatting namenode

I have executed hadoop namenode -format command. Namenode runs just fine, but the datanode cannot start. Version file for datanode shows that it's a NAME_NODE. Before formatting I have deleting everything from /hadoop/hdfs/namenode/* and /hadoop/hdfs/data/*
Now everytime I try to delete everything and re-format the namenode, datanode doesn't get start because of the incorrectly generated VERSION file. Googling didn't yield much.
It sounds like your version number doesn't match the name node. The version file is used to see what namenode that the datanode belongs to. Once they no longer match the datanode doesn't want to join the name node as it believe it belongs to a different namenode.
Here's a good explanation here of what you need to do.

Secondary name node is not displaying when I hit JPS command

I have Hadoop-3.1.3 and I can upload a file in hadoop pseudo distributed mode, also can display the contents of file.
but when I call jps command i am getting the following output
10912 DataNode
13072 ResourceManager
4480 NodeManager
6584 Jps
664 Namenode
I am unable to find secondary name node, is there a problem with any configuration or hadoop installation?
You're assuming that secondary namenode is started with psuedo-distributed?
If the basic commands work, then its fine.
You need to look at log files to know if something is broken, before asking elsewhere....
In general, I always suggest you use Apache Ambari to provision a Hadoop cluster
You can start the Secondary NameNode manually and observe the start up logs to see if there's anything wrong:
hdfs secondarynamenode
If there's no error, run jps again and hopefully you see SecondaryNameNode listed.
I'd suggest running hdfs --help and checking out all of the options, there's a lot of good stuff there.

Cannot start running on browser the namenode for Hadoop

It is my first time in installing Hadoop on my Linux (Fedora distro) running on VM (using Parallel on my Mac). And I followed every step on this video and including the textual version of it.And then when I run it on localhost (or the equivalent value from hostname) in port 50070, I got the following message.
...can't establish a connection to the server at localhost:50070
When I run the jps by the way command I don't have the datanode and namenode unlike at the end of the textual version tutorial which has the following:
While mine has only the following processes running:
6021 NodeManager
3947 SecondaryNameNode
5788 ResourceManager
8941 Jps
When I run the hadoop namenode command I have some of the following [redacted] error:
Cannot access storage directory /usr/local/hadoop_store/hdfs/namenode
16/10/11 21:52:45 WARN namenode.FSNamesystem: Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /usr/local/hadoop_store/hdfs/namenode is in an inconsistent state: storage directory does not exist or is not accessible.
I tried to access by the way the above mentioned directories and it existed.
Any hint for this newbie? ;-)
You would need to give read and write permission to user with which you are running the services on directory /usr/local/hadoop_store/hdfs/namenode.
Once done, you should run format command using hadoop namenode -format
Then try to start your services.
delete files /app/hadoop/tmp/*
and try again formatting the namenode and then start-dfs.sh & start-yarn.sh

Need help adding multiple DataNodes in pseudo-distributed mode (one machine), using Hadoop-0.18.0

I am a student, interested in Hadoop and started to explore it recently.
I tried adding an additional DataNode in the pseudo-distributed mode but failed.
I am following the Yahoo developer tutorial and so the version of Hadoop I am using is hadoop-0.18.0
I tried to start up using 2 methods I found online:
Method 1 (link)
I have a problem with this line
bin/hadoop-daemon.sh --script bin/hdfs $1 datanode $DN_CONF_OPTS
--script bin/hdfs doesn't seem to be valid in the version I am using. I changed it to --config $HADOOP_HOME/conf2 with all the configuration files in that directory, but when the script is ran it gave the error:
Usage: Java DataNode [-rollback]
Any idea what does the error mean? The log files are created but DataNode did not start.
Method 2 (link)
Basically I duplicated conf folder to conf2 folder, making necessary changes documented on the website to hadoop-site.xml and hadoop-env.sh. then I ran the command
./hadoop-daemon.sh --config ..../conf2 start datanode
it gives the error:
datanode running as process 4190. stop it first.
So I guess this is the 1st DataNode that was started, and the command failed to start another DataNode.
Is there anything I can do to start additional DataNode in the Yahoo VM Hadoop environment? Any help/advice would be greatly appreciated.
Hadoop start/stop scripts use /tmp as a default directory for storing PIDs of already started daemons. In your situation, when you start second datanode, startup script finds /tmp/hadoop-someuser-datanode.pid file from the first datanode and assumes that the datanode daemon is already started.
The plain solution is to set HADOOP_PID_DIR env variable to something else (but not /tmp). Also do not forget to update all network port numbers in conf2.
The smart solution is start a second VM with hadoop environment and join them in a single cluster. It's the way hadoop is intended to use.

Hadoop jobtracker UI not accessible

I've configured hadoop 1.0.4 in pseudo-distributed mode. Everything's good, I can put local files in HDFS and run wordcount task. But I just can't access the jobtracker web UI through localhost:50030, localhost:50070 doesn't work neither.
HTTP ERROR 404
Problem accessing /jobtracker.jsp. Reason:
/jobtracker.jsp Powered by Jetty://
I look at the log files, but there's no error...
I used to have some problem with datanode, and jobtracker complained about replication, but that is solved and now all daemons are good (namenode, datanode, jobtracker, tasktracker, secondarynamenode) and no error in any of the log files.
Any suggestions?
Ok finally I solved it myself, I had to re-install the system then re-install hadoop. I think the problem should be that I've previously installed the CDH4 on my system, which is hadoop 2.0.0 and even if I uninstalled all of its packages (debian system) and change the tmp folder of HDFS, but maybe there's still something left. The only way is to restart over.

Resources