namenode datanode and sec name node not getting started when i hit jps command - hadoop

I was using Hadoop 1.2.1 in a pseudo-distributed mode in Ubuntu and everything was working fine. But then I had to restart my system . And now when I am hit jps command after giving start-all.sh i was able to see only tasktracker and jobtracker running. Could anyone tell me the possible reason of this problem? and guide me getting this resolved?
************************************************************/
2017-03-13 18:41:16,733 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = keerthana-VirtualBox/127.0.1.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.8.0_121
************************************************************/
2017-03-13 18:41:19,383 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2017-03-13 18:41:19,628 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2017-03-13 18:41:19,653 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2017-03-13 18:41:19,653 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2017-03-13 18:41:21,947 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2017-03-13 18:41:22,117 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2017-03-13 18:41:23,564 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory in dfs.data.dir: Incorrect permission for /home/keerthana/hadoop/dfs/data, expected: rwxr-xr-x, while actual: rwxrwxr-x
2017-03-13 18:41:23,564 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: All directories in dfs.data.dir are invalid.
2017-03-13 18:41:23,564 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2017-03-13 18:41:23,630 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at keerthana-VirtualBox/127.0.1.1
************************************************************/

As suggested from the logs, there are some problems with the dfs.data.dir "/home/keerthana/hadoop/dfs/data" location.
Provide check if the folder exists or not and whether they have the proper permission or not. If folder exists , then provide 755 access for that folder.
chmod -R 755 /home/keerthana/hadoop/dfs/data

Related

Failure to start Hadoop after having stopped a running (and working) instance before, because Datanode says that the directory is locked

I have a cluster running Hadoop 1.2.1 with Giraph on top. The server runs ok, but when I stop it, I am unable to make it run again. In the datanode log I get the following error: ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Cannot lock storage /pathToFolder/data/datanode. The directory is already locked.
I have tried many solutions that I found online:
Checking permissions of folders.
Checking equal versions of VERSION file for namenode and datanode.
Checking configuration files (core-site, hdfs-site, mapred-site, master, slaves, ...)
Deleting / Changing the namenode and datanode data folders
Removing hadoop temporary files
Bottomline is, everything seems fine, but it is still failing to start the datanode. A complete log file for the datanode is the following:
2020-06-24 11:23:46,624 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/********************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = XXXX
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.8.0_212
********************/
2020-06-24 11:23:46,719 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2020-06-24 11:23:46,725 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2020-06-24 11:23:46,726 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2020-06-24 11:23:46,726 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2020-06-24 11:23:46,791 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2020-06-24 11:23:46,794 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2020-06-24 11:23:46,903 INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage /users/lahdak/rojas/AppHadoop/data/datanode. The directory is already locked.
2020-06-24 11:23:47,004 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Cannot lock storage /users/lahdak/rojas/AppHadoop/data/datanode. The directory is already locked.
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:599)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:452)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:111)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:414)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:321)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1712)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1651)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1669)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1795)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1812)
2020-06-24 11:23:47,004 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/********************
SHUTDOWN_MSG: Shutting down DataNode at XXXX
********************/
Still haven't managed to get rid of the problem (Datanode not shutting down correctly), but I found a workaround to the situation. I used lsof +D /pathto detect active processes and killed them. The weird part is that this process was invisible to top and jps commands.

hadoop Datanode shutting down

I have configured hadoop cluster with
namenode 192.168.56.101
secondarynode 192.168.56.102
datanode1 192.168.56.103
after running start-dfs.sh and start-mapred.sh
all demons are up except the datanode1 and I don't know why !
here is the log from the datanode1
2015-03-17 09:35:58,224 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = datanode1/192.168.56.103
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.2.0
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1479473; compiled by 'hortonfo' on Mon May 6 06:59:37 UTC 2013
STARTUP_MSG: java = 1.7.0_75
************************************************************/
2015-03-17 09:35:58,572 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2015-03-17 09:35:58,596 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2015-03-17 09:35:58,600 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2015-03-17 09:35:58,600 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2015-03-17 09:35:58,901 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2015-03-17 09:35:58,921 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2015-03-17 09:36:04,816 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /usr/local/hadoop/tmp/dfs/data: namenode namespaceID = 941733068; datanode namespaceID = 1890117295
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:232)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:147)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:412)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:319)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1698)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1637)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1655)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1781)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1798)
2015-03-17 09:36:04,820 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at datanode1/192.168.56.103
************************************************************/
the namespaceID of datanode1 do not match the current namenode's.Maybe you copied the /usr/local/hadoop/tmp/dfs/data dir from another cluster.If the data of datanode1 is irrelevant, you can delete the /usr/local/hadoop/tmp/dfs/* of datanode1

Unable to start TaskTracker.Says Can not start task tracker because java.lang.IllegalArgumentException: Does not contain a valid host:port authority:

Edited mapred-site.xml,core-site.xml,hadoop-env.sh,hdfs-site.xml,masters and slaves.
I have 1 DataNode and 2 Namenodes.Both of them started successfully and I am able to see it in browser.
Started start-mapred.sh and it started JobTracker and TaskTracker on the Namenode but unable to start Tasktracker on the datanaode.
Started Tasktracker and the following is the output.
->hadoop tasktracker
Warning: $HADOOP_HOME is deprecated.
13/10/17 03:21:55 INFO mapred.TaskTracker: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG: host = tintin/10.193.184.157
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.1.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1440782; compiled by 'hortonf o' on Thu Jan 31 02:03:24 UTC 2013
************************************************************/
13/10/17 03:21:55 INFO impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
13/10/17 03:21:55 INFO impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
13/10/17 03:21:55 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
13/10/17 03:21:55 INFO impl.MetricsSystemImpl: TaskTracker metrics system started
13/10/17 03:21:55 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/10/17 03:21:55 INFO impl.MetricsSourceAdapter: MBean for source ugi registered.
13/10/17 03:21:55 WARN impl.MetricsSystemImpl: Source name ugi already exists!
13/10/17 03:21:55 ERROR mapred.TaskTracker: Can not start task tracker because java.lang.IllegalArgumentException: Does no t contain a valid host:port authority:
10.193.184.132:54311
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:149)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:130)
at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:2312)
at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1532)
at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3906)
13/10/17 03:21:55 INFO mapred.TaskTracker: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down TaskTracker at tintin/10.193.184.157
************************************************************/
Though quite late to answer this, still does your mapred-site.xml contain mapred.job.tracker property ?
My job tracker runs on 8021, so the configuration is as follows:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>http://localhost:8021</value>
</property>
</configuration>
I faced such an issue due to missing properties.

Hadoop component is not starting

I'm new to hadoop well I've followed micheal install ( http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/ ).
When I run the command /usr/local/hadoop/bin/start-all.sh, normally this will startup a Namenode, Datanode, Jobtracker and a Tasktracker on the machine.
I get only TaskTracker started here is the trace :
hduser#srv591 ~ $ /usr/local/hadoop/bin/start-all.sh
starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-namenode-srv591.out
localhost: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-srv591.out
localhost: starting secondarynamenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-secondarynamenode-srv591.out
localhost: Exception in thread "main" org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /app/hadoop/tmp/dfs/namesecondary is in an inconsistent state: checkpoint directory does not exist or is not accessible.
localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.recoverCreate(SecondaryNameNode.java:729)
localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:208)
localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:150)
localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:676)
starting jobtracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-jobtracker-srv591.out
localhost: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-tasktracker-srv591.out
hduser#srv591 ~ $ /usr/local/java/bin/jps
19469 TaskTracker
19544 Jps
The solution of Tariq works, but still to start jobtracker and namenode here is the content of logs
hduser#srv591 ~ $ cat /usr/local/hadoop/libexec/../logs/hadoop-hduser-namenode-srv591.log
2013-09-21 00:30:13,765 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = srv591.sd-france.net/46.21.207.111
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.7.0_40
************************************************************/
2013-09-21 00:30:13,904 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2013-09-21 00:30:13,913 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2013-09-21 00:30:13,914 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-09-21 00:30:13,914 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system started
2013-09-21 00:30:14,140 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2013-09-21 00:30:14,144 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2013-09-21 00:30:14,148 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
2013-09-21 00:30:14,149 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source NameNode registered.
2013-09-21 00:30:14,164 INFO org.apache.hadoop.hdfs.util.GSet: Computing capacity for map BlocksMap
2013-09-21 00:30:14,164 INFO org.apache.hadoop.hdfs.util.GSet: VM type = 64-bit
2013-09-21 00:30:14,164 INFO org.apache.hadoop.hdfs.util.GSet: 2.0% max memory = 932184064
2013-09-21 00:30:14,164 INFO org.apache.hadoop.hdfs.util.GSet: capacity = 2^21 = 2097152 entries
2013-09-21 00:30:14,164 INFO org.apache.hadoop.hdfs.util.GSet: recommended=2097152, actual=2097152
2013-09-21 00:30:14,178 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hduser
2013-09-21 00:30:14,178 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2013-09-21 00:30:14,178 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
2013-09-21 00:30:14,185 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.block.invalidate.limit=100
2013-09-21 00:30:14,185 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
2013-09-21 00:30:14,335 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStateMBean and NameNodeMXBean
2013-09-21 00:30:14,370 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
2013-09-21 00:30:14,370 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times
2013-09-21 00:30:14,373 INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot access storage directory /app/hadoop/tmp/dfs/name
2013-09-21 00:30:14,374 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /app/hadoop/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:304)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:104)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:427)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:395)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:299)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:569)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
2013-09-21 00:30:14,404 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /app/hadoop/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:304)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:104)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:427)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:395)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:299)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:569)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
2013-09-21 00:30:14,405 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at srv591.sd-france.net/46.21.207.111
************************************************************/
2013-09-21 00:31:08,869 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = srv591.sd-france.net/46.21.207.111
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.7.0_40
************************************************************/
2013-09-21 00:31:09,012 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2013-09-21 00:31:09,021 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2013-09-21 00:31:09,022 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-09-21 00:31:09,022 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system started
2013-09-21 00:31:09,240 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2013-09-21 00:31:09,244 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2013-09-21 00:31:09,248 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
2013-09-21 00:31:09,249 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source NameNode registered.
2013-09-21 00:31:09,264 INFO org.apache.hadoop.hdfs.util.GSet: Computing capacity for map BlocksMap
2013-09-21 00:31:09,264 INFO org.apache.hadoop.hdfs.util.GSet: VM type = 64-bit
2013-09-21 00:31:09,264 INFO org.apache.hadoop.hdfs.util.GSet: 2.0% max memory = 932184064
2013-09-21 00:31:09,264 INFO org.apache.hadoop.hdfs.util.GSet: capacity = 2^21 = 2097152 entries
2013-09-21 00:31:09,264 INFO org.apache.hadoop.hdfs.util.GSet: recommended=2097152, actual=2097152
2013-09-21 00:31:09,278 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hduser
2013-09-21 00:31:09,278 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2013-09-21 00:31:09,278 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
2013-09-21 00:31:09,286 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.block.invalidate.limit=100
2013-09-21 00:31:09,286 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
2013-09-21 00:31:09,457 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStateMBean and NameNodeMXBean
2013-09-21 00:31:09,496 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
2013-09-21 00:31:09,496 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times
2013-09-21 00:31:09,501 INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage /app/hadoop/tmp/dfs/name. The directory is already locked.
2013-09-21 00:31:09,501 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.IOException: Cannot lock storage /app/hadoop/tmp/dfs/name. The directory is already locked.
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:599)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:452)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:299)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:104)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:427)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:395)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:299)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:569)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
2013-09-21 00:31:09,508 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException: Cannot lock storage /app/hadoop/tmp/dfs/name. The directory is already locked.
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:599)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:452)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:299)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:104)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:427)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:395)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:299)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:569)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
2013-09-21 00:31:09,509 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at srv591.sd-france.net/46.21.207.111
************************************************************/
here is log for datanode:
************************************************************/
2013-09-21 01:01:24,622 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = srv591.sd-france.net/46.21.207.111
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.7.0_40
************************************************************/
2013-09-21 01:01:24,855 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2013-09-21 01:01:24,870 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2013-09-21 01:01:24,871 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-09-21 01:01:24,871 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2013-09-21 01:01:25,204 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2013-09-21 01:01:25,224 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2013-09-21 01:01:25,499 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /app/hadoop/tmp/dfs/data: namenode namespaceID = 1590050521; datanode namespaceID = 1863017904
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:232)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:147)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:414)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:321)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1712)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1651)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1669)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1795)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1812)
2013-09-21 01:01:25,500 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
Please make sure you have created the directory /app/hadoop/tmp/dfs/namesecondary and have proper permissions. Also, inspecting the logs(NN, DN, SNN, JT) would be helpful. If you are still facing the issue show us the logs along with the configuration files.
In response to your comment :
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /app/hadoop/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
Looks like you haven't created any of the directories you are using in your configuration files. This exception clearly shows that. Please make sure you have created all the directories with proper permissions which you are using as values of your configuration properties.
The default Hadoop installation takes the datanode and namenode data directories as some /tmp directory on the local disk. The correct way is to manually create the data directories for datanode and namenode on local and put the path in the hdfs-site.xml. Add the following properties to /etc/hadoop/hdfs-site.xml and reformat the name node and restart it should be fixed.
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///data/1/dfs/nn,file:///data/2/dfs/nn</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///data/1/dfs/dn,file:///data/2/dfs/dn</value>
</property>

Incompatible build versions error in Hadoop slave node

My Hadoop cluster was running without any mistake. I don't know what has changed but when I try to start Hadoop components with start-all.sh command from master, I check running processes with jps command and see that DataNode does not work in slave node.
The datanode log is below. Versions of Hadoop installation (1.0.4) are the same on the machines in the cluster. I could not find how to solve the problem.
2013-09-18 09:35:21,638 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = noon101/10.240.20.30
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.0.4-SNAPSHOT
STARTUP_MSG: build = -r ; compiled by 'hduser' on Wed May 29 10:55:16 EEST 2013
************************************************************/
2013-09-18 09:35:21,752 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2013-09-18 09:35:21,761 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2013-09-18 09:35:21,762 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-09-18 09:35:21,762 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2013-09-18 09:35:21,867 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2013-09-18 09:35:21,869 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2013-09-18 09:35:26,964 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Incompatible build versions: namenode BV = 1393290; datanode BV =
2013-09-18 09:35:27,070 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible build versions: namenode BV = 1393290; datanode BV =
at org.apache.hadoop.hdfs.server.datanode.DataNode.handshake(DataNode.java:566)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:362)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
2013-09-18 09:35:27,071 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at noon101/10.240.20.30
************************************************************/
Below are part of datanode logs.
Slave node:
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = noon101/10.240.20.30
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.0.4-SNAPSHOT
STARTUP_MSG: build = -r ; compiled by 'hduser' on Wed May 29 10:55:16 EEST 2013
Master node:
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = noon102/10.240.20.32
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.0.4
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct 3 05:13:58 UTC 2012
When looking at the nodes, I see version and build values are different. Even if this is the problem, I still do not know how to solve it.
This generally would happen if there was any kind of change to the conf directory on the Master Name node. Are you sure there wasn't some kind of an ant script which ran or something which messed with the jar files in the 'lib' directory of Hadoop?
This issue seems to be fairly prevalent. The following link has a clear step-by-step instruction of how you can salvage the situation. The gist of it being, you try to copy the conf directory of the Datanode to Master node after backing it up.
http://permalink.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/22499
See if this helps.
I had the same problem and it's solved! I received the error "Incompatible build versions: namenode BV = ; datanode BV = 985326" many times as I had many < property > ... < /property > tabs under ... < configuration > tab in mapred-site.xml. I had to make sure all configurations are same in all nodes (master+slaves). If a single value in < property > ... < /property > tab differs from one node to another, you are likely to see the Incompatible Build Version error.
So here is the summary:
1) Make sure you have same hadoop version in all nodes.
2) Make sure you have same all *-site.xml are same in all nodes that is you use same configuration in core-site.xml, mapred-site.xml, and hdfs-site.xml.
Good Luck!

Resources