ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to localhost/127.0.0.1:54310 failed on local exception - hadoop

I am getting an error in starting the data node while initiating the single node cluster set up on my machine
************************************************************/
2013-02-18 20:21:32,300 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = somnath-laptop/127.0.1.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.0.4
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct 3 05:13:58 UTC 2012
************************************************************/
2013-02-18 20:21:32,593 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2013-02-18 20:21:32,618 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2013-02-18 20:21:32,620 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-02-18 20:21:32,620 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2013-02-18 20:21:33,052 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2013-02-18 20:21:33,056 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2013-02-18 20:21:37,890 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to localhost/127.0.0.1:54310 failed on local exception: java.io.IOException: Connection reset by peer
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107)
at org.apache.hadoop.ipc.Client.call(Client.java:1075)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at sun.proxy.$Proxy5.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:370)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:429)
at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:331)
at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:296)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:356)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
Caused by: java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251)
at sun.nio.ch.IOUtil.read(IOUtil.java:224)
Any idea about how to resolve this error?

Ok got the problem solved.
Since I was using my single-node cluster through a network proxy, I had added the following property line to $HADOOP_HOME/conf/mapred-site.xml to by-pass proxy server while communicating across Hadoop daemons.
However, this time I was trying out on a direct internet connection, so had to comment out the property that I added in mapred-site.xml.
Below is the property from mapred-site.xml that I commented out:
<!--
<property>
<name>hadoop.rpc.socket.factory.class.default</name>
<value>org.apache.hadoop.net.StandardSocketFactory</value>
<final>true</final>
<description>
Prevent proxy settings set up by clients in their job configs from affecting our connectivity.
</description>
</property>
-->

Related

Failure to start Hadoop after having stopped a running (and working) instance before, because Datanode says that the directory is locked

I have a cluster running Hadoop 1.2.1 with Giraph on top. The server runs ok, but when I stop it, I am unable to make it run again. In the datanode log I get the following error: ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Cannot lock storage /pathToFolder/data/datanode. The directory is already locked.
I have tried many solutions that I found online:
Checking permissions of folders.
Checking equal versions of VERSION file for namenode and datanode.
Checking configuration files (core-site, hdfs-site, mapred-site, master, slaves, ...)
Deleting / Changing the namenode and datanode data folders
Removing hadoop temporary files
Bottomline is, everything seems fine, but it is still failing to start the datanode. A complete log file for the datanode is the following:
2020-06-24 11:23:46,624 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/********************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = XXXX
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.8.0_212
********************/
2020-06-24 11:23:46,719 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2020-06-24 11:23:46,725 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2020-06-24 11:23:46,726 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2020-06-24 11:23:46,726 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2020-06-24 11:23:46,791 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2020-06-24 11:23:46,794 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2020-06-24 11:23:46,903 INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage /users/lahdak/rojas/AppHadoop/data/datanode. The directory is already locked.
2020-06-24 11:23:47,004 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Cannot lock storage /users/lahdak/rojas/AppHadoop/data/datanode. The directory is already locked.
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:599)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:452)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:111)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:414)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:321)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1712)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1651)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1669)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1795)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1812)
2020-06-24 11:23:47,004 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/********************
SHUTDOWN_MSG: Shutting down DataNode at XXXX
********************/
Still haven't managed to get rid of the problem (Datanode not shutting down correctly), but I found a workaround to the situation. I used lsof +D /pathto detect active processes and killed them. The weird part is that this process was invisible to top and jps commands.

namenode datanode and sec name node not getting started when i hit jps command

I was using Hadoop 1.2.1 in a pseudo-distributed mode in Ubuntu and everything was working fine. But then I had to restart my system . And now when I am hit jps command after giving start-all.sh i was able to see only tasktracker and jobtracker running. Could anyone tell me the possible reason of this problem? and guide me getting this resolved?
************************************************************/
2017-03-13 18:41:16,733 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = keerthana-VirtualBox/127.0.1.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.8.0_121
************************************************************/
2017-03-13 18:41:19,383 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2017-03-13 18:41:19,628 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2017-03-13 18:41:19,653 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2017-03-13 18:41:19,653 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2017-03-13 18:41:21,947 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2017-03-13 18:41:22,117 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2017-03-13 18:41:23,564 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory in dfs.data.dir: Incorrect permission for /home/keerthana/hadoop/dfs/data, expected: rwxr-xr-x, while actual: rwxrwxr-x
2017-03-13 18:41:23,564 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: All directories in dfs.data.dir are invalid.
2017-03-13 18:41:23,564 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2017-03-13 18:41:23,630 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at keerthana-VirtualBox/127.0.1.1
************************************************************/
As suggested from the logs, there are some problems with the dfs.data.dir "/home/keerthana/hadoop/dfs/data" location.
Provide check if the folder exists or not and whether they have the proper permission or not. If folder exists , then provide 755 access for that folder.
chmod -R 755 /home/keerthana/hadoop/dfs/data

hadoop Datanode shutting down

I have configured hadoop cluster with
namenode 192.168.56.101
secondarynode 192.168.56.102
datanode1 192.168.56.103
after running start-dfs.sh and start-mapred.sh
all demons are up except the datanode1 and I don't know why !
here is the log from the datanode1
2015-03-17 09:35:58,224 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = datanode1/192.168.56.103
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.2.0
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1479473; compiled by 'hortonfo' on Mon May 6 06:59:37 UTC 2013
STARTUP_MSG: java = 1.7.0_75
************************************************************/
2015-03-17 09:35:58,572 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2015-03-17 09:35:58,596 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2015-03-17 09:35:58,600 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2015-03-17 09:35:58,600 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2015-03-17 09:35:58,901 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2015-03-17 09:35:58,921 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2015-03-17 09:36:04,816 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /usr/local/hadoop/tmp/dfs/data: namenode namespaceID = 941733068; datanode namespaceID = 1890117295
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:232)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:147)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:412)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:319)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1698)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1637)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1655)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1781)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1798)
2015-03-17 09:36:04,820 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at datanode1/192.168.56.103
************************************************************/
the namespaceID of datanode1 do not match the current namenode's.Maybe you copied the /usr/local/hadoop/tmp/dfs/data dir from another cluster.If the data of datanode1 is irrelevant, you can delete the /usr/local/hadoop/tmp/dfs/* of datanode1

Unable to start TaskTracker.Says Can not start task tracker because java.lang.IllegalArgumentException: Does not contain a valid host:port authority:

Edited mapred-site.xml,core-site.xml,hadoop-env.sh,hdfs-site.xml,masters and slaves.
I have 1 DataNode and 2 Namenodes.Both of them started successfully and I am able to see it in browser.
Started start-mapred.sh and it started JobTracker and TaskTracker on the Namenode but unable to start Tasktracker on the datanaode.
Started Tasktracker and the following is the output.
->hadoop tasktracker
Warning: $HADOOP_HOME is deprecated.
13/10/17 03:21:55 INFO mapred.TaskTracker: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG: host = tintin/10.193.184.157
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.1.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1440782; compiled by 'hortonf o' on Thu Jan 31 02:03:24 UTC 2013
************************************************************/
13/10/17 03:21:55 INFO impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
13/10/17 03:21:55 INFO impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
13/10/17 03:21:55 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
13/10/17 03:21:55 INFO impl.MetricsSystemImpl: TaskTracker metrics system started
13/10/17 03:21:55 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/10/17 03:21:55 INFO impl.MetricsSourceAdapter: MBean for source ugi registered.
13/10/17 03:21:55 WARN impl.MetricsSystemImpl: Source name ugi already exists!
13/10/17 03:21:55 ERROR mapred.TaskTracker: Can not start task tracker because java.lang.IllegalArgumentException: Does no t contain a valid host:port authority:
10.193.184.132:54311
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:149)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:130)
at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:2312)
at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1532)
at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3906)
13/10/17 03:21:55 INFO mapred.TaskTracker: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down TaskTracker at tintin/10.193.184.157
************************************************************/
Though quite late to answer this, still does your mapred-site.xml contain mapred.job.tracker property ?
My job tracker runs on 8021, so the configuration is as follows:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>http://localhost:8021</value>
</property>
</configuration>
I faced such an issue due to missing properties.

Hadoop: Datanode process killed

I am currently using Hadoop-2.0.3-alpha and after I could work perfectly with HDFS (copying files into HDFS, getting success from an external framework, using the webfrontend), after a new start of my VM, the datanode process is stopping after a while. The namenode process and all yarn processes work without a problem. I installed Hadoop in a folder under an additional user, as I also still have installed Hadoop 0.2, which worked fine too.
Taking a look at the log-file of all datanode processes I got the following information:
2013-04-11 16:23:50,475 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2013-04-11 16:24:17,451 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2013-04-11 16:24:23,276 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-04-11 16:24:23,279 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2013-04-11 16:24:23,480 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Configured hostname is user-VirtualBox
2013-04-11 16:24:28,896 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened streaming server at /0.0.0.0:50010
2013-04-11 16:24:29,239 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 1048576 bytes/s
2013-04-11 16:24:38,348 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2013-04-11 16:24:44,627 INFO org.apache.hadoop.http.HttpServer: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer$QuotingIn putFilter)
2013-04-11 16:24:45,163 INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFil ter$StaticUserFilter) to context datanode
2013-04-11 16:24:45,164 INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFil ter$StaticUserFilter) to context logs
2013-04-11 16:24:45,164 INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFil ter$StaticUserFilter) to context static
2013-04-11 16:24:45,355 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server at 0.0.0.0:50075
2013-04-11 16:24:45,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dfs.webhdfs.enabled = false
2013-04-11 16:24:45,536 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075
2013-04-11 16:24:45,576 INFO org.mortbay.log: jetty-6.1.26
2013-04-11 16:25:18,416 INFO org.mortbay.log: Started SelectChannelConnector#0.0.0.0:50075
2013-04-11 16:25:42,670 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 50020
2013-04-11 16:25:44,955 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened IPC server at /0.0.0.0:50020
2013-04-11 16:25:45,483 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Refresh request received for nameservices: null
2013-04-11 16:25:47,079 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting BPOfferServices for nameservices: <default>
2013-04-11 16:25:47,660 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool <registering> (storage id unknown) service to localhost/127.0.0.1:8020 starting to offer service
2013-04-11 16:25:50,515 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2013-04-11 16:25:50,631 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2013-04-11 16:26:15,068 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/hadoop/workspace/hadoop_space/hadoop23/dfs/data/in_use.lock acquired by nodename 3099#user-VirtualBox
2013-04-11 16:26:15,720 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-474150866-127.0.1.1-1365686732002 (storage id DS-317990214-127.0.1.1-50010-1365505141363) service to localhost/127.0.0.1:8020
java.io.IOException: Incompatible clusterIDs in /home/hadoop/workspace/hadoop_space/hadoop23/dfs/data: namenode clusterID = CID-1745a89c-fb08-40f0-a14d-d37d01f199c3; datanode clusterID = CID-bb3547b0-03e4-4588-ac25-f0299ff81e4f
at org.apache.hadoop.hdfs.server.datanode.DataStorage .doTransition(DataStorage.java:391)
at org.apache.hadoop.hdfs.server.datanode.DataStorage .recoverTransitionRead(DataStorage.java:191)
at org.apache.hadoop.hdfs.server.datanode.DataStorage .recoverTransitionRead(DataStorage.java:219)
at org.apache.hadoop.hdfs.server.datanode.DataNode.in itStorage(DataNode.java:850)
at org.apache.hadoop.hdfs.server.datanode.DataNode.in itBlockPool(DataNode.java:821)
at org.apache.hadoop.hdfs.server.datanode.BPOfferServ ice.verifyAndSetNamespaceInfo(BPOfferService.java: 280)
at org.apache.hadoop.hdfs.server.datanode.BPServiceAc tor.connectToNNAndHandshake(BPServiceActor.java:22 2)
at org.apache.hadoop.hdfs.server.datanode.BPServiceAc tor.run(BPServiceActor.java:664)
at java.lang.Thread.run(Thread.java:722)
2013-04-11 16:26:16,212 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-474150866-127.0.1.1-1365686732002 (storage id DS-317990214-127.0.1.1-50010-1365505141363) service to localhost/127.0.0.1:8020
2013-04-11 16:26:16,276 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-474150866-127.0.1.1-1365686732002 (storage id DS-317990214-127.0.1.1-50010-1365505141363)
2013-04-11 16:26:18,396 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2013-04-11 16:26:18,940 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2013-04-11 16:26:19,668 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************** **********
SHUTDOWN_MSG: Shutting down DataNode at user-VirtualBox/127.0.1.1
************************************************** **********/
Any ideas? May be I made a mistake during the installation process? But it is strange, that it worked once. I also have to say, that if I am logged in as my additional user to execute the commands ./hadoop-daemon.sh start namenode and the same with the datanode, I need to add sudo.
I used this installation guide: http://jugnu-life.blogspot.ie/2012/0...rial-023x.html
By the way, I use the Oracle Java-7 version.
The problem could be that the namenode was formatted after the cluster was set up and the datanodes were not, so the slaves are still referring to the old namenode.
We have to delete and recreate the folder /home/hadoop/dfs/data on the local filesystem for the datanode.
Check your hdfs-site.xml file to see where dfs.data.dir is pointing to
and delete that folder
and then restart the datanode daemon on the machine
The steps above should recreate the folder and resolve the problem.
Please share your config info if the instructions above do not work.
DataNode dies because of incompatible Clusterids. To fix this problem
If you are using hadoop 2.X, then you have to delete everything in the folder that you have specified in hdfs-site.xml - "dfs.datanode.data.dir" (but NOT the folder itself).
The ClusterID will be maintained in that folder. Delete and restart dfs.sh. This should work!!!
You need to delete both
C:\hadoop\data\dfs\datanode and
C:\hadoop\data\dfs\namenode folders.
If you don't have this folders - open your C:\hadoop\etc\hadoop\hdfs-site.xml file and get paths for this folders for next deletion. For me it says:
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/hadoop/data/dfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/hadoop/data/dfs/datanode</value>
</property>
Run command for Format namenodec:\hadoop\bin>hdfs namenode -format
Now it should work!
I think the recommended way of doing this without deleting the data directory is to simply change the clusterID variable in the datanode's VERSION file.
If you look in your daemons directory, you will see the datanode directory exmaple
data/hadoop/daemons/datanode
The VERSION file should look like this.
cat current/VERSION
#Tue Oct 14 17:31:58 CDT 2014
storageID=DS-23bf7f3a-085c-4531-808f-801ff6d52d14
clusterID=CID-bb3547b0-03e4-4588-ac25-f0299ff81e4f
cTime=0
datanodeUuid=63154929-ae68-4149-9f75-9a6558545041
storageType=DATA_NODE
layoutVersion=-55
You need to change the clusterId to the first value in the output of the message so in your case that would be CID-1745a89c-fb08-40f0-a14d-d37d01f199c3 instead of CID-bb3547b0-03e4-4588-ac25-f0299ff81e4f
The updated version should appear like this with the altered clusterId
cat current/VERSION
#Tue Oct 14 17:31:58 CDT 2014
storageID=DS-23bf7f3a-085c-4531-808f-801ff6d52d14
clusterID=CID-1745a89c-fb08-40f0-a14d-d37d01f199c3
cTime=0
datanodeUuid=63154929-ae68-4149-9f75-9a6558545041
storageType=DATA_NODE
layoutVersion=-55
Restart hadoop and the datanode should start just fine.

Resources