How to run Hadoop NodeManager as privileged account for Windows - windows

I am working on the following versions:
OS- Windows 8
Hadoop- 2.3 (Single node cluster), Pig- 0.12.1, Hive- 0.13
I could start Namenode, Datanode, ResourceManager and NodeManager using windows command prompt (without admin privileges). When I try to run mapreduce program, Mapping task is not start, It show always as "Mapping-0%".
note:
If I start 'NodeManager' from windows command prompt with Administrator privileges then MapReduce program works fine.
NodeManager Log:
14/10/07 14:44:18 INFO http.HttpServer2: adding path spec: /node/*
14/10/07 14:44:18 INFO http.HttpServer2: adding path spec: /ws/*
14/10/07 14:44:18 INFO http.HttpServer2: Jetty bound to port 8042
14/10/07 14:44:18 INFO mortbay.log: jetty-6.1.26
14/10/07 14:44:18 INFO mortbay.log: Extract jar:file:/C:/kaveen/BDSDK/1
.1.0.8/SDK/Hadoop/share/hadoop/yarn/hadoop-yarn-common-2.3.0.jar!/webapps/node t
o C:\Users\kaveen\AppData\Local\Temp\Jetty_localhost_8042_node____j463i9\webapp
14/10/07 14:44:19 INFO mortbay.log: Started SelectChannelConnector#localhost:804
2
14/10/07 14:44:19 INFO webapp.WebApps: Web app /node started at 8042
14/10/07 14:44:19 INFO webapp.WebApps: Registered webapp guice modules
14/10/07 14:44:19 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0
:8031
14/10/07 14:44:19 INFO nodemanager.NodeStatusUpdaterImpl: Registering with RM us
ing finished containers :[]
14/10/07 14:44:19 INFO security.NMContainerTokenSecretManager: Rolling master-ke
y for container-tokens, got key with id 1336526610
14/10/07 14:44:19 INFO security.NMTokenSecretManagerInNM: Rolling master-key for
nm-tokens, got key with id :-93940432
14/10/07 14:44:19 INFO nodemanager.NodeStatusUpdaterImpl: Registered with Resour
ceManager as 127.0.0.1:49344 with total resource of <memory:8192, vCores:8>
14/10/07 14:44:19 INFO nodemanager.NodeStatusUpdaterImpl: Notifying ContainerMan
ager to unblock new container-requests
ResourceManager Log:
14/10/07 14:44:19 INFO util.RackResolver: Resolved 127.0.0.1 to /default-rack
14/10/07 14:44:19 INFO resourcemanager.ResourceTrackerService: NodeManager from
node 127.0.0.1(cmPort: 49344 httpPort: 8042) registered with capability: <memory
:8192, vCores:8>, assigned nodeId 127.0.0.1:49344
14/10/07 14:44:19 INFO rmnode.RMNodeImpl: 127.0.0.1:49344 Node Transitioned from
NEW to RUNNING
14/10/07 14:44:19 INFO capacity.CapacityScheduler: Added node 127.0.0.1:49344 cl
usterResource: <memory:16384, vCores:16>
14/10/07 14:44:19 INFO rmnode.RMNodeImpl: Node 127.0.0.1:49344 reported UNHEALTH
Y with details: 1/1 local-dirs turned bad: /tmp/hadoop-kaveen/nm-local-dir;1/1 l
og-dirs turned bad: C:/kaveen/BDSDK/1.1.0.8/SDK/Hadoop/logs/userlogs
14/10/07 14:44:19 INFO rmnode.RMNodeImpl: 127.0.0.1:49344 Node Transitioned from
RUNNING to UNHEALTHY
14/10/07 14:44:19 INFO capacity.CapacityScheduler: Removed node 127.0.0.1:49344
clusterResource: <memory:8192, vCores:8>
At cluster Node "http://localhost:8088/cluster/nodes" Look like below sceenshots.
No Nodes are created.
Do anyone know how to run 'NodeManager' without Admin privileges on windows?
Thanks in Advance...

A quick fix would be to run C:hadoop\sbin\start-dfs and C:hadoop\sbin\start-yarn under Administrator privileges.

local-dirs turned bad: /tmp/hadoop-kaveen/nm-local-dir;1/1
log-dirs turned bad: C:/kaveen/BDSDK/1.1.0.8/SDK/Hadoop/logs/userlogs
Your NM does not have a valid local-dirs, you have left Linux style local-dirs that are obviously not going to work on Windows. log-dirs are also incorrect, NM probably does not have the required privileges on it.
There is absolutely no need to elevate NM. Instead, fix local-dirs in yarn-site.xml:
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>C:/kaveen/BDSDK/1.1.0.8/SDK/Hadoop/local</value>
</property>
Then make sure the dirs exists and grant full control to NM:
c:\>md C:\kaveen\BDSDK\1.1.0.8\SDK\Hadoop\local
c:\>md C:\kaveen\BDSDK\1.1.0.8\SDK\Hadoop\logs
c:\>cacls C:\kaveen\BDSDK\1.1.0.8\SDK\Hadoop\local /E /T /G <nm account>:F
c:\>cacls C:\kaveen\BDSDK\1.1.0.8\SDK\Hadoop\logs/E /T /G <nm account>:F
Substitute with the appropriate account.

Related

Hadoop Mapreduce job map 100% reduce 0% - rduction is not running and node manager is shutting down

I am trying to run the hadoop wordcount example after installation:
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /input /output
I tried to run with default memory settings. The moment map job finishes nodemanager shutsdown and reduce job cannot start. Find the logs below:
2022-03-08 16:18:22,557 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 14844 for container-id container_1646774131562_0001_01_00
0001: 320.0 MB of 2 GB physical memory used; 2.7 GB of 4.2 GB virtual memory used
2022-03-08 16:18:25,266 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Done waiting for Applications to be Finished. Still alive: [application_1646774131562_0001]
2022-03-08 16:18:25,266 INFO org.apache.hadoop.ipc.Server: Stopping server on 38731
2022-03-08 16:18:25,270 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2022-03-08 16:18:25,271 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 38731
2022-03-08 16:18:25,275 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl is interrupted. Exiting.
2022-03-08 16:18:25,300 INFO org.apache.hadoop.ipc.Server: Stopping server on 8040
2022-03-08 16:18:25,302 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8040
2022-03-08 16:18:25,302 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2022-03-08 16:18:25,303 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Public cache exiting
2022-03-08 16:18:25,304 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NodeManager metrics system...
2022-03-08 16:18:25,306 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system stopped.
2022-03-08 16:18:25,306 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system shutdown complete.
2022-03-08 16:18:25,307 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NodeManager at ubuntu2110/127.0.1.1
************************************************************/

Hodoop namenode not starting

When I use start-all.cmd, then datanode, resourcemanager, nodemanager are working properly but namenode is not working!
19/11/04 22:09:14 WARN namenode.FSNamesystem: Encountered exception loading fsimage
java.io.IOException: NameNode is not formatted.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:220)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1012)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:691)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:634)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:695)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:898)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:877)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1603)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1671)
19/11/04 22:09:14 INFO mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup#0.0.0.0:50070
19/11/04 22:09:14 INFO impl.MetricsSystemImpl: Stopping NameNode metrics system...
19/11/04 22:09:14 INFO impl.MetricsSystemImpl: NameNode metrics system stopped.
19/11/04 22:09:14 INFO impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
19/11/04 22:09:14 ERROR namenode.NameNode: Failed to start namenode.
java.io.IOException: NameNode is not formatted.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:220)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1012)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:691)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:634)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:695)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:898)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:877)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1603)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1671)
19/11/04 22:09:14 INFO util.ExitUtil: Exiting with status 1
19/11/04 22:09:14 INFO namenode.NameNode: SHUTDOWN_MSG:
Open cmd as administrator.
Type and run stop-all.cmd
Then run hadoop namenode –format
Finally run start-all.cmd
Hope it will work for you.

Failed to start namenode.java.lang.IllegalStateException

iam using hadoop apache 2.7.1 high availability cluster that consists of
two name nodes mn1,mn2 and 3 journal nodes
but while i was working on cluster i faced the following error
when i issue start-dfs.sh mn1 is standby and mn2 is active
but after that if one of theses two namenodes are off there is no possibility
to turn it on again
and here are the last lines of log of one of these two name nodes
2017-08-05 09:37:21,063 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Need to save fs image? false (staleImage=true, haEnabled=true, isRollingUpgrade=false)
2017-08-05 09:37:21,063 INFO org.apache.hadoop.hdfs.server.namenode.NameCache: initialized with 3 entries 72 lookups
2017-08-05 09:37:21,088 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 7052 msecs
2017-08-05 09:37:21,300 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: RPC server is binding to mn2:8020
2017-08-05 09:37:21,304 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2017-08-05 09:37:21,316 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8020
2017-08-05 09:37:21,353 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemState MBean
2017-08-05 09:37:21,354 WARN org.apache.hadoop.hdfs.server.common.Util: Path /opt/hadoop/metadata_dir should be specified as a URI in configuration files. Please update hdfs configuration.
2017-08-05 09:37:21,361 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
at org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:119)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:5741)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1063)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:678)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:664)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:811)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:795)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1488)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554)
2017-08-05 09:37:21,364 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2017-08-05 09:37:21,365 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at mn2/192.168.25.22
************************************************************/
This may be
1.Namenode PORT may be Change for each NODE.
This is a particularly vexing problem.
Swallow IllegalStateExceptions thrown by removeShutdownHook in FileSystem. The javadoc states:
public boolean removeShutdownHook(Thread hook)
Throws:
IllegalStateException - If the virtual machine is already in the process of shutting down
So if we are getting this exception, it MEANS we are already in the process of shutdown, so we CANNOT, try what we may, removeShutdownHook. If Runtime had a method Runtime.isShutdownInProgress(), we could have checked for it before the removeShutdownHook call. As it stands, there is no such method. In my opinion, this would be a good patch regardless of the needs for this JIRA.
Not send SIGTERMs from the NM to the MR-AM in the first place. Rather we should expose a mechanism for the NM to politely tell the AM its no longer needed and should shutdown asap. Even after this, if an admin were to kill the MRAppMaster with a SIGTERM, the JobHistory would be lost defeating the purpose of 3614
i discovered that my problem was in journal node and not in namenode
even though the log of namenode shows the error mentioned in question
jps shows journal node but it is fake because journal node service is shut down
even though it is found in jps output
so as a solution i issue hadoop-daemon.sh stop journalnode
then hadoop-daemon.sh start journalnode
and then namenode starts to work again

CDH upgrade from 5.1 to 5.3

After I finished all distribution, activation steps on manager website,
I got the error as below when I restart the cluster:
2016-07-14 14:51:12,335 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup#UT190320.shis.uth.tmc.edu:50070
2016-07-14 14:51:12,436 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
2016-07-14 14:51:12,436 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2016-07-14 14:51:12,436 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2016-07-14 14:51:12,436 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.io.IOException:
File system image contains an old layout version -55.
An upgrade to version -59 is required.
Please restart NameNode with the "-rollingUpgrade started" option if a rolling upgrade is already started; or restart NameNode with the "-upgrade" option to start a new upgrade.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:232)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1006)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:736)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:553)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:609)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:776)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:760)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1466)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1534)
2016-07-14 14:51:12,439 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
You will need to perform the upgrade as suggested error messages. It is not clear what exactly you did but I suggest you follow the documentation at http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_earlier_cdh5_upgrade.html
sudo service hadoop-hdfs-namenode upgrade is possibly what you need.

Hadoop: Datanode process killed

I am currently using Hadoop-2.0.3-alpha and after I could work perfectly with HDFS (copying files into HDFS, getting success from an external framework, using the webfrontend), after a new start of my VM, the datanode process is stopping after a while. The namenode process and all yarn processes work without a problem. I installed Hadoop in a folder under an additional user, as I also still have installed Hadoop 0.2, which worked fine too.
Taking a look at the log-file of all datanode processes I got the following information:
2013-04-11 16:23:50,475 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2013-04-11 16:24:17,451 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2013-04-11 16:24:23,276 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-04-11 16:24:23,279 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2013-04-11 16:24:23,480 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Configured hostname is user-VirtualBox
2013-04-11 16:24:28,896 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened streaming server at /0.0.0.0:50010
2013-04-11 16:24:29,239 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 1048576 bytes/s
2013-04-11 16:24:38,348 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2013-04-11 16:24:44,627 INFO org.apache.hadoop.http.HttpServer: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer$QuotingIn putFilter)
2013-04-11 16:24:45,163 INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFil ter$StaticUserFilter) to context datanode
2013-04-11 16:24:45,164 INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFil ter$StaticUserFilter) to context logs
2013-04-11 16:24:45,164 INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFil ter$StaticUserFilter) to context static
2013-04-11 16:24:45,355 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server at 0.0.0.0:50075
2013-04-11 16:24:45,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dfs.webhdfs.enabled = false
2013-04-11 16:24:45,536 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075
2013-04-11 16:24:45,576 INFO org.mortbay.log: jetty-6.1.26
2013-04-11 16:25:18,416 INFO org.mortbay.log: Started SelectChannelConnector#0.0.0.0:50075
2013-04-11 16:25:42,670 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 50020
2013-04-11 16:25:44,955 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened IPC server at /0.0.0.0:50020
2013-04-11 16:25:45,483 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Refresh request received for nameservices: null
2013-04-11 16:25:47,079 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting BPOfferServices for nameservices: <default>
2013-04-11 16:25:47,660 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool <registering> (storage id unknown) service to localhost/127.0.0.1:8020 starting to offer service
2013-04-11 16:25:50,515 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2013-04-11 16:25:50,631 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2013-04-11 16:26:15,068 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/hadoop/workspace/hadoop_space/hadoop23/dfs/data/in_use.lock acquired by nodename 3099#user-VirtualBox
2013-04-11 16:26:15,720 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-474150866-127.0.1.1-1365686732002 (storage id DS-317990214-127.0.1.1-50010-1365505141363) service to localhost/127.0.0.1:8020
java.io.IOException: Incompatible clusterIDs in /home/hadoop/workspace/hadoop_space/hadoop23/dfs/data: namenode clusterID = CID-1745a89c-fb08-40f0-a14d-d37d01f199c3; datanode clusterID = CID-bb3547b0-03e4-4588-ac25-f0299ff81e4f
at org.apache.hadoop.hdfs.server.datanode.DataStorage .doTransition(DataStorage.java:391)
at org.apache.hadoop.hdfs.server.datanode.DataStorage .recoverTransitionRead(DataStorage.java:191)
at org.apache.hadoop.hdfs.server.datanode.DataStorage .recoverTransitionRead(DataStorage.java:219)
at org.apache.hadoop.hdfs.server.datanode.DataNode.in itStorage(DataNode.java:850)
at org.apache.hadoop.hdfs.server.datanode.DataNode.in itBlockPool(DataNode.java:821)
at org.apache.hadoop.hdfs.server.datanode.BPOfferServ ice.verifyAndSetNamespaceInfo(BPOfferService.java: 280)
at org.apache.hadoop.hdfs.server.datanode.BPServiceAc tor.connectToNNAndHandshake(BPServiceActor.java:22 2)
at org.apache.hadoop.hdfs.server.datanode.BPServiceAc tor.run(BPServiceActor.java:664)
at java.lang.Thread.run(Thread.java:722)
2013-04-11 16:26:16,212 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-474150866-127.0.1.1-1365686732002 (storage id DS-317990214-127.0.1.1-50010-1365505141363) service to localhost/127.0.0.1:8020
2013-04-11 16:26:16,276 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-474150866-127.0.1.1-1365686732002 (storage id DS-317990214-127.0.1.1-50010-1365505141363)
2013-04-11 16:26:18,396 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2013-04-11 16:26:18,940 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2013-04-11 16:26:19,668 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************** **********
SHUTDOWN_MSG: Shutting down DataNode at user-VirtualBox/127.0.1.1
************************************************** **********/
Any ideas? May be I made a mistake during the installation process? But it is strange, that it worked once. I also have to say, that if I am logged in as my additional user to execute the commands ./hadoop-daemon.sh start namenode and the same with the datanode, I need to add sudo.
I used this installation guide: http://jugnu-life.blogspot.ie/2012/0...rial-023x.html
By the way, I use the Oracle Java-7 version.
The problem could be that the namenode was formatted after the cluster was set up and the datanodes were not, so the slaves are still referring to the old namenode.
We have to delete and recreate the folder /home/hadoop/dfs/data on the local filesystem for the datanode.
Check your hdfs-site.xml file to see where dfs.data.dir is pointing to
and delete that folder
and then restart the datanode daemon on the machine
The steps above should recreate the folder and resolve the problem.
Please share your config info if the instructions above do not work.
DataNode dies because of incompatible Clusterids. To fix this problem
If you are using hadoop 2.X, then you have to delete everything in the folder that you have specified in hdfs-site.xml - "dfs.datanode.data.dir" (but NOT the folder itself).
The ClusterID will be maintained in that folder. Delete and restart dfs.sh. This should work!!!
You need to delete both
C:\hadoop\data\dfs\datanode and
C:\hadoop\data\dfs\namenode folders.
If you don't have this folders - open your C:\hadoop\etc\hadoop\hdfs-site.xml file and get paths for this folders for next deletion. For me it says:
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/hadoop/data/dfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/hadoop/data/dfs/datanode</value>
</property>
Run command for Format namenodec:\hadoop\bin>hdfs namenode -format
Now it should work!
I think the recommended way of doing this without deleting the data directory is to simply change the clusterID variable in the datanode's VERSION file.
If you look in your daemons directory, you will see the datanode directory exmaple
data/hadoop/daemons/datanode
The VERSION file should look like this.
cat current/VERSION
#Tue Oct 14 17:31:58 CDT 2014
storageID=DS-23bf7f3a-085c-4531-808f-801ff6d52d14
clusterID=CID-bb3547b0-03e4-4588-ac25-f0299ff81e4f
cTime=0
datanodeUuid=63154929-ae68-4149-9f75-9a6558545041
storageType=DATA_NODE
layoutVersion=-55
You need to change the clusterId to the first value in the output of the message so in your case that would be CID-1745a89c-fb08-40f0-a14d-d37d01f199c3 instead of CID-bb3547b0-03e4-4588-ac25-f0299ff81e4f
The updated version should appear like this with the altered clusterId
cat current/VERSION
#Tue Oct 14 17:31:58 CDT 2014
storageID=DS-23bf7f3a-085c-4531-808f-801ff6d52d14
clusterID=CID-1745a89c-fb08-40f0-a14d-d37d01f199c3
cTime=0
datanodeUuid=63154929-ae68-4149-9f75-9a6558545041
storageType=DATA_NODE
layoutVersion=-55
Restart hadoop and the datanode should start just fine.

Resources