Data node shuts automatically with error "WARN datanode.DataNode: Exiting Datanode" - hadoop

I am recieving below error for data node, even resource manager shuts automatically
2021-05-05 01:13:32,029 WARN common.Storage: Failed to add storage directory
[DISK]file:/C:/hadoop/data/datanode
java.io.IOException: Incompatible clusterIDs in C:\hadoop\data\datanode: namenode clusterID = CID-c6716736-36ba-454d-840a-ef8d77ac52c3; datanode clusterID = CID-b5548c69-4ac5-46ab-a413-11436c51ffad
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:736)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:294)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:407)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:387)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:551)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1705)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1665)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:390)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
at java.lang.Thread.run(Thread.java:748)
2021-05-05 01:13:32,029 ERROR datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid c9a3733b-f995-4ed3-8441-b3dd2fd03b7c) service to localhost/127.0.0.1:9000. Exiting.
java.io.IOException: All specified directories have failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:552)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1705)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1665)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:390)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
at java.lang.Thread.run(Thread.java:748)
2021-05-05 01:13:32,029 WARN datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid c9a3733b-f995-4ed3-8441-b3dd2fd03b7c) service to localhost/127.0.0.1:9000
2021-05-05 01:13:32,129 INFO datanode.DataNode: Removed Block pool <registering> (Datanode Uuid c9a3733b-f995-4ed3-8441-b3dd2fd03b7c)
2021-05-05 01:13:34,135 WARN datanode.DataNode: Exiting Datanode
2021-05-05 01:13:34,150 INFO datanode.DataNode: SHUTDOWN_MSG:
/************************************************************

Related

Hadoop Exception: All specified directories are failed to load

When I started the Hadoop cluster, the following Exception was thrown. I dont't have idea for solving it. Anyone help me. Thanks
2017-07-10 09:40:58,960 WARN org.apache.hadoop.hdfs.server.common.Storage: java.io.IOException: Incompatible clusterIDs in /tools/hadoop/hadoop_storage/hdfs/datanode: namenode clusterID = CID-47191263-b5b7-4a4d-b8b5-a78b782e66bb; datanode clusterID = CID-79a53373-9652-4c08-9735-b5972e0450ca
2017-07-10 09:40:58,960 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to localhost/127.0.0.1:54310. Exiting.
java.io.IOException: All specified directories are failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1358)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1323)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:802)
at java.lang.Thread.run(Thread.java:745)
2017-07-10 09:40:58,961 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to localhost/127.0.0.1:54310
2017-07-10 09:40:58,962 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned)
2017-07-10 09:41:00,962 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2017-07-10 09:41:00,964 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2017-07-10 09:41:00,966 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
It perhaps you format your cluster one more time thus it generate different ID cluster in the master node and data node.
Your namenode and datanode cluster ID does not match and you make sure to make them the same.
In name node, change cluster id in the file located in:
$ nano HADOOP_FILE_SYSTEM/namenode/current/VERSION
In data node you cluster id is stored in the file:
$ nano HADOOP_FILE_SYSTEM/datanode/current/VERSION
Whatever the way you change ID, but assure that the ID in the cluster's nodes are the same.
#VanThaoNguyen is correct
In my case:
/installation directory/hdata/dfs/name/current
/installation directory/hdata/dfs/data/current
clusterID=xxxx-xxxx-xxxx-xxxx
should be same for name node and data node.

Datanode is Not Starting in Hadoop after Kerberos Auth

I have given permission to /app/hadoop/tmp/dfs/data.
WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid dfs.datanode.data.dir /app/hadoop/tmp/dfs/data :
EPERM: Operation not permitted
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:230)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:727)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:502)
at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:140)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:156)
at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2341)
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2383)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2365)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2257)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2304)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2481)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2505)
2017-03-14 20:10:51,169 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
java.io.IOException: All directories in dfs.datanode.data.dir are invalid: "/app/hadoop/tmp/dfs/data/"
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2392)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2365)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2257)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2304)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2481)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2505)
2017-03-14 20:10:51,172 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2017-03-14 20:10:51,174 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
Does the owner of that directory on the local file system match the service of the Kerberos principal the datanode is now using? So if it is hdfs/, then the directory (and all under it) should be owned by hdfs.

hadoop can't start ./sbin/start-dfs.sh

I had started shall( ./sbin/start-dfs.sh )
jps
3098 Jps<br>
2492 NameNode<br>
2700 SecondaryNameNode
hadoop-datanode-log
2017-02-15 15:55:12,787 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage directory [DISK]file:/usr/local/Cellar/hadoop/2.7.3/libexec/%3E/data/hadoop/hdfs/datanode/
java.io.IOException: Incompatible clusterIDs in /usr/local/Cellar/hadoop/2.7.3/libexec/>/data/hadoop/hdfs/datanode: namenode clusterID = CID-4c9d5df1-10c6-45cb-9fe0-e1631e4d13e2; datanode clusterID = CID-6dc3d755-f713-4bec-a62a-c47e96dcbc0d
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:775)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:300)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:416)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:395)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:573)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1362)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1327)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:802)
at java.lang.Thread.run(Thread.java:745)
2017-02-15 15:55:12,792 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to localhost/127.0.0.1:9000. Exiting.
java.io.IOException: All specified directories are failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:574)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1362)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1327)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:802)
at java.lang.Thread.run(Thread.java:745)
2017-02-15 15:55:12,793 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to localhost/127.0.0.1:9000
2017-02-15 15:55:12,799 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned)
2017-02-15 15:55:14,800 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2017-02-15 15:55:14,802 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2017-02-15 15:55:14,803 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
Looks like you have formatted the namenode on a working cluster,
Delete the data directories and start the datanode process again in all nodes.
rm -rf <dfs.datanode.data.dir>
./sbin/hadoop-daemon.sh start datanode

Namenode starting but Datanode not starting

I get the following exception when I start the distributed file system. I am using hadoop 2.6.0.
2015-08-26 23:10:58,222 FATAL datanode.DataNode (DataNode.java:secureMain(2385)) - Exception in secureMain
java.net.UnknownHostException: IM1948-X0: IM1948-X0
at java.net.InetAddress.getLocalHost(InetAddress.java:1475)
at org.apache.hadoop.security.SecurityUtil.getLocalHostName(SecurityUtil.java:187)
at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:207)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2153)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2402)
Caused by: java.net.UnknownHostException: IM1948-X0
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1295)
at java.net.InetAddress.getLocalHost(InetAddress.java:1471)
2015-08-26 23:10:58,227 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2015-08-26 23:10:58,229 INFO datanode.DataNode (StringUtils.java:run(659)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at java.net.UnknownHostException: IM1948-X0: IM1948-X0
Even deleting the hadoop/hdfs-data/current directory doesnot help; I tried formatting the namenode but without success. This generally happens to me when I restart hadoop.
Basically to sum up, datanode process is not running at all for the hadoop cluster.

Data node demon not running on CDH 4.2.1 pseudo distributed mode

I am running hadoop-2.0.0-cdh4.2.1 on CentOS in pseudo deistributed mode. When I issued the command sudo jps I don't see datanode demon up and running.
Below is the error log that I got in log file http://localhost:50070/logs/hadoop-hdfs-datanode-localhost.localdomain.log
in NameNode:
**2015-05-12 04:35:26,319 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-539882958-127.0.0.1-1386722652683 (storage id DS-1842390259-127.0.0.1-50010-1431419699539) service to /0.0.0.0:8020 beginning handshake with NN
2015-05-12 04:35:28,573 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-539882958-127.0.0.1-1386722652683 (storage id DS-1842390259-127.0.0.1-50010-1431419699539) service to 0.0.0.0/0.0.0.0:8020
java.io.IOException: Failed on local exception: java.io.IOException: Connection reset by peer; Host Details : local host is: "localhost.localdomain/127.0.0.1"; destination host is: "0.0.0.0":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
at org.apache.hadoop.ipc.Client.call(Client.java:1229)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy10.registerDatanode(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy10.registerDatanode(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.registerDatanode(DatanodeProtocolClientSideTranslatorPB.java:149)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.register(BPServiceActor.java:619)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:221)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:198)
at sun.nio.ch.IOUtil.read(IOUtil.java:171)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:56)
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:143)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:156)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.FilterInputStream.read(FilterInputStream.java:116)
at java.io.FilterInputStream.read(FilterInputStream.java:116)
at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:409)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at java.io.FilterInputStream.read(FilterInputStream.java:66)
at com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:276)
at com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
at com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
at com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
at org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:938)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:836)
2015-05-12 04:35:28,578 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-539882958-127.0.0.1-1386722652683 (storage id DS-1842390259-127.0.0.1-50010-1431419699539) service to 0.0.0.0/0.0.0.0:8020
2015-05-12 04:35:28,595 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-539882958-127.0.0.1-1386722652683 (storage id DS-1842390259-127.0.0.1-50010-1431419699539)
2015-05-12 04:35:28,595 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Removed bpid=BP-539882958-127.0.0.1-1386722652683 from blockPoolScannerMap
2015-05-12 04:35:28,595 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing block pool BP-539882958-127.0.0.1-1386722652683
2015-05-12 04:35:30,597 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2015-05-12 04:35:30,600 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2015-05-12 04:35:30,603 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at localhost.localdomain/127.0.0.1
************************************************************/**

Resources