java.io.IOException: Incompatible clusterIDs - hadoop

I am installing Hadoop 2.7.2 (1 master NN -1 second NN-3 datanode) and cannot start the datanodes!!!
After trouble shouting the logs (see below), the fatal error is due to ClusterID mismatch... easy! just change the IDs.
WRONG... when I check my VERSION files on the NameNode and the DataNodes they are identical..
So the question is simple: INTO the log file --> Where the ClusterID of the NameNode is coming From????
LOG FILE:
WARN org.apache.hadoop.hdfs.server.common.Storage: java.io.IOException: Incompatible clusterIDs in /home/hduser/mydata/hdfs/datanode: namenode clusterID = **CID-8e09ff25-80fb-4834-878b-f23b3deb62d0**; datanode clusterID = **CID-cd85e59a-ed4a-4516-b2ef-67e213cfa2a1**
org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to master/172.XX.XX.XX:9000. Exiting.
java.io.IOException: All specified directories are failed to load.
atorg.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478)
atorg.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1358)
atorg.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1323)
atorg.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317)
atorg.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223)
atorg.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:802)
at java.lang.Thread.run(Thread.java:745)
WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to master/172.XX.XX.XX:9000
INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned)
WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
COPY of THE VERSION FILE
the master
storageID=DS-f72f5710-a869-489d-9f52-40dadc659937
clusterID=CID-cd85e59a-ed4a-4516-b2ef-67e213cfa2a1
cTime=0
datanodeUuid=54bc8b80-b84f-4893-8b96-36568acc5d4b
storageType=DATA_NODE
layoutVersion=-56
THE DataNode
storageID=DS-f72f5710-a869-489d-9f52-40dadc659937
clusterID=CID-cd85e59a-ed4a-4516-b2ef-67e213cfa2a1
cTime=0
datanodeUuid=54bc8b80-b84f-4893-8b96-36568acc5d4b
storageType=DATA_NODE
layoutVersion=-56

Just to summarize (and close) this issue, I would like to share how I fixed this issue.
On the MASTER and the 2nd Namenode the Namenode VERSION file is under ~/.../namenode/current/VERSION.
BUT for DATANODES the path is different. it should look something like this ~/.../datanode/current/VERSION
ClusterIDs between the 2 VERSION files should be identical
Hope it helps!

I also faced the same issue while installing 2.7.2. Data node is not coming up. Error shown in the datanode log file is
java.io.IOException: Incompatible clusterIDs in
/home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode: namenode
clusterID = CID-XXX; datanode
clusterID = CID-YYY
What i have done is
HADOOP_DIR/bin/hadoop namenode -format -clusterID CID-YYY
(No quotes required for cluster id)

Just to add one more thing.
First, stop the dfs and delete the namenode and datanode directory/folders as specified in the hfs-site.xml.
And after that go to the ../namenode/current/VERSION file and copy the clusterId and replate the clusterID in ../datanode/current/VERSION file with the previously copied clusterID.

Related

Hadoop Exception: All specified directories are failed to load

When I started the Hadoop cluster, the following Exception was thrown. I dont't have idea for solving it. Anyone help me. Thanks
2017-07-10 09:40:58,960 WARN org.apache.hadoop.hdfs.server.common.Storage: java.io.IOException: Incompatible clusterIDs in /tools/hadoop/hadoop_storage/hdfs/datanode: namenode clusterID = CID-47191263-b5b7-4a4d-b8b5-a78b782e66bb; datanode clusterID = CID-79a53373-9652-4c08-9735-b5972e0450ca
2017-07-10 09:40:58,960 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to localhost/127.0.0.1:54310. Exiting.
java.io.IOException: All specified directories are failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1358)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1323)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:802)
at java.lang.Thread.run(Thread.java:745)
2017-07-10 09:40:58,961 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to localhost/127.0.0.1:54310
2017-07-10 09:40:58,962 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned)
2017-07-10 09:41:00,962 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2017-07-10 09:41:00,964 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2017-07-10 09:41:00,966 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
It perhaps you format your cluster one more time thus it generate different ID cluster in the master node and data node.
Your namenode and datanode cluster ID does not match and you make sure to make them the same.
In name node, change cluster id in the file located in:
$ nano HADOOP_FILE_SYSTEM/namenode/current/VERSION
In data node you cluster id is stored in the file:
$ nano HADOOP_FILE_SYSTEM/datanode/current/VERSION
Whatever the way you change ID, but assure that the ID in the cluster's nodes are the same.
#VanThaoNguyen is correct
In my case:
/installation directory/hdata/dfs/name/current
/installation directory/hdata/dfs/data/current
clusterID=xxxx-xxxx-xxxx-xxxx
should be same for name node and data node.

ha hdfs : Initialization failed for Block pool <registering> (Datanode Uuid unassigned)

I get the following error trying to start datanodes in HA HDFS cluster
2016-01-06 22:54:58,064 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory [DISK]file:/home/data/hdfs/dn/ has already been used.
2016-01-06 22:54:58,082 INFO org.apache.hadoop.hdfs.server.common.Storage: Analyzing storage directories for bpid BP-1354640905-10.146.52.232-1452117061014
2016-01-06 22:54:58,083 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to analyze storage directories for block pool BP-1354640905-10.146.52.232-1452117061014
java.io.IOException: BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/data/hdfs/dn/current/BP-1354640905-10.146.52.232-1452117061014
at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:210)
at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:242)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:396)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:477)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1338)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1304)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:314)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:226)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:867)
at java.lang.Thread.run(Thread.java:745)
2016-01-06 22:54:58,084 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage for block pool: BP-1354640905-10.146.52.232-1452117061014 : BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/data/hdfs/dn/current/BP-1354640905-10.146.52.232-1452117061014
2016-01-06 22:54:58,084 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool (Datanode Uuid unassigned) service to master3/10.146.52.232:8020. Exiting.
java.io.IOException: All specified directories are failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1338)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1304)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:314)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:226)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:867)
at java.lang.Thread.run(Thread.java:745)
2016-01-06 22:54:58,084 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool (Datanode Uuid unassigned) service to master3/10.146.52.232:8020
2016-01-06 22:54:58,084 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool (Datanode Uuid unassigned) service to master2/10.146.52.231:8020. Exiting.
org.apache.hadoop.util.DiskChecker$DiskErrorException: Invalid volume failure config value: 3
at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.(FsDatasetImpl.java:261)
at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:34)
at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:30)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1351)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1304)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:314)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:226)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:867)
at java.lang.Thread.run(Thread.java:745)
2016-01-06 22:54:58,085 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool (Datanode Uuid unassigned) service to master2/10.146.52.231:8020
2016-01-06 22:54:58,185 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool (Datanode Uuid unassigned)
I have already check the clusters ID in namenode and datanode and they are similar...
I tried to reformat everything several times...
Thanks for your help !
I have seen messages like this in the log file when the file system for the DataNode is corrupt. Perhaps, try running fsck -y on each of the disks used by the DataNode. In your case:
fsck -y /home/data/hdfs
Once the disk(s) is(are) clean you should be able to start the DataNode. The NameNode will work ensure that the replication factor is fixed for any lost blocks.
I had a similar problem (but don't know without more logs, but mine didn't say "Datanode Uuid unassigned"), and fsck didn't solve it.
In my case, I had moved a subset of disks from one node to another node that already had disks, and disabled the old node, so there was a problem with the disks not matching the DatanodeUuid of the new machine.
Above those lines in the log, there were entries like:
2016-04-11 19:32:02,991 WARN org.apache.hadoop.hdfs.server.common.Storage: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /archive14/dfs/data is in an inconsistent state: Root /archive14/dfs/data: DatanodeUuid=5ba6418e-2c24-4582-8225-3e7f7fff9feb, does not match 519c1e34-a573-41f7-9e80-dca606fce704 from other StorageDirectory.
To solve that, I ran:
sed -i -r "s/${olduuid}/${olduuid}/' /mountpoints*/dfs/data/current/VERSION
This replaces the old UUID in the VERSION file with the new one. Then starting the datanode works.
Maybe in your case, you had a missing UUID rather than an incorrect one.
Deleting the name node directory and the data node directory and then creating the new directories worked for me. Use this technique assuming that you will lost the data.
For my case,I reinstall hdfs by CM6.2.0 and instance two namenodes for HA.
Then reformat these namenode each other,but this option cause the error below.
Initialization failed for Block pool BP-666417012-10.253.76.213-1557044865448 (Datanode Uuid 5132035c-8d6a-4617-af7e-7d07355a905b) service to hzd-t-vbdl-02/10.253.76.222:8022 Blockpool ID mismatch: previously connected to Blockpool ID BP-666417012-10.253.76.213-1557044865448 but now connected to Blockpool ID BP-1262695848-10.253.76.222-1557045124181
Process method:
ansible all -m shell -a " more /XXX/hdfs/dfs/nn/current/VERSION "
hzd-t-vbdl-01 | CHANGED | rc=0 >>
Sun May 05 16:27:45 CST 2019
namespaceID=732385684
clusterID=cluster54
cTime=1557044865448
storageType=NAME_NODE
blockpoolID=BP-666417012-10.253.76.213-1557044865448
layoutVersion=-64
hzd-t-vbdl-02 | CHANGED | rc=0 >>
Sun May 05 16:32:04 CST 2019
namespaceID=892287385
clusterID=cluster54
cTime=1557045124181
storageType=NAME_NODE
blockpoolID=BP-1262695848-10.253.76.222-1557045124181
layoutVersion=-64
Finally copy the context from hzd-t-vbdl-01(early formated) to hzd-t-vbdl-02,and restart namenodes and datanodes

Hadoop Multinode cluster. Data node not working properly

I'm deploying hadoop as a multi node cluster (distributed mode). But each data node is having different different cluster id.
On slave1,
java.io.IOException: Incompatible clusterIDs in /home/pushuser1/hadoop/tmp/dfs/data: namenode clusterID = CID-c72a7d30-ec64-4e4f-9a80-e6f9b6b1d78c; datanode clusterID = CID-2ecca585-6672-476e-9931-4cfef9946c3b
On slave2,
java.io.IOException: Incompatible clusterIDs in /home/pushuser1/hadoop/tmp/dfs/data: namenode clusterID = CID-c72a7d30-ec64-4e4f-9a80-e6f9b6b1d78c; datanode clusterID = CID-e24b0548-2d8d-4aa4-9b8c-a336193c006e
I followed this link as well Datanode not starts correctly but I dont know which cluster id I should pick. If I pick any then data node starts on that machine but not on another one. And also when I format namenode using basic command (hadoop namenode - format), datanodes on each slave nodes are started but then namenode on master machine doesn't get started.
ClusterIDs of datanodes and namenodes should match, then only datanodes can effectively communicate with namenode. If you do namenode format new ClusterID will be assigned for namenodes then ClusterIDs in datanodes won't match.
You can locate a VERSION files in your /home/pushuser1/hadoop/tmp/dfs/data/current/ (datanode directory ) as well as namenode directory(/home/pushuser1/hadoop/tmp/dfs/name/current/ based on the value your specified for dfs.namenode.name.dir) that contains the ClusterID.
If you are ready for format your hdfs namenode, Stop all HDFS services, Clear out all files inside the following directories
rm -rf /home/pushuser1/hadoop/tmp/dfs/data/* (Need to execute on all data nodes)
rm -rf /home/pushuser1/hadoop/tmp/dfs/name/*
and format hdfs again (hadoop namenode -format )

invalid last txid in stream

I am trying to configure hadoop namenode HA and resourcemanager HA as well. However, when I start namenode as standby, I got IllegalArgumentException as below:
=====================================================
About to bootstrap Standby ID nn2 from:
Nameservice ID: mycluster
Other Namenode ID: nn1
Other NN's HTTP address: http://my1.namenode.com:50070
Other NN's IPC address: my1.namenode.com/xxx.xxx.xxx.xxx:8020
Namespace ID: 1915209867
Block pool ID: BP-740716617-xxx.xxx.xxx.xxx-1409206617148
Cluster ID: CID-51cea219-ffe7-4a52-8a6c-fb83d501ccaa
Layout version: -56
=====================================================
Data exists in Storage Directory /hadoop1/hadoop/hdfs/nn. Formatting anyway.
14/11/05 16:41:20 INFO common.Storage: Storage directory /hadoop1/hadoop/hdfs/nn has been successfully formatted.
14/11/05 16:41:20 WARN common.Util: Path /hadoop1/hadoop/hdfs/nn should be specified as a URI in configuration files. Please update hdfs configuration.
14/11/05 16:41:20 WARN common.Util: Path /hadoop1/hadoop/hdfs/nn should be specified as a URI in configuration files. Please update hdfs configuration.
14/11/05 16:41:21 FATAL namenode.NameNode: Exception in namenode join
java.io.IOException: java.lang.IllegalArgumentException: invalid last txid in stream: http://my3.namenode.com:8480/getJournal?jid=mycluster&segmentTxId=74823&storageInfo=-56%3A1915209867%3A0%3ACID-51cea219-ffe7-4a52-8a6c-fb83d501ccaa
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.run(BootstrapStandby.java:317)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1306)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1395)
Caused by: java.lang.IllegalArgumentException: invalid last txid in stream: http://my3.namenode.com:8480/getJournal?jid=mycluster&segmentTxId=74823&storageInfo=-56%3A1915209867%3A0%3ACID-51cea219-ffe7-4a52-8a6c-fb83d501ccaa
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.<init>(RedundantEditLogInputStream.java:101)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.chainAndMakeRedundantStreams(JournalSet.java:300)
at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:494)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:260)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1399)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1418)
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.checkLogsAvailableForRead(BootstrapStandby.java:236)
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.doRun(BootstrapStandby.java:203)
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.access$000(BootstrapStandby.java:69)
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby$1.run(BootstrapStandby.java:106)
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby$1.run(BootstrapStandby.java:102)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.run(BootstrapStandby.java:102)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.run(BootstrapStandby.java:312)
... 2 more
14/11/05 16:41:21 INFO util.ExitUtil: Exiting with status 1
14/11/05 16:41:21 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at
************************************************************/
All others are working well and I've checked hdfs-site.xml configuration for the problem but I couldn't find anything.
Please help me...
Thank you
Restart :
./sbin/hadoop-daemon.sh start journalnode

Datanode not starts correctly

I am trying to install Hadoop 2.2.0 in pseudo-distributed mode. While I am trying to start the datanode services it is showing the following error, can anyone please tell how to resolve this?
**2**014-03-11 08:48:15,916 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool <registering> (storage id unknown) service to localhost/127.0.0.1:9000 starting to offer service
2014-03-11 08:48:15,922 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2014-03-11 08:48:15,922 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2014-03-11 08:48:16,406 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/in_use.lock acquired by nodename 3627#prassanna-Studio-1558
2014-03-11 08:48:16,426 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582) service to localhost/127.0.0.1:9000
java.io.IOException: Incompatible clusterIDs in /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode: namenode clusterID = CID-fb61aa70-4b15-470e-a1d0-12653e357a10; datanode clusterID = CID-8bf63244-0510-4db6-a949-8f74b50f2be9
at**** org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:391)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:191)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:837)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:808)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:280)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:222)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664)
at java.lang.Thread.run(Thread.java:662)
2014-03-11 08:48:16,427 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582) service to localhost/127.0.0.1:9000
2014-03-11 08:48:16,532 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582)
2014-03-11 08:48:18,532 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2014-03-11 08:48:18,534 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2014-03-11 08:48:18,536 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
You can do the following method,
copy to clipboard datanode clusterID for your example, CID-8bf63244-0510-4db6-a949-8f74b50f2be9
and run following command under HADOOP_HOME/bin directory
./hdfs namenode -format -clusterId CID-8bf63244-0510-4db6-a949-8f74b50f2be9
then this code formatted the namenode with datanode cluster ids.
You must do as follow :
bin/stop-all.sh
rm -Rf /home/prassanna/usr/local/hadoop/yarn_data/hdfs/*
bin/hadoop namenode -format
I had the same problem until I found an answer in this web site.
Whenever you are getting below error, trying to start a DN on a slave machine:
java.io.IOException: Incompatible clusterIDs in /home/hadoop/dfs/data: namenode clusterID= ****; datanode clusterID = ****
It is because after you set up your cluster, you, for whatever reason, decided to reformat
your NN. Your DNs on slaves still bear reference to the old NN.
To resolve this simply delete and recreate data folder on that machine in local Linux FS, namely /home/hadoop/dfs/data.
Restarting that DN's daemon on that machine will recreate data/ folder's content and resolve
the problem.
Do following simple steps
Clear the data directory of hadoop
Format the namenode again
start the cluster
After this your cluster will start normally if you are not having any other configuration issue
DataNode dies because of incompatible Clusterids compared to the NameNode. To fix this problem you need to delete the directory /tmp/hadoop-[user]/hdfs/data and restart hadoop.
rm -r /tmp/hadoop-[user]/hdfs/data
I got similar issue in my pseudo distributed environment. I stopped cluster first, then I copied Cluster ID from NameNode's version file and put it in DataNode's version file, then after restarting cluster, its all fine.
my data path is here /usr/local/hadoop/hadoop_store/hdfs/datanode and /usr/local/hadoop/hadoop_store/hdfs/namenode.
FYI : version file is under /usr/local/hadoop/hadoop_store/hdfs/datanode/current/ ; likewise for NameNode.
Here, the datanode gets stopped immediately because the clusterID of datanode and namenode are different. So you have to format the clusterID of namenode with clusterID of datanode
Copy the datanode clusterID for your example, CID-8bf63244-0510-4db6-a949-8f74b50f2be9 and run following command from your home directory. You can go to your home dir by just typing cd on your terminal.
From your home dir now type the command:
hdfs namenode -format -clusterId CID-8bf63244-0510-4db6-a949-8f74b50f2be9
Delete the namenode and datanode directories as specified in the core-site.xml.
After that create the new directories and restart the dfs and yarn.
I also had the similar issue.
I deleted namenode and datanode folders from all the nodes, and rerun:
$HADOOP_HOME/bin> hdfs namenode -format -force
$HADOOP_HOME/sbin> ./start-dfs.sh
$HADOOP_HOME/sbin> ./start-yarn.sh
To check the health report from command line (which I would recommend)
$HADOOP_HOME/bin> hdfs dfsadmin -report
and I got all the nodes working correctly.
I had same issue for hadoop 2.7.7
I removed the namenode/current & datanode/current directory on namenode and all the datanodes
Removed files at /tmp/hadoop-ubuntu/*
then format namenode & datanode
restart all the nodes.
things work fine
steps:
stop all nodes/managers then attempt below steps
rm -rf /tmp/hadoop-ubuntu/* (all nodes)
rm -r /usr/local/hadoop/data/hdfs/namenode/current (namenode: check hdfs-site.xml for path)
rm -r /usr/local/hadoop/data/hdfs/datanode/current (datanode:check hdfs-site.xml for path)
hdfs namenode -format (on namenode)
hdfs datanode -format (on namenode)
Reboot namenode & data nodes
There's been different solutions to this problem, but I tested another easy solution and it worked like a charm :
So if someone get the same error, you just need to change the clusterID in the datanodes with clusterID of the namenode in the VERSION file.
With your case, here's were you can change it on datanode side :
namenode clusterID = CID-fb61aa70-4b15-470e-a1d0-12653e357a10; datanode clusterID = CID-8bf63244-0510-4db6-a949-8f74b50f2be9
Backup the current VERSION : cp /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/current/VERSION /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/current/VERSION.BK
vim /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/current/VERSION and change
clusterID=CID-8bf63244-0510-4db6-a949-8f74b50f2be9
with
clusterID=CID-fb61aa70-4b15-470e-a1d0-12653e357a10
Restart the datanode and it should work.

Resources