Fatal disk error on datanode DataNode failed volumes: - hadoop

i am getting the following log on my namenode and its removing my datanode from execution
2013-02-08 03:25:54,345 WARN namenode.NameNode (NameNodeRpcServer.java:errorReport(825)) - Fatal disk error on xxx.xxx.xxx.xxx:50010: DataNode failed volumes:/home/srikmvm/hadoop-0.23.0/tmp/current;
2013-02-08 03:25:54,349 INFO net.NetworkTopology (NetworkTopology.java:remove(367)) - Removing a node: /default-rack/xxx.xxx.xxx.xxx:50010
Can anyone suggest how to rectify this ?
Data Node Logs:
2013-02-08 03:25:54,718 WARN datanode.DataNode (FSDataset.java:checkDirs(871)) - Removing failed volume /home/srikmvm/hadoop-0.23.0/tmp/current:
org.apache.hadoop.util.DiskChecker$DiskErrorException: can not create directory: /home/srikmvm/hadoop-0.23.0/tmp/current/BP-876979163-137.132.153.125-13602411944‌​23/current/finalized
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:87)

Things that make be causing this error message:
Does the directory / path to the directory exist
Does the user under which the datanode process is running, have permissions to create / write to this directory
/home/srikmvm/hadoop-0.23.0/tmp/current/BP-876979163-137.132.153.125-13602411944‌​23/current/finalized

Related

ERROR namenode.NameNode (NameNode.java:main(1715)) - Failed to start namenode

I am trying to restart one of the namenode (nn2) but i get the following error in the logs:
2021-12-17 10:23:53,676 ERROR namenode.NameNode (NameNode.java:main(1715)) - Failed to start namenode.
org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0. Expected transaction ID was 274488049
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:226)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
Caused by: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 274488048; expected file to go up to 274488109
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
... 12 more
2021-12-17 10:23:53,678 INFO util.ExitUtil (ExitUtil.java:terminate(210)) - Exiting with status 1: org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0. Expected transaction ID was 274488049
2021-12-17 10:23:53,681 INFO namenode.NameNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX
************************************************************/
i tryied to do the following steps in order to solve the issue:
i copied from nn01 to the NameNode directories of nn02 the following logs
edits_0000000000274487928-0000000000274488048
edits_0000000000274488049-0000000000274488109
So far the nn02 is still not starting and i get the same error.
Can you please help?
If that is an HA setup, and your NN1 is working properly. Format your NN2(hdfs namenode -format) and do a bootstrap (hdfs namenode -bootstrapStandby)
Then try restarting the NN2.

HDFS No valid image files found

we have a old hadoop cluster machine hadoop - version 2.6
all machines in the cluster are redhat version - 7.3
we have a problem to start the standby name-node on the last master machine
from the logs ( under /var/log/hadoop/hdf ) , we can see the error - No valid image files found
I am not sure about my solution , but is its mean that we need to delete the files - ( edits_inprogress_XXXXX ) under /hadoop/hdfs/journal/hdfsha/current , and then restart the standby name-node service ?
2018-01-24 16:10:27,826 ERROR namenode.NameNode (NameNode.java:main(1774)) - Failed to start namenode.
java.io.FileNotFoundException: No valid image files found
at org.apache.hadoop.hdfs.server.namenode.FSImageTransactionalStorageInspector.getLatestImages(FSImageTransactionalStorageInspector.java:165)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:618)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:289)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1045)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:703)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:688)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:752)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:992)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:976)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1701)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1769)
2018-01-24 16:10:27,829 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2018-01-24 16:10:27,845 INFO namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master03.sys573.com/102.14.22.29
************************************************************/
I also try with hadoop namenode -format
but its return with the following:
Could not format one or more JournalNodes. 1 exceptions thrown:
:8485: Directory /data/hadoop/hdfs/journal/hdfsha is in an inconsistent state: Can't format the storage directory because the current directory is not empty.
all output:
18/01/24 17:36:19 ERROR namenode.NameNode: Failed to start namenode.
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Could not format one or more JournalNodes. 1 exceptions thrown:
:8485: Directory /data/hadoop/hdfs/journal/hdfsha is in an inconsistent state: Can't format the storage directory because the current directory is not empty.
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.checkEmptyCurrent(Storage.java:482)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:558)
at org.apache.hadoop.hdfs.qjournal.server.JNStorage.format(JNStorage.java:185)
at org.apache.hadoop.hdfs.qjournal.server.Journal.format(Journal.java:217)
at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.format(JournalNodeRpcServer.java:145)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.format(QJournalProtocolServerSideTranslatorPB.java:145)
at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25419)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345)

ha hdfs : Initialization failed for Block pool <registering> (Datanode Uuid unassigned)

I get the following error trying to start datanodes in HA HDFS cluster
2016-01-06 22:54:58,064 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory [DISK]file:/home/data/hdfs/dn/ has already been used.
2016-01-06 22:54:58,082 INFO org.apache.hadoop.hdfs.server.common.Storage: Analyzing storage directories for bpid BP-1354640905-10.146.52.232-1452117061014
2016-01-06 22:54:58,083 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to analyze storage directories for block pool BP-1354640905-10.146.52.232-1452117061014
java.io.IOException: BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/data/hdfs/dn/current/BP-1354640905-10.146.52.232-1452117061014
at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:210)
at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:242)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:396)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:477)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1338)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1304)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:314)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:226)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:867)
at java.lang.Thread.run(Thread.java:745)
2016-01-06 22:54:58,084 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage for block pool: BP-1354640905-10.146.52.232-1452117061014 : BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/data/hdfs/dn/current/BP-1354640905-10.146.52.232-1452117061014
2016-01-06 22:54:58,084 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool (Datanode Uuid unassigned) service to master3/10.146.52.232:8020. Exiting.
java.io.IOException: All specified directories are failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1338)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1304)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:314)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:226)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:867)
at java.lang.Thread.run(Thread.java:745)
2016-01-06 22:54:58,084 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool (Datanode Uuid unassigned) service to master3/10.146.52.232:8020
2016-01-06 22:54:58,084 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool (Datanode Uuid unassigned) service to master2/10.146.52.231:8020. Exiting.
org.apache.hadoop.util.DiskChecker$DiskErrorException: Invalid volume failure config value: 3
at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.(FsDatasetImpl.java:261)
at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:34)
at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:30)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1351)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1304)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:314)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:226)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:867)
at java.lang.Thread.run(Thread.java:745)
2016-01-06 22:54:58,085 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool (Datanode Uuid unassigned) service to master2/10.146.52.231:8020
2016-01-06 22:54:58,185 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool (Datanode Uuid unassigned)
I have already check the clusters ID in namenode and datanode and they are similar...
I tried to reformat everything several times...
Thanks for your help !
I have seen messages like this in the log file when the file system for the DataNode is corrupt. Perhaps, try running fsck -y on each of the disks used by the DataNode. In your case:
fsck -y /home/data/hdfs
Once the disk(s) is(are) clean you should be able to start the DataNode. The NameNode will work ensure that the replication factor is fixed for any lost blocks.
I had a similar problem (but don't know without more logs, but mine didn't say "Datanode Uuid unassigned"), and fsck didn't solve it.
In my case, I had moved a subset of disks from one node to another node that already had disks, and disabled the old node, so there was a problem with the disks not matching the DatanodeUuid of the new machine.
Above those lines in the log, there were entries like:
2016-04-11 19:32:02,991 WARN org.apache.hadoop.hdfs.server.common.Storage: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /archive14/dfs/data is in an inconsistent state: Root /archive14/dfs/data: DatanodeUuid=5ba6418e-2c24-4582-8225-3e7f7fff9feb, does not match 519c1e34-a573-41f7-9e80-dca606fce704 from other StorageDirectory.
To solve that, I ran:
sed -i -r "s/${olduuid}/${olduuid}/' /mountpoints*/dfs/data/current/VERSION
This replaces the old UUID in the VERSION file with the new one. Then starting the datanode works.
Maybe in your case, you had a missing UUID rather than an incorrect one.
Deleting the name node directory and the data node directory and then creating the new directories worked for me. Use this technique assuming that you will lost the data.
For my case,I reinstall hdfs by CM6.2.0 and instance two namenodes for HA.
Then reformat these namenode each other,but this option cause the error below.
Initialization failed for Block pool BP-666417012-10.253.76.213-1557044865448 (Datanode Uuid 5132035c-8d6a-4617-af7e-7d07355a905b) service to hzd-t-vbdl-02/10.253.76.222:8022 Blockpool ID mismatch: previously connected to Blockpool ID BP-666417012-10.253.76.213-1557044865448 but now connected to Blockpool ID BP-1262695848-10.253.76.222-1557045124181
Process method:
ansible all -m shell -a " more /XXX/hdfs/dfs/nn/current/VERSION "
hzd-t-vbdl-01 | CHANGED | rc=0 >>
Sun May 05 16:27:45 CST 2019
namespaceID=732385684
clusterID=cluster54
cTime=1557044865448
storageType=NAME_NODE
blockpoolID=BP-666417012-10.253.76.213-1557044865448
layoutVersion=-64
hzd-t-vbdl-02 | CHANGED | rc=0 >>
Sun May 05 16:32:04 CST 2019
namespaceID=892287385
clusterID=cluster54
cTime=1557045124181
storageType=NAME_NODE
blockpoolID=BP-1262695848-10.253.76.222-1557045124181
layoutVersion=-64
Finally copy the context from hzd-t-vbdl-01(early formated) to hzd-t-vbdl-02,and restart namenodes and datanodes

invalid last txid in stream

I am trying to configure hadoop namenode HA and resourcemanager HA as well. However, when I start namenode as standby, I got IllegalArgumentException as below:
=====================================================
About to bootstrap Standby ID nn2 from:
Nameservice ID: mycluster
Other Namenode ID: nn1
Other NN's HTTP address: http://my1.namenode.com:50070
Other NN's IPC address: my1.namenode.com/xxx.xxx.xxx.xxx:8020
Namespace ID: 1915209867
Block pool ID: BP-740716617-xxx.xxx.xxx.xxx-1409206617148
Cluster ID: CID-51cea219-ffe7-4a52-8a6c-fb83d501ccaa
Layout version: -56
=====================================================
Data exists in Storage Directory /hadoop1/hadoop/hdfs/nn. Formatting anyway.
14/11/05 16:41:20 INFO common.Storage: Storage directory /hadoop1/hadoop/hdfs/nn has been successfully formatted.
14/11/05 16:41:20 WARN common.Util: Path /hadoop1/hadoop/hdfs/nn should be specified as a URI in configuration files. Please update hdfs configuration.
14/11/05 16:41:20 WARN common.Util: Path /hadoop1/hadoop/hdfs/nn should be specified as a URI in configuration files. Please update hdfs configuration.
14/11/05 16:41:21 FATAL namenode.NameNode: Exception in namenode join
java.io.IOException: java.lang.IllegalArgumentException: invalid last txid in stream: http://my3.namenode.com:8480/getJournal?jid=mycluster&segmentTxId=74823&storageInfo=-56%3A1915209867%3A0%3ACID-51cea219-ffe7-4a52-8a6c-fb83d501ccaa
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.run(BootstrapStandby.java:317)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1306)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1395)
Caused by: java.lang.IllegalArgumentException: invalid last txid in stream: http://my3.namenode.com:8480/getJournal?jid=mycluster&segmentTxId=74823&storageInfo=-56%3A1915209867%3A0%3ACID-51cea219-ffe7-4a52-8a6c-fb83d501ccaa
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.<init>(RedundantEditLogInputStream.java:101)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.chainAndMakeRedundantStreams(JournalSet.java:300)
at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:494)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:260)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1399)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1418)
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.checkLogsAvailableForRead(BootstrapStandby.java:236)
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.doRun(BootstrapStandby.java:203)
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.access$000(BootstrapStandby.java:69)
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby$1.run(BootstrapStandby.java:106)
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby$1.run(BootstrapStandby.java:102)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.run(BootstrapStandby.java:102)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.run(BootstrapStandby.java:312)
... 2 more
14/11/05 16:41:21 INFO util.ExitUtil: Exiting with status 1
14/11/05 16:41:21 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at
************************************************************/
All others are working well and I've checked hdfs-site.xml configuration for the problem but I couldn't find anything.
Please help me...
Thank you
Restart :
./sbin/hadoop-daemon.sh start journalnode

ERROR namenode.FSNamesystem: FSNamesystem initialization failed

I'm running hadoop in pseudo distributed mode on an ubuntu VM. I recently decided to increase the RAM and number of cores available to my VM, and that seems to have completely screwed hdfs. First, it was in safemode and I manually released that using:
hadoop dfsadmin -safemode leave
Then I ran:
hadoop fsck -blocks
and practically every block was corrupt or missing. So I figured, this is just for my learning, I deleted everything in "/user/msknapp" and everything in "/var/lib/hadoop-0.20/cache/mapred/mapred/.settings". So the block errors were gone. Then I try:
hadoop fs -put myfile myfile
and get (abridged):
12/01/07 15:05:29 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/msknapp/myfile could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1490)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:653)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
12/01/07 15:05:29 WARN hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null
12/01/07 15:05:29 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/msknapp/myfile" - Aborting...
put: java.io.IOException: File /user/msknapp/myfile could only be replicated to 0 nodes, instead of 1
12/01/07 15:05:29 ERROR hdfs.DFSClient: Exception closing file /user/msknapp/myfile : org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/msknapp/myfile could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1490)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:653)
at ...
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/msknapp/myfile could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1490)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:653)
at ...
So I tried to stop and restart the namenode and datanode. No luck:
hadoop namenode
12/01/07 15:13:47 ERROR namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.FileNotFoundException: /var/lib/hadoop-0.20/cache/hadoop/dfs/name/image/fsimage (Permission denied)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
at org.apache.hadoop.hdfs.server.namenode.FSImage.isConversionNeeded(FSImage.java:683)
at org.apache.hadoop.hdfs.server.common.Storage.checkConversionNeeded(Storage.java:690)
at org.apache.hadoop.hdfs.server.common.Storage.access$000(Storage.java:60)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:469)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:297)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:99)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:358)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:327)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:271)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:465)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1239)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1248)
12/01/07 15:13:47 ERROR namenode.NameNode: java.io.FileNotFoundException: /var/lib/hadoop-0.20/cache/hadoop/dfs/name/image/fsimage (Permission denied)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
at org.apache.hadoop.hdfs.server.namenode.FSImage.isConversionNeeded(FSImage.java:683)
at org.apache.hadoop.hdfs.server.common.Storage.checkConversionNeeded(Storage.java:690)
at org.apache.hadoop.hdfs.server.common.Storage.access$000(Storage.java:60)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:469)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:297)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:99)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:358)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:327)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:271)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:465)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1239)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1248)
12/01/07 15:13:47 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1
************************************************************/
Would somebody please help me out here? I have been trying to fix this for hours.
Go into where you have configured the hdfs. delete everything there, format namenode and you are good to go. It usually happens if you don't shut down your cluster properly!
Following error means, fsimage file not have permission
namenode.NameNode: java.io.FileNotFoundException: /var/lib/hadoop-0.20/cache/hadoop/dfs/name/image/fsimage (Permission denied)
so give permission to fsimage file,
$chmod -R 777 fsimage

Resources