Hadoop pseudo-distributed mode error - hadoop

I have set-up Hadoop on a OpenSuse 11.2 VM using Virtualbox.I have made the prerequisite configs. I ran this example in the Standalone mode successfully.
But in psuedo-distributed mode I get the following error:
$./bin/hadoop fs -put conf input
10/04/13 15:56:25 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.SocketException: Protocol not available
10/04/13 15:56:25 INFO hdfs.DFSClient: Abandoning block blk_-8490915989783733314_1003
10/04/13 15:56:31 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.SocketException: Protocol not available
10/04/13 15:56:31 INFO hdfs.DFSClient: Abandoning block blk_-1740343312313498323_1003
10/04/13 15:56:37 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.SocketException: Protocol not available
10/04/13 15:56:37 INFO hdfs.DFSClient: Abandoning block blk_-3566235190507929459_1003
10/04/13 15:56:43 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.SocketException: Protocol not available
10/04/13 15:56:43 INFO hdfs.DFSClient: Abandoning block blk_-1746222418910980888_1003
10/04/13 15:56:49 WARN hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block.
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)
10/04/13 15:56:49 WARN hdfs.DFSClient: Error Recovery for block blk_-1746222418910980888_1003 bad datanode[0] nodes == null
10/04/13 15:56:49 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/max/input/core-site.xml" - Aborting...
put: Protocol not available
10/04/13 15:56:49 ERROR hdfs.DFSClient: Exception closing file /user/max/input/core-site.xml : java.net.SocketException: Protocol not available
java.net.SocketException: Protocol not available
at sun.nio.ch.Net.getIntOption0(Native Method)
at sun.nio.ch.Net.getIntOption(Net.java:178)
at sun.nio.ch.SocketChannelImpl$1.getInt(SocketChannelImpl.java:419)
at sun.nio.ch.SocketOptsImpl.getInt(SocketOptsImpl.java:60)
at sun.nio.ch.SocketOptsImpl.sendBufferSize(SocketOptsImpl.java:156)
at sun.nio.ch.SocketOptsImpl$IP$TCP.sendBufferSize(SocketOptsImpl.java:286)
at sun.nio.ch.OptionAdaptor.getSendBufferSize(OptionAdaptor.java:129)
at sun.nio.ch.SocketAdaptor.getSendBufferSize(SocketAdaptor.java:328)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2873)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2826)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)

It seems like there are no live data nodes in the cluster. Did you check whether status page shows live nodes? http://localhost:50070/
Start all Hadoop daemons using command $ bin/start-all.sh .

Did you start the Hadoop demons. This needs to be done in psuedo-dist mode unlike the standalone mode. You start them using something like:
$bin\start-all.sh
Documentation for the steps required can be found here.
Did you follow all these steps? Can you browse the NameNode and JobTracker web interfaces?

Maybe try using preconfigured virtual machine? http://www.cloudera.com/developers/downloads/virtual-machine/ I think this is probably the best way to start learning hadoop, and those problems should not happened there.

Related

Hortonworks Data Platform 2.5: HBase services stopped due to failed connection

I set up a new Hadoop Cluster (containing 6 machines) with HDP 2.5. The Installation worked well and the first minutes everything seemed to work properly. But after a few minutes, two of the HBase services stopped working:
HBase / host1.mydomain.de
HBase Master Process: Connection failed [Errno 111] Connection refuced to host1.mydomain.de
HBase / host6.mydomain.de
HBase RegionServer Process: Connection failed [Errno 111] Connection refused to host6.mydomain.de
As I googled around for this issue, I found these tips:
check and enable NTPD (enabled before installation, still disabled)
check and disable Firewall (disabled before installation, still disabled)
check and disable SELinux (disabled before installation, still disabled)
The point is, that all services were running at the beginning, so the services listed above should be configured correctly!
I can say the following to my cluster configuration:
the Ambari-Server host (host1) can reach all oher hosts by ping and can connect password-less per SSH
installed components are HDFS, YARN, MR2, Tez, Hive, HBase, Pig, ZooKeeper, AmbariMetrics, Knox, Spark, Slider
I left all the default settings during installation, and I have ignored the following warning:
The log file /var/log/hbase/hbase-hbase-master-host1.domain.de.log contains the followng snippet (IP addresses are blackened by a.a.a.a / b.b.b.b / x.x.x.x / y.y.y.y / z.z.z.z):
2016-11-22 18:50:53,007 INFO [master/host1.xxx.de/xxx.xxx.xxx.xxx:16000] client.ZooKeeperRegistry: ClusterId read in ZooKeeper is null
2016-11-22 18:51:59,581 INFO [Thread-70] hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Got error, status message , ack with firstBadLink as bbb.bbb.bbb.bbb:50010
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1393)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1295)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)
2016-11-22 18:51:59,584 INFO [Thread-70] hdfs.DFSClient: Abandoning BP-90489822-xxx.xxx.xxx.xxx-1479836232259:blk_1073741825_1001
2016-11-22 18:51:59,597 INFO [Thread-70] hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[bbb.bbb.bbb.bbb:50010,DS-7691d8f6-0c76-4780-9836-85f20f935dd6,DISK]
2016-11-22 18:52:33,674 INFO [Thread-70] hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Got error, status message , ack with firstBadLink as zzz.zzz.zzz.zzz:50010
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1393)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1295)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)
2016-11-22 18:52:33,675 INFO [Thread-70] hdfs.DFSClient: Abandoning BP-90489822-xxx.xxx.xxx.xxx-1479836232259:blk_1073741837_1013
2016-11-22 18:52:33,683 INFO [Thread-70] hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[zzz.zzz.zzz.zzz:50010,DS-15d2586e-09a9-41ed-898d-689e15cd6596,DISK]
2016-11-22 18:52:33,771 WARN [host1:16000.activeMasterManager] hdfs.DFSClient: Slow waitForAckedSeqno took 100584ms (threshold=30000ms)
2016-11-22 18:52:33,797 INFO [host1:16000.activeMasterManager] util.FSUtils: Created version file at hdfs://host1.xxx.de:8020/apps/hbase/data with version=8
2016-11-22 18:52:36,820 INFO [Thread-76] hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Got error, status message , ack with firstBadLink as yyy.yyy.yyy.yyy:50010
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1393)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1295)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)
2016-11-22 18:52:36,821 INFO [Thread-76] hdfs.DFSClient: Abandoning BP-90489822-xxx.xxx.xxx.xxx-1479836232259:blk_1073741843_1019
2016-11-22 18:52:36,828 INFO [Thread-76] hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[yyy.yyy.yyy.yyy:50010,DS-439b87d1-f08d-464c-b0e2-728987cd211d,DISK]
2016-11-22 18:52:37,567 INFO [Thread-76] hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Got error, status message , ack with firstBadLink as zzz.zzz.zzz.zzz:50010
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1393)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1295)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)
2016-11-22 18:52:37,567 INFO [Thread-76] hdfs.DFSClient: Abandoning BP-90489822-xxx.xxx.xxx.xxx-1479836232259:blk_1073741845_1021
2016-11-22 18:52:37,575 INFO [Thread-76] hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[zzz.zzz.zzz.zzz:50010,DS-15d2586e-09a9-41ed-898d-689e15cd6596,DISK]
2016-11-22 18:52:40,589 INFO [Thread-76] hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Got error, status message , ack with firstBadLink as aaa.aaa.aaa.aaa:50010
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1393)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1295)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)
2016-11-22 18:52:40,589 INFO [Thread-76] hdfs.DFSClient: Abandoning BP-90489822-xxx.xxx.xxx.xxx-1479836232259:blk_1073741846_1022
2016-11-22 18:52:40,593 INFO [Thread-76] hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[aaa.aaa.aaa.aaa:50010,DS-15ea9223-2b1b-4f86-8797-ff0e2aaa6787,DISK]
2016-11-22 18:52:40,694 INFO [host1:16000.activeMasterManager] master.MasterFileSystem: BOOTSTRAP: creating hbase:meta region
2016-11-22 18:52:40,699 INFO [host1:16000.activeMasterManager] regionserver.HRegion: creating HRegion hbase:meta HTD == 'hbase:meta', {TABLE_ATTRIBUTES => {IS_META => 'true', coprocessor$1 => '|org.apache.hadoop.hbase.coprocessor.Mul$
2016-11-22 18:52:43,741 INFO [Thread-79] hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Got error, status message , ack with firstBadLink as yyy.yyy.yyy.yyy:50010
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1393)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1295)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)
2016-11-22 18:52:43,742 INFO [Thread-79] hdfs.DFSClient: Abandoning BP-90489822-xxx.xxx.xxx.xxx-1479836232259:blk_1073741848_1024
2016-11-22 18:52:43,744 INFO [Thread-79] hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[yyy.yyy.yyy.yyy:50010,DS-fc167096-246b-4215-b344-be786d98c472,DISK]
2016-11-22 18:52:46,760 INFO [Thread-79] hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Got error, status message , ack with firstBadLink as zzz.zzz.zzz.zzz:50010
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1393)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1295)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)
2016-11-22 18:52:46,760 INFO [Thread-79] hdfs.DFSClient: Abandoning BP-90489822-xxx.xxx.xxx.xxx-1479836232259:blk_1073741849_1025
2016-11-22 18:52:46,766 INFO [Thread-79] hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[zzz.zzz.zzz.zzz:50010,DS-c1dc059d-8f0b-4971-88cc-ebc76dd8659a,DISK]
Can someone give a clue why these (and only these) two services are stopping after a few minutes?
I got same issue before ZooKeeperRegistry: ClusterId read in ZooKeeper is null
In Hbase -> config -> hbase-site.xml changed value of zookeeper.znode.parent to /hbase-unsecure and restarted the services

Namenode starting but Datanode not starting

I get the following exception when I start the distributed file system. I am using hadoop 2.6.0.
2015-08-26 23:10:58,222 FATAL datanode.DataNode (DataNode.java:secureMain(2385)) - Exception in secureMain
java.net.UnknownHostException: IM1948-X0: IM1948-X0
at java.net.InetAddress.getLocalHost(InetAddress.java:1475)
at org.apache.hadoop.security.SecurityUtil.getLocalHostName(SecurityUtil.java:187)
at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:207)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2153)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2402)
Caused by: java.net.UnknownHostException: IM1948-X0
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1295)
at java.net.InetAddress.getLocalHost(InetAddress.java:1471)
2015-08-26 23:10:58,227 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2015-08-26 23:10:58,229 INFO datanode.DataNode (StringUtils.java:run(659)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at java.net.UnknownHostException: IM1948-X0: IM1948-X0
Even deleting the hadoop/hdfs-data/current directory doesnot help; I tried formatting the namenode but without success. This generally happens to me when I restart hadoop.
Basically to sum up, datanode process is not running at all for the hadoop cluster.

HADOOP copyFromLocal DataStreamer exception

I'm using Hadoop 0.20.203, and I have a cluster with nodes 0 ~ 24. cluster0 is used as a NameNode, and all others are currently used as DataNodes.
I'm currently trying to execute WordCount example, however when I try to -copyFromLocal into DFS, following message is shown :
aqjune#cluster0:~>> $HADOOP_HOME/bin/hadoop dfs -copyFromLocal pg132.txt /user/aqjune/input/pg132.txt
14/06/17 19:54:01 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection refused
14/06/17 19:54:01 INFO hdfs.DFSClient: Abandoning block blk_-7530678618792869516_1003
14/06/17 19:54:07 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection refused
14/06/17 19:54:07 INFO hdfs.DFSClient: Abandoning block blk_-7462751912508683911_1003
14/06/17 19:54:13 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection refused
14/06/17 19:54:13 INFO hdfs.DFSClient: Abandoning block blk_252255837066920011_1003
14/06/17 19:54:19 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection refused
14/06/17 19:54:19 INFO hdfs.DFSClient: Abandoning block blk_4030900909035905642_1003
14/06/17 19:54:25 WARN hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block.
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3002)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)
14/06/17 19:54:25 WARN hdfs.DFSClient: Error Recovery for block blk_4030900909035905642_1003 bad datanode[0] nodes == null
14/06/17 19:54:25 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/aqjune/input/pg132.txt" - Aborting...
copyFromLocal: Connection refused
14/06/17 19:54:25 ERROR hdfs.DFSClient: Exception closing file /user/aqjune/input/pg132.txt : java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:406)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:3028)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2983)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)
And then only empty file was created ;
aqjune#grey0:~/hadoop>> bin/hadoop dfs -lsr /
drwxr-xr-x - aqjune supergroup 0 2014-06-17 19:45 /user
drwxr-xr-x - aqjune supergroup 0 2014-06-17 19:45 /user/aqjune
drwxr-xr-x - aqjune supergroup 0 2014-06-17 19:54 /user/aqjune/input
-rw-r--r-- 1 aqjune supergroup 0 2014-06-17 19:54 /user/aqjune/input/pg132.txt
I can't figure out the cause of this problem. Can I get some hint for this?

Hadoop : java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry

I am using Hadoop in a pseudo-distributed mode and everything was working fine.But whenever I restart my computer namenode goes into safenode. In order to forcefully let the namenode leave safemode, I am using $ bin/hadoop dfsadmin -safemode leavecommand. but after that I have a situation here that I'm wondering where it's coming from. When I use -ls, I can see files but when I try to get the file, I'm not able to retrieve this block. I am getting following error
$ hadoop fs -cat /user/op/part-r-00000
13/11/21 12:45:12 INFO hdfs.DFSClient: No node available for block: blk_-4538200827997952429_1071 file=/user/op/part-r-00000
13/11/21 12:45:12 INFO hdfs.DFSClient: Could not obtain block blk_-4538200827997952429_1071 from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry...
13/11/21 12:45:15 INFO hdfs.DFSClient: No node available for block: blk_-4538200827997952429_1071 file=/user/op/part-r-00000
13/11/21 12:45:15 INFO hdfs.DFSClient: Could not obtain block blk_-4538200827997952429_1071 from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry...
13/11/21 12:45:18 INFO hdfs.DFSClient: No node available for block: blk_-4538200827997952429_1071 file=/user/op/part-r-00000
13/11/21 12:45:18 INFO hdfs.DFSClient: Could not obtain block blk_-4538200827997952429_1071 from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry...
13/11/21 12:45:21 WARN hdfs.DFSClient: DFS Read: java.io.IOException: Could not obtain block: blk_-4538200827997952429_1071 file=/user/op/part-r-00000
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:2426)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2218)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2381)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:68)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:47)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:100)
at org.apache.hadoop.fs.FsShell.printToStdout(FsShell.java:114)
at org.apache.hadoop.fs.FsShell.access$100(FsShell.java:49)
at org.apache.hadoop.fs.FsShell$1.process(FsShell.java:349)
at org.apache.hadoop.fs.FsShell$DelayedExceptionThrowing.globAndProcess(FsShell.java:1913)
at org.apache.hadoop.fs.FsShell.cat(FsShell.java:346)
at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1557)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1776)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1895)
cat: Could not obtain block: blk_-4538200827997952429_1071 file=/user/op/part-r-00000

ERROR namenode.FSNamesystem: FSNamesystem initialization failed

I'm running hadoop in pseudo distributed mode on an ubuntu VM. I recently decided to increase the RAM and number of cores available to my VM, and that seems to have completely screwed hdfs. First, it was in safemode and I manually released that using:
hadoop dfsadmin -safemode leave
Then I ran:
hadoop fsck -blocks
and practically every block was corrupt or missing. So I figured, this is just for my learning, I deleted everything in "/user/msknapp" and everything in "/var/lib/hadoop-0.20/cache/mapred/mapred/.settings". So the block errors were gone. Then I try:
hadoop fs -put myfile myfile
and get (abridged):
12/01/07 15:05:29 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/msknapp/myfile could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1490)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:653)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
12/01/07 15:05:29 WARN hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null
12/01/07 15:05:29 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/msknapp/myfile" - Aborting...
put: java.io.IOException: File /user/msknapp/myfile could only be replicated to 0 nodes, instead of 1
12/01/07 15:05:29 ERROR hdfs.DFSClient: Exception closing file /user/msknapp/myfile : org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/msknapp/myfile could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1490)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:653)
at ...
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/msknapp/myfile could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1490)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:653)
at ...
So I tried to stop and restart the namenode and datanode. No luck:
hadoop namenode
12/01/07 15:13:47 ERROR namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.FileNotFoundException: /var/lib/hadoop-0.20/cache/hadoop/dfs/name/image/fsimage (Permission denied)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
at org.apache.hadoop.hdfs.server.namenode.FSImage.isConversionNeeded(FSImage.java:683)
at org.apache.hadoop.hdfs.server.common.Storage.checkConversionNeeded(Storage.java:690)
at org.apache.hadoop.hdfs.server.common.Storage.access$000(Storage.java:60)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:469)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:297)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:99)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:358)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:327)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:271)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:465)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1239)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1248)
12/01/07 15:13:47 ERROR namenode.NameNode: java.io.FileNotFoundException: /var/lib/hadoop-0.20/cache/hadoop/dfs/name/image/fsimage (Permission denied)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
at org.apache.hadoop.hdfs.server.namenode.FSImage.isConversionNeeded(FSImage.java:683)
at org.apache.hadoop.hdfs.server.common.Storage.checkConversionNeeded(Storage.java:690)
at org.apache.hadoop.hdfs.server.common.Storage.access$000(Storage.java:60)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:469)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:297)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:99)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:358)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:327)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:271)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:465)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1239)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1248)
12/01/07 15:13:47 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1
************************************************************/
Would somebody please help me out here? I have been trying to fix this for hours.
Go into where you have configured the hdfs. delete everything there, format namenode and you are good to go. It usually happens if you don't shut down your cluster properly!
Following error means, fsimage file not have permission
namenode.NameNode: java.io.FileNotFoundException: /var/lib/hadoop-0.20/cache/hadoop/dfs/name/image/fsimage (Permission denied)
so give permission to fsimage file,
$chmod -R 777 fsimage

Resources