NameNode and Datanode not starting in Hadoop on start-dfs.cmd - hadoop

So I am trying to setup Hadoop by using this as reference: Towardsdatascience.com click here
now the error:
E:\RIYA\hadoop-env\hadoop-3.2.1\sbin>start-dfs.cmd
2 cmd pop up, one for datanode and one for namenode
DataNode Error:
************************************************************/
2022-03-11 23:44:42,810 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2022-03-11 23:44:48,486 INFO checker.ThrottledAsyncChecker: Scheduling a check for [DISK]file:/E:/hadoop-env/hadoop-3.2.1/data/dfs/datanode
2022-03-11 23:44:48,687 WARN checker.StorageLocationChecker: Exception checking StorageLocation [DISK]file:/E:/hadoop-env/hadoop-3.2.1/data/dfs/datanode
java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:645)
at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:1230)
at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:160)
at org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:142)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:116)
at org.apache.hadoop.hdfs.server.datanode.StorageLocation.check(StorageLocation.java:239)
at org.apache.hadoop.hdfs.server.datanode.StorageLocation.check(StorageLocation.java:52)
at org.apache.hadoop.hdfs.server.datanode.checker.ThrottledAsyncChecker$1.call(ThrottledAsyncChecker.java:142)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2022-03-11 23:44:48,691 ERROR datanode.DataNode: Exception in secureMain
org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 0, volumes configured: 1, volumes failed: 1, volume failures tolerated: 0
at org.apache.hadoop.hdfs.server.datanode.checker.StorageLocationChecker.check(StorageLocationChecker.java:231)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2799)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2714)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2756)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2900)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2924)
2022-03-11 23:44:48,701 INFO util.ExitUtil: Exiting with status 1: org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 0, volumes configured: 1, volumes failed: 1, volume failures tolerated: 0
2022-03-11 23:44:48,707 INFO datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at LAPTOP-7DCG00HD/192.168.56.1
************************************************************/
NameNode Error:
2022-03-11 23:44:53,048 ERROR namenode.NameNode: Failed to start namenode.
java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:645)
at org.apache.hadoop.fs.FileUtil.canWrite(FileUtil.java:1249)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:690)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:642)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:386)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:242)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1105)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:720)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:648)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:710)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:953)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:926)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1692)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1759)
2022-03-11 23:44:53,064 INFO util.ExitUtil: Exiting with status 1: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
2022-03-11 23:44:53,090 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at LAPTOP-7DCG00HD/192.168.56.1
************************************************************/
Files inside my directory: E:\RIYA\hadoop-env\hadoop-3.2.1\etc\hadoop
core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>MapReduce framework name</description>
</property>
</configuration>
yarn-site.xml
<?xml version="1.0"?>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>Yarn Node Manager Aux Service</description>
</property>
</configuration>
hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/E:/hadoop-env/hadoop-3.2.1/data/dfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/E:/hadoop-env/hadoop-3.2.1/data/dfs/datanode</value>
</property>
</configuration>
I have put my hadoop.dll file and winutils.exe file in windows32 folder and mostly I have done everything that has been given on the internet. I have downloaded jar file from here and moved it to my folder directory E:\RIYA\hadoop-env\hadoop-3.2.1\share\hadoop\hdfs with name hadoop-hdfs-3.2.1.bak. I don't know what steps are remaining to this.

Related

Datanode not working on Hadoop single node cluster on windows

There are many similar questions on stack overflow but none of them solves my problem.
I'm trying to start my namenode and datanode, of which namenode starts working but datanode fails alongwith resource manager and node manager. Here is the error that shows up:
2021-06-17 15:44:09,513 ERROR datanode.DataNode: Exception in secureMain
org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 0, volumes configured: 1, volumes failed: 1, volume failures tolerated: 0
at org.apache.hadoop.hdfs.server.datanode.checker.StorageLocationChecker.check(StorageLocationChecker.java:231)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2799)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2714)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2756)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2900)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2924)
2021-06-17 15:44:09,518 INFO util.ExitUtil: Exiting with status 1: org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 0, volumes configured: 1, volumes failed: 1, volume failures tolerated: 0
2021-06-17 15:44:09,522 INFO datanode.DataNode: SHUTDOWN_MSG:
Here is my hdfs-site.xml:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>C:\Users\username\Documents\hadoop-3.2.1\data\dfs\namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>C:\Users\username\Documents\hadoop-3.2.1\data\dfs\datanode</value>
</property>
<property>
<name>dfs.datanode.failed.volumes.tolerated</name>
<value>0</value>
</property>
</configuration>
What could be the solution?
The question is answered here:
https://stackoverflow.com/a/58924939/14194692
This answer is not accepted on the question but I tried it and it worked. Tada.
Not deleting my question because none of the question is asked as clearly as this one I believe. I hope it helps other people.
Cheers.

Hadoop job stuck due to connection timeout

I am new to hadoop.I have set up hadoop in my mac system, then I am trying to run following:
hadoop jar wordcount.jar /usr/joy/input /usr/joy/output
In response to the command, following message got printed in terminal,
16/03/18 17:13:20 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes
where applicable 16/03/18 17:13:20 INFO client.RMProxy: Connecting to
ResourceManager at localhost/127.0.0.1:8032
16/03/18 17:13:21 WARN mapreduce.JobResourceUploader: Hadoop command-
line option parsing not performed. Implement the Tool interface and
execute your application with ToolRunner to remedy this.
16/03/18 17:13:21 INFO input.FileInputFormat: Total input paths to
process : 1
16/03/18 17:13:21 INFO mapreduce.JobSubmitter: number of splits:1
16/03/18 17:13:21 INFO mapreduce.JobSubmitter: Submitting tokens for
job: job_1458279089418_0002
16/03/18 17:13:21 INFO impl.YarnClientImpl: Submitted application
application_1458279089418_0002
16/03/18 17:13:21 INFO mapreduce.Job: The url to track the job:
http://EN-AbhishekM:8088/proxy/application_1458279089418_0002/
Now while I am checking status of the job at browser, in logs I found following error:
Application application_1458279089418_0001 failed 2 times due to Error
launching appattempt_1458279089418_0001_000002. Got exception:
org.apache.hadoop.net.ConnectTimeoutException: Call From
EN-AbhishekM/192.168.0.102 to 192.168.43.66:61029
failed on socket timeout exception:
org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout
while waiting for channel to be ready for connect. ch :
java.nio.channels.SocketChannel[connection-pending
remote=192.168.43.66/192.168.43.66:61029];....
I am pasting configuration files here:
core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
yarn-site.xml
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
Formatted the filesystem by:
bin/hdfs namenode -format
started namenode and datanode daemon by:
sbin/start-dfs.sh
Started ResourceManager daemon and NodeManager daemon by:
sbin/start-yarn.sh
Can please anyone suggest me what mistake I am doing here.

the directory is already locked hadoop

I am getting below error while starting hadoop:
2015-09-04 08:49:05,648 ERROR org.apache.hadoop.hdfs.server.common.Storage: It appears that another node 854#ip-1-2-3-4 has already locked the storage directory: /mnt/xvdb/tmp/dfs/namesecondary
java.nio.channels.OverlappingFileLockException
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:712)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:678)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:499)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.recoverCreate(SecondaryNameNode.java:962)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:243)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNode.java:192)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:671)
2015-09-04 08:49:05,650 INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage /mnt/xvdb/tmp/dfs/namesecondary. The directory is already locked
2015-09-04 08:49:05,650 FATAL org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Failed to start secondary namenode
java.io.IOException: Cannot lock storage /mnt/xvdb/tmp/dfs/namesecondary. The directory is already locked
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:683)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:499)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.recoverCreate(SecondaryNameNode.java:962)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:243)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNode.java:192)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:671)
2015-09-04 08:49:05,652 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2015-09-04 08:49:05,653 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down SecondaryNameNode at ip-#ip-1-2-3-4/#ip-1-2-3-4
************************************************************/
Hadoop version: 2.7.1(3 node cluster)
hdfs-site.xml configuration file:
<configuration>
<property>
<name>dfs.data.dir</name>
<value>/mnt/xvdb/hadoop/dfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.name.dir</name>
<value>/mnt/xvdb/hadoop/dfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
I have tried formatting name node as well, but it didn't help. Can anyone help me with this?
I found solution to above problem here : http://misconfigurations.blogspot.in/2014/10/hadoop-initialization-failed-for-block.html
If there is any other solution,would like to have a look.
P.S: I have deleted the directory pointed out by "dfs.datanode.data.dir" and it has erased all data on HDFS but helped me to fix the issue. So You can use an alternate way, if has any, for fixing this issue.

secondarynamenode on master and Datanode not start on slave hadoop 2.6.0

when i start hadoop using start-all.sh after that datanode and secondarynamenode not up on server and on slave datanode not starting.
when i troubleshoot using hdfs datanode get this error
15/06/29 11:06:34 INFO datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT]
15/06/29 11:06:35 WARN common.Util: Path /var/lib/hadoop/hdfs/datanode should be specified as a URI in configuration files. Please update hdfs configuration.
15/06/29 11:06:35 FATAL datanode.DataNode: Exception in secureMain
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
at org.apache.hadoop.security.Groups.<init>(Groups.java:70)
at org.apache.hadoop.security.Groups.<init>(Groups.java:66)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:280)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:271)
at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:299)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2152)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2402)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:129)
... 9 more
Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative()V
at org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative(Native Method)
at org.apache.hadoop.security.JniBasedUnixGroupsMapping.<clinit>(JniBasedUnixGroupsMapping.java:49)
at org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.<init>(JniBasedUnixGroupsMappingWithFallback.java:39)
... 14 more
15/06/29 11:06:35 INFO util.ExitUtil: Exiting with status 1
15/06/29 11:06:35 INFO datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at localserver39/10.200.208.28
what is issue with my datanode on slave and on master secondarynamenode ?
start-dfs.sh on master
get this as output
hadoop#10.200.208.29's password: 10.200.208.28: starting datanode, logging to /home/hadoop/hadoop/logs/hadoop-hadoop-datanode-localserver39.out
10.200.208.28: nice: /usr/libexec/../bin/hdfs: No such file or directory
Starting secondary namenodes [0.0.0.0]
hadoop#0.0.0.0's password:
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/hadoop/logs/hadoop-hadoop-secondarynamenode-MC-RND-1.out
After Jps get this
bash-3.2$ jps
8103 Jps
7437 DataNode
7309 NameNode
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://10.200.208.29:9000/</value>
</property>
</configuration>
hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/Backup-HDD/hadoop/datanode</value>
</property>
<property>
<name>dfs.namenode.data.dir</name>
<value>/Backup-HDD/hadoop/namenode</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/Backup-HDD/hadoop/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/Backup-HDD/hadoop/datanode</value>
</property>
Remove the below properties from hdfs-site.xml,
<property>
<name>dfs.datanode.data.dir</name>
<value>/Backup-HDD/hadoop/datanode</value>
</property>
<property>
<name>dfs.namenode.data.dir</name>
<value>/Backup-HDD/hadoop/namenode</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/Backup-HDD/hadoop/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/Backup-HDD/hadoop/datanode</value>
</property>
Add the below two properties in hdfs-site.xml
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/user/Backup-HDD/hadoop/datanode</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/user/Backup-HDD/hadoop/namenode</value>
</property>
Make sure path specified in the name and data dir are exists in you system.
Problem Solved after search on google
Update .bashrc and .bash_profile
cat .bashrc
#!/bin/bash
unset all HADOOP environment variables
env | grep HADOOP | sed 's/.(HADOOP[^=])=.*/\1/' > un_var
while read line; do unset "$line"; done < un_var
rm un_var
export JAVA_HOME="/usr/java/latest/"
export HADOOP_PREFIX="/home/hadoop/hadoop"
export HADOOP_YARN_USER="hadoop"
export HADOOP_HOME="$HADOOP_PREFIX"
export HADOOP_CONF_DIR="$HADOOP_PREFIX/etc/hadoop"
export HADOOP_PID_DIR="$HADOOP_PREFIX"
export HADOOP_LOG_DIR="$HADOOP_PREFIX/logs"
export HADOOP_OPTS="$HADOOP_OPTS -Djava.io.tmpdir=$HADOOP_PREFIX/tmp"
export YARN_HOME="$HADOOP_PREFIX"
export YARN_CONF_DIR="$HADOOP_PREFIX/etc/hadoop"
export YARN_PID_DIR="$HADOOP_PREFIX"
export YARN_LOG_DIR="$HADOOP_PREFIX/logs"
export YARN_OPTS="$YARN_OPTS -Djava.io.tmpdir=$HADOOP_PREFIX/tmp"
cat .bash_profile
#!/bin/bash
if [ -f ~/.bashrc ]; then
source ~/.bashrc
fi
Issue with Bash Profile

Hadoop 0.23.9 How to Start datanodes

It seems like I can't get hadoop to start properly. I'm using hadoop 0.23.9:
[msknapp#localhost sbin]$ hadoop namenode -format
...
[msknapp#localhost sbin]$ ./start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/cloud/hadoop-0.23.9/logs/hadoop-msknapp-namenode-localhost.localdomain.out
localhost: starting datanode, logging to /usr/local/cloud/hadoop-0.23.9/logs/hadoop-msknapp-datanode-localhost.localdomain.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/cloud/hadoop-0.23.9/logs/hadoop-msknapp-secondarynamenode-localhost.localdomain.out
[msknapp#localhost sbin]$ ./start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/cloud/hadoop-0.23.9/logs/yarn-msknapp-resourcemanager-localhost.localdomain.out
localhost: starting nodemanager, logging to /usr/local/cloud/hadoop-0.23.9/logs/yarn-msknapp-nodemanager-localhost.localdomain.out
[msknapp#localhost sbin]$ cd /var/local/stock/data
[msknapp#localhost data]$ hadoop fs -ls /
[msknapp#localhost data]$ hadoop fs -mkdir /stock
[msknapp#localhost data]$ ls
companies.csv raw slf_series.txt
[msknapp#localhost data]$ hadoop fs -put companies.csv /stock/companies.csv
13/12/08 11:10:40 WARN hdfs.DFSClient: DataStreamer Exception
java.io.IOException: File /stock/companies.csv._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1180)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1536)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:414)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:394)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1571)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1262)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1565)
at org.apache.hadoop.ipc.Client.call(Client.java:1094)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:195)
at com.sun.proxy.$Proxy6.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:102)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:67)
at com.sun.proxy.$Proxy6.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1130)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1006)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:458)
put: File /stock/companies.csv._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
13/12/08 11:10:40 ERROR hdfs.DFSClient: Failed to close file /stock/companies.csv._COPYING_
java.io.IOException: File /stock/companies.csv._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1180)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1536)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:414)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:394)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1571)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1262)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1565)
at org.apache.hadoop.ipc.Client.call(Client.java:1094)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:195)
at com.sun.proxy.$Proxy6.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:102)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:67)
at com.sun.proxy.$Proxy6.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1130)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1006)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:458)
Here is my core-site.xml:
<property>
<name>fs.default.name</name>
<value>hdfs://localhost/</value>
</property>
and my hdfs-site.xml:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
and mapred-site.xml:
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>
I looked through all the documentation I have, I cannot figure out how to start hadoop correctly. I can't find any documentation online about hadoop-0.23.9. My Hadoop book is written for 0.22. The online documentation is for 2.1.1, which coincidentally I could not get to work.
Can somebody please tell me how to get my hadoop started correctly?
Specify a port for fs.default.name
like:
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
After that, create a tmp directory for hdfs:
sudo mkdir -p /app/hadoop/tmp
sudo chown you /app/hadoop/tmp
and add to core-site.xml:
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
Make sure to restart your cluster.
$HADOOP_HOME/bin/stop-all.sh
$HADOOP_HOME/bin/start-all.sh
Try deleting all the data with hadoop stopped:
$HADOOP_HOME/bin/hadoop datanode -format
or manually delete the contents of
/app/hadoop/tmp/dfs/data/
and then start hadoop again:
$HADOOP_HOME/bin/start-all.sh
The key problem in your configuration is as below:
java.io.IOException: File /stock/companies.csv._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
Make sure your HDFS specific configuration has at least minimum of as below:
hdfs-site.xml:
As shows in the xml you must have /tmp/hdfs23/namenode and /tmp/hdfs23/datanode folder already existed. You can configure any other folder for hdfs root and then namenode and datanode folder insider it.
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///tmp/hdfs23/namenode</value>
</property>
<property>
<name>fs.checkpoint.dir</name>
<value>file:///tmp/hdfs23/secnamenode</value>
</property>
<property>
<name>fs.checkpoint.edits.dir</name>
<value>file:///tmp/hdfs23/secnamenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///tmp/hdfs23/datanode</value>
</property>
</configuration>
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.http.staticuser.user</name>
<value>hdfs</value>
</property>
</configuration>
Then you need to format your namenode as you already did:
$ hadoop namenode -format
After that you can start HDFS as below:
[Hadoop023_ROOT]/sbin/start-dfs.sh

Resources