Couldn't start hadoop datanode normally - hadoop

i am trying to install hadoop 2.2.0 i am getting following kind of error while starting dataenode services please help me resolve this issue.Thanks in Advance.
2014-03-11 08:48:16,406 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/in_use.lock acquired by nodename 3627#prassanna-Studio-1558
2014-03-11 08:48:16,426 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582) service to localhost/127.0.0.1:9000
java.io.IOException: Incompatible clusterIDs in /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode: namenode clusterID = CID-fb61aa70-4b15-470e-a1d0-12653e357a10; datanode clusterID = CID-8bf63244-0510-4db6-a949-8f74b50f2be9
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:391)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:191)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:837)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:808)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:280)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:222)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664)
at java.lang.Thread.run(Thread.java:662)
2014-03-11 08:48:16,427 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582) service to localhost/127.0.0.1:9000
2014-03-11 08:48:16,532 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582)
2014-03-11 08:48:18,532 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2014-03-11 08:48:18,534 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2014-03-11 08:48:18,536 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/**********************************
SHUTDOWN_MSG: Shutting down DataNode at prassanna-Studio-1558/127.0.1.1

Make sure you are ready with correct configuration and right path.
This is a link for Running Hadoop on ubuntu.
I have used this link to setup hadoop in my machine and it works fine.

That simply shows that the datanode tried to startup but took some exception and died.
Please check the datanode log under the logs folder in the hadoop installation folder (unless you changed that config) for exceptions. It usually points to a configuration issue of some kind, esp. network settings (/etc/hosts) related but there are quite a few possibilities.

Refer this,
1.Check JAVA_HOME---
readlink -f $(which java)
/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java
2.If JAVA is not available install by command
sudo apt-get install defalul-jdk
than run 1. and check on terminal
java -version
javac -version
3.Configure SSH
Hadoop requires SSH access to manage its nodes, i.e. remote machines plus your local machine if you want to use Hadoop on it (which is what we want to do in this short tutorial). For our single-node setup of Hadoop, we therefore need to configure SSH access to localhost for the user .
sudo apt-get install ssh
sudo su hadoop
ssh-keygen -t rsa -P “”
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh localhost
Download and extract hadoop-2.7.3(Chosse dirrectory having read write permisson)
Set Environment Variable
sudo gedit .bashrc
source .bashrc
Setup Configuration Files
The following files will have to be modified to complete the Hadoop setup:
~/.bashrc (Already done)
(PATH)/etc/hadoop/hadoop-env.sh
(PATH)/etc/hadoop/core-site.xml
(PATH)/etc/hadoop/mapred-site.xml.template
(PATH)/etc/hadoop/hdfs-site.xm
gedit (PATH)/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
gedit (PATH)/etc/hadoop/core-site.xml:
The (HOME)/etc/hadoop/core-site.xml file contains configuration properties that Hadoop uses when starting up.
This file can be used to override the default settings that Hadoop starts with.
($ sudo mkdir -p /app/hadoop/tmp)
Open the file and enter the following in between the <configuration></configuration> tag:
gedit /usr/local/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
(PATH)/etc/hadoop/mapred-site.xml
By default, the (PATH)/etc/hadoop/ folder contains (PATH)/etc/hadoop/mapred-site.xml.template file which has to be renamed/copied with the name mapred-site.xml:
cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
The mapred-site.xml file is used to specify which framework is being used for MapReduce.
We need to enter the following content in between the <configuration></configuration> tag:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
</configuration>
(PATH)/etc/hadoop/hdfs-site.xml
The (PATH)/etc/hadoop/hdfs-site.xml file needs to be configured for each host in the cluster that is being used.
It is used to specify the directories which will be used as the namenode and the datanode on that host.
Before editing this file, we need to create two directories which will contain the namenode and the datanode for this Hadoop installation.
This can be done using the following commands:
sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
Open the file and enter the following content in between the <configuration></configuration> tag:
gedit (PATH)/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
Format the New Hadoop Filesystem
Now, the Hadoop file system needs to be formatted so that we can start to use it. The format command should be issued with write permission since it creates current directory under /usr/local/hadoop_store/ folder:
bin/hadoop namenode -format
or
bin/hdfs namenode -format
HADOOP SETUP IS DONE
Now start the hdfs
start-dfs.sh
start-yarn.sh
CHECK URL: http://localhost:50070/
FOR STOPPING HDFS
stop-dfs.sh
stop-yarn.sh

Related

Hadoop: Secondary NameNode Permission Denied

I'm attempting to run Hadoop in pseudo-distributed mode to learn how the system work. To install it, I've downloaded Hadoop-3.0.0 from the site, untarred it. I've done my configurations as follows (leaving out the configuration tags for brevity):
core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost/</value>
</property>
hdsf-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value> </property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
After doing this, I've formatted my hdfs using
hdfs namenode -format
I've also setup passwordless ssh using the following:
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa2
cat ~/.ssh/id_rsa2.pub >> ~/.ssh/authorized_keys
(I've also added id_rsa2.pub as the default for localhost using a config file, since I already was using id_rsa.pub for something else and didn't want to mix-and-match in case I broke something)
I'm able to ssh into localhost. All looks well.
Then I run start-dfs.sh, and I see this error:
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [zm.local]
zm.local: zm#zm.local: Permission denied (publickey,password,keyboard-interactive).
2018-01-16 17:31:35,807 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
If I run jps (after starting yarn and mapreduce history server), I have the following:
37921 NodeManager
38070 Jps
37434 NameNode
38060 JobHistoryServer
37821 ResourceManager
Noticeably, the SecondaryNameNode is missing, my assumption being it's due to the error above.
I can then try to use hadoop's fs command and I'm able to create a folder and look it up. But if I try to copy any data over, I get notified that the NameNode is in SAFEmode. If I turn off save mode using:
hdfs dfsadmin -safemode leave
It immediately turns back on. By going to the namenode port on localhost, I see the following message:
Safe mode is ON. Resources are low on NN. Please add or free up more resourcesthen turn off safe mode manually. NOTE: If you turn off safe mode before adding resources, the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode leave" to turn safe mode off.
However, I have plenty of resources. The single datanode is using less than 8% of it's allotted space, the namenode as almost 100GB of space. The datanode and namenode are both reporting as healthy. Thus, I think the problem is the lack of a secondary namenode. With that in mind, is anyone aware what might be causing the SecondaryNameNode to have different permission issues from the PrimaryNameNode? It seems to be trying to put the sNN somewhere on the local machine instead - but when I check in /tmp/hadoop*, all of the file permissions seem to be normal.
Thanks for any help.

Hadoop - Mkdirs failed to create C:\Users\acer\AppData\Local\Temp\hadoop-unjar778 7707269774970262\META-INF\license

I installed Hadoop on a Windows machine in pseudo-distributed mode and tried to run a MapReduce job on it. The Namenode and Datanode ran without any problems, however, the MapReduce job kept failing with the error:
Exception in thread "main" java.io.IOException: Mkdirs failed to create C:\Users\acer\AppData\Local\Temp\hadoop-unjar778
7707269774970262\META-INF\license
at org.apache.hadoop.util.RunJar.ensureDirectory(RunJar.java:128)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:104)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:81)
at org.apache.hadoop.util.RunJar.run(RunJar.java:209)
I've checked that I already have full permission to that folder, and I also tried using maven-shade-plugin with no success.
Not sure what the issue but there are some todo
verify the folder permission with proper user for Temp\hadoop-unjar778
7707269774970262\META-INF (Can use chmod -R 777)
Check Namenode is running while executing MR
Node Managger service is running
Check the configuration:
For Hadoop 1.x:
<property>
<name>mapred.job.tracker</name>
<value>localhost:9101</value>
</property
For Hadoop 2.x:
<property>
<name>mapreduce.jobtracker.address</name>
<value>localhost:9101</value>
</property>

Hadoop - java.net.ConnectException: Connection refused

I want connect to hdfs (in localhost) and i have a error:
Call From despubuntu-ThinkPad-E420/127.0.1.1 to localhost:54310 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
I follow all the steps in other posts, but i dont solve my problem. I use hadoop 2.7 and this is configurations:
core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/despubuntu/hadoop/name/data</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
I type /usr/local/hadoop/bin/hdfs namenode -format and
/usr/local/hadoop/sbin/start-all.sh
But when i type "jps" the result is:
10650 Jps
4162 Main
5255 NailgunRunner
20831 Launcher
I need help...
Make sure that DFS which is set to port 9000 in core-site.xml is actually started. You can check with jps command. You can start it with sbin/start-dfs.sh
I guess that you didn't set up your hadoop cluster correctly please follow these steps :
Step1: begin with setting up .bashrc:
vi $HOME/.bashrc
put the following lines at the end of the file: (change the hadoop home as yours)
# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop
# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
export JAVA_HOME=/usr/lib/jvm/java-6-sun
# Some convenient aliases and functions for running Hadoop-related commands
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"
# If you have LZO compression enabled in your Hadoop cluster and
# compress job outputs with LZOP (not covered in this tutorial):
# Conveniently inspect an LZOP compressed file from the command
# line; run via:
#
# $ lzohead /hdfs/path/to/lzop/compressed/file.lzo
#
# Requires installed 'lzop' command.
#
lzohead () {
hadoop fs -cat $1 | lzop -dc | head -1000 | less
}
# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin
step 2 : edit hadoop-env.sh as following:
# The java implementation to use. Required.
export JAVA_HOME=/usr/lib/jvm/java-6-sun
step 3 : Now create a directory and set the required ownerships and permissions
$ sudo mkdir -p /app/hadoop/tmp
$ sudo chown hduser:hadoop /app/hadoop/tmp
# ...and if you want to tighten up security, chmod from 755 to 750...
$ sudo chmod 750 /app/hadoop/tmp
step 4 : edit core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
step 5 : edit mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
step 6 : edit hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
finally format your hdfs (You need to do this the first time you set up a Hadoop cluster)
$ /usr/local/hadoop/bin/hadoop namenode -format
hope this will help you
I got the same issue. You can see Name node, DataNode, Resource manager and Task manager daemons are running when you type. So just do start-all.sh then all daemons start running and now you can access HDFS.
First check is if java processes are working or not by typing jps command on command line. On running jps command following processes are mandatory to run-->>
DataNode
jps
NameNode
SecondaryNameNode
If following processes are not running then first start the name node by using following command-->>
start-dfs.sh
This worked out for me and removed the error you stated.
I was getting similar error. Upon checking I found that my namenode service was in stopped state.
check status of the namenode sudo status hadoop-hdfs-namenode
if its not in started/running state
start namenode service sudo start hadoop-hdfs-namenode
Do keep in mind that it takes time before name node service becomes fully functional after restart. It reads all the hdfs edits in memory. You can check progress of this in /var/log/hadoop-hdfs/ using command tail -f /var/log/hadoop-hdfs/{Latest log file}

How to add an hard disk to hadoop

I installed Hadoop 2.4 on Ubuntu 14.04 and now I am trying to add an internal sata HD to the existing cluster.
I have mounted the new hd in /mnt/hadoop and assigned its ownership to the hadoop user
Then I tried to add it to the configuration file as follow:
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/namenode, file:///mnt/hadoop/hadoopdata/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/datanode, file:///mnt/hadoop/hadoopdata/hdfs/datanode</value>
</property>
</configuration>
Afterwards, I started the hdfs:
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/hadoop/hadoop/logs/hadoop-hadoop-namenode-hadoop-Datastore.out
localhost: starting datanode, logging to /home/hadoop/hadoop/logs/hadoop-hadoop-datanode-hadoop-Datastore.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/hadoop/logs/hadoop-hadoop-secondarynamenode-hadoop-Datastore.out
It seems that it does not fire up the second hd
This is my core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
In addition I tried to refresh the namenode and I get a connection problem:
Refreshing namenode [localhost:9000]
refreshNodes: Call From hadoop-Datastore/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Error: refresh of namenodes failed, see error messages above.
In addition, I can't connect to the Hadoop web interface.
It seems that I have two related problems:
1) A connection problem
2) I cannot connect to the new installed hd
Are these problem related?
How can I fix these issues?
Thanks
EDIT
I can ping the localhost and I can access localhost:50090/status.jsp
However, I cannot access 50030 and 50070
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/namenode, file:///mnt/hadoop/hadoopdata/hdfs/namenode</value>
</property>
This is documented as:
Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.
Are you sure you need this? Do you want your fsimage to be copied in both locations, for redundancy? And if yes, did you actually copy the fsimage on the new HDD before starting the namenode? See Adding a new namenode data directory to an existing cluster.
The new data directory (dfs.data.dir) is OK, the datanode should pick it up and start using it for placing blocks.
Also, as a general troubleshooting advice, look into the namenode and datanode logs for more clues.
Regarding your comment: "sudo chown -R hadoop.hadoop /usr/local/hadoop_store."
The owner has to be hdfs user. Try:
sudo chown -R hdfs.hadoop /usr/local/hadoop_store.

Installing Hadoop on NFS

As a start, I've installed Hadoop (0.15.2) and setup a cluster of 3 nodes: one each for NameNode, DataNode and the JobTracker. All the daemons are up and running. But when I issue any command I get the above error. For instance, when I do a copyFromLocal, I get the following error:
Am I missing something?
More details:
I am trying to install Hadoop on an NFS file system. I've installed 1.0.4 version and tried running it but to of no avail. The 1.0.4 version doesn't start the datanode. And the log files for the datanode are empty. Hence I switched back to 0.15 version which started all the daemons atleast.
I believe the problem is due to the underlying NFS file system i.e. all the datanodes and masters using the same files and folders. But I am not sure if that is actually the case.
But I don't see any reason why I shouldn't be able to run Hadoop on NFS (after appropriately setting the configuration parameters).
Currently I am trying and figuring out if I could set the name and data directories differently for different machines based on the individual machine names.
Configuration file: (hadoop-site.xml)
<property>
<name>fs.default.name</name>
<value>mumble-12.cs.wisc.edu:9001</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>mumble-13.cs.wisc.edu:9001</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.secondary.info.port</name>
<value>9002</value>
</property>
<property>
<name>dfs.info.port</name>
<value>9003</value>
</property>
<property>
<name>mapred.job.tracker.info.port</name>
<value>9004</value>
</property>
<property>
<name>tasktracker.http.port</name>
<value>9005</value>
</property>
Error using Hadoop 1.0.4 (DataNode doesn't get started):
2013-04-22 18:50:50,438 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9001, call addBlock(/tmp/hadoop-akshar/mapred/system/jobtracker.info, DFSClient_502734479, null) from 128.105.112.13:37204: error: java.io.IOException: File /tmp/hadoop-akshar/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
java.io.IOException: File /tmp/hadoop-akshar/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
Error using Hadoop 0.15.2:
[akshar#mumble-12] (38)$ bin/hadoop fs -copyFromLocal lib/junit-3.8.1.LICENSE.txt input
13/04/17 03:22:11 WARN fs.DFSClient: Error while writing.
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.net.SocketInputStream.read(SocketInputStream.java:203)
at java.io.DataInputStream.readShort(DataInputStream.java:312)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1660)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1733)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:55)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:83)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:140)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:826)
at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:120)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1360)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1478)
13/04/17 03:22:12 WARN fs.DFSClient: Error while writing.
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.net.SocketInputStream.read(SocketInputStream.java:203)
at java.io.DataInputStream.readShort(DataInputStream.java:312)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1660)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1733)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:55)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:83)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:140)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:826)
at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:120)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1360)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1478)
13/04/17 03:22:12 WARN fs.DFSClient: Error while writing.
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.net.SocketInputStream.read(SocketInputStream.java:203)
at java.io.DataInputStream.readShort(DataInputStream.java:312)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1660)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1733)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:55)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:83)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:140)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:826)
at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:120)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1360)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1478)
copyFromLocal: Connection reset
I was able to get Hadoop to run over NFS using version 1.1.2. It might work for other versions, but I can't guarantee anything.
If you have an NFS file system then each node should have access to the filesystem. The fs.default.name tells Hadoop the filesystem URI to use, so it should be pointed to the local disk. I'll assume that your NFS directory is mounted to each node at /nfs.
In core-site.xml you should define:
<property>
<name>fs.default.name</name>
<value>file:///</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/nfs/tmp</value>
</property>
In mapred-site.xml you should define:
<property>
<name>mapred.job.tracker</name>
<value>node1:8021</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/tmp/mapred-local</value>
</property>
Since hadoop.tmp.dir is pointed to the nfs drive then the default locations of mapred.system.dir and mapreduce.jobtracker.staging.root.dir point to locations on the nfs drive. It might run with leaving the default value for mapred.local.dir, but it is supposed to point to the local filesystem so to be safe you can put that in /tmp.
You don't have to worry about hdfs-site.xml. This configuration file is used when you start the namenode, but with everything being distributed on the nfs drive you shouldn't run HDFS.
Now you can run start-mapred.sh on the jobtracker node and run a hadoop job. Don't run start-all.sh or start-dfs.sh because those will start HDFS. If you run multiple DataNodes that point to the same NFS directory, then one DataNode will lock that directory and the others will shutdown because they are unable to obtain a lock.
I tested the configuration with:
bin/hadoop jar hadoop-examples-1.1.2.jar wordcount /nfs/data/test.text /nfs/out
Note that you need to specify full paths to the input and output locations.
I also tried:
bin/hadoop jar hadoop-examples-1.1.2.jar grep /nfs/data/loremIpsum.txt /nfs/out2 lorem
It gave me the same output as when I run it in Standalone, so I assume it is performing correctly.
Here is more information on fs.default.name:
http://www.greenplum.com/blog/dive-in/usage-and-quirks-of-fs-default-name-in-hadoop-filesystem

Resources