Error: E0902: Exception occured: [User: Root is not allowed to impersonate root - hadoop

I am trying to follow the steps given at http://www.rohitmenon.com/index.php/apache-oozie-installation/
Note: I am not using cloudera distibution of hadoop
The above link is similar to http://oozie.apache.org/docs/4.0.1/DG_QuickStart.html
but with more descriptive seems to me
however while running the below command as a root user i am getting exception
./bin/oozie-setup.sh sharelib create -fs
Note: i have two live node shown at dfshealth.jsp . and i have updated the core-site.xml for all three(including namenode) with property as below
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
i understand this is point where i am making mistake Could someone please guide me
Stacktrace
org.apache.oozie.service.HadoopAccessorException: E0902: Exception occured: [User: root is not allowed to impersonate root]
at
org.apache.oozie.service.HadoopAccessorService.createFileSystem(HadoopAccessorService.java:430)
at org.apache.oozie.tools.OozieSharelibCLI.run(OozieSharelibCLI.java:144)
at org.apache.oozie.tools.OozieSharelibCLI.main(OozieSharelibCLI.java:52)
Caused by: org.apache.hadoop.ipc.RemoteException: User: root is not allowed to impersonate root
at org.apache.hadoop.ipc.Client.call(Client.java:1107)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy5.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:135)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:276)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:241)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1411)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1429)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.oozie.service.HadoopAccessorService$2.run(HadoopAccessorService.java:422)
at org.apache.oozie.service.HadoopAccessorService$2.run(HadoopAccessorService.java:420)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
at org.apache.oozie.service.HadoopAccessorService.createFileSystem(HadoopAccessorService.java:420)
... 2 more
--------------------------------------
Note: Getting E0902: Exception occured: [User: oozie is not allowed to impersonate oozie] i have followed this link as well but not able to solve my problem
if i change the core-site.xml as below only for NameNode
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>[NAMENODE IP]</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>hadoop</value>
</property>
I get the exception as
Unauthorized connection for super-user: hadoop

After adding the property files into core-site.xml restart your hadoop and try. Even though if it not works format the namenode and start hadoop it will work.

You need to add these properties in core-site.xml for impersonation in order to solve your whitelist error
<property>
<name>hadoop.proxyuser.oozie.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.oozie.groups</name>
<value>*</value>
</property>
Hope this fixes your issue.

Follow the advice in the article below. Hadoop before 1.1.0 doesn't support wildcard so you have to explicitly specified the hosts and the groups
http://mail-archives.apache.org/mod_mbox/oozie-user/201212.mbox/%3CCAOcnVr1TZZ5X0Mrb7fFA8JdW6rO6PgoJ9u0=2UYbfXf_o8r=DA#mail.gmail.com%3E

I solved the problem by adding those lines in the core-site.xml-file
hadoop.proxyuser.root.hosts
value = *
hadoop.proxyuser.root.groups
value = *
and it works perfectly all my databases and tables are shown.

./oozie-setup.sh sharelib create -fs hdfs://localhost:9000
try to run this command using sudo.
check for hdfs if this path already exits i.e., /user/user_name/share/lib, if it exists remove it using
hadoop fs -rmr /user/user_name
After that run sudo ./oozied.sh. oozie will be started. Then check for your localhost:11000.

Related

Hbase configuration with Hdfs HA

I am trying to setup hbase ha with hadoop ha. I have done setting hadoop ha setup and tested it.
But in hbase setup, while starting, I am getting this following error
java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster.
at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2426)
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:231)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:137)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2436)
Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: hdfs-nameservice
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:373)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:258)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:153)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:602)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:547)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:139)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2625)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2607)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.hbase.util.FSUtils.getRootDir(FSUtils.java:1003)
at org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:570)
at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:381)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2419)
... 5 more
Caused by: java.net.UnknownHostException: hdfs-nameservice
... 25 more
I think my hbase setup doesn't recognize my nameservice hdfs-nameservice.
I am using Hbase 1.2.4 & Hadoop 2.7.3.
My hbase-site.xml has
<property>
<name>hbase.rootdir</name>
<value>hdfs://hdfs-nameservice/hbase</value>
</property>
core-site.xml has
<property>
<name>fs.defaultFS</name>
<value>hdfs://hdfs-nameservice:8020</value>
</property>
and hdfs-site.xml has
<property>
<name>dfs.nameservices</name>
<value>hdfs-nameservice</value>
</property>
<property>
<name>dfs.ha.namenodes.hdfs-nameservice</name>
<value>namenode1,namenode2</value>
</property>
//with rpc, http address confs for both namenode1&2
Tried:
1. Copied both the core and hdfs-site xmls in hbase conf directory.
2. Added Hadoop conf path in both environmental variable and hbase-env.sh
Still couldn't figure out how to make my hbase recognize hdfs-nameservice.
Any help would be much appreciated.

Replace HDFS form local disk to s3 getting error (org.apache.hadoop.service.AbstractService)

We are trying to setup Cloudera 5.5 where HDFS will be working on s3 only for that we have already configured the necessory properties in Core-site.xml
<property>
<name>fs.s3a.access.key</name>
<value>################</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>###############</value>
</property>
<property>
<name>fs.default.name</name>
<value>s3a://bucket_Name</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>s3a://bucket_Name</value>
</property>
After setting it up we were able to browse the files for s3 bucket from command
hadoop fs -ls /
And it shows the files available on s3 only.
But when we start the yarn services JobHistory server fails to start with below error and on launching pig jobs we are getting same error
PriviledgedActionException as:mapred (auth:SIMPLE) cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: s3a
ERROR org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils
Unable to create default file context [s3a://kyvosps]
org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: s3a
at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:154)
at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:242)
at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:337)
at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:334)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
On serching on Internet we found that we need to set following properties as well in core-site.xml
<property>
<name>fs.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
<description>The implementation class of the S3A Filesystem</description>
</property>
<property>
<name>fs.AbstractFileSystem.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
<description>The FileSystem for S3A Filesystem</description>
</property>
After setting the above properties we are getting following error
org.apache.hadoop.service.AbstractService
Service org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager failed in state INITED; cause: java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.fs.s3a.S3AFileSystem.<init>(java.net.URI, org.apache.hadoop.conf.Configuration)
java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.fs.s3a.S3AFileSystem.<init>(java.net.URI, org.apache.hadoop.conf.Configuration)
at org.apache.hadoop.fs.AbstractFileSystem.newInstance(AbstractFileSystem.java:131)
at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:157)
at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:242)
at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:337)
at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:334)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:334)
at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:451)
at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:473)
at org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils.getDefaultFileContext(JobHistoryUtils.java:247)
The jars needed for this is in place but still getting the error any help will be great. Thanks in advance
Update
I tried to remove the property fs.AbstractFileSystem.s3a.impl but it give me the same first exception the one i was getting previously which is
org.apache.hadoop.security.UserGroupInformation
PriviledgedActionException as:mapred (auth:SIMPLE) cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: s3a
ERROR org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils
Unable to create default file context [s3a://bucket_name]
org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: s3a
at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:154)
at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:242)
at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:337)
at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:334)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:334)
at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:451)
at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:473)
The problem is not with the location of the jars.
The problem is with the setting:
<property>
<name>fs.AbstractFileSystem.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
<description>The FileSystem for S3A Filesystem</description>
</property>
This setting is not needed. Because of this setting, it is searching for following constructor in S3AFileSystem class and there is no such constructor:
S3AFileSystem(URI theUri, Configuration conf);
Following exception clearly tells that it is unable to find a constructor for S3AFileSystem with URI and Configuration parameters.
java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.fs.s3a.S3AFileSystem.<init>(java.net.URI, org.apache.hadoop.conf.Configuration)
To resolve this problem, remove fs.AbstractFileSystem.s3a.impl setting from core-site.xml. Just having fs.s3a.impl setting in core-site.xml should solve your problem.
EDIT:
org.apache.hadoop.fs.s3a.S3AFileSystem just implements FileSystem.
Hence, you cannot set value of fs.AbstractFileSystem.s3a.impl to org.apache.hadoop.fs.s3a.S3AFileSystem, since org.apache.hadoop.fs.s3a.S3AFileSystem does not implement AbstractFileSystem.
I am using Hadoop 2.7.0 and in this version s3A is not exposed as AbstractFileSystem.
There is JIRA ticket: https://issues.apache.org/jira/browse/HADOOP-11262 to implement the same and the fix is available in Hadoop 2.8.0.
Assuming, your jar has exposed s3A as AbstractFileSystem, you need to set the following for fs.AbstractFileSystem.s3a.impl:
<property>
<name>fs.AbstractFileSystem.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3A</value>
</property>
That will solve your problem.

Oozie on YARN - oozie is not allowed to impersonate hadoop

I'm trying to use Oozie from Java to start a job on a Hadoop cluster. I have very limited experience with Oozie on Hadoop 1 and now I'm struggling trying out the same thing on YARN.
I'm given a machine that doesn't belong to the cluster, so when I try to start my job I get the following exception:
E0501 : E0501: Could not perform authorization operation, User: oozie is not allowed to impersonate hadoop
Why is that and what to do?
I read a bit about core-site properties that need to be set
<property>
<name>hadoop.proxyuser.oozie.groups</name>
<value>users</value>
</property>
<property>
<name>hadoop.proxyuser.oozie.hosts</name>
<value>master</value>
</property>
Does it seem that this is the problem? Should I contact people responsible for cluster to fix that?
Could there be problems because I'm using same code for YARN as I did for Hadoop 1? Should something be changed? For example, I'm setting nameNode and jobTracker in workflow.xml, should jobTracker exist, since there is now ResourceManager? I have set the address of ResourceManager, but left the property name as jobTracker, could that be the error?
Maybe I should also mention that Ambari is used...
Hi please update the core-site.xml
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
and jobTracker address is the Resourcemananger address that will not be the case . once update the core-site.xml file it will works.
Reason:
Cause of this type of error is- You run oozie server as a hadoop user but you define oozie as a proxy user in core-site.xml file.
Solution:
change the ownership of oozie installation directory to oozie user and run oozie server as a oozie user and problem will be solved.

Getting the following error "Datanode denied communication with namenode" while configuring hadoop 0.23.8

I am trying to configure hadoop 0.23.8 on my macbook and am running in with the following exception
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode: 192.168.1.13:50010
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:549)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:2548)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:784)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:394)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1571)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1262)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1565)
My core-site.xml looks like this
<configuration>
<property>
<name>dfs.federation.nameservices</name>
<value>ns1</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1</name>
<value>192.168.1.13:54310</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1</name>
<value>192.168.1.13:50070</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns1</name>
<value>192.168.1.13:50090</value>
</property>
</configuration>
Any ideas on what I may be doing wrong?
Had the same problem with 2.6.0, and shamouda's answer solved it (I was not using dfs.hosts at all so that could not be the answer. I did add
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
to hdfs-site.xml and that was enough to fix the issue.
I got the same problem with Hadoop 2.6.0 and the solution for my case was different than Tariq's answer.
I couldn't list the IP-Host mapping in /etc/hosts because I use DHCP for setting the IPs dynamically.
The problem was that my DNS does not allow Reverse DNS lookup (i.e. looking up the hostname given the IP), and HDFS by default use reverse DNS lookup whenever a datanode tries to register with a namenode. Luckily, this behaviour can be disabled by setting this property "dfs.namenode.datanode.registration.ip-hostname-check" to false in hdfs-site.xml
How to know that your DNS does not allow Reverse lookup? the answer in ubuntu is to use the command "host ". If it can resolve the hostname, then reverse lookup is enabled. If it fails, then reverse lookup is disabled.
References:
1. http://rrati.github.io/blog/2014/05/07/apache-hadoop-plus-docker-plus-fedora-running-images/
2. https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
Looks like name resolution issue to me. Possible reasons :
Machine is listed in the file defined by dfs.hosts.exclude
dfs.hosts is used and the machine is not listed within that file
Also make sure you have IP+hostname of the machine listed in your hosts file.
HTH
I got this problem.
earlier configuration in core-site.xml is like this.
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:12345</value>
</property>
Later I've modified the localhost name with my HOSTNAME (PC name)
<property>
<name>fs.default.name</name>
<value>hdfs://cnu:12345</value>
</property>
It worked for me.
Just for information. I have had the same problem and i have recognized, that there was a typo in the hostname of my slaves. Vise versa there the node itself can have the wrong hostname.

Installing Hadoop on NFS

As a start, I've installed Hadoop (0.15.2) and setup a cluster of 3 nodes: one each for NameNode, DataNode and the JobTracker. All the daemons are up and running. But when I issue any command I get the above error. For instance, when I do a copyFromLocal, I get the following error:
Am I missing something?
More details:
I am trying to install Hadoop on an NFS file system. I've installed 1.0.4 version and tried running it but to of no avail. The 1.0.4 version doesn't start the datanode. And the log files for the datanode are empty. Hence I switched back to 0.15 version which started all the daemons atleast.
I believe the problem is due to the underlying NFS file system i.e. all the datanodes and masters using the same files and folders. But I am not sure if that is actually the case.
But I don't see any reason why I shouldn't be able to run Hadoop on NFS (after appropriately setting the configuration parameters).
Currently I am trying and figuring out if I could set the name and data directories differently for different machines based on the individual machine names.
Configuration file: (hadoop-site.xml)
<property>
<name>fs.default.name</name>
<value>mumble-12.cs.wisc.edu:9001</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>mumble-13.cs.wisc.edu:9001</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.secondary.info.port</name>
<value>9002</value>
</property>
<property>
<name>dfs.info.port</name>
<value>9003</value>
</property>
<property>
<name>mapred.job.tracker.info.port</name>
<value>9004</value>
</property>
<property>
<name>tasktracker.http.port</name>
<value>9005</value>
</property>
Error using Hadoop 1.0.4 (DataNode doesn't get started):
2013-04-22 18:50:50,438 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9001, call addBlock(/tmp/hadoop-akshar/mapred/system/jobtracker.info, DFSClient_502734479, null) from 128.105.112.13:37204: error: java.io.IOException: File /tmp/hadoop-akshar/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
java.io.IOException: File /tmp/hadoop-akshar/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
Error using Hadoop 0.15.2:
[akshar#mumble-12] (38)$ bin/hadoop fs -copyFromLocal lib/junit-3.8.1.LICENSE.txt input
13/04/17 03:22:11 WARN fs.DFSClient: Error while writing.
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.net.SocketInputStream.read(SocketInputStream.java:203)
at java.io.DataInputStream.readShort(DataInputStream.java:312)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1660)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1733)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:55)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:83)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:140)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:826)
at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:120)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1360)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1478)
13/04/17 03:22:12 WARN fs.DFSClient: Error while writing.
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.net.SocketInputStream.read(SocketInputStream.java:203)
at java.io.DataInputStream.readShort(DataInputStream.java:312)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1660)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1733)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:55)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:83)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:140)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:826)
at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:120)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1360)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1478)
13/04/17 03:22:12 WARN fs.DFSClient: Error while writing.
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.net.SocketInputStream.read(SocketInputStream.java:203)
at java.io.DataInputStream.readShort(DataInputStream.java:312)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1660)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1733)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:55)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:83)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:140)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:826)
at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:120)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1360)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1478)
copyFromLocal: Connection reset
I was able to get Hadoop to run over NFS using version 1.1.2. It might work for other versions, but I can't guarantee anything.
If you have an NFS file system then each node should have access to the filesystem. The fs.default.name tells Hadoop the filesystem URI to use, so it should be pointed to the local disk. I'll assume that your NFS directory is mounted to each node at /nfs.
In core-site.xml you should define:
<property>
<name>fs.default.name</name>
<value>file:///</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/nfs/tmp</value>
</property>
In mapred-site.xml you should define:
<property>
<name>mapred.job.tracker</name>
<value>node1:8021</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/tmp/mapred-local</value>
</property>
Since hadoop.tmp.dir is pointed to the nfs drive then the default locations of mapred.system.dir and mapreduce.jobtracker.staging.root.dir point to locations on the nfs drive. It might run with leaving the default value for mapred.local.dir, but it is supposed to point to the local filesystem so to be safe you can put that in /tmp.
You don't have to worry about hdfs-site.xml. This configuration file is used when you start the namenode, but with everything being distributed on the nfs drive you shouldn't run HDFS.
Now you can run start-mapred.sh on the jobtracker node and run a hadoop job. Don't run start-all.sh or start-dfs.sh because those will start HDFS. If you run multiple DataNodes that point to the same NFS directory, then one DataNode will lock that directory and the others will shutdown because they are unable to obtain a lock.
I tested the configuration with:
bin/hadoop jar hadoop-examples-1.1.2.jar wordcount /nfs/data/test.text /nfs/out
Note that you need to specify full paths to the input and output locations.
I also tried:
bin/hadoop jar hadoop-examples-1.1.2.jar grep /nfs/data/loremIpsum.txt /nfs/out2 lorem
It gave me the same output as when I run it in Standalone, so I assume it is performing correctly.
Here is more information on fs.default.name:
http://www.greenplum.com/blog/dive-in/usage-and-quirks-of-fs-default-name-in-hadoop-filesystem

Resources