hdfs put fails from laptop to remote hadoop cluster - hadoop

I have my hadoop cluster set up on a different network. Because of this, hdfs put is failing when I run it from my laptop.
Is there a port I should forward or something to access the datanodes remotely? I see it's using the local ip address in the error message.
Here is the command: hdfs dfs -put ~/Documents/reddit-streaming/redditStreaming/target/redditStreaming-1.0-SNAPSHOT.jar hdfs://mydns.asuscomm.com:8021/user/me/jars/
and here is the error message:
2021-10-14 18:04:55,704 WARN hdfs.DataStreamer: Exception in createBlockOutputStream blk_1073742036_1212
java.net.UnknownHostException
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:591)
at org.apache.hadoop.hdfs.DataStreamer.createSocketForPipeline(DataStreamer.java:253)
at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1757)
at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1711)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:707)
2021-10-14 18:04:55,708 WARN hdfs.DataStreamer: Abandoning BP-668799564-192.168.50.7-1633461871664:blk_1073742036_1212
2021-10-14 18:04:55,752 WARN hdfs.DataStreamer: Excluding datanode DatanodeInfoWithStorage[192.168.50.31:9866,DS-60974173-31d6-4dcb-a2ba-05ab6431db66,DISK]
2021-10-14 18:05:00,801 WARN hdfs.DataStreamer: Exception in createBlockOutputStream blk_1073742037_1213
java.net.UnknownHostException
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:591)
at org.apache.hadoop.hdfs.DataStreamer.createSocketForPipeline(DataStreamer.java:253)
at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1757)
at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1711)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:707)
2021-10-14 18:05:00,801 WARN hdfs.DataStreamer: Abandoning BP-668799564-192.168.50.7-1633461871664:blk_1073742037_1213
2021-10-14 18:05:00,833 WARN hdfs.DataStreamer: Excluding datanode DatanodeInfoWithStorage[192.168.50.19:9866,DS-aeaca5a1-562c-4f35-b2fb-6f0b51c5f695,DISK]
2021-10-14 18:05:00,869 WARN hdfs.DataStreamer: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/me/jars/redditStreaming-1.0-SNAPSHOT.jar._COPYING_ could only be written to 0 of the 1 minReplication nodes. There are 2 datanode(s) running and 2 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2329)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2942)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:915)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:593)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:600)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:568)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:552)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1093)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2966)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1573)
at org.apache.hadoop.ipc.Client.call(Client.java:1519)
at org.apache.hadoop.ipc.Client.call(Client.java:1416)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:530)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1084)
at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1898)
at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1700)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:707)
I have this property in my hdfs-site.xml file on my laptop:
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
</property>
I can also see in the UI that both datanodes are running.

I assume you've forwarded the namenode port (8021) since it can see that 2 datanodes exist?
Yes, the datanodes have their own ports that need to be available to the client for data to actually be written
Check the value for dfs.datanode.address and make sure you can establish a connection to the port listed there for each datanode.
If you look at the error, you can see this is 9866
Excluding datanode DatanodeInfoWithStorage[192.168.50.31:9866
And also, IIUC, the use.datanode.hostname config needs to actually be in the cluster, not your local laptop config, for the protocol to return the hostnames rather than the IPs
There is also an HTTP Port you can open if you want to see each Datanode's web-portal (should be available to be accessed from the Namenode UI as well)
The alternative, more secure / less exposed, option is to establish an edge-node between the networks that you can only SSH to & SFTP files into (assuming you don't otherwise have a shared fileserver), then run your hdfs commands from there. You can setup a SOCKS proxy if you needed to access a Web UI in that network
To re-iterate, you should not expose a Hadoop cluster without Kerberos & TLS over dynamic DNS through any internet-facing router

Related

Namenode not starting without formatting

I have a hadoop cluster setup manually in high availability with one primary namenode,one standby namenode and one datanode. I formatted the namenode in the initial startup process, but if all the servers shut down due to electricity outage, I have to restart all the services again such as journal node,zookeeper,namenode,datanode and failover daemons.
However, when I start failover daemons(dfszkfailover) on both the namenodes, the namenode stops on both of them. For it to start, I have to format the namenode(on primary namenode) and delete the temp files.
This is my local setup and have to integrate on production as well. I cannot keep formatting namenode in production environment as I will lose all the data.
I read some of the answers explaining to point the namenode and datanode directory(in hdfs-site.xml) to some specific folder other than /tmp. I am already pointing it in my home directory.
Please suggest some method so that namenode and failover daemons both start without formatting the namenode.
EDIT-- I am sharing the last 100 lines of my namenode nn1 logs
192.168.12.12:8485: Journal Storage Directory /tmp/hadoop/dfs/journalnode/ha-cluster not formatted
at org.apache.hadoop.hdfs.qjournal.server.Journal.checkFormatted(Journal.java:479)
at org.apache.hadoop.hdfs.qjournal.server.Journal.getLastPromisedEpoch(Journal.java:247)
at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.getJournalState(JournalNodeRpcServer.java:139)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.getJournalState(QJournalProtocolServerSideTranslatorPB.java:118)
at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25415)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2606)
at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:286)
at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142)
at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createNewUniqueEpoch(QuorumJournalManager.java:180)
at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnfinalizedSegments(QuorumJournalManager.java:438)
at org.apache.hadoop.hdfs.server.namenode.JournalSet$8.apply(JournalSet.java:624)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java:621)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1521)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1180)
at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1919)
at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1777)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1649)
at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2606)
2019-12-06 10:57:57,541 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: Error: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [192.168.12.10:8485, 192.168.12.11:8485, 192.168.12.12:8485], stream=null))
2019-12-06 10:57:57,546 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at nn1/192.168.12.10
************************************************************/

Apache Spark Ec2: could only be replicated to 0 nodes, instead of 1

I have a 2Node cluster running on Ec2 d2.xlarge instance, and i have a 10 Gb of file to be processed through Spark, I have mounted a local Disk on spark and generated the Dataset o 10gb over there, but when I am trying to put that into Hdfs its throwing me the error of "could only be replicated to 0 nodes, instead of 1" as follows
16/03/09 21:44:25 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /vinit/inputfile.txt could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
Since your HDFS datanodes are not working and you have checked with jps command. You can configure the datanodes(slaves) by adding hostname or IP entries of datanodes to <HADOOP_HOME>/etc/hadoop/slaves file e.g.:
192.168.64.11
192.168.64.12
and similary for namenode (master) you can add entires to <HADOOP_HOME>/etc/hadoop/masters file e.g.: (assuming 192.168.64.11 is the master)
192.168.64.11
then run stop-all.sh and star-all.sh on master-node to restart the cluster. This will automatically start the datanodes. Check with jps command.

datanode runs but not detected

I configured hadoop on 2 machine. everything works fine. In the first one if I run the command jps I have
16406 NameNode
16774 ResourceManager
16619 SecondaryNameNode
17037 Jps
And in the second machine I have
2641 Jps
2445 NodeManager
2141 DataNode
but if I go in the browser master:50070 I see Live DataNodes 0(why?)
in fact if I try to put a file (hdfs dfs -put /input/file.txt)
I get this error
15/03/08 01:43:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/03/08 01:43:26 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /input/file.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
put: File /input/reduced.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
Why it say me that there are no datanodes?
Instead if I run the command hdfs dfs -mkdir /input it works, why it behaves differently?
Your datanode is not able to connect to Namenode that is the reason for your datanode not being reflected namenode web UI. Please update the core-site.xml file present in both nodes along with datanode log.
Your core-site should have namenode address.
<property>
<name>fs.default.name</name>
<value>hdfs://$namenode.full.hostname:8020</value>
<description>Enter your NameNode hostname</description>
</property>
It seems the problem is in your hostname only. Use FULL COMPUTER NAME instead of hostname. If your system connected with a domain then you have to specify FCN which likes hostname.domainname
Then use FCN in all hadoop configuration files. FCN should be access/ping from all other machines. If you cannot ping it. Then edit /etc/hosts file and set the FCN for the specific ip address.
Some of the hadoop configuration to be set with FCN are listed below.
Core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://FCN:8020</value>
</property>
Hdfs-site.xml
<property>
<name>dfs.namenode.http-address</name>
<value>FCN:50070</value>
</property>
Let me know if you have any further queries.
I had the same problem. Fixed it by ensuring all the slaves has "etc\hosts" file updated with the current ip.

Mapreduce returns error when accessing to datanode on master machine

I set up a Hadoop 2.4.0 cluster with three machines. One master machine is deployed with namenode, resource manager, datanode and node manager. The other two worker machines are deployed with datanode and node manager. When I run Hive query, the work fails and the error is
2014-06-11 13:40:13,364 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.net.ConnectException: Call From master/127.0.0.1 to
master:43607 failed on connection exception: java.net.ConnectException: Connection >refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:5>7)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImp>l.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
at org.apache.hadoop.ipc.Client.call(Client.java:1414)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:231)
at com.sun.proxy.$Proxy9.getTask(Unknown Source)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:136)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:604)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:699)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
at org.apache.hadoop.ipc.Client.call(Client.java:1381)
... 4 more
if I disable the datanode on master machine, everything works well. I'm wondering if it's allowed to deployed datanode on the master machine. Thank you for your kindly help in advance.
BTW, my /etc/hosts on the three machines are the same:
127.0.0.1 localhost
10.1.154.231 master
10.1.153.220 slave1
10.1.153.133 slave2
Please set up passwordless ssh on your master to itself.
You can achieve this by
cat ~/id_rsa.pub >> ~/.ssh/authorized_keys2
Make sure the permissions are correct
chmod 0600 ~/.ssh/authorized_keys2
In this case you may check if the namenode is started correctly on the master by checking logs at your yourhadoopfolder/logs/hadoop-[hadoop-user]-namenode-master.log
It is often caused by the hdfs is not formatted before. Run
hadoop namenode -format
Of course you will need to put your data to the cluster again.

Error in copying files to HDFS

I tried installing hadoop in two nodes. Both the nodes are up and running. The namenode runs on Ubuntu 10.10 and Datanode on Fedora 13. While copying the file from local file system to hdfs I encountered the following errors.
The terminal showed:
12/04/12 02:19:15 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.OException: Bad connect ack with firstBadLink as 10.211.87.162:9200
12/04/12 02:19:15 INFO hdfs.DFSClient: Abandoning block blk_-1069539184735421145_1014
The log file in namenode showed:
2012-10-16 16:17:56,723 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.6.2.26:50010, storageID=DS-880164535-10.18.13.10-50010-1349721715148, infoPort=50075, ipcPort=50020):DataXceiver
java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:282)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
at java.lang.Thread.run(Thread.java:662)
Datanodes available are indicated as 2. I've disabled the firewall and selinux.
The following changes have also been made in the hdfs-site.xml
dfs.socket.timeout -> 360000
dfs.datanode.socket.write.timeout -> 3600000
dfs.datanode.max.xcievers -> 1048576
Both the nodes run sun-java6-jdk, The datanode contains Openjdk but the path settings have been made for sun java.
Yet the same error persists.
What might be the solution.
That's because your firewall is on.
try
sudo /etc/init.d/iptables stop
If you are on Ubuntu, do
sudo ufw disable
this should solve the issue.
The exception log mentioned tha the failure reason is No route to host.
Try ping 10.6.2.26 to test your network connection.

Resources