datanode runs but not detected - hadoop

I configured hadoop on 2 machine. everything works fine. In the first one if I run the command jps I have
16406 NameNode
16774 ResourceManager
16619 SecondaryNameNode
17037 Jps
And in the second machine I have
2641 Jps
2445 NodeManager
2141 DataNode
but if I go in the browser master:50070 I see Live DataNodes 0(why?)
in fact if I try to put a file (hdfs dfs -put /input/file.txt)
I get this error
15/03/08 01:43:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/03/08 01:43:26 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /input/file.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
put: File /input/reduced.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
Why it say me that there are no datanodes?
Instead if I run the command hdfs dfs -mkdir /input it works, why it behaves differently?

Your datanode is not able to connect to Namenode that is the reason for your datanode not being reflected namenode web UI. Please update the core-site.xml file present in both nodes along with datanode log.
Your core-site should have namenode address.
<property>
<name>fs.default.name</name>
<value>hdfs://$namenode.full.hostname:8020</value>
<description>Enter your NameNode hostname</description>
</property>

It seems the problem is in your hostname only. Use FULL COMPUTER NAME instead of hostname. If your system connected with a domain then you have to specify FCN which likes hostname.domainname
Then use FCN in all hadoop configuration files. FCN should be access/ping from all other machines. If you cannot ping it. Then edit /etc/hosts file and set the FCN for the specific ip address.
Some of the hadoop configuration to be set with FCN are listed below.
Core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://FCN:8020</value>
</property>
Hdfs-site.xml
<property>
<name>dfs.namenode.http-address</name>
<value>FCN:50070</value>
</property>
Let me know if you have any further queries.

I had the same problem. Fixed it by ensuring all the slaves has "etc\hosts" file updated with the current ip.

Related

hdfs put fails from laptop to remote hadoop cluster

I have my hadoop cluster set up on a different network. Because of this, hdfs put is failing when I run it from my laptop.
Is there a port I should forward or something to access the datanodes remotely? I see it's using the local ip address in the error message.
Here is the command: hdfs dfs -put ~/Documents/reddit-streaming/redditStreaming/target/redditStreaming-1.0-SNAPSHOT.jar hdfs://mydns.asuscomm.com:8021/user/me/jars/
and here is the error message:
2021-10-14 18:04:55,704 WARN hdfs.DataStreamer: Exception in createBlockOutputStream blk_1073742036_1212
java.net.UnknownHostException
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:591)
at org.apache.hadoop.hdfs.DataStreamer.createSocketForPipeline(DataStreamer.java:253)
at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1757)
at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1711)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:707)
2021-10-14 18:04:55,708 WARN hdfs.DataStreamer: Abandoning BP-668799564-192.168.50.7-1633461871664:blk_1073742036_1212
2021-10-14 18:04:55,752 WARN hdfs.DataStreamer: Excluding datanode DatanodeInfoWithStorage[192.168.50.31:9866,DS-60974173-31d6-4dcb-a2ba-05ab6431db66,DISK]
2021-10-14 18:05:00,801 WARN hdfs.DataStreamer: Exception in createBlockOutputStream blk_1073742037_1213
java.net.UnknownHostException
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:591)
at org.apache.hadoop.hdfs.DataStreamer.createSocketForPipeline(DataStreamer.java:253)
at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1757)
at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1711)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:707)
2021-10-14 18:05:00,801 WARN hdfs.DataStreamer: Abandoning BP-668799564-192.168.50.7-1633461871664:blk_1073742037_1213
2021-10-14 18:05:00,833 WARN hdfs.DataStreamer: Excluding datanode DatanodeInfoWithStorage[192.168.50.19:9866,DS-aeaca5a1-562c-4f35-b2fb-6f0b51c5f695,DISK]
2021-10-14 18:05:00,869 WARN hdfs.DataStreamer: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/me/jars/redditStreaming-1.0-SNAPSHOT.jar._COPYING_ could only be written to 0 of the 1 minReplication nodes. There are 2 datanode(s) running and 2 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2329)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2942)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:915)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:593)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:600)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:568)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:552)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1093)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2966)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1573)
at org.apache.hadoop.ipc.Client.call(Client.java:1519)
at org.apache.hadoop.ipc.Client.call(Client.java:1416)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:530)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1084)
at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1898)
at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1700)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:707)
I have this property in my hdfs-site.xml file on my laptop:
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
</property>
I can also see in the UI that both datanodes are running.
I assume you've forwarded the namenode port (8021) since it can see that 2 datanodes exist?
Yes, the datanodes have their own ports that need to be available to the client for data to actually be written
Check the value for dfs.datanode.address and make sure you can establish a connection to the port listed there for each datanode.
If you look at the error, you can see this is 9866
Excluding datanode DatanodeInfoWithStorage[192.168.50.31:9866
And also, IIUC, the use.datanode.hostname config needs to actually be in the cluster, not your local laptop config, for the protocol to return the hostnames rather than the IPs
There is also an HTTP Port you can open if you want to see each Datanode's web-portal (should be available to be accessed from the Namenode UI as well)
The alternative, more secure / less exposed, option is to establish an edge-node between the networks that you can only SSH to & SFTP files into (assuming you don't otherwise have a shared fileserver), then run your hdfs commands from there. You can setup a SOCKS proxy if you needed to access a Web UI in that network
To re-iterate, you should not expose a Hadoop cluster without Kerberos & TLS over dynamic DNS through any internet-facing router

All services working in a Multinode cluster with JPS, but the "hdfs dfsadmin -report" shows nothing

[root#master ~]# jps
10197 SecondaryNameNode
10805 Jps
10358 ResourceManager
9998 NameNode
[root#slave1 ~]# jps
5872 NodeManager
5767 DataNode
6186 Jps
[root#slave2 ~]# jps
5859 Jps
5421 DataNode
5534 NodeManager
As you can see all the services are running when I run the JPS command on the namenode and the respective slave nodes "slave1" and "slave2" .
However, when I check the hdfs dfsadmin -report command, here is what I get:
[root#master ~]# hdfs dfsadmin -report
17/09/01 12:11:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 0 (0 B)
Present Capacity: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used: 0 (0 B)
DFS Used%: NaN%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
This is the issue. I am aware that there are number of articles over this particular topic to which I have already referred and disabled my firewall, formatted the cluster with the datanode cluster ID as well and resolved the IP issue in the Virtual Box where i was getting duplicate packets while pinging the slaves from the master.
The data-nodes just don't seem to start. Once even if luckily they do, i get the following error while copying files on the HDFS.
[root#master ~]# hdfs dfs -moveFromLocal /home/master/Downloads /citibike.tar /user/citibike
17/09/01 12:17:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/09/01 12:17:34 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/citibike._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1628)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3121)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3045)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:493)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2213)
at org.apache.hadoop.ipc.Client.call(Client.java:1476)
at org.apache.hadoop.ipc.Client.call(Client.java:1413)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1588)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1373)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:554)
moveFromLocal: File /user/citibike._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
The fsck command is working fine but to no use because it too is as empty as the dfsadmin -report.

How to copy files into HDFS?

I am trying to start a hadoop single node cluster on my local machine. I have configured the following files according to https://amodernstory.com/2014/09/23/installing-hadoop-on-mac-osx-yosemite/: hadoop-env.sh, core-site.xml, mapred-site.xml and hdfs-site.xml. When I run the script start-dfs.sh and then the command jps (right after running start-dfs.sh) I see that the datanode is up and running:
15735 Jps
15548 DataNode
15660 SecondaryNameNode
15453 NameNode
A few seconds later, I re-run the command jps and I see that the datanode is not running. Why? How to resolve this?
After that I run the script start-yarn.sh and then the command jps. I see this:
15955 NodeManager
16011 Jps
15660 SecondaryNameNode
15453 NameNode
15854 ResourceManager
My ultimate aim is to copy files into the HDFS from my local filesystem. For doing so, I run the command hdfs dfs -copyFromLocal /source-file-path/filename /destination-file-path/. I get the following error:
17/07/10 17:09:00 WARN hdfs.DataStreamer: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /pay/txnlinking/redshift.yml._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1733)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:265)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2496)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:828)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:506)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:788)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2455)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1481)
at org.apache.hadoop.ipc.Client.call(Client.java:1427)
at org.apache.hadoop.ipc.Client.call(Client.java:1337)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:440)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:398)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:335)
at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1733)
at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1536)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:658)
copyFromLocal: File /pay/txnlinking/redshift.yml._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
How to avoid the above error and copy files into the HDFS?
P.S: I created the destination path folders in HDFS explicitly before doing the copy.
Do
hadoop namenode -format
then stop all services using
stop-all.sh
then restart all services using
start-all.sh
start-all.sh and stop-all.sh are deprecated use start-dfs.sh and stop-dfs.sh instead
First delete the contents of the hadoop.tmp.dir folder that you specified in core-site.xml. Then do a namenode format using hdfs namenode -format. Your datanode should be up and running properly after which all the copy operations will be successfully executed.

Apache Spark Ec2: could only be replicated to 0 nodes, instead of 1

I have a 2Node cluster running on Ec2 d2.xlarge instance, and i have a 10 Gb of file to be processed through Spark, I have mounted a local Disk on spark and generated the Dataset o 10gb over there, but when I am trying to put that into Hdfs its throwing me the error of "could only be replicated to 0 nodes, instead of 1" as follows
16/03/09 21:44:25 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /vinit/inputfile.txt could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
Since your HDFS datanodes are not working and you have checked with jps command. You can configure the datanodes(slaves) by adding hostname or IP entries of datanodes to <HADOOP_HOME>/etc/hadoop/slaves file e.g.:
192.168.64.11
192.168.64.12
and similary for namenode (master) you can add entires to <HADOOP_HOME>/etc/hadoop/masters file e.g.: (assuming 192.168.64.11 is the master)
192.168.64.11
then run stop-all.sh and star-all.sh on master-node to restart the cluster. This will automatically start the datanodes. Check with jps command.

Hadoop DataStreamer Exception: File could only be replicated to 0 nodes instead of minReplication (=1)

I tried to load json data from my local into hadoop hdfs, I use these commands, and it throwing exception:
hadoop fs -copyFromLocal path/files/file.json input/
hadoop fs -put path/files/file.json input/
I've check using jps command, found out that hadoop is running.
26039 ResourceManager
30858 SecondaryNameNode
35605 Jps
26147 NodeManager
30714 DataNode
This is detail of the exception:
WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File path/files/file.json._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1471)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2791)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:606)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:455)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
copyFromLocal: File path/files/file.json._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
How should I do? I need the data to be loaded into hdfs.
Thanks in advance.
This recently happened to me and it was because the HDFS utilization was close to 100%. You can check your usage by running:
hdfs dfs -df -h
Increasing the HDFS size by increasing the number of cluster nodes solved the problem.
Check jps, you need 6 process after start-all.sh. Here you start fail NameNode process.
please format NameNode:
hadoop namenode -format
In my case the hard disk mentioned in hdfs-site.xml was overfilled with files and Hadoop couldn't just create even minor files for locks.
Config in master node or namenode:
--> /etc/hosts:
masterIP masterNodeName
slave1IP slave1Name
slave2IP slave2Name
Data-node should know master node to connect with.
You should define in each data node in /etc/hosts this:
node_IP masterNodeName
Or you could easily copy master's etc/hosts config to slave nodes using scp from master node.
scp /etc/hosts hadoopUser#slaveiName:/etc/hosts
Master Node must be able to connect with slaves through ssh without passwords.
Finally, you should have this config in core-site.xml:
<property>
<name>fs.defaultFS</name>
<value>hdfs://masterNodeName:9000</value>
</property>

Resources