HDFS file upload failed in the middle - hadoop

I am building a prototype using CDH 5.1. I am using a 3 node cluster.
While uploading the data from the database to the HDFS the resource manager server aborted unexpectedly.
Now I see a file of size 3 GB in the HDFS.
Issue :
Total rows in the DB for the query is 42600000. The Server stopped after 7641242 rows transferred. I am using Talend to do the ETL. I know this time I will not be able to do much other than start the process all over again.
Is there a way we can mitigate this issue in future?
Update after running the dfsadmin command :
sudo -u hdfs hadoop dfsadmin -report
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Configured Capacity: 338351861760 (315.11 GB)
Present Capacity: 314764926976 (293.15 GB)
DFS Remaining: 303901577216 (283.03 GB)
DFS Used: 10863349760 (10.12 GB)
DFS Used%: 3.45%
Under replicated blocks: 1
Blocks with corrupt replicas: 1
Missing blocks: 0
-------------------------------------------------
Live datanodes (3):
Name: 10.215.204.196:50010 (txwlcloud3)
Hostname: txwlcloud3
Rack: /default
Decommission Status : Normal
Configured Capacity: 112783953920 (105.04 GB)
DFS Used: 3623538688 (3.37 GB)
Non DFS Used: 2971234304 (2.77 GB)
DFS Remaining: 106189180928 (98.90 GB)
DFS Used%: 3.21%
DFS Remaining%: 94.15%
Configured Cache Capacity: 522190848 (498 MB)
Cache Used: 0 (0 B)
Cache Remaining: 522190848 (498 MB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Last contact: Tue Sep 30 10:55:11 CDT 2014
Name: 10.215.204.203:50010 (txwlcloud2)
Hostname: txwlcloud2
Rack: /default
Decommission Status : Normal
Configured Capacity: 112783953920 (105.04 GB)
DFS Used: 3645382656 (3.40 GB)
Non DFS Used: 2970497024 (2.77 GB)
DFS Remaining: 106168074240 (98.88 GB)
DFS Used%: 3.23%
DFS Remaining%: 94.13%
Configured Cache Capacity: 815792128 (778 MB)
Cache Used: 0 (0 B)
Cache Remaining: 815792128 (778 MB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Last contact: Tue Sep 30 10:55:10 CDT 2014
Name: 10.215.204.213:50010 (txwlcloud1)
Hostname: txwlcloud1
Rack: /default
Decommission Status : Normal
Configured Capacity: 112783953920 (105.04 GB)
DFS Used: 3594428416 (3.35 GB)
Non DFS Used: 17645203456 (16.43 GB)
DFS Remaining: 91544322048 (85.26 GB)
DFS Used%: 3.19%
DFS Remaining%: 81.17%
Configured Cache Capacity: 3145728 (3 MB)
Cache Used: 0 (0 B)
Cache Remaining: 3145728 (3 MB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Last contact: Tue Sep 30 10:55:10 CDT 2014

Related

add automatic failover to my existing cluster?

I need help with Automatic Fail-over in Hadoop. The requirement is to transfer the control from one node to another during the failure. I already have running cluster and want to add this to the existing cluster.
hdfs dfsadmin -report
Configured Capacity: 4393174024192 (4.00 TB)
Present Capacity: 4101312667648 (3.73 TB)
DFS Remaining: 4100850401280 (3.73 TB)
DFS Used: 462266368 (440.85 MB)
DFS Used%: 0.01%
Replicated Blocks:
Under replicated blocks: 20
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
Erasure Coded Block Groups:
Low redundancy block groups: 0
Block groups with corrupt internal blocks: 0
Missing block groups: 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
Live datanodes (4):
Name: 192.168.5.250:9866 (odin)
Hostname: odin
Decommission Status : Normal
Configured Capacity: 1098293506048 (1022.87 GB)
DFS Used: 222879744 (212.55 MB)
Non DFS Used: 16838836224 (15.68 GB)
DFS Remaining: 1025325965312 (954.91 GB)
DFS Used%: 0.02%
DFS Remaining%: 93.36%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon May 20 10:44:50 IST 2019
Last Block Report: Mon May 20 09:12:14 IST 2019
Num of Blocks: 129
Name: 192.168.5.251:9866 (loki)
Hostname: loki
Decommission Status : Normal
Configured Capacity: 1098293506048 (1022.87 GB)
DFS Used: 145424384 (138.69 MB)
Non DFS Used: 15433760768 (14.37 GB)
DFS Remaining: 1026808496128 (956.29 GB)
DFS Used%: 0.01%
DFS Remaining%: 93.49%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon May 20 10:44:50 IST 2019
Last Block Report: Mon May 20 09:12:14 IST 2019
Num of Blocks: 106
Name: 192.168.5.252:9866 (thor)
Hostname: thor
Decommission Status : Normal
Configured Capacity: 1098293506048 (1022.87 GB)
DFS Used: 8003584 (7.63 MB)
Non DFS Used: 16954404864 (15.79 GB)
DFS Remaining: 1025425272832 (955.00 GB)
DFS Used%: 0.00%
DFS Remaining%: 93.37%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon May 20 10:44:50 IST 2019
Last Block Report: Mon May 20 10:08:38 IST 2019
Num of Blocks: 102
Name: 192.168.5.253:9866 (hela)
Hostname: hela
Decommission Status : Normal
Configured Capacity: 1098293506048 (1022.87 GB)
DFS Used: 85958656 (81.98 MB)
Non DFS Used: 19011055616 (17.71 GB)
DFS Remaining: 1023290667008 (953.01 GB)
DFS Used%: 0.01%
DFS Remaining%: 93.17%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon May 20 10:44:50 IST 2019
Last Block Report: Mon May 20 09:12:14 IST 2019
Num of Blocks: 109
[https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html][1] , I found this link and [https://www.edureka.co/blog/how-to-set-up-hadoop-cluster-with-hdfs-high-availability/][1]
Both will helps for setting this environment.

Unable to write to HDFS: WARN hdfs.DataStreamer - Unexpected EOF

I'm following a tutorial and while running in a single cluster test environment I suddenly cannot run any MR jobs or write data to HDFS. It worked good before and suddenly I keep getting below error (rebooting didn't help).
I can read and delete files from HDFS, but not write.
$ hdfs dfs -put war-and-peace.txt /user/hands-on/
19/03/25 18:28:29 WARN hdfs.DataStreamer: Exception for BP-1098838250-127.0.0.1-1516469292616:blk_1073742374_1550
java.io.EOFException: Unexpected EOF while trying to read response from server
at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:399)
at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213)
at org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:1020)
put: All datanodes [DatanodeInfoWithStorage[127.0.0.1:50010,DS-b90326de-a499-4a43-a66a-cc3da83ea966,DISK]] are bad. Aborting...
"hdfs dfsadmin -report" shows me everything is fine, enough disk space. I barely ran any jobs, just some test MRs and little test data.
$ hdfs dfsadmin -report
Configured Capacity: 52710469632 (49.09 GB)
Present Capacity: 43335585007 (40.36 GB)
DFS Remaining: 43334025216 (40.36 GB)
DFS Used: 1559791 (1.49 MB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (1):
Name: 127.0.0.1:50010 (localhost)
Hostname: localhost
Decommission Status : Normal
Configured Capacity: 52710469632 (49.09 GB)
DFS Used: 1559791 (1.49 MB)
Non DFS Used: 6690530065 (6.23 GB)
DFS Remaining: 43334025216 (40.36 GB)
DFS Used%: 0.00%
DFS Remaining%: 82.21%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Mon Mar 25 18:30:45 EDT 2019
Also the NameNode WebUI (port 50070) shows me everything is fine, the logs too do not report any error. What could it be / how could I properly troubleshoot it?
CentOS Linux 6.9 minimal
Apache Hadoop 2.8.1

Hadoop add new datanode fail when build cluster

I'm build a hadoop cluster, about two node, step by step with official document.
But append datanode not join the cluster at Web UI: http://{host address}:50070/dfshealth.html#tab-datanode
with command:
[az-user#AZ-TEST1-SPARK-SLAVE ~]$ yarn node --list
17/11/27 09:16:04 INFO client.RMProxy: Connecting to ResourceManager at /10.0.4.12:8032
Total Nodes:2
Node-Id Node-State Node-Http-Address Number-of-Running-Containers
AZ-TEST1-SPARK-MASTER:37164 RUNNING AZ-TEST1-SPARK-MASTER:8042 0
AZ-TEST1-SPARK-SLAVE:42608 RUNNING AZ-TEST1-SPARK-SLAVE:8042 0
It shows there are two node, but with another command just shows one livenode:
[az-user#AZ-TEST1-SPARK-SLAVE ~]$ hdfs dfsadmin -report
Configured Capacity: 1081063493632 (1006.82 GB)
Present Capacity: 1026027008000 (955.56 GB)
DFS Remaining: 1026026967040 (955.56 GB)
DFS Used: 40960 (40 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (1):
Name: 10.0.4.12:50010 (10.0.4.12)
Hostname: AZ-TEST1-SPARK-MASTER
Decommission Status : Normal
Configured Capacity: 1081063493632 (1006.82 GB)
DFS Used: 40960 (40 KB)
Non DFS Used: 97816576 (93.29 MB)
DFS Remaining: 1026026967040 (955.56 GB)
DFS Used%: 0.00%
DFS Remaining%: 94.91%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon Nov 27 09:22:36 UTC 2017
command show the same result on Master node.
Thanks for any advice.
other messages
the problem similar as number-of-nodes-in-hadoop-cluster but not work on my stage.
I'm use bare ip not config host ip file as usual.
Fixed
Use host name in every node and their configuration file.
In cluster mode, it must use host name rather then bare ip.

How to read `hadoop dfsadmin -report` output

Command:
[hdfs#sandbox oozie]$ hadoop dfsadmin -report|head -n 100
Output:
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Configured Capacity: 44716605440 (41.65 GB)
Present Capacity: 31614091245 (29.44 GB)
DFS Remaining: 30519073792 (28.42 GB)
DFS Used: 1095017453 (1.02 GB)
DFS Used%: 3.46%
Under replicated blocks: 657
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (1):
Name: 10.0.2.15:50010 (sandbox.hortonworks.com)
Hostname: sandbox.hortonworks.com
Decommission Status : Normal
Configured Capacity: 44716605440 (41.65 GB)
DFS Used: 1095017453 (1.02 GB)
Non DFS Used: 13102514195 (12.20 GB)
DFS Remaining: 30519073792 (28.42 GB)
DFS Used%: 2.45%
DFS Remaining%: 68.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 4
Last contact: Thu Aug 11 23:12:04 UTC 2016
What is Cache Used%, Non DFS Used specially???
hdfs dfsadmin -report command:
Reports basic filesystem information and statistics. Optional flags
may be used to filter the list of displayed DataNodes.
..from official page of hadoop
About,
Cache Used%:
It depends on "Configured Cache Capacity". It's the percentage of the configured value. As you have not configured any space for cache, it is shown as 100% (0 B out of 0 B)
NonDFS used:
It is calculated using following formula
NonDFS used = Configured Capacity - DFS Used - DFS Remaining

HDFS: file is not distributed after upload

I've deployed hadoop (0.20.203.0rc1) on 8-node cluster. After uploading file onto hdfs I've got this file only on one of the nodes instead of being uniformly distributed across all nodes. What can be the issue?
$HADOOP_HOME/bin/hadoop dfs -copyFromLocal ../data/rmat-20.0 /user/frolo/input/rmat-20.0
$HADOOP_HOME/bin/hadoop dfs -stat "%b %o %r %n" /user/frolo/input/rmat-*
1220222968 67108864 1 rmat-20.0
$HADOOP_HOME/bin/hadoop dfsadmin -report
Configured Capacity: 2536563998720 (2.31 TB)
Present Capacity: 1642543419392 (1.49 TB)
DFS Remaining: 1641312030720 (1.49 TB)
DFS Used: 1231388672 (1.15 GB)
DFS Used%: 0.07%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 8 (8 total, 0 dead)
Name: 10.10.1.15:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 131536928768 (122.5 GB)
DFS Remaining: 185533546496(172.79 GB)
DFS Used%: 0%
DFS Remaining%: 58.51%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.13:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 131533377536 (122.5 GB)
DFS Remaining: 185537097728(172.79 GB)
DFS Used%: 0%
DFS Remaining%: 58.52%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.17:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 120023924736 (111.78 GB)
DFS Remaining: 197046550528(183.51 GB)
DFS Used%: 0%
DFS Remaining%: 62.15%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.18:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 78510628864 (73.12 GB)
DFS Remaining: 238559846400(222.18 GB)
DFS Used%: 0%
DFS Remaining%: 75.24%
Last contact: Fri Feb 07 12:10:24 MSK 2014
Name: 10.10.1.14:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 131537530880 (122.5 GB)
DFS Remaining: 185532944384(172.79 GB)
DFS Used%: 0%
DFS Remaining%: 58.51%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.11:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 1231216640 (1.15 GB)
Non DFS Used: 84698116096 (78.88 GB)
DFS Remaining: 231141167104(215.27 GB)
DFS Used%: 0.39%
DFS Remaining%: 72.9%
Last contact: Fri Feb 07 12:10:24 MSK 2014
Name: 10.10.1.16:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 131537494016 (122.5 GB)
DFS Remaining: 185532981248(172.79 GB)
DFS Used%: 0%
DFS Remaining%: 58.51%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.12:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 84642578432 (78.83 GB)
DFS Remaining: 232427896832(216.47 GB)
DFS Used%: 0%
DFS Remaining%: 73.3%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Your file has been written with a replication factor of 1, as evidenced by your hadoop fs -stat command output. This means only one block replica will exist for the blocks under the file.
The default replication factor for writes is governed by the property dfs.replication under $HADOOP_HOME/conf/hdfs-site.xml. If unspecified under it, the default is 3, but its likely that you have an override of it specified whose value is 1. Changing its value back to 3 or removing it altogether (to invoke default) will make all new file writes use 3 replicas by default.
You may also pass a specific replication factor with each write command using the -D property passing method supported by the hadoop fs utility, such as:
hadoop fs -Ddfs.replication=3 -copyFromLocal ../data/rmat-20.0 /user/frolo/input/rmat-20.0
And you may alter an existing file's replication factor by using the hadoop fs -setrep utility, such as:
hadoop fs -setrep 3 -w /user/frolo/input/rmat-20.0
Files with a HDFS replication factor greater than 1 will show up automatically distributed across multiple nodes. HDFS will never write more than one replica of a block onto the same DataNode.

Resources