HDFS: file is not distributed after upload - hadoop

I've deployed hadoop (0.20.203.0rc1) on 8-node cluster. After uploading file onto hdfs I've got this file only on one of the nodes instead of being uniformly distributed across all nodes. What can be the issue?
$HADOOP_HOME/bin/hadoop dfs -copyFromLocal ../data/rmat-20.0 /user/frolo/input/rmat-20.0
$HADOOP_HOME/bin/hadoop dfs -stat "%b %o %r %n" /user/frolo/input/rmat-*
1220222968 67108864 1 rmat-20.0
$HADOOP_HOME/bin/hadoop dfsadmin -report
Configured Capacity: 2536563998720 (2.31 TB)
Present Capacity: 1642543419392 (1.49 TB)
DFS Remaining: 1641312030720 (1.49 TB)
DFS Used: 1231388672 (1.15 GB)
DFS Used%: 0.07%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 8 (8 total, 0 dead)
Name: 10.10.1.15:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 131536928768 (122.5 GB)
DFS Remaining: 185533546496(172.79 GB)
DFS Used%: 0%
DFS Remaining%: 58.51%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.13:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 131533377536 (122.5 GB)
DFS Remaining: 185537097728(172.79 GB)
DFS Used%: 0%
DFS Remaining%: 58.52%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.17:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 120023924736 (111.78 GB)
DFS Remaining: 197046550528(183.51 GB)
DFS Used%: 0%
DFS Remaining%: 62.15%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.18:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 78510628864 (73.12 GB)
DFS Remaining: 238559846400(222.18 GB)
DFS Used%: 0%
DFS Remaining%: 75.24%
Last contact: Fri Feb 07 12:10:24 MSK 2014
Name: 10.10.1.14:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 131537530880 (122.5 GB)
DFS Remaining: 185532944384(172.79 GB)
DFS Used%: 0%
DFS Remaining%: 58.51%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.11:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 1231216640 (1.15 GB)
Non DFS Used: 84698116096 (78.88 GB)
DFS Remaining: 231141167104(215.27 GB)
DFS Used%: 0.39%
DFS Remaining%: 72.9%
Last contact: Fri Feb 07 12:10:24 MSK 2014
Name: 10.10.1.16:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 131537494016 (122.5 GB)
DFS Remaining: 185532981248(172.79 GB)
DFS Used%: 0%
DFS Remaining%: 58.51%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.12:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 84642578432 (78.83 GB)
DFS Remaining: 232427896832(216.47 GB)
DFS Used%: 0%
DFS Remaining%: 73.3%
Last contact: Fri Feb 07 12:10:27 MSK 2014

Your file has been written with a replication factor of 1, as evidenced by your hadoop fs -stat command output. This means only one block replica will exist for the blocks under the file.
The default replication factor for writes is governed by the property dfs.replication under $HADOOP_HOME/conf/hdfs-site.xml. If unspecified under it, the default is 3, but its likely that you have an override of it specified whose value is 1. Changing its value back to 3 or removing it altogether (to invoke default) will make all new file writes use 3 replicas by default.
You may also pass a specific replication factor with each write command using the -D property passing method supported by the hadoop fs utility, such as:
hadoop fs -Ddfs.replication=3 -copyFromLocal ../data/rmat-20.0 /user/frolo/input/rmat-20.0
And you may alter an existing file's replication factor by using the hadoop fs -setrep utility, such as:
hadoop fs -setrep 3 -w /user/frolo/input/rmat-20.0
Files with a HDFS replication factor greater than 1 will show up automatically distributed across multiple nodes. HDFS will never write more than one replica of a block onto the same DataNode.

Related

add automatic failover to my existing cluster?

I need help with Automatic Fail-over in Hadoop. The requirement is to transfer the control from one node to another during the failure. I already have running cluster and want to add this to the existing cluster.
hdfs dfsadmin -report
Configured Capacity: 4393174024192 (4.00 TB)
Present Capacity: 4101312667648 (3.73 TB)
DFS Remaining: 4100850401280 (3.73 TB)
DFS Used: 462266368 (440.85 MB)
DFS Used%: 0.01%
Replicated Blocks:
Under replicated blocks: 20
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
Erasure Coded Block Groups:
Low redundancy block groups: 0
Block groups with corrupt internal blocks: 0
Missing block groups: 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
Live datanodes (4):
Name: 192.168.5.250:9866 (odin)
Hostname: odin
Decommission Status : Normal
Configured Capacity: 1098293506048 (1022.87 GB)
DFS Used: 222879744 (212.55 MB)
Non DFS Used: 16838836224 (15.68 GB)
DFS Remaining: 1025325965312 (954.91 GB)
DFS Used%: 0.02%
DFS Remaining%: 93.36%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon May 20 10:44:50 IST 2019
Last Block Report: Mon May 20 09:12:14 IST 2019
Num of Blocks: 129
Name: 192.168.5.251:9866 (loki)
Hostname: loki
Decommission Status : Normal
Configured Capacity: 1098293506048 (1022.87 GB)
DFS Used: 145424384 (138.69 MB)
Non DFS Used: 15433760768 (14.37 GB)
DFS Remaining: 1026808496128 (956.29 GB)
DFS Used%: 0.01%
DFS Remaining%: 93.49%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon May 20 10:44:50 IST 2019
Last Block Report: Mon May 20 09:12:14 IST 2019
Num of Blocks: 106
Name: 192.168.5.252:9866 (thor)
Hostname: thor
Decommission Status : Normal
Configured Capacity: 1098293506048 (1022.87 GB)
DFS Used: 8003584 (7.63 MB)
Non DFS Used: 16954404864 (15.79 GB)
DFS Remaining: 1025425272832 (955.00 GB)
DFS Used%: 0.00%
DFS Remaining%: 93.37%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon May 20 10:44:50 IST 2019
Last Block Report: Mon May 20 10:08:38 IST 2019
Num of Blocks: 102
Name: 192.168.5.253:9866 (hela)
Hostname: hela
Decommission Status : Normal
Configured Capacity: 1098293506048 (1022.87 GB)
DFS Used: 85958656 (81.98 MB)
Non DFS Used: 19011055616 (17.71 GB)
DFS Remaining: 1023290667008 (953.01 GB)
DFS Used%: 0.01%
DFS Remaining%: 93.17%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon May 20 10:44:50 IST 2019
Last Block Report: Mon May 20 09:12:14 IST 2019
Num of Blocks: 109
[https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html][1] , I found this link and [https://www.edureka.co/blog/how-to-set-up-hadoop-cluster-with-hdfs-high-availability/][1]
Both will helps for setting this environment.

hadoop dfs used 100% with little data

Recently I format namenode to hdfs (hadoop namenode format ) ,but when I start the hdfs it couldn't upload anydata to HDFS then I delete the datanode directory to make sure the have the same namespace .
but when I hdfs dfsadmin -report it has a strange things
Live datanodes (3):
Name: 192.168.0.30:50010 (hadoop1)
Hostname: hadoop1
Rack: /default
Decommission Status : Normal
Configured Capacity: 0 (0 B)
DFS Used: 8192 (8 KB)
Non DFS Used: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used%: 100.00%
DFS Remaining%: 0.00%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Thu Mar 22 14:10:41 CST 2018
all the datanodes dfs used 100% & remining 0%
But it also have available interspace on the disk,
[root#hadoop1 nn]# df -h /dfs/
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/cl-root 70G 55G 16G 78% /
when I open the namenode webpage the capacity is zero
Any ideas?
Cheers

How to read `hadoop dfsadmin -report` output

Command:
[hdfs#sandbox oozie]$ hadoop dfsadmin -report|head -n 100
Output:
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Configured Capacity: 44716605440 (41.65 GB)
Present Capacity: 31614091245 (29.44 GB)
DFS Remaining: 30519073792 (28.42 GB)
DFS Used: 1095017453 (1.02 GB)
DFS Used%: 3.46%
Under replicated blocks: 657
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (1):
Name: 10.0.2.15:50010 (sandbox.hortonworks.com)
Hostname: sandbox.hortonworks.com
Decommission Status : Normal
Configured Capacity: 44716605440 (41.65 GB)
DFS Used: 1095017453 (1.02 GB)
Non DFS Used: 13102514195 (12.20 GB)
DFS Remaining: 30519073792 (28.42 GB)
DFS Used%: 2.45%
DFS Remaining%: 68.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 4
Last contact: Thu Aug 11 23:12:04 UTC 2016
What is Cache Used%, Non DFS Used specially???
hdfs dfsadmin -report command:
Reports basic filesystem information and statistics. Optional flags
may be used to filter the list of displayed DataNodes.
..from official page of hadoop
About,
Cache Used%:
It depends on "Configured Cache Capacity". It's the percentage of the configured value. As you have not configured any space for cache, it is shown as 100% (0 B out of 0 B)
NonDFS used:
It is calculated using following formula
NonDFS used = Configured Capacity - DFS Used - DFS Remaining

HDFS file upload failed in the middle

I am building a prototype using CDH 5.1. I am using a 3 node cluster.
While uploading the data from the database to the HDFS the resource manager server aborted unexpectedly.
Now I see a file of size 3 GB in the HDFS.
Issue :
Total rows in the DB for the query is 42600000. The Server stopped after 7641242 rows transferred. I am using Talend to do the ETL. I know this time I will not be able to do much other than start the process all over again.
Is there a way we can mitigate this issue in future?
Update after running the dfsadmin command :
sudo -u hdfs hadoop dfsadmin -report
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Configured Capacity: 338351861760 (315.11 GB)
Present Capacity: 314764926976 (293.15 GB)
DFS Remaining: 303901577216 (283.03 GB)
DFS Used: 10863349760 (10.12 GB)
DFS Used%: 3.45%
Under replicated blocks: 1
Blocks with corrupt replicas: 1
Missing blocks: 0
-------------------------------------------------
Live datanodes (3):
Name: 10.215.204.196:50010 (txwlcloud3)
Hostname: txwlcloud3
Rack: /default
Decommission Status : Normal
Configured Capacity: 112783953920 (105.04 GB)
DFS Used: 3623538688 (3.37 GB)
Non DFS Used: 2971234304 (2.77 GB)
DFS Remaining: 106189180928 (98.90 GB)
DFS Used%: 3.21%
DFS Remaining%: 94.15%
Configured Cache Capacity: 522190848 (498 MB)
Cache Used: 0 (0 B)
Cache Remaining: 522190848 (498 MB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Last contact: Tue Sep 30 10:55:11 CDT 2014
Name: 10.215.204.203:50010 (txwlcloud2)
Hostname: txwlcloud2
Rack: /default
Decommission Status : Normal
Configured Capacity: 112783953920 (105.04 GB)
DFS Used: 3645382656 (3.40 GB)
Non DFS Used: 2970497024 (2.77 GB)
DFS Remaining: 106168074240 (98.88 GB)
DFS Used%: 3.23%
DFS Remaining%: 94.13%
Configured Cache Capacity: 815792128 (778 MB)
Cache Used: 0 (0 B)
Cache Remaining: 815792128 (778 MB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Last contact: Tue Sep 30 10:55:10 CDT 2014
Name: 10.215.204.213:50010 (txwlcloud1)
Hostname: txwlcloud1
Rack: /default
Decommission Status : Normal
Configured Capacity: 112783953920 (105.04 GB)
DFS Used: 3594428416 (3.35 GB)
Non DFS Used: 17645203456 (16.43 GB)
DFS Remaining: 91544322048 (85.26 GB)
DFS Used%: 3.19%
DFS Remaining%: 81.17%
Configured Cache Capacity: 3145728 (3 MB)
Cache Used: 0 (0 B)
Cache Remaining: 3145728 (3 MB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Last contact: Tue Sep 30 10:55:10 CDT 2014

Hbase: Newly added regionserver is not severing requests

I'm setting up a Hbase cluster on a cloud infrastructure.
HBase version: 0.94.11
Hadoop version: 1.0.4
Currently I have 4 nodes in my cluster (1 master, 3 regionservers) and I'm using YCSB (yahoo benchmarks) to create a table (500.000 rows) and send READ requests (Asynchronous READ requests).
Everything works fine with this setup (as I'm monitoring the hole process with ganglia and I'm getting lamda, throughput, latency combined with the YCSB's output), but the problem occurs when I add a new regionserver on-the-fly as it doesn't getting any requests.
What "on-the-fly" means:
While the YCSB is sending request to the cluster, I'm adding new regionservers using python scripts.
Addition Process (while the cluster is serving requests):
I'm creating a new VM which will act as the new regionserver and configure every needed aspect (hbase, hadoop, /etc/host, connect to private network, etc)
Stoping **hbase** balancer
Configuring every node in the cluster with the new node's information adding hostname to regioservers filesadding hostname to hadoop's slave fileadding hostname and IP to /etc/host file of every nodeetc
Executing on the master node:
`hadoop/bin/start-dfs.sh`
`hadoop/bin/start-mapred.sh`
`hbase/bin/start-hbase.sh`
(I've also tried to run `hbase start regionserver` on the newly added node and does exactly the same with the last command - starts the regionserver)
Once the newly added node is up and running I'm executing **hadoop** load balancer
When the hadoop load balancer stops I'm starting again the **hbase** load balancer
I'm connecting over ssh to the master node and check that the load balancers (hbase/hadoop) did their job as both the blocks and regions are uniformly spread across all the regionservers/slaves including the new one.
But when I run status 'simple' in the hbase shell I see that the new regionservers are not getting any requests. (below is the output of the command after adding 2 new regionserver "okeanos-nodes-4/5")
hbase(main):008:0> status 'simple'
5 live servers
okeanos-nodes-1:60020 1380865800330
requestsPerSecond=5379, numberOfOnlineRegions=4, usedHeapMB=175, maxHeapMB=3067
okeanos-nodes-2:60020 1380865800738
requestsPerSecond=5674, numberOfOnlineRegions=4, usedHeapMB=161, maxHeapMB=3067
okeanos-nodes-5:60020 1380867725605
requestsPerSecond=0, numberOfOnlineRegions=3, usedHeapMB=27, maxHeapMB=3067
okeanos-nodes-3:60020 1380865800162
requestsPerSecond=3871, numberOfOnlineRegions=5, usedHeapMB=162, maxHeapMB=3067
okeanos-nodes-4:60020 1380866702216
requestsPerSecond=0, numberOfOnlineRegions=3, usedHeapMB=29, maxHeapMB=3067
0 dead servers
Aggregate load: 14924, regions: 19
The fact that they don't serve any requests is also evidenced by the CPU usage, in a serving regionserver is about 70% while in these 2 regioservers is about 2%.
Below is the output of hadoop dfsadmin -report, as you can see the block are evenly distributed (according to hadoop balancer -threshold 2).
root#okeanos-nodes-master:~# /opt/hadoop-1.0.4/bin/hadoop dfsadmin -report
Configured Capacity: 105701683200 (98.44 GB)
Present Capacity: 86440648704 (80.5 GB)
DFS Remaining: 84188446720 (78.41 GB)
DFS Used: 2252201984 (2.1 GB)
DFS Used%: 2.61%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 5 (5 total, 0 dead)
Name: 10.0.0.11:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 309166080 (294.84 MB)
Non DFS Used: 3851579392 (3.59 GB)
DFS Remaining: 16979591168(15.81 GB)
DFS Used%: 1.46%
DFS Remaining%: 80.32%
Last contact: Fri Oct 04 11:30:31 EEST 2013
Name: 10.0.0.3:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 531652608 (507.02 MB)
Non DFS Used: 3852300288 (3.59 GB)
DFS Remaining: 16756383744(15.61 GB)
DFS Used%: 2.51%
DFS Remaining%: 79.26%
Last contact: Fri Oct 04 11:30:32 EEST 2013
Name: 10.0.0.5:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 502910976 (479.61 MB)
Non DFS Used: 3853029376 (3.59 GB)
DFS Remaining: 16784396288(15.63 GB)
DFS Used%: 2.38%
DFS Remaining%: 79.4%
Last contact: Fri Oct 04 11:30:32 EEST 2013
Name: 10.0.0.4:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 421974016 (402.43 MB)
Non DFS Used: 3852365824 (3.59 GB)
DFS Remaining: 16865996800(15.71 GB)
DFS Used%: 2%
DFS Remaining%: 79.78%
Last contact: Fri Oct 04 11:30:29 EEST 2013
Name: 10.0.0.10:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 486498304 (463.96 MB)
Non DFS Used: 3851759616 (3.59 GB)
DFS Remaining: 16802078720(15.65 GB)
DFS Used%: 2.3%
DFS Remaining%: 79.48%
Last contact: Fri Oct 04 11:30:29 EEST 2013
I've tried stopping YCSB, restarting hbase master and restarting YCSB but with no lack.. these 2 nodes don't serve any requests!
As there are many log and conf files, I have created a zip file with logs and confs (both hbase and hadoop) of the master, a healthy regionserver serving requests and a regionserver not serving requests.
https://dl.dropboxusercontent.com/u/13480502/hbase_hadoop_logs__conf.zip
Thank you in advance!!
I found what was going on and it had nothing to do with Hbase... I have forgotten to add the hostname and IP of the new RS to the YCSB server VM (/etc/hosts file).... :-(

Resources