Which nodes a hadoop file is stored on - hadoop
Is there a way to find which datanodes a particular hdfs file is stored on, or a list of blocks that store an hdfs file?
For example, if I have hdfs://user/person/file.csv,
Is there a way to find a list of ext4 paths corresponding to the blocks that make up this file on the datanodes?
Yes, you can find out the location of blocks which are stored on different datanodes in HDFS. Here is the command:
hdfs fsck /user/hduser/file.txt -files -blocks -locations
This will give you all the information related to individual blocks created for file: "/user/hduser/file.txt". Output generally looks like this:
[hduser#node001 ~]$ hdfs fsck /user/hduser/file.txt -files -blocks -locations
Connecting to namenode via http://node001.morado.com:50070
FSCK started by hduser (auth:SIMPLE) from /192.168.2.169 for path /user/hduser/file.txt at Mon Jul 11 23:14:27 PDT 2016
/user/hduser/file.txt 1073839694 bytes, 9 block(s): OK
0. BP-778802867-192.168.2.147-1465886958278:blk_1080847742_7107323 len=134217728 repl=3 [DatanodeInfoWithStorage[192.168.2.169:50010,DS-25d2b73a-2dc2-48c1-9aad-f0f5ca8d302a,DISK], DatanodeInfoWithStorage[192.168.2.147:50010,DS-293a7f8d-ad31-4bc1-98d8-0c0822eda305,DISK], DatanodeInfoWithStorage[192.168.2.20:50010,DS-8efb7a6e-08f0-4f2d-aee2-bc5a102277bd,DISK]]
1. BP-778802867-192.168.2.147-1465886958278:blk_1080847748_7107329 len=134217728 repl=3 [DatanodeInfoWithStorage[192.168.2.169:50010,DS-6881b609-1473-48d5-a07c-f111e0bdcf2f,DISK], DatanodeInfoWithStorage[192.168.2.15:50010,DS-060c75ff-5632-4f6f-a73b-fb2a68927c63,DISK], DatanodeInfoWithStorage[192.168.2.147:50010,DS-3e108776-d3bd-4b84-b68a-59e1ca755331,DISK]]
2. BP-778802867-192.168.2.147-1465886958278:blk_1080847753_7107334 len=134217728 repl=3 [DatanodeInfoWithStorage[192.168.2.169:50010,DS-25d2b73a-2dc2-48c1-9aad-f0f5ca8d302a,DISK], DatanodeInfoWithStorage[192.168.2.177:50010,DS-b7a33931-8917-4fe2-b2ec-2e4c3d5b6b01,DISK], DatanodeInfoWithStorage[192.168.2.135:50010,DS-5efb0813-7e4e-4d27-8fa4-7f8f3b2e6e3c,DISK]]
3. BP-778802867-192.168.2.147-1465886958278:blk_1080847760_7107341 len=134217728 repl=3 [DatanodeInfoWithStorage[192.168.2.169:50010,DS-6881b609-1473-48d5-a07c-f111e0bdcf2f,DISK], DatanodeInfoWithStorage[192.168.2.20:50010,DS-b8a5ceaf-6953-4842-8930-29b286ccb7cf,DISK], DatanodeInfoWithStorage[192.168.2.134:50010,DS-c6418fbb-6e30-447e-b507-bf19e0f28fd1,DISK]]
4. BP-778802867-192.168.2.147-1465886958278:blk_1080847764_7107345 len=134217728 repl=3 [DatanodeInfoWithStorage[192.168.2.169:50010,DS-95636645-c59e-4bca-8478-c15b3c16d514,DISK], DatanodeInfoWithStorage[192.168.2.147:50010,DS-293a7f8d-ad31-4bc1-98d8-0c0822eda305,DISK], DatanodeInfoWithStorage[192.168.2.20:50010,DS-8efb7a6e-08f0-4f2d-aee2-bc5a102277bd,DISK]]
5. BP-778802867-192.168.2.147-1465886958278:blk_1080847768_7107349 len=134217728 repl=3 [DatanodeInfoWithStorage[192.168.2.169:50010,DS-6881b609-1473-48d5-a07c-f111e0bdcf2f,DISK], DatanodeInfoWithStorage[192.168.2.20:50010,DS-276bd2c8-ee3d-4cd3-b655-17a83917c45b,DISK], DatanodeInfoWithStorage[192.168.2.134:50010,DS-5e128658-c876-46df-b10e-5962baf73db2,DISK]]
6. BP-778802867-192.168.2.147-1465886958278:blk_1080847772_7107353 len=134217728 repl=3 [DatanodeInfoWithStorage[192.168.2.169:50010,DS-fa57d5e9-a187-4856-8bf2-6933e63b3afe,DISK], DatanodeInfoWithStorage[192.168.2.135:50010,DS-f5f1e2a0-186b-4c70-844f-7e7ebe389f50,DISK], DatanodeInfoWithStorage[192.168.2.15:50010,DS-8cfb8ffb-77b6-40bb-930e-81c7198166ad,DISK]]
7. BP-778802867-192.168.2.147-1465886958278:blk_1080847776_7107357 len=134217728 repl=3 [DatanodeInfoWithStorage[192.168.2.169:50010,DS-95636645-c59e-4bca-8478-c15b3c16d514,DISK], DatanodeInfoWithStorage[192.168.2.15:50010,DS-060c75ff-5632-4f6f-a73b-fb2a68927c63,DISK], DatanodeInfoWithStorage[192.168.2.147:50010,DS-3e108776-d3bd-4b84-b68a-59e1ca755331,DISK]]
8. BP-778802867-192.168.2.147-1465886958278:blk_1080847780_7107361 len=97870 repl=3 [DatanodeInfoWithStorage[192.168.2.169:50010,DS-25d2b73a-2dc2-48c1-9aad-f0f5ca8d302a,DISK], DatanodeInfoWithStorage[192.168.2.15:50010,DS-de3cffa6-cef4-4f47-9bbf-5f44214b3a5a,DISK], DatanodeInfoWithStorage[192.168.2.177:50010,DS-847ec520-bc14-4ca4-af94-21140a3b20f6,DISK]]
Status: HEALTHY
Total size: 1073839694 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 9 (avg. block size 119315521 B)
Minimally replicated blocks: 9 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 7
Number of racks: 1
FSCK ended at Mon Jul 11 23:14:27 PDT 2016 in 1 milliseconds
The filesystem under path '/user/hduser/file.txt' is HEALTHY
Look for the information after "repl", usually starting with "DatanodeInfoWithStorage" tag. It gives the required information about datanode locations.
Related
HDFS + results from hdfs fsck / are diff from hdfs dfsadmin -report
we have hadoop cluster ( Ambari platform with HDP version - 2.6.4 ) and we performed verification step in order to understand if we have under replica blocks the first verification was with: su hdfs hdfs fsck / - --> its gives the results: Total size: 17653549013347 B (Total open files size: 854433698229 B) Total dirs: 843714 Total files: 11752836 Total symlinks: 0 (Files currently being written: 16) Total blocks (validated): 11792203 (avg. block size 1497052 B) (Total open file blocks (not validated): 6381) Minimally replicated blocks: 11792203 (100.00001 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 3.0 Corrupt blocks: 0 Missing replicas: 0 (0.0 %) Number of data-nodes: 6 Number of racks: 1 so as we can see above Under-replicated blocks is 0 BUT when we perform the next verification: hdfs dfsadmin -report then we get Configured Capacity: 141275429535744 (128.49 TB) Present Capacity: 140886991802565 (128.14 TB) DFS Remaining: 84748655941292 (77.08 TB) DFS Used: 56138335861273 (51.06 TB) DFS Used%: 39.85% Under replicated blocks: 4212067 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 so from above we can see that Under replicated blocks is --> 4212067 about to know what is the right under replica number: why we get differences between hdfs fsck / and hdfs dfsadmin -report ? BTW - from Ambari we get the ~ same results as from hdfs dfsadmin -report
hdfs jmxget vs hdfs fsck
I have 2 namenodes with several datanodes, but today I've just seen that I have some corrupt blocks. What is awkward is that: hdfs jmxget -server namenode02 -port 8006 | grep CorruptBlocks CorruptBlocks=27 and when I've checked with hdfs fsck / , I've got: Total size: 734930879995888 B (Total open files size: 537967073 B) Total dirs: 1501316 Total files: 113743394 Total symlinks: 0 (Files currently being written: 137) Total blocks (validated): 109063040 (avg. block size 6738587 B) (Total open file blocks (not validated): 133) Minimally replicated blocks: 109063040 (100.00001 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 3.001944 Corrupt blocks: 0 Missing replicas: 0 (0.0 %) Number of data-nodes: 103 Number of racks: 1 FSCK ended at Mon Feb 12 10:09:10 CET 2018 in 1608344 milliseconds So with fsck nothing bad regarding the blocks. How is this check made? Thx in advance!
For the hdfs jmx command we have the overall status of the blocks from Hadoop, which it seems that few of them might be corrupted (don't know the reason). For the fsck command we have the status of the files which they are safe due to the replica number set. To conclude it's normal behavior, no anomalies here.
CDH HDFS node decommission never ends
We have a 12 servers hadoop cluster(CDH), Recent, we want to decommission three of them, but this process already been running there more than 2 days. But it never ends, Especially, in the past 24 hours, I saw there are only 94G data available on the three data-node, but the size seems not changing in the past 24 hours. even through the under replicated blocks number already been zero. The replication factor is 3 for all the data in hdfs. Below is the result for hadoop fsck command: Total size: 5789534135468 B (Total open files size: 94222879072 B) Total dirs: 42458 Total files: 5494378 Total symlinks: 0 (Files currently being written: 133) Total blocks (validated): 5506578 (avg. block size 1051385 B) (Total open file blocks (not validated): 822) Minimally replicated blocks: 5506578 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 2.999584 Corrupt blocks: 0 Missing replicas: 0 (0.0 %) Number of data-nodes: 13 Number of racks: 1 FSCK ended at Mon Oct 17 16:36:09 KST 2016 in 781094 milliseconds
You can try to stop cloudera agent on the datanode. sudo service cloudera-scm-agent hard_stop_confirmed After the agent is stopped, you can just delete that datanode from hdfs instance page Hope this works
hdfs data got corrupted. the corrupted folder cannot be deleted as it shows no such file or directory
My hdfs data got corrupted. on doing fsck, i got the following result . /siva: CORRUPT block blk_-1910702044505537827 /siva: CORRUPT block blk_6483992593913191763 /siva: MISSING 2 blocks of total size 82009995 B.Status: CORRUPT Total size: 82009995 B Total dirs: 8 Total files: 1 Total blocks (validated): 2 (avg. block size 41004997 B) CORRUPT FILES: 1 MISSING BLOCKS: 2 MISSING SIZE: 82009995 B CORRUPT BLOCKS: 2 Minimally replicated blocks: 0 (0.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 2 Average block replication: 0.0 Corrupt blocks: 2 Missing replicas: 0 Number of data-nodes: 1 Number of racks: 1 FSCK ended at Tue Feb 23 12:21:03 IST 2016 in 2 milliseconds The filesystem under path '/' is CORRUPT Then i tried to remove the /siva folder but i got the following output rmr: cannot remove /siva: No such file or directory. please support
Use hdfs fsck / -delete to remove corrupted files.
Please run the below command on any headnode[ HN0 or HN1]. hdfs fsck -D "fs.default.name=hdfs://mycluster/" / In the report, we can see a filesystem as corrupt since blocks are got corrupted. The filesystem under path '/' is CORRUPT Run the below command to fix this issue. hdfs fsck -D "fs.default.name=hdfs://mycluster/" / -delete after, run the below command again to see a filesystem status. hdfs fsck -D "fs.default.name=hdfs://mycluster/" / this time we should see the filesytem status as Healthy like below. The filesystem under path '/' is HEALTHY
HDFS blocks issue
when I run fsck command it shows total blocks to be 68 (avg. block size 286572 B). How can I have only 68 blocks? I recently installed CDH5 with version: Hadoop 2.6.0 - [hdfs#cluster1 ~]$ hdfs fsck / Connecting to namenode via http://cluster1.abc:50070 FSCK started by hdfs (auth:SIMPLE) from /192.168.101.241 for path / at Fri Sep 25 09:51:56 EDT 2015 ....................................................................Status: HEALTHY Total size: 19486905 B Total dirs: 569 Total files: 68 Total symlinks: 0 Total blocks (validated): 68 (avg. block size 286572 B) Minimally replicated blocks: 68 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 1.9411764 Corrupt blocks: 0 Missing replicas: 0 (0.0 %) Number of data-nodes: 3 Number of racks: 1 FSCK ended at Fri Sep 25 09:51:56 EDT 2015 in 41 milliseconds The filesystem under path '/' is HEALTHY - This is what I get when I run hdfsadmin -repot command: [hdfs#cluster1 ~]$ hdfs dfsadmin -report Configured Capacity: 5715220577895 (5.20 TB) Present Capacity: 5439327449088 (4.95 TB) DFS Remaining: 5439303270400 (4.95 TB) DFS Used: 24178688 (23.06 MB) DFS Used%: 0.00% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 504 - Also, my hive query does not start MapReduce job, could it be above issue? Any suggestion? Thank you!
Blocks are chunks of data that is distributed in the nodes in the File System. So for example if you are having a file of 200MB, there would infact be 2 blocks of 128 and 72 mbs each. So do not be worried about the blocks as that is taken care of by the Framework. As the fsck report shows, you have 68 files in HDFS and hence 68 blocks.