Hadoop error while using this command hadoop fs -mkdir /in - windows

Hadoop command: hadoop fs -mkdir /in tried in folder C:\hwork but it did not work properly. Please help me to solve this.

You should be able to visit the namenode UI before doing any filesystem operations to verify your HDFS cluster is actually working. The error message you have given indicates that it is not.
The namenode UI is typically available on http://localhost:50070/dfshealth.jsp.
You can also verify that HDFS is working properly by running the following command:
hdfs dfsadmin -report
If "Safe Mode is ON", then things are not running properly. You should also have a non-zero configured capacity and at least one datanode which is available.
A healthy pseudo-distributed HDFS environment should give a report that looks something like this:
Configured Capacity: 41746268160 (38.88 GB)
Present Capacity: 34658451456 (32.28 GB)
DFS Remaining: 34655678464 (32.28 GB)
DFS Used: 2772992 (2.64 MB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)
Live datanodes:
Name: 192.168.10.129:50010 (datanode)
Hostname: datanode
Decommission Status : Normal
Configured Capacity: 41746268160 (38.88 GB)
DFS Used: 2772992 (2.64 MB)
Non DFS Used: 7087816704 (6.60 GB)
DFS Remaining: 34655678464 (32.28 GB)
DFS Used%: 0.01%
DFS Remaining%: 83.02%
Last contact: Thu May 07 16:51:50 UTC 2015

Related

Unable to write to HDFS: WARN hdfs.DataStreamer - Unexpected EOF

I'm following a tutorial and while running in a single cluster test environment I suddenly cannot run any MR jobs or write data to HDFS. It worked good before and suddenly I keep getting below error (rebooting didn't help).
I can read and delete files from HDFS, but not write.
$ hdfs dfs -put war-and-peace.txt /user/hands-on/
19/03/25 18:28:29 WARN hdfs.DataStreamer: Exception for BP-1098838250-127.0.0.1-1516469292616:blk_1073742374_1550
java.io.EOFException: Unexpected EOF while trying to read response from server
at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:399)
at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213)
at org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:1020)
put: All datanodes [DatanodeInfoWithStorage[127.0.0.1:50010,DS-b90326de-a499-4a43-a66a-cc3da83ea966,DISK]] are bad. Aborting...
"hdfs dfsadmin -report" shows me everything is fine, enough disk space. I barely ran any jobs, just some test MRs and little test data.
$ hdfs dfsadmin -report
Configured Capacity: 52710469632 (49.09 GB)
Present Capacity: 43335585007 (40.36 GB)
DFS Remaining: 43334025216 (40.36 GB)
DFS Used: 1559791 (1.49 MB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (1):
Name: 127.0.0.1:50010 (localhost)
Hostname: localhost
Decommission Status : Normal
Configured Capacity: 52710469632 (49.09 GB)
DFS Used: 1559791 (1.49 MB)
Non DFS Used: 6690530065 (6.23 GB)
DFS Remaining: 43334025216 (40.36 GB)
DFS Used%: 0.00%
DFS Remaining%: 82.21%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Mon Mar 25 18:30:45 EDT 2019
Also the NameNode WebUI (port 50070) shows me everything is fine, the logs too do not report any error. What could it be / how could I properly troubleshoot it?
CentOS Linux 6.9 minimal
Apache Hadoop 2.8.1

How to read `hadoop dfsadmin -report` output

Command:
[hdfs#sandbox oozie]$ hadoop dfsadmin -report|head -n 100
Output:
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Configured Capacity: 44716605440 (41.65 GB)
Present Capacity: 31614091245 (29.44 GB)
DFS Remaining: 30519073792 (28.42 GB)
DFS Used: 1095017453 (1.02 GB)
DFS Used%: 3.46%
Under replicated blocks: 657
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (1):
Name: 10.0.2.15:50010 (sandbox.hortonworks.com)
Hostname: sandbox.hortonworks.com
Decommission Status : Normal
Configured Capacity: 44716605440 (41.65 GB)
DFS Used: 1095017453 (1.02 GB)
Non DFS Used: 13102514195 (12.20 GB)
DFS Remaining: 30519073792 (28.42 GB)
DFS Used%: 2.45%
DFS Remaining%: 68.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 4
Last contact: Thu Aug 11 23:12:04 UTC 2016
What is Cache Used%, Non DFS Used specially???
hdfs dfsadmin -report command:
Reports basic filesystem information and statistics. Optional flags
may be used to filter the list of displayed DataNodes.
..from official page of hadoop
About,
Cache Used%:
It depends on "Configured Cache Capacity". It's the percentage of the configured value. As you have not configured any space for cache, it is shown as 100% (0 B out of 0 B)
NonDFS used:
It is calculated using following formula
NonDFS used = Configured Capacity - DFS Used - DFS Remaining

Hadoop namenode can't get out of safemode

All.
I use hadoop2.6.0.
When I force the hadoop leave the safe mode,using hdfs dfsadmin -safemode leave, it shows Safe mode is OFF,but I still can't delete the file in the directory,the result show that:
rm: Cannot delete /mei/app-20151013055617-0001-614d554c-cc04-4800-9be8-7d9b3fd3fcef. Name node is in safe mode.
I try to solve this problem using the way listing in the Internet,it doesn't work...
I use the command 'hdfs dfsadmin -report',it shows:
Safe mode is ON
Configured Capacity: 52710469632 (49.09 GB)
Present Capacity: 213811200 (203.91 MB)
DFS Remaining: 0 (0 B)
DFS Used: 213811200 (203.91 MB)
DFS Used%: 100.00%
Under replicated blocks: 39
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Live datanodes (1):
Name: 127.0.0.1:50010 (bdrhel6)
Hostname: bdrhel6
Decommission Status : Normal
Configured Capacity: 52710469632 (49.09 GB)
DFS Used: 213811200 (203.91 MB)
Non DFS Used: 52496658432 (48.89 GB)
DFS Remaining: 0 (0 B)
DFS Used%: 0.41%
DFS Remaining%: 0.00%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed Oct 14 03:30:33 EDT 2015
Does anyone have the same problem?
Any help on this please.
Safemode is an HDFS state in which the file system is mounted read-only; no replication is performed, nor can files be created or deleted. This is automatically entered as the NameNode starts, to allow all DataNodes time to check in with the NameNode and announce which blocks they hold, before the NameNode determines which blocks are under-replicated, etc. The NameNode waits until a specific percentage of the blocks are present and accounted-for; this is controlled in the configuration by the dfs.safemode.threshold.pct parameter. After this threshold is met, safemode is automatically exited, and HDFS allows normal operations.
1. Below command forces the NameNode to exit safemode
hdfs dfsadmin -safemode leave
2. Run hdfs fsck -move or hdfs fsck -delete to move or delete corrupted files.
Based on the report, It seems that Resource are low on NN. Add or free up more resources then turn off safe mode manually. If you turn off safe mode before adding more resources or freeing up resource, the NameNode will immediately return to safe mode.
Reference:
Hadoop Tutorial-YDN
fsck
Running:
hdfs dfsadmin -safemode forceExit
did the trick for me.
I faced the same problem. It was occurring because there was no disk space for hadoop to run new commands to manipulate the files.
Since hadoop was in safemode, I could not even delete files inside hadoop.
I am using cloudera version of hadoop so I first deleted few files in cloudera file system. This freed up some space. Then I executed following command:
[cloudera#quickstart ~]$ hdfs dfsadmin -safemode leave | hadoop fs -rm -r <file on hdfs to be deleted>
This worked for me!
HTH

Hadoop datanodes cannot find namenode in standalone setup

There are no errors in any log but I believe my datanode cannot find my namenode.
This is the error that leads me to this conclusion (according to what I've found online):
[INFO ]: org.apache.hadoop.ipc.Client - Retrying connect to server: /hadoop.server:9000. Already tried 4 time(s).
jps output:
7554 Jps
7157 NameNode
7419 SecondaryNameNode
7251 DataNode
Please can someone offer some advice?
Result of dfsadmin
Configured Capacity: 13613391872 (12.68 GB)
Present Capacity: 9255071744 (8.62 GB)
DFS Remaining: 9254957056 (8.62 GB)
DFS Used: 114688 (112 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)
Live datanodes:
Name: 192.172.1.49:50010 (Hadoop)
Hostname: Hadoop
Decommission Status : Normal
Configured Capacity: 13613391872 (12.68 GB)
DFS Used: 114688 (112 KB)
Non DFS Used: 4358320128 (4.06 GB)
DFS Remaining: 9254957056 (8.62 GB)
DFS Used%: 0.00%
DFS Remaining%: 67.98%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Fri Aug 08 17:25:57 SAST 2014
Give a hostname to your machines and make their entries in the /etc/hosts file, like this ,
#hostname hdserver.example.com
#vim /etc/hosts
192.168.0.25 hdserver.example.com
192.168.0.30 hdclient.example.com
and save it.(Use correct IP addresses)
On client also give hostname hdclient.example.com and make above entries in /etc/hosts. This will help the nameserver to locate the machines with hostnames.
delete all contents from tmp folder: rm -Rf path/of/tmp/directory
format namenode: :bin/hadoop namenode -format
start all processes again : bin/start-all.sh

Hbase: Newly added regionserver is not severing requests

I'm setting up a Hbase cluster on a cloud infrastructure.
HBase version: 0.94.11
Hadoop version: 1.0.4
Currently I have 4 nodes in my cluster (1 master, 3 regionservers) and I'm using YCSB (yahoo benchmarks) to create a table (500.000 rows) and send READ requests (Asynchronous READ requests).
Everything works fine with this setup (as I'm monitoring the hole process with ganglia and I'm getting lamda, throughput, latency combined with the YCSB's output), but the problem occurs when I add a new regionserver on-the-fly as it doesn't getting any requests.
What "on-the-fly" means:
While the YCSB is sending request to the cluster, I'm adding new regionservers using python scripts.
Addition Process (while the cluster is serving requests):
I'm creating a new VM which will act as the new regionserver and configure every needed aspect (hbase, hadoop, /etc/host, connect to private network, etc)
Stoping **hbase** balancer
Configuring every node in the cluster with the new node's information adding hostname to regioservers filesadding hostname to hadoop's slave fileadding hostname and IP to /etc/host file of every nodeetc
Executing on the master node:
`hadoop/bin/start-dfs.sh`
`hadoop/bin/start-mapred.sh`
`hbase/bin/start-hbase.sh`
(I've also tried to run `hbase start regionserver` on the newly added node and does exactly the same with the last command - starts the regionserver)
Once the newly added node is up and running I'm executing **hadoop** load balancer
When the hadoop load balancer stops I'm starting again the **hbase** load balancer
I'm connecting over ssh to the master node and check that the load balancers (hbase/hadoop) did their job as both the blocks and regions are uniformly spread across all the regionservers/slaves including the new one.
But when I run status 'simple' in the hbase shell I see that the new regionservers are not getting any requests. (below is the output of the command after adding 2 new regionserver "okeanos-nodes-4/5")
hbase(main):008:0> status 'simple'
5 live servers
okeanos-nodes-1:60020 1380865800330
requestsPerSecond=5379, numberOfOnlineRegions=4, usedHeapMB=175, maxHeapMB=3067
okeanos-nodes-2:60020 1380865800738
requestsPerSecond=5674, numberOfOnlineRegions=4, usedHeapMB=161, maxHeapMB=3067
okeanos-nodes-5:60020 1380867725605
requestsPerSecond=0, numberOfOnlineRegions=3, usedHeapMB=27, maxHeapMB=3067
okeanos-nodes-3:60020 1380865800162
requestsPerSecond=3871, numberOfOnlineRegions=5, usedHeapMB=162, maxHeapMB=3067
okeanos-nodes-4:60020 1380866702216
requestsPerSecond=0, numberOfOnlineRegions=3, usedHeapMB=29, maxHeapMB=3067
0 dead servers
Aggregate load: 14924, regions: 19
The fact that they don't serve any requests is also evidenced by the CPU usage, in a serving regionserver is about 70% while in these 2 regioservers is about 2%.
Below is the output of hadoop dfsadmin -report, as you can see the block are evenly distributed (according to hadoop balancer -threshold 2).
root#okeanos-nodes-master:~# /opt/hadoop-1.0.4/bin/hadoop dfsadmin -report
Configured Capacity: 105701683200 (98.44 GB)
Present Capacity: 86440648704 (80.5 GB)
DFS Remaining: 84188446720 (78.41 GB)
DFS Used: 2252201984 (2.1 GB)
DFS Used%: 2.61%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 5 (5 total, 0 dead)
Name: 10.0.0.11:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 309166080 (294.84 MB)
Non DFS Used: 3851579392 (3.59 GB)
DFS Remaining: 16979591168(15.81 GB)
DFS Used%: 1.46%
DFS Remaining%: 80.32%
Last contact: Fri Oct 04 11:30:31 EEST 2013
Name: 10.0.0.3:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 531652608 (507.02 MB)
Non DFS Used: 3852300288 (3.59 GB)
DFS Remaining: 16756383744(15.61 GB)
DFS Used%: 2.51%
DFS Remaining%: 79.26%
Last contact: Fri Oct 04 11:30:32 EEST 2013
Name: 10.0.0.5:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 502910976 (479.61 MB)
Non DFS Used: 3853029376 (3.59 GB)
DFS Remaining: 16784396288(15.63 GB)
DFS Used%: 2.38%
DFS Remaining%: 79.4%
Last contact: Fri Oct 04 11:30:32 EEST 2013
Name: 10.0.0.4:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 421974016 (402.43 MB)
Non DFS Used: 3852365824 (3.59 GB)
DFS Remaining: 16865996800(15.71 GB)
DFS Used%: 2%
DFS Remaining%: 79.78%
Last contact: Fri Oct 04 11:30:29 EEST 2013
Name: 10.0.0.10:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 486498304 (463.96 MB)
Non DFS Used: 3851759616 (3.59 GB)
DFS Remaining: 16802078720(15.65 GB)
DFS Used%: 2.3%
DFS Remaining%: 79.48%
Last contact: Fri Oct 04 11:30:29 EEST 2013
I've tried stopping YCSB, restarting hbase master and restarting YCSB but with no lack.. these 2 nodes don't serve any requests!
As there are many log and conf files, I have created a zip file with logs and confs (both hbase and hadoop) of the master, a healthy regionserver serving requests and a regionserver not serving requests.
https://dl.dropboxusercontent.com/u/13480502/hbase_hadoop_logs__conf.zip
Thank you in advance!!
I found what was going on and it had nothing to do with Hbase... I have forgotten to add the hostname and IP of the new RS to the YCSB server VM (/etc/hosts file).... :-(

Resources