Hadoop dfs -ls returns list of files in my hadoop/ dir - hadoop

I've set up a sigle-node Hadoop configuration running via cygwin under Win7. After starting Hadoop bybin/start-all.sh I run bin/hadoop dfs -ls which returns me a list of files in my hadoop directory. Then I run bin/hadoop datanode -formatbin/hadoop namenode -format but -ls still returns me the contents of my hadoop directory. As far as I understand it should return nothing(empty folder). What am I doing wrong?

Did you edit the core-site.xml and mapred-site.xml under conf folder ?
It seems like your hadoop cluster is in local mode.

I know this question is quite old, but directory structure in Hadoop has changed a bit (version 2.5 )
Jeroen's current version would be.
hdfs dfs -ls hdfs://localhost:9000/users/smalldata
Also Just for information - use of start-all.sh and stop-all.sh has been deprecated, instead one should use start-dfs.sh and start-yarn.sh

I had the same problem and solved it by explicitly specifying the URL to the NameNode.
To list all directories in the root of your hdfs space do the following:
./bin/hadoop dfs -ls hdfs://<ip-of-your-server>:9000/
The documentation says something about a default hdfs point in the configuration, but I cannot find it. If someone knows what they mean please enlighten us.
This is where I got the info: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_shell.html#Overview

Or you could just do:
Run stop-all.sh.
Remove dfs data and name directories
Namenode -format
Run start-all.sh

Related

error while running any hadoop hdfs file system command

I am very new to hadoop. I am referring "hadoop for dummies" book.
I have setup a vm with following specs
hadoop version 2.0.6-alpha
bigtop
os centos
problem is while running any hdfs file system command I am getting following error
example command : hadoop hdfs dfs -ls
error : Could not find or load main class hdfs
Please advice
Regards,
Try running:
hadoop fs -ls
or
hdfs dfs -ls
what do they return?
fs and dfs are the same commands.
Difference between `hadoop dfs` and `hadoop fs`
Remove either hadoop or hdfs and the command should run.

NameNode Does Not Start with start-all.sh

The NameNode does not start after stop-all.sh with start-all.sh. I try hadoop namenode -format and hadoop-daemon.sh start namenode then everything ok. However my data is lost in HDFS.
I do not want data loss. This result, hadoop namenode -format command is not want my path to a solution. How can I start the NameNode with start-all.sh ?
Thanks
First of all, stop-all.sh with start-all.sh are deprecated. Use start-dfs.sh and start-yarn.sh instead of start-all.sh. Same with stop-all.sh(it already says so)
secondly, hadoop namenode -format formats your HDFS and should therefore be used only once, at the time of installation.
Hadoop by default sets the property of hadoop.tmp.dir to a directory in /tmp, where the files are deleted after every restart. Set the hadoop.tmp.dir property in $HADOOP_HOME/conf/hadoop/core-site.xml, to some place where the files are not usually deleted. Run the hadoop namenode -format (actually it is hdfs namenode -format, this one is also deprecated.) one last time and start the daemons.
PS: If you can post the log file or the terminal screenshot of the error, it will be easier to help you.
hadoop.temp.dir
temp = should be "tmp" => hadoop.tmp.dir
I missed only "e".

Hadoop filesystem reads linux filesystem instead of hdfs?

I have a strange thing happening, when I read hadoop filesystem it shows me linux filesystem not the hadoop one, anyone is familiar with this issue?
Thanks,
Mika
This will happen if a valid hadoop configuration is not found.
e.g. if you do:
hadoop fs -ls
and there is no configuration is found at the default location, then you will see the linux filesystem. You can test this by adding either the -conf option after the "hadoop" command e.g.
hadoop -conf=<path-to-conf-files> fs -ls

SafeModeException : Name node is in safe mode

I tried copying files from my local disk to hdfs . At first it gave SafeModeException. While searching for solution I read that the problem does not appear if one executes same command again. So I tried again and it didn't gave exception.
hduser#saket:/usr/local/hadoop$ bin/hadoop dfs -copyFromLocal /tmp/gutenberg/ /user/hduser/gutenberg
copyFromLocal: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /user/hduser/gutenberg. Name node is in safe mode.
hduser#saket:/usr/local/hadoop$ bin/hadoop dfs -copyFromLocal /tmp/gutenberg/ /user/hduser/gutenberg
Why is this happening?. Should I keep safemode off by using this code?
hadoop dfs -safemode leave
NameNode is in safemode until configured percent of blocks reported to be online by the data nodes. It can be configured by parameter dfs.namenode.safemode.threshold-pct in the hdfs-site.xml
For small / development clusters, where you have very few blocks - it makes sense to make this parameter lower then its default 0.9999f value. Otherwise 1 missing block can lead to system to hang in safemode.
Go to the hadoop path into bin(in my system /usr/local/hadoop/bin/),
cd /usr/local/hadoop/bin/
Check there is a file hadoop,
hadoopuser#arul-PC:/usr/local/hadoop/bin$ ls
the o/p will be,
hadoop hadoop-daemons.sh start-all.sh start-jobhistoryserver.sh stop-balancer.sh stop-mapred.sh
hadoop-config.sh rcc start-balancer.sh start-mapred.sh stop-dfs.sh task-controller
hadoop-daemon.sh slaves.sh start-dfs.sh stop-all.sh stop-jobhistoryserver.sh
Then you have to off safe mode by using command ./hadoop dfsadmin -safemode leave,
hadoopuser#arul-PC:/usr/local/hadoop/bin$ ./hadoop dfsadmin -safemode leave
you will get response as,
Safe mode is OFF
Note: I created Hadoop user with the name of hadoopuser.

running hadoop wordcount example

I'm running a Hadoop single node cluster
while running the
hadoop dfs -copyFromLocal <source> <destination>
I get only one file from the source directory
And then there is the next source directory.
Furthur, I can't get error and output on running hadoop-0.20.2-examples.jar wordcount.
I can't see either error or the output?
Please, give me your help?
Just go with this tutorial here:
running-a-mapreduce-job by michael noll
I'm quite sure your namenode or datanode is not up, what are the logs saying?

Resources