Navigate file system in Hadoop - hadoop

When running hadoop fs -ls
drwxr-xr-x - chiki supergroup 0 2019-01-14 17:03 Party_output
drwxr-xr-x - chiki supergroup 0 2018-01-22 18:25 party_uploads
but when try to access the directory
hadoop fs -ls /Party_output
showing output as
`/Party_output': No such file or directory

That's because hadoop fs -ls shows the contents of your home directory /home/chiki/.
You need to run hadoop fs -ls Party_output to see inside that directory (because it lives in /home/chiki/Party_output and not /Party_output).

Related

hdfs command: ls not find files hdfs namespace after hadoop federation implement

After implementing hadoop federation when I give bellow command its works fine.
> hdfs dfs -ls /
-r-xr-xr-x - hdfs hadoop 0 2016-11-02 00:13 /home
-r-xr-xr-x - hdfs hadoop 0 2016-11-02 00:13 /projects
-r-xr-xr-x - hdfs hadoop 0 2016-11-02 00:13 /user
But when I give bellow command
> hdfs dfs -ls /home
ls: `/home': No such file or directory
What is the reason. If any one help me it will be better for me.
The particular user doesn't have access to /home
Try with sudo or change permission of /home path

Confusion on HDFS 'pwd' equivalents

First, I have read this post:Is there an equivalent to `pwd` in hdfs?. It says there is no such 'pwd' in HDFS.
However, as I progressed with the instructions of Hadoop: Setting up a Single Node Cluster, I failed on this command:
$ bin/hdfs dfs -put etc/hadoop input
put: 'input': No such file or directory
It's weird that I succeed on this command for the first time I went through the instructions, but failed for the second time. It's also weird that I succeed on this command on my friends computer, which has the same system (Ubuntu 14.04) and hadoop version (2.7.1) as mine.
Can anyone explain what happened here? Is there some 'pwd' in HDFS after all?
Firstly, You are trying to run the command $ bin/hdfs dfs -put etc/hadoop input with user that doesn't exist in the VM/HDFS
Let me clearly explain you with the following example in HDP VM
[root#sandbox hadoop-hdfs-client]# bin/hdfs dfs -put /etc/hadoop input
put: `input': No such file or directory
Here I executed the command with root user and it didn't exist in the HDP VM. Check in the following command to list the users
[root#sandbox hadoop-hdfs-client]# hadoop fs -ls /user
Found 8 items
drwxrwx--- - ambari-qa hdfs 0 2015-08-20 08:33 /user/ambari-qa
drwxr-xr-x - guest guest 0 2015-08-20 08:47 /user/guest
drwxr-xr-x - hcat hdfs 0 2015-08-20 08:36 /user/hcat
drwx------ - hive hdfs 0 2015-09-04 09:52 /user/hive
drwxr-xr-x - hue hue 0 2015-08-20 09:05 /user/hue
drwxrwxr-x - oozie hdfs 0 2015-08-20 08:37 /user/oozie
drwxr-xr-x - solr hdfs 0 2015-08-20 08:41 /user/solr
drwxrwxr-x - spark hdfs 0 2015-08-20 08:34 /user/spark
In HDFS, If you want to copy a file and not mentioning the absolute path for destination argument, it will consider home of the logged user and place your file there. Here root user not found.
Now let's switch to hive user and test
[root#sandbox hadoop-hdfs-client]# su hive
[hive#sandbox hadoop-hdfs-client]$ bin/hdfs dfs -put /etc/hadoop input
[hive#sandbox hadoop-hdfs-client]$ hadoop fs -ls /user/hive
Found 1 items
drwxr-xr-x - hive hdfs 0 2015-09-04 10:07 /user/hive/input
Yay..Successfully Copied..
Hope it helps..!!!
It means that we need to move input files to hdfs location.
Suppose you have input file named input.txt and we need to move to HDFS, then follow the below command.
Command: hdfs dfs -put /input_location /hdfs_location
In case no specific directory in HDFS
hdfs dfs -put /home/Desktop/input.txt /
In case specific directory in HDFS (Note: We need to create a directory before proceeding)
hdfs dfs -put /home/Desktop/input.txt /MR_input
After that you can run the examples
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /input /output
Here Input and output are the paths which should be in HDFS.
Hope this helps.

hadoop copy a local file system folder to HDFS

I need to copy a folder from local file system to HDFS. I could not find any example of moving a folder(including its all subfolders) to HDFS
$ hadoop fs -copyFromLocal /home/ubuntu/Source-Folder-To-Copy HDFS-URI
You could try:
hadoop fs -put /path/in/linux /hdfs/path
or even
hadoop fs -copyFromLocal /path/in/linux /hdfs/path
By default both put and copyFromLocal would upload directories recursively to HDFS.
In Short
hdfs dfs -put <localsrc> <dest>
In detail with an example:
Checking source and target before placing files into HDFS
[cloudera#quickstart ~]$ ll files/
total 132
-rwxrwxr-x 1 cloudera cloudera 5387 Nov 14 06:33 cloudera-manager
-rwxrwxr-x 1 cloudera cloudera 9964 Nov 14 06:33 cm_api.py
-rw-rw-r-- 1 cloudera cloudera 664 Nov 14 06:33 derby.log
-rw-rw-r-- 1 cloudera cloudera 53655 Nov 14 06:33 enterprise-deployment.json
-rw-rw-r-- 1 cloudera cloudera 50515 Nov 14 06:33 express-deployment.json
[cloudera#quickstart ~]$ hdfs dfs -ls
Found 1 items
drwxr-xr-x - cloudera cloudera 0 2017-11-14 00:45 .sparkStaging
Copy files HDFS using -put or -copyFromLocal command
[cloudera#quickstart ~]$ hdfs dfs -put files/ files
Verify the result in HDFS
[cloudera#quickstart ~]$ hdfs dfs -ls
Found 2 items
drwxr-xr-x - cloudera cloudera 0 2017-11-14 00:45 .sparkStaging
drwxr-xr-x - cloudera cloudera 0 2017-11-14 06:34 files
[cloudera#quickstart ~]$ hdfs dfs -ls files
Found 5 items
-rw-r--r-- 1 cloudera cloudera 5387 2017-11-14 06:34 files/cloudera-manager
-rw-r--r-- 1 cloudera cloudera 9964 2017-11-14 06:34 files/cm_api.py
-rw-r--r-- 1 cloudera cloudera 664 2017-11-14 06:34 files/derby.log
-rw-r--r-- 1 cloudera cloudera 53655 2017-11-14 06:34 files/enterprise-deployment.json
-rw-r--r-- 1 cloudera cloudera 50515 2017-11-14 06:34 files/express-deployment.json
If you copy a folder from local then it will copy folder with all its sub folders to HDFS.
For copying a folder from local to hdfs, you can use
hadoop fs -put localpath
or
hadoop fs -copyFromLocal localpath
or
hadoop fs -put localpath hdfspath
or
hadoop fs -copyFromLocal localpath hdfspath
Note:
If you are not specified hdfs path then folder copy will be copy to hdfs with the same name of that folder.
To copy from hdfs to local
hadoop fs -get hdfspath localpath
You can use :
1.LOADING DATA FROM LOCAL FILE TO HDFS
Syntax:$hadoop fs –copyFromLocal
EX: $hadoop fs –copyFromLocal localfile1 HDIR
2. Copying data From HDFS to Local
Sys: $hadoop fs –copyToLocal < new file name>
EX: $hadoop fs –copyToLocal hdfs/filename myunx;
To copy a folder file from local to hdfs, you can the below command
hadoop fs -put /path/localpath /path/hdfspath
or
hadoop fs -copyFromLocal /path/localpath /path/hdfspath
Navigate to your "/install/hadoop/datanode/bin" folder or path where you could execute your hadoop commands:
To place the files in HDFS:
Format: hadoop fs -put "Local system path"/filename.csv "HDFS destination path"
eg)./hadoop fs -put /opt/csv/load.csv /user/load
Here the /opt/csv/load.csv is source file path from my local linux system.
/user/load means HDFS cluster destination path in "hdfs://hacluster/user/load"
To get the files from HDFS to local system:
Format : hadoop fs -get "/HDFSsourcefilepath" "/localpath"
eg)hadoop fs -get /user/load/a.csv /opt/csv/
After executing the above command, a.csv from HDFS would be downloaded to /opt/csv folder in local linux system.
This uploaded files could also be seen through HDFS NameNode web UI.
using the following commands -
hadoop fs -copyFromLocal <local-nonhdfs-path> <hdfs-target-path>
hadoop fs -copyToLocal <hdfs-input-path> <local-nonhdfs-path>
Or you also use spark FileSystem library to get or put hdfs file.
Hope this is helpful.

why does my hadoop command does not work?

I have my hadoop cluster set up with one master and two slaves.
when I type
hadoop fs -ls
ls: Cannot access .: No such file or directory.
But when I type the following:
hadoop fs -ls /
Found 1 items
drwxr-xr-x - Mike supergroup 0 2014-06-24 00:24 /usr
I get the same output both on master and slaves. why hadoop fs -ls does not work?
Thanks!
hadoop fs -ls
This tries to list current user's home directory on hdfs. since i think /user/{username} directory doesn't exist in your case hence you get the error,
hadoop fs -ls /
you are specifically telling it to list root directory which it does successfully as it exist.

HDFS path changing when trying to update files in HDFS

I am new to Hadoop and HDFS, so maybe it is something I am doing wrong when I copy from local (Ubuntu 10.04) to HDFS on a single node on localhost. The initial copy works fine, but when I modify my local input folder and try to copy back to HDFS, the HDFS path changes.
~$ $HADOOP_HOME/bin/hadoop dfs -copyFromLocal /tmp/anagram /user/hduser/anagram
~$ $HADOOP_HOME/bin/hadoop dfs -ls /user/hduser/anagram
Found 1 items
-rw-r--r-- 1 hduser supergroup 4067675 2011-08-29 05:44 /user/hduser/anagram/SINGLE.TXT
After adding another file (COMMON.TXT) to the same local directory, I run the same copy on the local directory to HDFS, but this time it copies to a different location than the first time (/user/hduser/anagram to /user/hduser/anagram/anagram).
~$ $HADOOP_HOME/bin/hadoop dfs -copyFromLocal /tmp/anagram /user/hduser/anagram
~$ $HADOOP_HOME/bin/hadoop dfs -ls /user/hduser/anagram
Found 2 items
-rw-r--r-- 1 hduser supergroup 4067675 2011-08-29 05:44 /user/hduser/anagram/SINGLE.TXT
drwxr-xr-x - hduser supergroup 0 2011-08-29 05:48 /user/hduser/anagram/anagram
~$ $HADOOP_HOME/bin/hadoop dfs -ls /user/hduser/anagram/anagram
Found 2 items
-rw-r--r-- 1 hduser supergroup 805232 2011-08-29 05:48 /user/hduser/anagram/anagram/COMMON.TXT
-rw-r--r-- 1 hduser supergroup 4067675 2011-08-29 05:48 /user/hduser/anagram/anagram/SINGLE.TXT
Has anyone ran into this? I found that to resolve this, you need to remove the first directory and then copy over again:
~$ $HADOOP_HOME/bin/hadoop dfs -rmr /user/hduser/anagram/anagram
Deleted hdfs://localhost:54310/user/hduser/anagram/anagram
~$ $HADOOP_HOME/bin/hadoop dfs -rmr /user/hduser/anagram
Deleted hdfs://localhost:54310/user/hduser/anagram
~$ $HADOOP_HOME/bin/hadoop dfs -copyFromLocal /tmp/anagram /user/hduser/anagram
~$ $HADOOP_HOME/bin/hadoop dfs -ls /user/hduser/anagram
Found 2 items
-rw-r--r-- 1 hduser supergroup 805232 2011-08-29 05:55 /user/hduser/anagram/COMMON.TXT
-rw-r--r-- 1 hduser supergroup 4067675 2011-08-29 05:55 /user/hduser/anagram/SINGLE.TXT
Does anyone know how to do this without having to delete the directory every time?
It seems to me that this is side effect (check the FileUtil.java, static method FileUtil.checkDest(String srcName, FileSystem dstFS, Path dst, boolean overwrite) )
try this:
hadoop dfs -copyFromLocal /tmp/anagram/*.TXT /user/hduser/anagram
for updating directory.

Resources