hadoop dfs -ls gives list of folders not present in the local file system - hadoop

I have just installed a standalone cluster on my laptop. On running the hdfs dfs -ls command in a terminal, I get to see a list of folders. Upon searching the local file system through the File Explorer window I couldn't locate those files in my file system.
rishirich#localhost:/$ hdfs dfs -ls
Found 1 items
drwxr-xr-x - rishirich supergroup 0 2017-11-09 03:32 user
This folder named 'user' was nowhere to be seen on the local filesystem. Is it that the folder is hidden?
If so, then what terminal command should I use in order to find this folder?
If not, then how do I locate it?

You can't see the hdfs directory structure in graphical view to view it you have to use your terminal only.
hdfs dfs -ls /
and to see local file directory structure in the terminal you should try
ls <path>
cd <path>
cd use to change the directory in terminal.

In your installation of Hadoop, you had set up a core-site.xml file to establish the fs.defaultFS property. If you did not make this file://, it will not be the local filesystem.
If you set it to hdfs://, then the default locations for the namenode and datanode directories are in your local /tmp folder.
Note - those are HDFS blocks, not whole, readable files stored in HDFS.
If you want to list your local filesystem, you're welcome to use hadoop fs -ls file://

Related

How do I get the path to my user's home directory in HDFS

I want to see the absolute path to my home directory so that my code can pick up those files and process. But I find myself having to hdfs dfs -ls / and then explore from there until I come across my user's directory.
Effectively I want an hdfs dfs -pwd but of course this does not exist. If I can get a command to list a file on hdfs that shows it's full path, that will also work.
Safe lockdown everyone
$ hdfs getconf -confKey dfs.user.home.dir.prefix

`No such file or directory` while copying from local filesystem to hadoop

I have installed local single node Hadoop on Windows 10 and it appatently works.
Unfortunately, when I am trying to copy files to Hadoop from local filesystem, it swears:
λ hadoop fs -copyFromLocal ../my_models/*.model hdfs://localhost/tmp
copyFromLocal: `../my_models/aaa.model': No such file or directory
copyFromLocal: `../my_models/bbb.model': No such file or directory
copyFromLocal: `../my_models/ccc.model': No such file or directory
copyFromLocal: `../my_models/ddd.model': No such file or directory
As you see, it lists all model files in local directory, which proves it sees them. Unfortunately, it doesn't copy them.
Simultaneously I can create directories
λ hadoop fs -mkdir -p hdfs://localhost/tmp/
λ hadoop fs -ls hdfs://localhost/
Found 1 items
drwxr-xr-x - dims supergroup 0 2018-04-22 22:16 hdfs://localhost/tmp
What can be the problem?
You're probably getting this error because :
You can't use an asterisk(*) to specify the file format with the files you want to copy. You can only mention the path to the file or dir.(In your case this is the possible cause)
The folder you're copying from LFS is in the root dir. or some other dir. which HDFS user can't access.
Try using cd command as HDFS user to the same folder where your files exist, if the permission denied error persist then you must copy the files to /tmp folder.
Why dont you use a for loop for this something like below
for file in aaa.model bbb.model ccc.model; do hadoop fs -copyFromLocal ../my_models/$file hdfs://localhost/tmp; done

Hadoop copyFromLocal: '.': No such file or directory

I use Windows 8 with a cloudera-quickstart-vm-5.4.2-0 virtual box.
I downloaded a text file as words.txt into the Downloads folder.
I changed directory to Downloads and used hadoop fs -copyFromLocal words.txt
I get the no such file or directory error.
Can anyone explain me why this is happening / how to solve this issue?
Here is a screenshot of the terminal:
Someone told me this error occurs when Hadoop is in safe mode, but I have made sure that the safe mode is OFF.
It's happening because hdfs:///user/cloudera doesn't exist.
Running hdfs dfs -ls probably gives you a similar error.
Without specified destination folder, it looks for ., the current HDFS directory for the UNIX account running the command.
You must hdfs dfs -mkdir "/user/$(whoami)" before your current UNIX account can use HDFS, or you can specify an otherwise existing HDFS location to copy to

How files or directories are getting stored in hadoop hdfs

I have created a file in hdfs using below command
hdfs dfs -touchz /hadoop/dir1/file1.txt
I could see the created file by using below command
hdfs dfs -ls /hadoop/dir1/
But, I could not find the location itself by using linux commands (using find or locate). I searched on internet and found following link.
How to access files in Hadoop HDFS? . It says, hdfs is virtual storage. In that case, How its taking partition which one or how much it needs to be used, where the meta data being stored
Is it taking datanode location for virtual storage which I have mentioned in hdfs-site.xml to store all the data?
I looked into datanode location and there are files available. But I could not find out anything related to my file or folder which I have created.
(I am using hadoop 2.6.0)
HDFS file system is a distributed storage system wherein the storage location is virtual and created using the disk space from all the DataNodes. While installing hadoop, you must have specified paths for dfs.namenode.name.dir and dfs.datanode.data.dir. These are the locations at which all the HDFS related files are stored on individual nodes.
While storing the data onto HDFS, it is stored as blocks of a specified size (default 128MB in Hadoop 2.X). When you use hdfs dfs commands you will see the complete files but internally HDFS stores these files as blocks. If you check the above mentioned paths on your local file system, you will see a bunch of files which correcpond to files on your HDFS. But again, you will not see them as actual files as they are split into blocks.
Check below mentioned command's output to get more details on how much space from each DataNode is used to create the virtual HDFS storage.
hdfs dfsadmin -report #Or
sudo -u hdfs hdfs dfsadmin -report
HTH
As we creating a file in local file system i.e on creating a directory in it
for ex:$/mkdir MITHUN94** it is a directory entering into that(LFS) cd MITHUN90
in that **create a new file as **$nano file1.log .
And now create a directory in** hdfs for ex: hdfs dfs -mkdir /mike90 .Here "mike90"
refers to directory name . After that creating a directory send files from LFS to hdfs. By using this command $hdfs dfs -copyFromLocal /home/gopalkrishna/file1.log
/mike90
Here '/home/gopalkrishna/file1.log' refers to pwd (present working directory)
and '/mike90' refers to directory in hdfs. By clickig $hdfs dfs -ls /mike90
the list of files .

Reading files from hdfs vs local directory

I am a beginner in hadoop. I have two doubts
1) how to access files stored in the hdfs? Is it same as using a FileReader in java.io and giving the local path or is it something else?
2) i have created a folder where i have copied the file to be stored in hdfs and the jar file of the mapreduce program. When I run the command in any directory
${HADOOP_HOME}/bin/hadoop dfs -ls
it just shows me all the files in the current dir. So does that mean all the files got added without me explicitly adding it?
Yes, it's pretty much the same. Read this post to read files from HDFS.
You should keep in mind that HDFS is different than your local file system. With hadoop dfs you access the HDFS, not the local file system. So, hadoop dfs -ls /path/in/HDFS shows you the contents of the /path/in/HDFS directory, not the local one. That's why it's the same, no matter where you run it from.
If you want to "upload" / "download" files to/from HDFS you should use the commads:
hadoop dfs -copyFromLocal /local/path /path/in/HDFS and
hadoop dfs -copyToLocal /path/in/HDFS /local/path, respectively.

Resources