How to get absolute path for directory in hadoop - hadoop

I have created a directory in hadoop and copied a file to that directory.
Now i want to create external hive table which will refer the above created file.
Is there way we can find out the root dir, under which prvys dir was created.

By default, hadoop fs -ls will look at /user/$(whoami)
If you echo that path, then -ls it, you should find the prvys directory. For example, hdfs:///user/liftadmin/
If you're using Kerberos, then the user directory depends on the ticket you've initialized the session with

Related

Hadoop/HDFS: put command fails - No such file or directory

I do not know why I cannot move a file from one directory to another. I can view the content of the file but I cannot move the same file into another directory.
WORKS FINE:
hadoop fs -cat /user/hadoopusr/project-data.txt
DOES NOT WORK:
hadoop fs -put /user/hadoopusr/project-data.txt /user/hadoopusr/Projects/MarketAnalysis
I got a No such file or directory error message. What is wrong? Please help. Thank you!
As we can read from here about the -put command:
This command is used to copy files from the local file system to the
HDFS filesystem. This command is similar to –copyFromLocal command.
This command will not work if the file already exists unless the –f
flag is given to the command. This overwrites the destination if the
file already exists before the copy
Which makes it clear why it doesn't work and throws the No such file or directory message. It's because it can't find any file with the name project-data.txt on your current directory of your local filesystem.
You plan on moving a file between directories inside the HDFS, so instead of using the -put parameter for moving, we can simply use the -mv parameter as we would in our local filesystem!
Tested it out on my own HDFS as follows:
Create the source and destination directories in HDFS
hadoop fs -mkdir source_dir dest_dir
Create an empty (for the sake of the test) file under the source directory
hadoop fs -touch source_dir/test.txt
Move the empty file to the destination directory
hadoop fs -mv source_dir/test.txt dest_dir/test.txt
(Notice how the /user/username/part of the path for the file and the destination directory is not needed, because HDFS is by default on this directory where you are working. You also should note that you have to write the full path of the destination with name of the file included.)
You can see below with the HDFS browser that the empty text file has been moved to the destination directory:

How do I get the path to my user's home directory in HDFS

I want to see the absolute path to my home directory so that my code can pick up those files and process. But I find myself having to hdfs dfs -ls / and then explore from there until I come across my user's directory.
Effectively I want an hdfs dfs -pwd but of course this does not exist. If I can get a command to list a file on hdfs that shows it's full path, that will also work.
Safe lockdown everyone
$ hdfs getconf -confKey dfs.user.home.dir.prefix

hadoop dfs -ls gives list of folders not present in the local file system

I have just installed a standalone cluster on my laptop. On running the hdfs dfs -ls command in a terminal, I get to see a list of folders. Upon searching the local file system through the File Explorer window I couldn't locate those files in my file system.
rishirich#localhost:/$ hdfs dfs -ls
Found 1 items
drwxr-xr-x - rishirich supergroup 0 2017-11-09 03:32 user
This folder named 'user' was nowhere to be seen on the local filesystem. Is it that the folder is hidden?
If so, then what terminal command should I use in order to find this folder?
If not, then how do I locate it?
You can't see the hdfs directory structure in graphical view to view it you have to use your terminal only.
hdfs dfs -ls /
and to see local file directory structure in the terminal you should try
ls <path>
cd <path>
cd use to change the directory in terminal.
In your installation of Hadoop, you had set up a core-site.xml file to establish the fs.defaultFS property. If you did not make this file://, it will not be the local filesystem.
If you set it to hdfs://, then the default locations for the namenode and datanode directories are in your local /tmp folder.
Note - those are HDFS blocks, not whole, readable files stored in HDFS.
If you want to list your local filesystem, you're welcome to use hadoop fs -ls file://

Can't put file from local directory to HDFS

I have created a file with name "file.txt" in the local directory , now I want to put it in HDFS by using :-
]$ hadoop fs -put file.txt abcd
I am getting a response like
put: 'abcd': no such file or directory
I have never worked on Linux. Please help me out - How do I put the file "file.txt" into HDFS?
If you don't specify an absolute path in hadoop (HDFS or wathever other file system used), it will pre-append your user directory to create an absloute path.
By default, in HDFS you default folder should be /user/user name.
Then in your case you are trying to create the file /user/<user name>/abcd and put inside it the content of your local file.txt.
The user name is your operative system user, in your local machine. You can get it using the whoami command.
The the problem is that your user folder doesn't exist in HDFS, and you need to create it.
BTW, according with hadoop documentation, the correct command to work with HDFS is hdfs dfs instead hadoop fs (https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html). But by now both should work.
Then:
If you don't know your user name in your local operative system. Open a terminal and run the whoami command.
Execute the follow command, replacing your user name.
hdfs dfs -mkdir -p /user/<user name>
And then you should be able to execute your PUT command.
NOTE: The -p parameter is to create the /user folder if it doesn't exist.

How to view the hadoop data directory structure?

I have partitioned table in hive. So I wanna see the directory structure in hadoop hdfs?
From documentation, I have found the following command
hadoop fs -ls /app/hadoop/tmp/dfs/data/
and /app/hadoop/tmp/dfs/data/ is my data path. But this command return
ls: Cannot access /app/hadoop/tmp/dfs/data/: No such file or
directory.
Am I missing something there?
Unless I'm mistaken, it seems you are looking for a temporary directory that you probably defined in the property hadoop.tmp.dir. This is a local directory, but when you do hadoop fs -ls you are looking at what files are available in HDFS, so you won't see anything.
Since you're looking or the Hive directories, you are looking for the following property in your hive-site.xml:
hive.metastore.warehouse.dir
The default is /user/hive/warehouse, so if you haven't changed this property you should be able to do:
hadoop fs -ls /user/hive/warehouse
And this should show you your table directories.
check whether tmp directory is correctly set in your core-site.xml file and hdfs-site.xml.
if not set, then the temporary directory of operating system(tmp in ubuntu and %temp% in windows) will be set to hadoop tmp folder, due to which you may lose your data after restarting your computer. Set this dfs.tmp.dir in both the xml and restart your cluster. It will work fine then.
even after this if it is not resolved, please give more details about partitioning table code and the table data too.

Resources