I use Windows 8 with a cloudera-quickstart-vm-5.4.2-0 virtual box.
I downloaded a text file as words.txt into the Downloads folder.
I changed directory to Downloads and used hadoop fs -copyFromLocal words.txt
I get the no such file or directory error.
Can anyone explain me why this is happening / how to solve this issue?
Here is a screenshot of the terminal:
Someone told me this error occurs when Hadoop is in safe mode, but I have made sure that the safe mode is OFF.
It's happening because hdfs:///user/cloudera doesn't exist.
Running hdfs dfs -ls probably gives you a similar error.
Without specified destination folder, it looks for ., the current HDFS directory for the UNIX account running the command.
You must hdfs dfs -mkdir "/user/$(whoami)" before your current UNIX account can use HDFS, or you can specify an otherwise existing HDFS location to copy to
Related
I'm new to Apache Hadoop and I'm trying to copy a simple text file from my local directory to HDFS on Hadoop, which is up and running. However, Hadoop is installed in D: while my file is in C:.
If I use the -put or copyFromLocal command in cmd with the file in the aforementioned drive, it doesn't allow me to do that. However, if I place the text file in the same D: drive, the file is correctly uploaded to Hadoop and can be seen on Hadoop localhost. The code that works with the file and Hadoop in the same drive is as follows:
hadoop fs -put /test.txt /user/testDirectory
If my file is in a separate drive, I get the error '/test.txt': No such file or directory. I've tried variations of /C/pathOfFile/test.txt but to no avail, so in short, I need to know how to access a local file in another directory, specifically with respect to the -put command. Any help for this probably amateurish question will be appreciated.
If your current cmd session is in D:\, then your command would look at the root of that drive
You could try prefixing the path
file:/C:/test.txt
Otherwise, cd to the path containing your file first, then just -put test.txt or -put .\test.txt
Note: HDFS doesn't know about the difference between C and D unless you actually set fs.defaultFS to be something like file:/D:/hdfs
From your question I assume that you have installed Hadoop in a Virtual Machine (VM) on a Windows installation. Please provide more details on that if this assumption is incorrect. The issue is that your VM considers drive D: as the Local Directory, where -put and -copyFromLocal can see files at. C: is not visible to these commands currently.
You need to mount drive C: to your VM, in order to make its files available as local for Hadoop. There are guides out there depending on your VM. I advise care while at it, in order not to mishandle any Windows installation files.
I am trying to upload a file in HDFS with:
sudo -u hdfs hdfs dfs -put /home/hive/warehouse/sample.csv hdfs://[ip_redacted]:9000/data
I can confirm that HDFS works, as I managed to create the /data directory just fine.
Even giving the full path to the .csv file gives the same error:
put: `/home/hive/warehouse/sample.csv': No such file or directory
Why is it giving this error?
I encountered the problem, too.
Because user hdfs has no permission to access one of the file's ancestry directories, so it gave the error No such file or directory.
As crystyxn commentted, using environment variable HADOOP_USER_NAME instead of sudo -u hdfs worked.
Is the csv file in your local system or in HDFS? You can use -put command (or the -copyFromLocal command) ONLY to move a LOCAL file into the distributed file system.
I have just installed a standalone cluster on my laptop. On running the hdfs dfs -ls command in a terminal, I get to see a list of folders. Upon searching the local file system through the File Explorer window I couldn't locate those files in my file system.
rishirich#localhost:/$ hdfs dfs -ls
Found 1 items
drwxr-xr-x - rishirich supergroup 0 2017-11-09 03:32 user
This folder named 'user' was nowhere to be seen on the local filesystem. Is it that the folder is hidden?
If so, then what terminal command should I use in order to find this folder?
If not, then how do I locate it?
You can't see the hdfs directory structure in graphical view to view it you have to use your terminal only.
hdfs dfs -ls /
and to see local file directory structure in the terminal you should try
ls <path>
cd <path>
cd use to change the directory in terminal.
In your installation of Hadoop, you had set up a core-site.xml file to establish the fs.defaultFS property. If you did not make this file://, it will not be the local filesystem.
If you set it to hdfs://, then the default locations for the namenode and datanode directories are in your local /tmp folder.
Note - those are HDFS blocks, not whole, readable files stored in HDFS.
If you want to list your local filesystem, you're welcome to use hadoop fs -ls file://
I am using Hadoop 2.6.4 and I have files in status 'openforwrite'. I got the solutions: run 'hdfs debug recoverLease' to recover the lease of the hdfs block files.(Features After Hadoop 2.7.0)
But in my hadoop version(2.6.4) I can't execute recoverLease commands. Is there any ideas to fix that?
Thanks a lot.
In some rare cases, files can be stuck in the OPENFORWRITE state in HDFS more than the default expiration time. If this happens, the data needs to be moved to a new inode to clear up the OPENFORWRITE status.
Solution
1) Stop all applications writing to HDFS.
2) Move the file temporarily to some other location.
$ hdfs dfs -mv /Path_to_file /tmp/
3) Copy the file back to its original location. This will force a new inode to be created and will clear up the OPENFORWRITE state
$ hdfs dfs -cp /tmp/Path_to_file /Original/destination
4) Once you have confirmed that the file is working correctly, remove the copied file from temp location.
$ hdfs dfs -rm /tmp/Path_to_file
I am a beginner in hadoop. I have two doubts
1) how to access files stored in the hdfs? Is it same as using a FileReader in java.io and giving the local path or is it something else?
2) i have created a folder where i have copied the file to be stored in hdfs and the jar file of the mapreduce program. When I run the command in any directory
${HADOOP_HOME}/bin/hadoop dfs -ls
it just shows me all the files in the current dir. So does that mean all the files got added without me explicitly adding it?
Yes, it's pretty much the same. Read this post to read files from HDFS.
You should keep in mind that HDFS is different than your local file system. With hadoop dfs you access the HDFS, not the local file system. So, hadoop dfs -ls /path/in/HDFS shows you the contents of the /path/in/HDFS directory, not the local one. That's why it's the same, no matter where you run it from.
If you want to "upload" / "download" files to/from HDFS you should use the commads:
hadoop dfs -copyFromLocal /local/path /path/in/HDFS and
hadoop dfs -copyToLocal /path/in/HDFS /local/path, respectively.