How to remove files inside the hadoop directory at once? - hadoop

I want to remove all the files containes in hadoop directory, without removing the directory itself. I've tried using rm -r
but it removed the whole directory.

Please include a wildcard character * after the desired folder you want to delete, to avoid deleting the parent folder. Please look at the example below:
hdfs dfs -rm -r '/home/user/folder/*'

referring to the previous answer, you need to quote the asterisk:
hdfs dfs -rm -r "/home/user/folder/*"

Use hdfs command to delete all files in it. For example, if your hadoop path is /user/your_user_name/* then use asterisk to delete all files inside the specific folder.
hdfs dfs -rm -r '/user/your_user_name/*'

Related

Hadoop delete all files ends with a name in a folder

How can I delete all files end with a specific name on HDFS? I'm trying to type hadoop fs -rm -R /path/*<end_of_file_name>, where * is passed as a wildcard. But I received an error not able to find such file or directory.
The asterisk is expanded within your local shell; you need to quote the argument so that the full string is passed to the namenode
hadoop fs -rm "/path/*end"

How to delete all the directories with a work in their names in HDFS? (using the command line)

I want to delete all the directories in the HDFS with an specific word in their names. It has to be considered the directories are in different locations under a common directory. Is there a way to do this?
I have tried the following but it didn't work:
hdfs dfs -rm -r /user/myUser/*toFind*
The answer to the previous was:
rm: `/user/myUser/toFind': No such file or directory
It is ok for me to do with this command , and my cluster is cdh5.6.0 with apache 2.6.0
Are you sure is there any file or directory's name contains "toFind" ??
Besides it cannot recursion direcotry .

HDFS Directory with '.' in the name

I accidentally created a directory in HDFS that is named 'again.' and I am trying to delete the directory. I have tried everything that I can think to help but, have been unsuccessful. I tried 'hdfs dfs -rm -r /user/[username]/*'. I tried 'hdfs dfs -rm -r '/user/[username]/again.'. None of these have worked ! Even the first which deleted every directory except for the directory that I wanted to delete.
Hadoop 2.7.3
Any thoughts ?
You could try with a ? placeholder:
hdfs dfs -rm -r /user/[username]/again?
That could theoretically match other files too, but if you have only one matching file it should work tolerably well.
Try using
hdfs dfs -rm -r "/user/[username]/again\."
or
hdfs dfs -rm -r ".\ /user/[username]/again\."
Note: In case you have Hue, please do it in Hue. That will make life easy.
None of the responses worked but, thank you all for responding. I just dropped the entire directory structure and refreshed the environment from an existing instance.

Not able to create subdir under dir in hdfs

I am able to create directory using the below command but not able to create the subdir under already created dir. May I know what could be the reason. I have setup hdfs on my mac in pseudo distributed mode and trying to create these directories. Any help would be appreciated.
hadoop fs -mkdir /test/subdir
The above command doesn't create any sub directory however the below command creates a directory.
hadoop fs -mkdir test
To recursively create subdirectories inside parent directory, you have to provide -p option or else you can create one directory at a time.
hdfs dfs -mkdir -p /test/subdir
will work in your case.
Try giving it the parent creation flag.
hadoop fs -mkdir -p /test/subdir

Reorganize files in HDFS

I need to move files written by a Hive job that look like this
/foo/0000_0
/foo/0000_1
/bar/0000_0
into a file structure that looks like this
/foo/prefix1/prefix2-0000_0
/foo/prefix1/prefix2-0000_1
/bar/prefix1/prefix2-0000_0
before migrating this out of the cluster (using s3distcp). I've been looking around hadoop fs but I can't find something that would let me do this. I don't want to rename file by file.
first, you need to create the sub directory inside /foo. For this use following command
$hdfs dfs -mkdir /foo/prefix1
this will create a sub directory in /foo. if you want to create more subdirectory inside prefix1 use this same command recursively with updated path structure.In case you are using an older version of Hadoop (1.x) replace hdfs by hadoop.
now you can move files from /foo to /foo/prefix1 using the following command.Here newfilename can be any name you want to give to your file.
$hdfs dfs -mv /foo/filename /foo/prefix1/newfilename
Hope this answer your query

Resources