Bash: find | sed | xargs rm not working, but rm does - bash

I'm trying to remove all .js and .js.map files from any sub-directory of src called __tests__.
$ find . -path './src/**' -name __tests__ | # find subdirectories
> sed -E 's/([^ ]+__tests__)/\1\/*.js \1\/*.js.map/g' | # for each subdirectory, concat *.js and *.js.map
> xargs rm # remove files
This fails with the following errors:
rm: cannot remove './src/game/__tests__/*.js': No such file or directory
rm: cannot remove './src/game/__tests__/*.js.map': No such file or directory
rm: cannot remove './src/helpers/__tests__/*.js': No such file or directory
rm: cannot remove './src/helpers/__tests__/*.js.map': No such file or directory
However, if I change my xargs rm to xargs echo rm, copy and paste the output, and run it, it works.
$ find . -path './src/**' -name __tests__ | sed -E 's/([^ ]+__tests__)/\1\/*.js \1\/*.js.map/g' |
> xargs echo rm # echo command to remove files
rm ./src/game/__tests__/*.js ./src/game/__tests__/*.js.map ./src/helpers/__tests__/*.js ./src/helpers/__tests__/*.js.map
$ rm ./src/game/__tests__/*.js ./src/game/__tests__/*.js.map ./src/helpers/__tests__/*.js ./src/helpers/__tests__/*.js.map
Wrapping the output of my echo in $(...) and prepending rm results in the same error as before.
$ rm $(find . -path './src/**' -name __tests__ | sed -E 's/([^ ]+__tests__)/\1\/*.js \1\/*.js.map/g' | xargs echo rm
rm: cannot remove './src/game/__tests__/*.js': No such file or directory
rm: cannot remove './src/game/__tests__/*.js.map': No such file or directory
rm: cannot remove './src/helpers/__tests__/*.js': No such file or directory
rm: cannot remove './src/helpers/__tests__/*.js.map': No such file or directory
What am I doing wrong?
I doubt it matters, but I'm using GitBash on Windows.

First, to explain the issue: In find | sed | xargs rm, the shell only sets up communication between those programs, but it doesn't actually process the results in any way. That's a problem here because *.js needs to be expanded by a shell to replace it with a list of filenames; rm treats every argument it's given as a literal name. (This is unlike Windows, where programs do their own command-line parsing and glob expansion).
Arguably, you don't need find here at all. Consider:
shopt -s globstar # enable ** as a recursion operator
rm ./src/**/__tests__/*.js{,.map} # delete *.js and *.js.map in any __tests__ directory under src
...or, if you do want to use find, let it do the work of coming up with a list of individual files matching *.js, instead of leaving that work to happen later:
find src -regextype posix-egrep -regex '.*/__tests__/[^/]*[.]js([.]map)?' -delete

You need to have your globs (*) expanded. File name expansion is performed by the shell on UNIX, not by rm or other programs. Try:
.... | xargs -d $'\n' sh -c 'IFS=; for f; do rm -- $f; done' sh
...to explain this:
The -d $'\n' ensures that xargs splits only on newlines (not spaces!), and also stops it from treating backslashes and quotes as special.
sh -c '...' sh runs ... as a script, with sh as $0, and subsequent arguments in $1, etc; for f; will thus iterate over those arguments.
Clearing IFS with IFS= prevents string-splitting from happening when $f is used unquoted, so only glob expansion happens.
Using the -- argument to rm ensures that it treats subsequent arguments as filenames, not options, even if they start with dashes.
That said, if you have really a lot of files for each pattern, you might run into an "argument list too long", even though you are using xargs.
Another caveat is that filenames containing newlines can potentially be split into multiple names (depending on the details of the version of find you're using). A way to solve this that will work with all POSIX-compliant versions of find might be:
find ./src -type d -name __tests__ -exec sh -c '
for d; do
rm -- "$d"/*.js{,.map}
done
' sh {} +

Related

Remove all except one file in bash on mac

bash on mac, installed by brew
λ brew list | grep bash
bash
λ which bash
/usr/local/bin/bash
λ rm !("shorturl.api")
-bash: !: event not found
λ ls -1 | grep -v shorturl.api | xargs rm
rm: cannot remove ''$'\033''[0m'$'\033''[01;32mapi'$'\033''[0m': No such file or directory
rm: cannot remove ''$'\033''[01;34metc'$'\033''[0m': No such file or directory
rm: cannot remove ''$'\033''[01;34minternal'$'\033''[0m': No such file or directory
rm: cannot remove ''$'\033''[00mshorturl.go'$'\033''[0m': No such file or directory
The !(pattern-list) globbing pattern only works when extended globbing is enabled. See the extglob section in glob - Greg's Wiki. In this case you need:
shopt -s extglob
rm -- !(shorturl.api)
The -- with rm is to prevent files whose names begin with - being treated as options.
One way to do it without extended globbing is:
find . -maxdepth 1 -type f ! -name shorturl.api -delete
The ls -1 | grep -v shorturl.api | xargs rm attempt in the question is broken in several ways, including:
The output of ls is intended for reading by humans. It is not suitable for automatic processing. See Why you shouldn't parse the output of ls(1).
The grep -v shorturl.api will exclude files other than the intended one. For instance, old-shorturl.api would be excluded.
xargs by default uses spaces and newlines to split its input into arguments. xargs rm won't delete files that have such characters in their names.
thanks #GordonDavisson.
use ls --color before pipeline to xargs
ls -1 --color=never | grep -v shorturl.api | xargs rm -rf

How to remove files using grep and rm?

grep -n magenta *| rm *
grep: a.txt: No such file or directory
grep: b: No such file or directory
Above command removes all files present in the directory except ., .. .
It should remove only those files which contains the word "magenta"
Also, tried grep magenta * -exec rm '{}' \; but no luck.
Any idea?
Use xargs:
grep -l --null magenta ./* | xargs -0 rm
The purpose of xargs is to take input on stdin and place it on the command line of its argument.
What the options do:
The -l option tells grep not to print the matching text and instead just print the names of the files that contain matching text.
The --null option tells grep to separate the filenames with NUL characters. This allows all manner of filenames to be handled safely.
The -0 option to xargs to treat its input as NUL-separated.
Here is a safe way:
grep -lr magenta . | xargs -0 rm -f --
-l prints file names of files matching the search pattern.
-r performs a recursive search for the pattern magenta in the given directory .. 
If this doesn't work, try -R.
(i.e., as multiple names instead of one).
xargs -0 feeds the file names from grep to rm -f
-- is often forgotten but it is very important to mark the end of options and allow for removal of files whose names begin with -.
If you would like to see which files are about to be deleted, simply remove the | xargs -0 rm -f -- part.

Copying list of files to a directory

I want to make a search for all .fits files that contain a certain text in their name and then copy them to a directory.
I can use a command called fetchKeys to list the files that contain say 'foo'
The command looks like this : fetchKeys -t 'foo' -F | grep .fits
This returns a list of .fits files that contain 'foo'. Great! Now I want to copy all of these to a directory /path/to/dir. There are too many files to do individually , I need to copy them all using one command.
I'm thinking something like:
fetchKeys -t 'foo' -F | grep .fits > /path/to/dir
or
cp fetchKeys -t 'foo' -F | grep .fits /path/to/dir
but of course neither of these works. Any other ideas?
If this is on Linux/Unix, can you use the find command? That seems very much like fetchkeys.
$ find . -name "*foo*.fit" -type f -print0 | while read -r -d $'\0' file
do
basename=$(basename $file)
cp "$file" "$fits_dir/$basename"
done
The find command will find all files that match *foo*.fits in their name. The -type f says they have to be files and not directories. The -print0 means print out the files found, but separate them with the NUL character. Normally, the find command will simply return a file on each line, but what if the file name contains spaces, tabs, new lines, or even other strange characters?
The -print0 will separate out files with nulls (\0), and the read -d $'\0' file means to read in each file separating by these null characters. If your files don't contain whitespace or strange characters, you could do this:
$ find . -name "*foo*.fit" -type f | while read file
do
basename=$(basename $file)
cp "$file" "$fits_dir/$basename"
done
Basically, you read each file found with your find command into the shell variable file. Then, you can use that to copy that file into your $fits_dir or where ever you want.
Again, maybe there's a reason to use fetchKeys, and it is possible to replace that find with fetchKeys, but I don't know that fetchKeys command.
Copy all files with the name containing foo to a certain directory:
find . -name "*foo*.fit" -type f -exec cp {} "/path/to/dir/" \;
Copy all files themselves containing foo to a certain directory (solution without xargs):
for f in `find . -type f -exec grep -l foo {} \;`; do cp "$f" /path/to/dir/; done
The find command has very useful arguments -exec, -print, -delete. They are very robust and eliminate the need to manually process the file names. The syntax for -exec is: -exec (what to do) \;. The name of the file currently processed will be substituted instead of the placeholder {}.
Other commands that are very useful for such tasks are sed and awk.
The xargs tool can execute a command for every line what it gets from stdin. This time, we execute a cp command:
fetchkeys -t 'foo' -F | grep .fits | xargs -P 1 -n 500 --replace='{}' cp -vfa '{}' /path/to/dir
xargs is a very useful tool, although its parametrization is not really trivial. This command reads in 500 .fits files, and calls a single cp command for every group. I didn't tested it to deep, if it doesn't go, I'm waiting your comment.

Removing files with a double quote in their name

I am trying to remove files within a directory. Some of the files have double-quotes around their name while others do not. An example of these files would be:
"DDD344".csv
D2DW.csv
Both these files are located in sub-directories within the directory YM.
To find such files and remove them, I invoke find like so:
find YM -name "*.csv" -print | xargs rm
The above command results in a lot of No such file or directory errors.
I tried using sed in the following way:
find yum/yum_hyd -name "\"*\".csv" | sed 's/"/\"/g' | xargs rm
but to no avail. How do I remove the files?
The problem is that you're using xargs. xargs is a horribly broken program that should never be used for anything except in conjunction with the nonstandard -0 option. Even so, I can't think of any advantages to doing that in this case. You should just execute rm directly from find.
find . -type f -name '"*".csv' -exec rm -f -- {} +
Will work. If you have GNU find, you may also use -delete.
try this:
find yum/yum_hyd -name "\"*\".csv" |sed 's/"/\\"/g'|xargs rm
explanation:
you want to replace " with \". but if you write \" directly, sed considers it as plain ", you have to escape the backslash. so \\" works.
I wasn't aware of this option until recently but you can list the inode of the file in the following way:
$ ls –il
In the output you will see that the first column contains the inode value. You can then use that value to find -inum the offending files and remove them.
Output
2616366 -rw-r--r-- 1 etc etc
$ find . -inum 2616366 -exec rm -f {} \;
This will remove the file with that specific inum.
As a test you can run the following to locate your files.
ls -il \"* | awk '{print $1}' | xargs -n1 -I {} find -inum {}
Replace the final portion of this command (the "find -inum {}") with the "rm" command once you are satisfied.
This is also similar to the question on SuperUser

Modifying replace string in xargs

When I am using xargs sometimes I do not need to explicitly use the replacing string:
find . -name "*.txt" | xargs rm -rf
In other cases, I want to specify the replacing string in order to do things like:
find . -name "*.txt" | xargs -I '{}' mv '{}' /foo/'{}'.bar
The previous command would move all the text files under the current directory into /foo and it will append the extension bar to all the files.
If instead of appending some text to the replace string, I wanted to modify that string such that I could insert some text between the name and extension of the files, how could I do that? For instance, let's say I want to do the same as in the previous example, but the files should be renamed/moved from <name>.txt to /foo/<name>.bar.txt (instead of /foo/<name>.txt.bar).
UPDATE: I manage to find a solution:
find . -name "*.txt" | xargs -I{} \
sh -c 'base=$(basename $1) ; name=${base%.*} ; ext=${base##*.} ; \
mv "$1" "foo/${name}.bar.${ext}"' -- {}
But I wonder if there is a shorter/better solution.
The following command constructs the move command with xargs, replaces the second occurrence of '.' with '.bar.', then executes the commands with bash, working on mac OSX.
ls *.txt | xargs -I {} echo mv {} foo/{} | sed 's/\./.bar./2' | bash
It is possible to do this in one pass (tested in GNU) avoiding the use of the temporary variable assignments
find . -name "*.txt" | xargs -I{} sh -c 'mv "$1" "foo/$(basename ${1%.*}).new.${1##*.}"' -- {}
In cases like this, a while loop would be more readable:
find . -name "*.txt" | while IFS= read -r pathname; do
base=$(basename "$pathname"); name=${base%.*}; ext=${base##*.}
mv "$pathname" "foo/${name}.bar.${ext}"
done
Note that you may find files with the same name in different subdirectories. Are you OK with duplicates being over-written by mv?
If you have GNU Parallel http://www.gnu.org/software/parallel/ installed you can do this:
find . -name "*.txt" | parallel 'ext={/} ; mv -- {} foo/{/.}.bar."${ext##*.}"'
Watch the intro videos for GNU Parallel to learn more:
https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
If you're allowed to use something other than bash/sh, AND this is just for a fancy "mv"... you might try the venerable "rename.pl" script. I use it on Linux and cygwin on windows all the time.
http://people.sc.fsu.edu/~jburkardt/pl_src/rename/rename.html
rename.pl 's/^(.*?)\.(.*)$/\1-new_stuff_here.\2/' list_of_files_or_glob
You can also use a "-p" parameter to rename.pl to have it tell you what it WOULD HAVE DONE, without actually doing it.
I just tried the following in my c:/bin (cygwin/windows environment). I used the "-p" so it spit out what it would have done. This example just splits the base and extension, and adds a string in between them.
perl c:/bin/rename.pl -p 's/^(.*?)\.(.*)$/\1-new_stuff_here.\2/' *.bat
rename "here.bat" => "here-new_stuff_here.bat"
rename "htmldecode.bat" => "htmldecode-new_stuff_here.bat"
rename "htmlencode.bat" => "htmlencode-new_stuff_here.bat"
rename "sdiff.bat" => "sdiff-new_stuff_here.bat"
rename "widvars.bat" => "widvars-new_stuff_here.bat"
the files should be renamed/moved from <name>.txt to /foo/<name>.bar.txt
You can use rename utility, e.g.:
rename s/\.txt$/\.txt\.bar/g *.txt
Hint: The subsitution syntax is similar to sed or vim.
Then move the files to some target directory by using mv:
mkdir /some/path
mv *.bar /some/path
To do rename files into subdirectories based on some part of their name, check for:
-p/--mkpath/--make-dirs Create any non-existent directories in the target path.
Testing:
$ touch {1..5}.txt
$ rename --dry-run "s/.txt$/.txt.bar/g" *.txt
'1.txt' would be renamed to '1.txt.bar'
'2.txt' would be renamed to '2.txt.bar'
'3.txt' would be renamed to '3.txt.bar'
'4.txt' would be renamed to '4.txt.bar'
'5.txt' would be renamed to '5.txt.bar'
Adding on that the wikipedia article is surprisingly informative
for example:
Shell trick
Another way to achieve a similar effect is to use a shell as the launched command, and deal with the complexity in that shell, for example:
$ mkdir ~/backups
$ find /path -type f -name '*~' -print0 | xargs -0 bash -c 'for filename; do cp -a "$filename" ~/backups; done' bash
Inspired by an answer by #justaname above, this command which incorporates Perl one-liner will do it:
find ./ -name \*.txt | perl -p -e 's/^(.*\/(.*)\.txt)$/mv $1 .\/foo\/$2.bar.txt/' | bash

Resources