xargs respect wildcards in searches - bash

I have a file called file1.txt:
dir1
dir2
dir3
...
I wanted to use xargs to check if some files exist farther into the file system like this:
cat file1.txt | xargs -i ls /projects/analysis7/{}/meta_bwa/hg19a/*varFilter 2>/dev/null
But xargs never seems smart enough to expand the *. ie, it never finds the files even when they would match the pattern (if the * was expanded).
Any ideas?

You just need to add sh -c:
cat file1.txt | xargs -i sh -c 'ls /projects/analysis7/{}/meta_bwa/hg19a/*varFilter' 2>/dev/null

Related

How to copy files found with grep on OSX

I'm wanting to copy files I've found with grep on an OSX system, where the cp command doesn't have a -t option.
A previous posts' solution for doing something like this relied on the -t flag in cp. However, like that poster, I want to take the file list I receive from grep and then execute a command over it, something like:
grep -lr "foo" --include=*.txt * 2>/dev/null | xargs cp -t /path/to/targetdir
Less efficient than cp -t, but this works:
grep -lr "foo" --include=*.txt * 2>/dev/null |
xargs -I{} cp "{}" /path/to/targetdir
Explanation:
For filenames | xargs cp -t destination, xargs changes the incoming filenames into this format:
cp -t destination filename1 ... filenameN
i.e., it only runs cp once (actually, once for every few thousand filenames -- xargs breaks the command line up if it would be too long for the shell).
For filenames | xargs -I{} cp "{}" destination, on the other hand, xargs changes the incoming filenames into this format:
cp "filename1" destination
...
cp "filenameN" destination
i.e., it runs cp once for each incoming filename, which is much slower. For a large number (e.g., >10k) of very small (e.g., <10k) files, I'd guess it could even be thousands of times slower. But it does work :)
PS: Another popular technique is use find's exec function instead of xargs, e.g., https://stackoverflow.com/a/5241677/1563960
Yet another option is, if you have admin privileges or can persuade your sysadmin, to install the coreutils package as suggested here, and follow the steps but for cp rather than ls.

Using cat and grep commands in Bash

I'm having trouble with trying to achieve this bash command:
Concatenate all the text files in the current directory that have at least one occurrence of the word BOB (in any case) within the text of the file.
Is it correct for me to do this use the cat command then use grep to find the occurences of the word BOB?
cat grep -i [BOB] *.txt > catFile.txt
To handle filenames with whitespace characters correctly:
grep --null -l -i "BOB" *.txt | xargs -0 cat > catFile.txt
Your issue was the need to pass grep's file names to cat as an inline function:
cat $(grep --null -l -i "BOB" *.txt ) > catFile.txt
$(.....) handles the inline execution
-l returns only filenames of the things that matched
You could use find with -exec:
find -maxdepth 1 -name '*.txt' -exec grep -qi 'bob' {} \; \
-exec cat {} + > catFile.txt
-maxdepth 1 makes sure you don't search any deeper than the current directory
-name '*.txt' says to look at all files ending with .txt – for the case that there is also a directory ending in .txt, you could add -type f to only look at files
-exec grep -qi 'bob' {} \; runs grep for each .txt file found. If bob is in the file, the exit status is zero and the next directive is executed. -q makes sure the grep is silent.
-exec cat {} + runs cat on all the files that contain bob
You need to remove the square brackets...
grep -il "BOB" *
You can also use the following command that you must run from the directory containing your BOB files.
grep -il BOB *.in | xargs cat > BOB_concat.out
-i is an option used to set grep in case insensitive mode
-l will be used to output only the filename containing the pattern provided as argument to grep
*.in is used to find all the input files in the dir (should be adapted to your folder content)
then you pipe the result of the command to xargs in order to build the arguments that cat will use to produce your file concatenation.
HYPOTHESIS:
Your folder does only contain files without strange characters in their name (e.g. space)

How to remove files using grep and rm?

grep -n magenta *| rm *
grep: a.txt: No such file or directory
grep: b: No such file or directory
Above command removes all files present in the directory except ., .. .
It should remove only those files which contains the word "magenta"
Also, tried grep magenta * -exec rm '{}' \; but no luck.
Any idea?
Use xargs:
grep -l --null magenta ./* | xargs -0 rm
The purpose of xargs is to take input on stdin and place it on the command line of its argument.
What the options do:
The -l option tells grep not to print the matching text and instead just print the names of the files that contain matching text.
The --null option tells grep to separate the filenames with NUL characters. This allows all manner of filenames to be handled safely.
The -0 option to xargs to treat its input as NUL-separated.
Here is a safe way:
grep -lr magenta . | xargs -0 rm -f --
-l prints file names of files matching the search pattern.
-r performs a recursive search for the pattern magenta in the given directory .. 
If this doesn't work, try -R.
(i.e., as multiple names instead of one).
xargs -0 feeds the file names from grep to rm -f
-- is often forgotten but it is very important to mark the end of options and allow for removal of files whose names begin with -.
If you would like to see which files are about to be deleted, simply remove the | xargs -0 rm -f -- part.

Why ls command combined with xargs and cp move only 10 files?

I have a command that copies file from one dir to another
FILE_COLLECTOR_PATH="/var/www/";
FILE_BACKUP_PATH='/home/'
ls $FILE_COLLECTOR_PATH | head -${1} | xargs -i basename {} | xargs -t -i cp $FILE_COLLECTOR_PATH{} "${FILE_BACKUP_PATH}{}-`date +%F%H%M%S%N`"
I loop it in a shell script like,
#!/bin/sh
SLEEP=120
FILE_COLLECTOR_PATH="/var/www/";
FILE_BACKUP_PATH='/home/'
while true
do
ls $FILE_COLLECTOR_PATH | head -${1} | xargs -i basename {} | xargs -t -i cp $FILE_COLLECTOR_PATH{} "${FILE_BACKUP_PATH}{}-`date +%F%H%M%S%N`"
sleep ${SLEEP}
done
But it seems to move only 10 files and not all files in the dir, Why? It should suppose to move all files.
In general, don't try to parse the output of ls in a script. You can end up with many different types of subtle problems. There is almost always a better tool for the job. Many times, this tool is find. For example, to generate a list of all of the files in a directory and do something to each of them, you would do something like this:
find <search directory> -maxdepth 1 -type f -print0 | xargs -0i basename {} ...
The -print0 and -0 arguments allow find and xargs to communicate filenames in a way that handles special characters (like spaces) correctly.
The find command has other options that you may find useful in a backup script (which is what it appears you are building). Options like -mmin and -newer will enable you to only back up files that have changed since the last iteration.
Try doing
ls -1
instead of just ls, because ls by default don't displays files on a newline (tail expect newlines) for each files when ls -1 does.

how to pipe commands in ubuntu

How do I pipe commands and their results in Ubuntu when writing them in the terminal. I would write the following commands in sequence -
$ ls | grep ab
abc.pdf
cde.pdf
$ cp abc.pdf cde.pdf files/
I would like to pipe the results of the first command into the second command, and write them all in the same line. How do I do that ?
something like
$ cp "ls | grep ab" files/
(the above is a contrived example and can be written as cp *.pdf files/)
Use the following:
cp `ls | grep ab` files/
Well, since the xargs person gave up, I'll offer my xargs solution:
ls | grep ab | xargs echo | while read f; do cp $f files/; done
Of course, this solution suffers from an obvious flaw: files with spaces in them will cause chaos.
An xargs solution without this flaw? Hmm...
ls | grep ab | xargs '-d\n' bash -c 'docp() { cp "$#" files/; }; docp "$#"'
Seems a bit klunky, but it works. Unless you have files with returns in them I mean. However, anyone who does that deserves what they get. Even that is solvable:
find . -mindepth 1 -maxdepth 1 -name '*ab*' -print0 | xargs -0 bash -c 'docp() { cp "$#" files/; }; docp "$#"'
To use xargs, you need to ensure that the filename arguments are the last arguments passed to the cp command. You can accomplish this with the -t option to cp to specify the target directory:
ls | grep ab | xargs cp -t files/
Of course, even though this is a contrived example, you should not parse the output of ls.

Resources