How can I execute a list of binaries from a pipe with desired parameters? - shell

I'm doing a find and locating several executables that I want to run with -v. I tried something like this:
find somefilters | xargs -I % % -v
Unfortuntely, xargs seems to require that the "utility" be a fixed binary rather than a binary provided by stdin. Does anyone have a recipe for doing this command line magic?

Use the -exec primary:
find ... -exec '{}' -v \;

Yet another way around this - use xargs to write a shell script for you:
find somefilters | xargs -n 1 -I % echo % -v | ${SHELL}
That won't work out so well if any of the programs require interactivity, but if the -v option is just to spit out the version numbers or something (one common meaning, the other being a verbose flag), it should work fine.

Related

How to search and replace with egrep and sed on macOS?

I want to match a pattern in a file and replace it.
This command works with egrep, xargs and sed:
egrep -lRZ "hello" . | xargs -0 -l sed -i -e 's/hello/world/g'
The problem: It does not work on MacOS because the xargs of MacOS does not support the argumente -l.
xargs: illegal option -- l
usage: xargs [-0opt] [-E eofstr] [-I replstr [-R replacements]] [-J replstr]
[-L number] [-n number [-x]] [-P maxprocs] [-s size]
[utility [argument ...]]
How is this solvable on MacOS?
There are actually three incompatibilities you're going to run into here between the GNU (Linux) vs. bsd (macOS) utilities.
The one you're getting an error message from is that bsd's xargs doesn't accept the -l option. But -l is equivalent to -L except that -L requires an argument specifying the maximum number of lines to pass per invocation of the command, while -l defaults to one if it isn't specified. Thus, you can just replace -l with -L1. -L is understood the same way by both the GNU and bsd versions of xargs, so using this is portable between Linux and macOS.
But in this particular case, there's another even easier option: sed is perfectly capable of operating on multiple files per invocation, so there's no reason to limit it to one per invocation. This'll even be slightly faster, since it doesn't have to spend as much time launching new processes. So just leave -l off.
The GNU and bsd versions of egrep (and others in the grep family) both take the option -Z, but they use it to mean completely different things. With GNU, egrep -Z prints zero bytes (ASCII NUL characters) after each filename (matching what xargs -0 expects). But with bsd, egrep -Z is equivalent to zgrep -- it treats its input files as zip archives, and expands them before searching their contents.
Fortunately, both versions understand --null to invoke zero-byte delimiters, so you can use that portably on both platforms.
Both the GNU and bsd versions understand -i<suffix> to mean "edit in place, but make a backup copy, and back up the original with the specified filename suffix". And for both of them, if the suffix is zero-length, it doesn't keep a backup. Unfortunately, the way you specify a zero-length suffix is different and (as far as I've been able to find) irreconcilably incompatible. Specifically, GNU requires the suffix to be directly attached to the -i (e.g. -i.bkp), so just specifying -i by itself is enough to specify in-place-without-backup mode. But bsd allows the suffix to be passed as a separate argument (e.g. -i .bkp), so if you just specify -i by itself, it'll use whatever the next argument is as a suffix (e.g. sed -i -e 's/hello/world/g' will use "-e" as a suffix). To specify in-place-without-backup mode, you need to follow -i with an explicit empty argument (e.g. sed -i '' -e 's/hello/world/g'). But if you do that with GNU's sed, it'll try to execute the empty argument as its script, which will fail.
With all that, here's the macOS version of your command:
egrep -lR --null "hello" . | xargs -0 sed -i '' -e 's/hello/world/g'
...which will almost work on Linux -- the only difference is that you need to remove the '' argument to sed. If you want something that's fully portable between Linux and macOS, you need to specify a backup suffix (and attach it directly to the -i option, as in -i.bkp).
The grep options to recursively search for files are best avoided - they just clutter up your grep args and make your scripts non-portable. There's already a perfectly good tool designed to find files with a very obvious name.
Are you just trying to replace hello with world in all your files? If so that's just
find . -type f |
while IFS= read -r file; do
sed 's/hello/world/g' "$file" > "tmp$$" &&
mv "tmp$$" "$file"
done
That'll work in any shell on any UNIX box unless your file names contain newlines. If you didn't want to change timestamps etc. on files that don't contain hello one way is:
find . -type f -exec grep -q 'hello' {} \; -print |
while IFS= read -r file; do
sed 's/hello/world/g' "$file" > "tmp$$" &&
mv "tmp$$" "$file"
done

When to use xargs when piping?

I am new to bash and I am trying to understand the use of xargs, which is still not clear for me. For example:
history | grep ls
Here I am searching for the command ls in my history. In this command, I did not use xargs and it worked fine.
find /etc - name "*.txt" | xargs ls -l
I this one, I had to use xargs but I still can not understand the difference and I am not able to decide correctly when to use xargs and when not.
xargs can be used when you need to take the output from one command and use it as an argument to another. In your first example, grep takes the data from standard input, rather than as an argument. So, xargs is not needed.
xargs takes data from standard input and executes a command. By default, the data is appended to the end of the command as an argument. It can be inserted anywhere however, using a placeholder for the input. The traditional placeholder is {}; using that, your example command might then be written as:
find /etc -name "*.txt" | xargs -I {} ls -l {}
If you have 3 text files in /etc you'll get a full directory listing of each. Of course, you could just have easily written ls -l /etc/*.txt and saved the trouble.
Another example lets you rename those files, and requires the placeholder {} to be used twice.
find /etc -name "*.txt" | xargs -I {} mv {} {}.bak
These are both bad examples, and will break as soon as you have a filename containing whitespace. You can work around that by telling find to separate filenames with a null character.
find /etc -print0 -name "*.txt" | xargs -I {} -0 mv {} {}.bak
My personal opinion is that there are almost always alternatives to using xargs (such as the -exec argument to find) and you will be better served by learning those.
When you use piping without xargs, the actual data is fed into the next command. On the other hand, when using piping with xargs, the actual data is viewed as a parameter to the next command. To give a concrete example, say you have a folder with a.txt and b.txt. a.txt contains just a single line 'hello world!', and b.txt is just empty.
If you do
ls | grep txt
you would end up getting the output:
a.txt
b.txt
Yet, if you do
ls | xargs grep txt
you would get nothing since neither file a.txt nor b.txt contains the word txt.
If the command is
ls | xargs grep hello
you would get:
hello world!
That's because with xargs, the two filenames given by ls are passed to grep as arguments, rather than the actual content.
Short answer: Avoid xargs for now. Return to xargs when you have written dozens or hundreds of scripts.
Commands can get their input from parameters (like rm bad_example) or can get the input from stdin (not just the y on the question after rm -i is_this_bad_too, but also read answer). Other commands like grep and sed will look for parameters and when the parameters don't show the input, switch to the input.
Your grep example works fine reading from stdin, nothing special needed.
Your ls needs the output of find as a parameter. xargs is just one way to turn things around. Use man xargs for more about xargs. Alternatives:
find /etc -name "*.txt" -exec ls -l {} \;
find /etc -name "*.txt" -ls
ls -l $(find /etc -name "*.txt" )
ls /etc/*.txt
First try to see which of this commands is best when you have a nasty filename with spaces.txt in /etc.
xargs(1) is dangerous (broken, exploitable, etc.) when reading non-NUL-delimited input.
If you're working with filenames, use find's -exec [command] {} + instead.
If you can get NUL-delimited output, use xargs -0.
GNU Parallel can do the same as xargs, but does not have the broken and exploitable "features".
You can learn GNU Parallel by looking at examples http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Working-as-xargs--n1.-Argument-appending and walking through the tutorial http://www.gnu.org/software/parallel/parallel_tutorial.html

What is causing these whitespace problems on Mac OS X bash?

I am trying to untar several files at once (I know I can do it differently but I want to make this work because it should work).
So I do:
ls -1 *.gz | xargs tar xf
This produces one command from xargs with all files, and fails. -1 is optional - fails the same way without it.
In fact,
ls -1 *.gz | xargs -t echo
echo host1_logs.tar.gz host2_logs.tar.gz host3_logs.tar.gz host5_logs.tar.gz
host1_logs.tar.gz host2_logs.tar.gz host3_logs.tar.gz host5_logs.tar.gz
I tried unsetting IFS, setting it to newline.
How do I make xargs on Mac OS X actually work?
Bonus question, in light of linebreak/loop problems I had with other commands before I wonder if Mac terminal is just broken and I should replace all utilities with GNU.
Use the -n argument to force xargs to run the given command with only a single argument:
ls -1 *.gz | xargs -n 1 echo
Otherwise, it tries to use each line from the input as a separate argument to the same invocation of the command.
Note that this will fail if any of the matched file names contain newlines, since ls has no way of producing output that distinguishes such names from a sequence of newline-free file names. (That is, there is no option to ls similar to the -print0 argument to find, which is commonly used in pipelines like find ... -print0 | xargs -0 to guard against such file names.)
Your question implies that you realize that you could do something like:
for f in *.gz; do
tar xf "$f"
done
which is unlikely to be noticeably slower than any attempt at using xargs. In each case the process of spawning multiple tar processes is likely to outweigh any differences in looping in bash vs the internal loop in xargs.
Basically you are passing all filenames to tar at once, which is not what you want as you have noticed. The above xargs -n 1 answer is neater, but you could also use the -I flag to run the tar command for each of the arguments (also useful for multi-parameter commands like mvor cp):
ls *.gz | xargs -I {} tar xzvf {}

Doing parallel processing in bash?

I've thousands of png files which I like to make smaller with pngcrush. I've a simple find .. -exec job, but it's sequential. My machine has quite some resources and I'd make this in parallel.
The operation to be performed on every png is:
pngcrush input output && mv output input
Ideally I can specify the maximum number of parallel operations.
Is there a way to do this with bash and/or other shell helpers? I'm Ubuntu or Debian.
You can use xargs to run multiple processes in parallel:
find /path -print0 | xargs -0 -n 1 -P <nr_procs> sh -c 'pngcrush $1 temp.$$ && mv temp.$$ $1' sh
xargs will read the list of files produced by find (separated by 0 characters (-0)) and run the provided command (sh -c '...' sh) with one parameter at a time (-n 1). xargs will run <nr_procs> (-P <nr_procs>) in parallel.
You can use custom find/xargs solutions (see Bart Sas' answer), but when things become more complex you have -at least- two powerful options:
parallel (from package moreutils)
GNU parallel
With GNU Parallel http://www.gnu.org/software/parallel/ it can be done like:
find /path -print0 | parallel -0 pngcrush {} {.}.temp '&&' mv {.}.temp {}
Learn more:
Watch the intro video for a quick introduction:
https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial (man parallel_tutorial). You command line
will love you for it.

Best way to do a find/replace in several files?

what's the best way to do this? I'm no command line warrior, but I was thinking there's possibly a way of using grep and cat.
I just want to replace a string that occurs in a folder and sub-folders. what's the best way to do this? I'm running ubuntu if that matters.
I'll throw in another example for folks using ag, The Silver Searcher to do find/replace operations on multiple files.
Complete example:
ag -l "search string" | xargs sed -i '' -e 's/from/to/g'
If we break this down, what we get is:
# returns a list of files containing matching string
ag -l "search string"
Next, we have:
# consume the list of piped files and prepare to run foregoing command
# for each file delimited by newline
xargs
Finally, the string replacement command:
# -i '' means edit files in place and the '' means do not create a backup
# -e 's/from/to/g' specifies the command to run, in this case,
# global, search and replace
sed -i '' -e 's/from/to/g'
find . -type f -print0 | xargs -0 -n 1 sed -i -e 's/from/to/g'
The first part of that is a find command to find the files you want to change. You may need to modify that appropriately. The xargs command takes every file the find found and applies the sed command to it. The sed command takes every instance of from and replaces it with to. That's a standard regular expression, so modify it as you need.
If you are using svn beware. Your .svn-directories will be search and replaced as well. You have to exclude those, e.g., like this:
find . ! -regex ".*[/]\.svn[/]?.*" -type f -print0 | xargs -0 -n 1 sed -i -e 's/from/to/g'
or
find . -name .svn -prune -o -type f -print0 | xargs -0 -n 1 sed -i -e 's/from/to/g'
As Paul said, you want to first find the files you want to edit and then edit them. An alternative to using find is to use GNU grep (the default on Ubuntu), e.g.:
grep -r -l from . | xargs -0 -n 1 sed -i -e 's/from/to/g'
You can also use ack-grep (sudo apt-get install ack-grep or visit http://petdance.com/ack/) as well, if you know you only want a certain type of file, and want to ignore things in version control directories. e.g., if you only want text files,
ack -l --print0 --text from | xargs -0 -n 1 sed -i -e 's/from/to/g'
# `from` here is an arbitrary commonly occurring keyword
An alternative to using sed is to use perl which can process multiple files per command, e.g.,
grep -r -l from . | xargs perl -pi.bak -e 's/from/to/g'
Here, perl is told to edit in place, making a .bak file first.
You can combine any of the left-hand sides of the pipe with the right-hand sides, depending on your preference.
An alternative to sed is using rpl (e.g. available from http://rpl.sourceforge.net/ or your GNU/Linux distribution), like rpl --recursive --verbose --whole-words 'F' 'A' grades/
For convenience, I took Ulysse's answer (after correcting the undesirable error printing) and turned it into a .zshrc / .bashrc function:
function find-and-replace() {
ag -l "$1" | xargs sed -i -e s/"$1"/"$2"/g
}
Usage: find-and-replace Foo Bar
The typical (find|grep|ack|ag|rg)-xargs-sed combination has a few problems:
Difficult to remember and get correct. Eg, forgetting the xargs -r option will run the command even when no files are found, potentially causing problems.
Retrieving the file list, and the actual replacement uses different CLI tools and can have a different search behaviour.
These problems were big enough for such an invasive and dangerous operation as recursive search-and-replace, to start the development of a dedicated tool: mo.
Early tests seem to indicate that its performance is between ag and rg and it solves following problems I encounter with them:
A single invocation can filter on filename and content. Following command searches for the word bug in all source files that have a v1 indication:
mo -f 'src/.*v1.*' -p bug -w
Once the search results are OK, actual replacement for bug with fix can be added:
mo -f 'src/.*v1.*' -p bug -w -r fix
comment() {
}
doc() {
}
function agr {
doc 'usage: from=sth to=another agr [ag-args]'
comment -l --files-with-matches
ag -0 -l "$from" "${#}" | pre-files "$from" "$to"
}
pre-files() {
doc 'stdin should be null-separated list of files that need replacement; $1 the string to replace, $2 the replacement.'
comment '-i backs up original input files with the supplied extension (leave empty for no backup; needed for in-place replacement.)(do not put whitespace between -i and its arg.)'
comment '-r, --no-run-if-empty
If the standard input does not contain any nonblanks,
do not run the command. Normally, the command is run
once even if there is no input. This option is a GNU
extension.'
AGR_FROM="$1" AGR_TO="$2" xargs -r0 perl -pi.pbak -e 's/$ENV{AGR_FROM}/$ENV{AGR_TO}/g'
}
You can use it like this:
from=str1 to=sth agr path1 path2 ...
Supply no paths to make it use the current directory.
Note that ag, xargs, and perl need to be installed and on PATH.

Resources