combine find and grep into a single command

combine find and grep into a single command - bash

How to combine below two into one line without changing the first one?
# find / -name sshd_config -print
# grep -I <sshd_config path> permitrootlogin
I came up with the following, but don't know whether I gives correct result in different cases
cat `find / -name sshd_config -print` |grep permitrootlogin

Don't do cat $(...) [$() is the modern replacement for backticks] -- that doesn't work reliably if your filenames contain special characters (spaces, wildcards, etc).
Instead, tell find to invoke cat for you, with as many filenames passed to each cat invocation as possible:
find / -name sshd_config -exec cat -- '{}' + | grep permitrootlogin
...or, even better, ignore cat altogether and just pass the filenames to grep literally:
find / -name sshd_config -exec grep -h -e permitrootlogin -- /dev/null '{}' +
Replace the -h with -H if you want filenames to be shown.

You could do something like that:
find / -name "somefilename" -print0 | xargs -0 grep "something"
The 'xargs' keyword will transform the stdout into arguments that can be read by grep.

I guess what you want is use the output of find / -name sshd_config -print (which should be the path of the sshd_config file) and use it as the second argument to grep (so that that the sshd_config file gets parsed for your search string).
There are several ways to achieve this.
Commands in back-quotes (`) are replaced by their output. So
grep permitrootlogin `find / -name sshd_config -print`
will be replaced by
grep permitrootlogin /path/to/the/sshd_config
which will search /path/to/the/sshd_config for permitrootlogin.
The same happens with
grep permitrootlogin $(find / -name sshd_config -print)
As another answer already mentions, this syntax has some advantages over the back-ticks. Namely, it can be nested.
However, this still runs into a problem when the path where the file is found contains spaces. As both backticks and $(...) just perform text substitution, such a patch would be passed as several arguments to grep, each probably being an invalid path. (/path/to the/sshd_config would become /path/to and the/sshd_config.)
Rather than working around this with fancy quoting and escaping strategies, remember that UNIX commands were already designed for being used in combination, usually by pipes. Indeed find has a -print0 action which will separate paths of found files by \0, so that they can be distinguished from paths containing whitespace. Alas, grep can't process a zero-delimited list of files and still wants the files to search as invocation arguments, not on stdin.
This is where xargs comes into play. It applies stuff it gets on stdin as arguments to other commands. And with its -0 option, it interprets stdin as a zero-delimited list instead of treating whitespace as delimiters.
So
find / -name sshd_config -print0 | xargs -0 grep permitrootlogin
should have you covered.

| is a pipeline, which means, that the standard output of catfind / -name sshd_config -print`` will go to standard intput of grep permitrootlogin, so you just have to be sure what's the output of the first command

Related

Files with quotes, spaces causing bad behavior from xargs

I want to find some files and calculate the shasum by using a pipe command.
find . -type f | xargs shasum
But there are files withe quotes in my directory, for example the file named
file with "special" characters.txt
The pipe output look like this:
user#home ~ $ find . -type f | xargs shasum
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./empty1.txt
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./empty2.txt
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./empty3.txt
shasum: ./file:
shasum: with: No such file or directory
shasum: special: No such file or directory
shasum: characters.txt: No such file or directory
25ea78ccd362e1903c4a10201092edeb83912d78 ./file1.txt
25ea78ccd362e1903c4a10201092edeb83912d78 ./file2.txt
The quotes within the filename makes problems.
How can I tell shasum to process the files correctly?

The short explanation is that xargs is widely considered broken-by-design, unless using extensions to the standard that disable its behavior of trying to parse and honor quote and escaping content in its input. See the xargs section of UsingFind for more details.
Using NUL Delimited Streams
On a system with GNU or modern BSD extensions (including MacOS X), you can (and should) NUL-delimit the output from find:
find . -type f -print0 | xargs -0 shasum --
Using find -exec
That said, you can do even better by getting xargs out of the loop entirely in a way that's fully compliant with modern (~2006) POSIX:
find . -type f -exec shasum -- '{}' +
Note that the -- argument specifies to shasum that all future arguments are filenames. If you'd used find * -type f ..., then you could have a result starting with a dash; using -- ensures that this result isn't interpreted as a set of options.
Using Newline Delimiters (And Security Risks Thereof)
If you have GNU xargs, but don't have the option of a NUL-delimited input stream, then xargs -d $'\n' (in shells such as bash with ksh extensions) will avoid the quoting and escaping behavior:
xargs -d $'\n' shasum -- <files.txt
However, this is suboptimal, because newline literals are actually possible inside filenames, thus making it impossible to distinguish between a newline that separates two names and a newline that is part of an actual name. Consider the following scenario:
mkdir -p ./file.txt$'\n'/etc/passwd$'\n'/
touch ./file.txt$'\n'/etc/passwd$'\n'file.txt file.txt
find . -type f | xargs -d $'\n' shasum --
This will have output akin to the following:
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./file.txt
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./file.txt
c0c71bac843a3ec7233e99e123888beb6da8fbcf /etc/passwd
da39a3ee5e6b4b0d3255bfef95601890afd80709 file.txt
...thus allowing an attacker who can control filenames to cause a shasum for an arbitrary file outside the intended directory structure to be added to your output.

execute command on files returned by grep

Say I want to edit every .html file in a directory one after the other using vim, I can do this with:
find . -name "*.html" -exec vim {} \;
But what if I only want to edit every html file containing a certain string one after the other? I use grep to find files containing those strings, but how can I pipe each one to vim similar to the find command. Perphaps I should use something other than grep, or somehow pipe the find command to grep and then exec vim. Does anyone know how to edit files containing a certain string one after the other, in the same fashion the find command example I give above would?

grep -l 'certain string' *.html | xargs vim
This assumes you don't have eccentric file names with spaces etc in them. If you have to deal with eccentric file names, check whether your grep has a -z option to terminate output lines with null bytes (and xargs has a -0 option to read such inputs), and if so, then:
grep -zl 'certain string' *.html | xargs -0 vim
If you need to search subdirectories, maybe your version of Bash has support for **:
grep -zl 'certain string' **/*.html | xargs -0 vim
Note: these commands run vim on batches of files. If you must run it once per file, then you need to use -n 1 as extra options to xargs before you mention vim. If you have GNU xargs, you can use -r to prevent it running vim when there are no file names in its input (none of the files scanned by grep contain the 'certain string').
The variations can be continued as you invent new ways to confuse things.

With find :
find . -type f -name '*.html' -exec bash -c 'grep -q "yourtext" "${1}" && vim "${1}"' _ {} \;
On each files, calls bash commands that grep the file with yourtext and open it with vim if text is matching.

Solution with a for cycle:
for i in $(find . -type f -name '*.html'); do vim $i; done
This should open all files in a separate vim session once you close the previous.

When to use xargs when piping?

I am new to bash and I am trying to understand the use of xargs, which is still not clear for me. For example:
history | grep ls
Here I am searching for the command ls in my history. In this command, I did not use xargs and it worked fine.
find /etc - name "*.txt" | xargs ls -l
I this one, I had to use xargs but I still can not understand the difference and I am not able to decide correctly when to use xargs and when not.

xargs can be used when you need to take the output from one command and use it as an argument to another. In your first example, grep takes the data from standard input, rather than as an argument. So, xargs is not needed.
xargs takes data from standard input and executes a command. By default, the data is appended to the end of the command as an argument. It can be inserted anywhere however, using a placeholder for the input. The traditional placeholder is {}; using that, your example command might then be written as:
find /etc -name "*.txt" | xargs -I {} ls -l {}
If you have 3 text files in /etc you'll get a full directory listing of each. Of course, you could just have easily written ls -l /etc/*.txt and saved the trouble.
Another example lets you rename those files, and requires the placeholder {} to be used twice.
find /etc -name "*.txt" | xargs -I {} mv {} {}.bak
These are both bad examples, and will break as soon as you have a filename containing whitespace. You can work around that by telling find to separate filenames with a null character.
find /etc -print0 -name "*.txt" | xargs -I {} -0 mv {} {}.bak
My personal opinion is that there are almost always alternatives to using xargs (such as the -exec argument to find) and you will be better served by learning those.

When you use piping without xargs, the actual data is fed into the next command. On the other hand, when using piping with xargs, the actual data is viewed as a parameter to the next command. To give a concrete example, say you have a folder with a.txt and b.txt. a.txt contains just a single line 'hello world!', and b.txt is just empty.
If you do
ls | grep txt
you would end up getting the output:
a.txt
b.txt
Yet, if you do
ls | xargs grep txt
you would get nothing since neither file a.txt nor b.txt contains the word txt.
If the command is
ls | xargs grep hello
you would get:
hello world!
That's because with xargs, the two filenames given by ls are passed to grep as arguments, rather than the actual content.

Short answer: Avoid xargs for now. Return to xargs when you have written dozens or hundreds of scripts.
Commands can get their input from parameters (like rm bad_example) or can get the input from stdin (not just the y on the question after rm -i is_this_bad_too, but also read answer). Other commands like grep and sed will look for parameters and when the parameters don't show the input, switch to the input.
Your grep example works fine reading from stdin, nothing special needed.
Your ls needs the output of find as a parameter. xargs is just one way to turn things around. Use man xargs for more about xargs. Alternatives:
find /etc -name "*.txt" -exec ls -l {} \;
find /etc -name "*.txt" -ls
ls -l $(find /etc -name "*.txt" )
ls /etc/*.txt
First try to see which of this commands is best when you have a nasty filename with spaces.txt in /etc.

xargs(1) is dangerous (broken, exploitable, etc.) when reading non-NUL-delimited input.
If you're working with filenames, use find's -exec [command] {} + instead.
If you can get NUL-delimited output, use xargs -0.

GNU Parallel can do the same as xargs, but does not have the broken and exploitable "features".
You can learn GNU Parallel by looking at examples http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Working-as-xargs--n1.-Argument-appending and walking through the tutorial http://www.gnu.org/software/parallel/parallel_tutorial.html

Find, grep, and execute - all in one?

This is the command I've been using for finding matches (queryString) in php files, in the current directory, with grep, case insensitive, and showing matching results in line:
find . -iname "*php" -exec grep -iH queryString {} \;
Is there a way to also pipe just the file name of the matches to another script?
I could probably run the -exec command twice, but that seems inefficient.
What I'd love to do on Mac OS X is then actually to "reveal" that file in the finder. I think I can handle that part. If I had to give up the inline matches and just let grep show the files names, and then pipe that to a third script, that would be fine, too - I would settle.
But I'm actually not even sure how to pipe the output (the matched file names) to somewhere else...
Help! :)
Clarification
I'd like to reveal each of the files in a finder window - so I'm probably not going to using the -q flag and stop at the first one.
I'm going to run this in the console, ideally I'd like to see the inline matches printed out there, as well as being able to pipe them to another script, like oascript (applescript, to reveal them in the finder). That's why I have been using -H - because I like to see both the file name and the match.
If I had to settle for just using -l so that the file name could more easily be piped to another script, that would be OK, too. But I think after looking at the reply below from #Charlie Martin, that xargs could be helpful here in doing both at the same time with a single find, and single grep command.
I did say bash but I don't really mind if this needs to be ran as /bin/sh instead - I don't know too much about the differences yet, but I do know there are some important ones.
Thank you all for the responses, I'm going to try some of them at the command line and see if I can get any of them to work and then I think I can choose the best answer. Leave a comment if you want me to clarify anything more.
Thanks again!

You bet. The usual thing is something like
$ find /path -name pattern -print | xargs command
So you might for example do
$ find . -name '*.[ch]' -print | xargs grep -H 'main'
(Quiz: why -H?)
You can carry on with this farther; for example. you might use
$ find . -name '*.[ch]' -print | xargs grep -H 'main' | cut -d ':' -f 1
to get the vector of file names for files that contain 'main', or
$ find . -name '*.[ch]' -print | xargs grep -H 'main' | cut -d ':' -f 1 |
xargs growlnotify -
to have each name become a Growl notification.
You could also do
$ grep pattern `find /path -name pattern`
or
$ grep pattern $(find /path -name pattern)
(in bash(1) at least these are equivalent) but you can run into limits on the length of a command line that way.
Update
To answer your questions:
(1) You can do anything in bash you can do in sh. The one thing I've mentioned that would be any different is the use of $(command) in place of using backticks around command, and that works in the version of sh on Macs. The csh, zsh, ash, and fish are different.
(2) I think merely doing $ open $(dirname arg) will opena finder window on the containing directory.

It sounds like you want to open all *.php files that contain querystring from within a Terminal.app session.
You could do it this way:
find . -name '*.php' -exec grep -li 'querystring' {} \; | xargs open
With my setup, this opens MacVim with each file on a separate tab. YMMV.

Replace -H with -l and you will get a list of those filenames that matched the pattern.

if you have bash4, simply do
grep pattern /path/**/*.php
the ** operator is like
grep pattern `find -name \*.php -print`

find /home/aaronmcdaid/Code/ -name '*.cpp' -exec grep -q -iH boost {} \; -exec echo {} \;
The first change I made is to add -q to your grep command. This is "Exit immediately with zero status if any match is found".
The good news is that this speeds up grep when a file has many matching lines. You don't care how many matches there are. But that means we need another exec on the end to actually print the filenames when grep has been successful

The grep result will be sent to stdout, so another -exec predicate is probably the best solution here.

Pipe to another script:
find . -iname "*.php" | myScript
File names will come into the stdin of myScript 1 line at a time.
You can also use xargs to form/execute commands to act on each file:
find . -iname "*.php" | xargs ls -l
act on files you find that match:
find . -iname "*.php" | xargs grep -l pattern | myScript
act that don't match pattern
find . -iname "*.php" | xargs grep -L pattern | myScript

In general using multiple -exec's and grep -q will be FAR faster than piping, since find has implied short circuits -a's separating each juxtaposed pair of expressions that's not separated with an explicit operator. The main problem here, is that you want something to happen if grep matches something AND for matches to be printed. If the files are reasonably sized then this should be faster (because grep -q exits after finding a single match)
find . -iname "*php" -exec grep -iq queryString {} \; -exec grep -iH queryString {} \; -exec otherprogram {} \;
If the files are particularly big, encapsulating it in a shell script may be faster then running multiple grep commands
find . -iname "*php" -exec bash -c \
'out=$(grep -iH queryString "$1"); [[ -n $out ]] && echo "$out" && exit 0 || exit 1' \
bash {} \; -print
Also note, if the matches are not particularly needed, then
find . -iname "*php" -exec grep -iq queryString {} \; -exec otherprogram {} \;
Will virtually always be faster than then a piped solution like
find . -iname "*php" -print0 | xargs -0 grep -iH | ...
Additionally, you should really have -type f in all cases, unless you want to catch *php directories

Regarding the question of which is faster, and you actually care about the minuscule time difference, which maybe you might if you are trying to see which will save your processor some time... perhaps testing using the command as a suffix to the "time" command, and see which one performs better.

Write a shell script that find-greps and outputs filename and content in 1 line

To see all the php files that contain "abc" I can use this simple script:
find . -name "*php" -exec grep -l abc {} \;
I can omit the -l and i get extracted some part of the content instead of the filenames as results:
find . -name "*php" -exec grep abc {} \;
What I would like now is a version that does both at the same time, but on the same line.
Expected output:
path1/filename1: lorem abc ipsum
path2/filename2: ipsum abc lorem
path3/filename3: non abc quod
More or less like grep abc * does.
Edit: I want to use this as a simple shell script. It would be great if the output is on one line, so further grepping would be possible. But it is not necessary that the script is only one line, i am putting it in a bash script file anyways.
Edit 2: Later I found "ack", which is a great tool and I use this now in most cases instead of grep. It does all this and more. http://betterthangrep.com/ You would write ack --php --nogroup abc to get the desired result

Use the -H switch (man grep):
find . -name "*php" -exec grep -H abc {} \;
Alternative using xargs (now the -H switch is not needed, at least for the version of grep I have here):
find . -name "*php" -print | xargs grep abc
Edit: As a consequence of grep's behavior as noted by orsogufo, the second command above should use -H if find could conceivably return only a single filename (i.e. if there is only a single PHP file). If orsogufo's comment w.r.t. -print0 is also incorporated, the command becomes:
find . -name "*php" -print0 | xargs -0 grep -H abc
Edit 2: A (more1) POSIX compliant version as proposed by Jonathan Leffler, which through the use of /dev/null avoids the -H switch:
find . -name "*php" -print0 | xargs -0 grep abc /dev/null
1: A quote from the opengroup.org manual on find hints that -print0 is non-standard:
A feature of SVR4's find utility was
the -exec primary's + terminator. This
allowed filenames containing special
characters (especially s) to
be grouped together without the
problems that occur if such filenames
are piped to xargs. Other
implementations have added other ways
to get around this problem, notably a
-print0 primary that wrote filenames with a null byte terminator. This was
considered here, but not adopted.
Using a null terminator meant that any
utility that was going to process
find's -print0 output had to add a new
option to parse the null terminators
it would now be reading.

If you don't need to recursively search, you can just do..
grep -H abc *.php
..which gives you the desired output. -H is the default behaviour (at least on the OS X version of grep), so you can omit this:
grep abc *.php
You can grep recursively using the -R flag, but you're unable limit it to .php files:
grep -R abc *
Again, this has the same desired output.
I know this doesn't exactly answer your questions, it's just.. an alternative... The above are just grep with a single flag, so are easier to remember than find/-exec/grep/xargs combinations! (irrelevant for a script, but useful for day-to-day shell'ing)

find /path -type f -name "*.php" | awk '
{
while((getline line<$0)>0){
if(line ~ /time/){
print $0":"line
#do some other things here
}
}
}'

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio