How to use >> inside find -exec statement? - bash

From time to time I have to append some text at the end of a bunch of files. I would normally find these files with find.
I've tried
find . -type f -name "test" -exec tail -n 2 /source.txt >> {} \;
This however results in writing the last two lines from /source.txt to a file named {} however many times a file was found matching the search criteria.
I guess I have to escape >> somehow but so far I wasn't successful.
Any help would be greatly appreciated.

-exec only takes one command (with optional arguments) and you can't use any bash operators in it.
So you need to wrap it in a bash -c '...' block, which executes everything between '...' in a new bash shell.
find . -type f -name "test" -exec bash -c 'tail -n 2 /source.txt >> "$1"' bash {} \;
Note: Everything after '...' is passed as regular arguments, except they start at $0 instead of $1. So the bash after ' is used as a placeholder to match how you would expect arguments and error processing to work in a regular shell, i.e. $1 is the first argument and errors generally start with bash or something meaningful
If execution time is an issue, consider doing something like export variable="$(tail -n 2 /source.txt)" and using "$variable" in the -exec. This will also always write the same thing, unlike using tail in -exec, which could change if the file changes. Alternatively, you can use something like -exec ... + and pair it with tee to write to many files at once.

A more efficient alternative (assuming bash 4):
shopt -s globstar
to_augment=( **/test )
tail -n 2 /source.txt | tee -a "${to_augment[#]}" > /dev/null
First, you create an array with all the file names, using a simple pattern that should be equivalent to your call to find. Then, use tee to append the desired lines to all those files at once.
If you have more criteria for the find command, you can still use it; this version is not foolproof, as it assumes no filename contains a newline, but fixing that is best left to another question.
while read -r fname; do
to_augment+=( "$fname" )
done < <(find ...)

Related

Bash - iterate through output lines

What do I want: find all the nginx access log files, iterate them (get some data from them).
I'm stuck at for loop:
#!/bin/bash
logfiles="$(find /var/log/nginx -name 'access.log*')"
for lf in "$logfiles"
do
echo "file"
done
Output is only one "file" word, despite of there are more than one log file. What's wrong?
when you say
for lf in "$logfiles"
your quotes preserve the whitespace within find's output. The quotes, in this case, are incorrect. Removing them will properly iterate over the files:
$ for i in "`find . -iname '*.log'`"; do echo $i; done
./2.log ./3.log ./1.log
$ for i in `find . -iname '*.log'`; do echo $i; done
./2.log
./3.log
./1.log
But there's a much better way: you should stream your data instead of iterating. Consider this pattern:
$ find . -iname '*.log' | xargs -n 1 echo
./2.log
./3.log
./1.log
It's very much worth wrapping your head around xargs, which turns its standard input into additional arguments to add to its own, which it then executes. In this simple case, I'm telling xargs to run the command echo individually for each 1 (-n 1) of the files
There's a few reasons xargs is my go-to iteration operator whenever possible: firstly, it's very smart. Iterating over command output with for i in $(command) requires $(command) to provide your list in the form item1 item2 item3, causing problems if any of the items contain special characters, which are then interpreted by bash as part of the for arguments.
Here is an example of the space which typically becomes special in bash as a valid input field spearator.
$ for i in `find . -iname '*.log'`; do echo $i; done
./4
tricky.log
./2.log
./3.log
./1.log
the file 4 tricky.log, containing a space, has now caused a problem.
xargs can be smart enough to keep them separate. For some cases you can get around it with changing your $IFS, the input field separator. But that gets messy fast. With xargs, you have better options - specifially, xargs can also use the null character to terminate the items in its input stream with the -0 character. Other programs, namely find, can also use the null character in its output to match what xargs expects. In this sense, xargs and find are a great combination:
$ find . -iname '*.log' -print0 | xargs -0 -n 1 echo
./4 tricky.log
./2.log
./3.log
./1.log
But wait, there's more! The next step in your command will surely be to grep the files looking for whatever matching lines you wish to find. If your lines are large, you'll want to parallelize too. xargs can do this as well. You can add more steps ot the pipeline for filtering etc.
Finally, using subshell substitution $() as program arguments can lead to unintended commands when not used very carefully to avoid unintentional arguments in failure cases. I once wrote a script that used $() to find mysql's source directory to do some first-time setup. It said something like remove -r /$(find / -iname mysqldir) . Well, if there's no mysqldir in the expected location that turned into rm -r /. Not what I intended, obviously: d'oh!
That's why I use and encourage others to use xargs whenever possible.
lose the quotes in this line: for lf in $logfiles
But it looks like you may have only one file named access.log

Weird behaviour with bash script for file arguments

So I built a super simple script to allow me to search across all directories relative to the one the script is run from that will find the first argument and replace it with the second one:
#!/usr/local/bin/bash -f
word_to_look_for=$1
substitue=$2
find ./ type f -exec sed -i "" "s/$word_to_look_for/$substitute/g" {} \;
echo "Replaced ($word_to_look_for) with ($substitue)"
For some reason though, this bit -> "s/$word_to_look_for/$substitute/g"
would only output as s/wordImlookingfor//g and as result sed would replace it with empty text, to get this to work as intended I had change the script to the following:
sed_arg="s/$word_to_look_for"
sed_arg="$sed_arg/$substitue/g"
find ./ type f -exec sed -i "" "$sed_arg" {} \;
echo "Replaced ($word_to_look_for) with ($substitue)"
I'm just wondering, why did bash not seem to like the way I had it in the first version?
You misspelled substitute as substitue everywhere except for one place:
"s/$word_to_look_for/$substitute/g"
So bash expanded the variable $substitute, which was never set, while the other variable ($substitue) was set but never used.

Iterate over specific files in a directory using Bash find

Shellcheck doesn't like my for over find loop in Bash.
for f in $(find $src -maxdepth 1 -name '*.md'); do wc -w < "$f" >> $path/tmp.txt; done
It suggests instead:
1 while IFS= read -r -d '' file
2 do
3 let count++
4 echo "Playing file no. $count"
5 play "$file"
6 done < <(find mydir -mtime -7 -name '*.mp3' -print0)
7 echo "Played $count files"
I understand most of it, but some things are still unclear.
In line one: What is '' file?
In line six: What does the empty space do in < < (find). Are the < redirects, as usual? If they are, what does it mean to redirect into do block?
Can someone help parse this out? Is this the right way to iterate over files of a certain kind in a directory?
In line one: What is '' file?
According to help read, that '' is an argument to the -d parameter:
-d delim continue until the first character of
DELIM is read, rather than newline
In line six: What does the empty space do in < < (find).
There are two separate operators there. There is <, the standard I/O redirection operator, followed by a <(...) construct, which is a bash-specific construct that performs process substitution:
Process Substitution
Process substitution is supported on systems that
support named pipes (FIFOs) or the /dev/fd method of naming
open files. It takes the form of <(list) or >(list). The
process list is run with its input or output connected
to a FIFO or some file in /dev/fd...
So this is is sending the output of the find command into the do
loop.
Are the < redirects, as usual? If they are, what does it mean to redirect into do block?
Redirect into a loop means that any command inside that loop that
reads from stdin will read from the redirected input source. As a
side effect, everything inside that loop runs in a subshell, which has
implications with respect to variable scope: variables set inside the
loop won't be visible outside the loop.
Can someone help parse this out? Is this the right way to iterate over files of a certain kind in a directory?
For the record, I would typically do this by piping find to xargs,
although which solution is best depends to a certain extend on what
you're trying to do. The two examples in your question do completely
different things, and it's not clear what you're actually trying to
accomplish.
But for example:
find $src -maxdepth 1 -name '*.md' -print0 |
xargs -0 -iDOC wc -w DOC
This would run wc on all the *.md files. The -print0 to find
(and the -0 to xargs) permit this command to correctly handle
filenames with embedded whitespace (e.g., This is my file.md). If
you know you don't have any of those, you just do:
find $src -maxdepth 1 -name '*.md' |
xargs -iDOC wc -w DOC
Generally, you need to use find if you want to do a recursive search through a directory tree (although with modern bash, you can set the shell option globstar, as shellcheck suggests). But in this case you've specified -maxdepth 1, so your find command is just listing files which match the pattern "$src"/*.md. That being the case, it is much simpler and more reliable to use the glob (pattern):
for f in "$src"/*.md; do
wc -w < "$f"
done >> "$path"/tmp.txt
(I also quoted all the variable expansions, for safety, and moved the output redirection so it applies to the entire for loop, which is slightly more efficient.)
If you need to use find (because a glob won't work), then you should attempt to use the -exec option to find, which doesn't require fiddling around with other options to avoid mishandled special characters in filenames. For example, you could do this:
find "$src" -maxdepth 1 -name '*.md' -exec do wc -w {} + >> "$path"/tmp.txt
To answer your specific questions:
In IFS= read -r -d '' file, the '' is the argument to the -d option. That option is used to specify the character which delimits lines to be read; by default, a newline character is used so that read reads one line at a time. The empty string is the same as specifying the NUL character, which is what find outputs at the end of each filename if you specify the -print0 option. (Unlike -exec, -print0 is not Posix standard so it is not guaranteed to work with every find implementation, but in practice it's pretty generally available.)
The space between < and <(...) is to avoid creating the token <<, which would indicate a here-document. Instead, it specifies a redirection (<) from a process substitution (<(...)).

Piping output of bash function

I'm trying to connect the inputs/outputs of two bash functions with a
pipe. Here is a complete program which illustrates my issue:
function print_info {
files=$(ls);
echo $files;
}
touch "file.pattern"
print_info | grep "pattern"
rm -f file.pattern
But this simply outputs a list of all files, not those that match
"pattern". Can anyone help me understand why?
The reason this isn't working is that in
echo $files;
the variable $files is subject to shell expansion (i.e., it is expanded into individual arguments to echo), and the resulting tokens are printed by echo delimited by spaces. This means that the output of it is a single line, and grep handles it accordingly.
The least invasive fix is to use
echo "$files";
Dont parse the output of ls command. You could do the same using find command like:
find . -maxdepth 1 -type f -exec grep "pattern" {} \;
if you are getting file names from a function then do it like:
grep "pattern" $(print_info)

calling grep from a bash script

I'm new to bash scripts (and the *nix shell altogether) but I'm trying to write this script to make grepping a codebase easier.
I have written this
#!/bin/bash
args=("$#");
for arg in args
grep arg * */* */*/* */*/*/* */*/*/*/*;
done
when I try to run it, this is what happens:
~/Work/richmond $ ./f.sh "\$_REQUEST\['a'\]"
./f.sh: line 4: syntax error near unexpected token `grep'
./f.sh: line 4: ` grep arg * */* */*/* */*/*/* */*/*/*/*;'
~/Work/richmond $
How do I do this properly?
And, I think a more important question is, how can I make grep recurse through subdirectories properly like this?
Any other tips and/or pitfalls with shell scripting and using bash in general would also be appreciated.
The syntax error is because you're missing do. As for searching recursively if your grep has the -R option you would do:
#!/bin/bash
for arg in "$#"; do
grep -R "$arg" *
done
Otherwise you could use find:
#!/bin/bash
for arg in "$#"; do
find . -exec grep "$arg" {} +
done
In the latter example, find will execute grep and replace the {} braces with the file names it finds, starting in the current directory ..
(Notice that I also changed arg to "$arg". You need the dollar sign to get the variable's value, and the quotes tell the shell to treat its value as one big word, even if $arg contains spaces or newlines.)
On recusive grepping:
Depending on your grep version, you can pass -R to your grep command to have it search Recursively (in subdirectories).
The best solution is stated above, but try putting your statement in back ticks:
`grep ...`
You should use 'find' plus 'xargs' to do the file searching.
for arg in "$#"
do
find . -type f -print0 | xargs -0 grep "$arg" /dev/null
done
The '-print0' and '-0' options assume you're using GNU grep and ensure that the script works even if there are spaces or other unexpected characters in your path names. Using xargs like this is more efficient than having find execute it for each file; the /dev/null appears in the argument list so grep always reports the name of the file containing the match.
You might decide to simplify life - perhaps - by combining all the searches into one using either egrep or grep -E. An optimization would be to capture the output from find once and then feed that to xargs on each iteration.
Have a look at the findrepo script which may give you some pointers
If you just want a better grep and don't want to do anything yourself, use ack, which you can get at http://betterthangrep.com/.

Resources