calling grep from a bash script - bash

I'm new to bash scripts (and the *nix shell altogether) but I'm trying to write this script to make grepping a codebase easier.
I have written this
#!/bin/bash
args=("$#");
for arg in args
grep arg * */* */*/* */*/*/* */*/*/*/*;
done
when I try to run it, this is what happens:
~/Work/richmond $ ./f.sh "\$_REQUEST\['a'\]"
./f.sh: line 4: syntax error near unexpected token `grep'
./f.sh: line 4: ` grep arg * */* */*/* */*/*/* */*/*/*/*;'
~/Work/richmond $
How do I do this properly?
And, I think a more important question is, how can I make grep recurse through subdirectories properly like this?
Any other tips and/or pitfalls with shell scripting and using bash in general would also be appreciated.

The syntax error is because you're missing do. As for searching recursively if your grep has the -R option you would do:
#!/bin/bash
for arg in "$#"; do
grep -R "$arg" *
done
Otherwise you could use find:
#!/bin/bash
for arg in "$#"; do
find . -exec grep "$arg" {} +
done
In the latter example, find will execute grep and replace the {} braces with the file names it finds, starting in the current directory ..
(Notice that I also changed arg to "$arg". You need the dollar sign to get the variable's value, and the quotes tell the shell to treat its value as one big word, even if $arg contains spaces or newlines.)

On recusive grepping:
Depending on your grep version, you can pass -R to your grep command to have it search Recursively (in subdirectories).

The best solution is stated above, but try putting your statement in back ticks:
`grep ...`

You should use 'find' plus 'xargs' to do the file searching.
for arg in "$#"
do
find . -type f -print0 | xargs -0 grep "$arg" /dev/null
done
The '-print0' and '-0' options assume you're using GNU grep and ensure that the script works even if there are spaces or other unexpected characters in your path names. Using xargs like this is more efficient than having find execute it for each file; the /dev/null appears in the argument list so grep always reports the name of the file containing the match.
You might decide to simplify life - perhaps - by combining all the searches into one using either egrep or grep -E. An optimization would be to capture the output from find once and then feed that to xargs on each iteration.

Have a look at the findrepo script which may give you some pointers

If you just want a better grep and don't want to do anything yourself, use ack, which you can get at http://betterthangrep.com/.

Related

Cannot iterate associative array keys in PKGBUILD

I am working with a PKGBUILD file for the AUR. I have a lot of colors that need to be replaced in different files in the $pkgsrc directory and I wanted to use an associative array.
declare -A _BLACKISH_REPLACEMENTS
_BLACKISH_REPLACEMENTS['#242424']='#1C1C1C'
_BLACKISH_REPLACEMENTS['#333333']='#292929'
_BLACKISH_REPLACEMENTS['#999999']='#787878'
_BLACKISH_REPLACEMENTS['#555555']='#4C4C4C'
_BLACKISH_REPLACEMENTS['#373737']='#2E2E2E'
_BLACKISH_REPLACEMENTS['#434343']='#383838'
_BLACKISH_REPLACEMENTS['#3E3E3E']='#333333'
_BLACKISH_REPLACEMENTS['#383838']='#2E2E2E'
_BLACKISH_REPLACEMENTS['#313131']='#262626'
_BLACKISH_REPLACEMENTS['#101010']='#101010'
_BLACKISH_REPLACEMENTS['#3B3B3B']='#303030'
_BLACKISH_REPLACEMENTS['#2A2A2A']='#1F1F1F'
_BLACKISH_REPLACEMENTS['#656565']='#575757'
_BLACKISH_REPLACEMENTS['#767676']='#5E5E5E'
_BLACKISH_REPLACEMENTS['#868686']='#787878'
_BLACKISH_REPLACEMENTS['#636363']='#595959'
_BLACKISH_REPLACEMENTS['#696969']='#5E5E5E'
_BLACKISH_REPLACEMENTS['#707070']='#666666'
_BLACKISH_REPLACEMENTS['#767676']='#6B6B6B'
_BLACKISH_REPLACEMENTS['#C1C1C1']='#B8B8B8'
_BLACKISH_REPLACEMENTS['#C6C6C6']='#BDBDBD'
That seems like a fairly clean solution, otherwise I would have many variables and that is less than ideal. Now, I iterate over these with the syntax found in other SO posts:
_blackish_replace() (
shopt -s globstar
echo "${!_BLACKISH_REPLACEMENTS[#]}"
echo "${_BLACKISH_REPLACEMENTS[#]}"
for file in "$1"/**/*.scss; do
echo "Replacing colors in file: $file"
for color in "${!_BLACKISH_REPLACEMENTS[#]}"; do
echo "$color"
sed -i "s;$color;${_BLACKISH_REPLACEMENTS["$color"]};gI" "$file"
done
done
)
It looks good to me, and when this is run in a standalone script, it does indeed replace the correct matches in the correct files.
However, when using it from makepkg, it fails silently, hence the four echo calls exhibited.
The first two output newlines. This leads me to believe they are undefined?
The iteration has proved to be working for the glob expansion, however echo "$color" is never reached; the loop iterates nothing.
I thought maybe makepkg was using the system shell, which in that case, running the code directly from my user shell zsh fails with event not found: _BLACKISH_REPLACEMENTS or something alike (off the top of my head).
I asked in the Arch Linux Discord server if makepkg uses the locally available bash, and was assured it does. I am very confused.
It is probably a good idea to turn your array into a sed script before iterating the files:
#!/usr/bin/env bash
declare -A _BLACKISH_REPLACEMENTS=(
['#242424']='#1C1C1C'
['#333333']='#292929'
['#999999']='#787878'
['#555555']='#4C4C4C'
['#373737']='#2E2E2E'
['#434343']='#383838'
['#3E3E3E']='#333333'
['#383838']='#2E2E2E'
['#313131']='#262626'
['#101010']='#101010'
['#3B3B3B']='#303030'
['#2A2A2A']='#1F1F1F'
['#656565']='#575757'
['#767676']='#5E5E5E'
['#868686']='#787878'
['#636363']='#595959'
['#696969']='#5E5E5E'
['#707070']='#666666'
['#767676']='#6B6B6B'
['#C1C1C1']='#B8B8B8'
['#C6C6C6']='#BDBDBD'
)
sed_script=
for k in "${!_BLACKISH_REPLACEMENTS[#]}"; do
v="${_BLACKISH_REPLACEMENTS[$k]}"
sed_script+="s/$k/$v/g;"
done
shopt -s globstar nullglob
for file in "$1"/**/*.scss; do
sed -i.bak -e "$sed_script" "$file"
done
Now in a more practical one-liner POSIX-shell friendly call:
find ./ -type f -name '*.scss' -exec sed -i.bak -e 's/#242424/#1C1C1C/g;s/#696969/#5E5E5E/g;s/#555555/#4C4C4C/g;s/#767676/#6B6B6B/g;s/#868686/#787878/g;s/#383838/#2E2E2E/g;s/#636363/#595959/g;s/#101010/#101010/g;s/#373737/#2E2E2E/g;s/#C6C6C6/#BDBDBD/g;s/#313131/#262626/g;s/#333333/#292929/g;s/#C1C1C1/#B8B8B8/g;s/#707070/#666666/g;s/#434343/#383838/g;s/#3E3E3E/#333333/g;s/#3B3B3B/#303030/g;s/#999999/#787878/g;s/#656565/#575757/g;s/#2A2A2A/#1F1F1F/g;' {} \;
To clarify the point of all the above:
As you are unsure about the shell brand running your makepkg, it is a safe route to choose the most portable shell code by sticking to POSIX-shell grammar, common tools and options.
Instead of choosing a quite over-engineered associative array here. The replacement instructions for sed can be layed-out as clearly as your associative array:
#!/usr/bin/env sh
# A plain string of sed replacement instructions
# is as compact and more portable than an associative array.
# It also saves from looping over each entry.
_BLACKISH_REPLACEMENTS='
s/#242424/#1C1C1C/g;
s/#696969/#5E5E5E/g;
s/#555555/#4C4C4C/g;
s/#767676/#6B6B6B/g;
s/#868686/#787878/g;
s/#383838/#2E2E2E/g;
s/#636363/#595959/g;
s/#101010/#101010/g;
s/#373737/#2E2E2E/g;
s/#C6C6C6/#BDBDBD/g;
s/#313131/#262626/g;
s/#333333/#292929/g;
s/#C1C1C1/#B8B8B8/g;
s/#707070/#666666/g;
s/#434343/#383838/g;
s/#3E3E3E/#333333/g;
s/#3B3B3B/#303030/g;
s/#999999/#787878/g;
s/#656565/#575757/g;
s/#2A2A2A/#1F1F1F/g;
'
_blackish_replace() {
# Instead of iterating a bash specific globstar,
# find -exec can replace it while sticking to the most
# genuine POSIX-shell grammar.
# find and sed tools are used with their most common options
# avoiding gnu-specific extensions.
find "$1" -type f -name '*.scss' -exec \
sed -i.bak -e "$_BLACKISH_REPLACEMENTS" {} \;
}

for loop is not giving expected output in sh whereas it is working fine if sh """#!/bin/bash +x is added to the script block

for newfile in `find . -type f ! -path "./data/*" ! -name new_changes.txt`; do
if ! grep -q "\$newfile" new_changes.txt; then
rm \$newfile;
fi
done
The above code works fine if sh """#!/bin/bash +x is given at the starting of the code block. But when it is commented out - It throws the below error
rm: cannot remove '$newfile': No such file or directory
Any suggestions on how we can modify this for loop to work without sh """#!/bin/bash +x?
#alaniwi already explained in his comment the error in your rm command. You are trying to remove a file with the name $newfile, and you probably don't have any file starting with a Dollar sign.
The other problem is similar, but not identical: Your grep command searches for a literal string $newfile, while you probably want to search for a string stored in the variable newfile. Hence you have to drop the \.
But this still means that the content of the variable newfile is subject to interpretation as a regular expression. For example, if newfile has the value abc.txt, grep would also succeed if new_changes.txt contained just abcdtxt. To avoid this error, you should use the -F option to grep, to avoid interpretation as a regexp.
And still, there is one more error: Say newfile has the value abc, and new_changes.txt contained just xxxabc, but they would still match, since abc is a substring of xxxabc. To avoid this error, you should use the -x option to grep, which forces to match the whole line.
Hence, your command should be grep -qFx "$newfile" new_changes.txt

/bin/ls: Argument list too long

Attempting to convert a twitter account of over 10K tweets into another format with a bash script on a maxed out MBP 16" running the latest macOS.
After running for several minutes outputting many periods it says, line 43: /bin/ls: Argument list too long. Assuming this issue relates to the number of tweets so while I could attempt to break into small pieces as a last resort, not knowing what the max number to avoid the error is, decided to first search for a solution.
Searched Google and SO and found, "bash: /bin/ls: Argument list too long". If my issue is the same it sounds like replacing "ls" with "find -name" may help. Tried and same error, but perhaps not the correct syntax.
The two lines that use "ls" currently are the following (the first is the one the error currently complains about):
for fileName in `ls ${thisDir}/dotwPosts/p*` ; do
and
printf "`ls ${thisDir}/dotwPosts/p* | wc -l` posts left to import.\n"
Tried changing the first line to (with the error saying /usr/bin/find: Argument list too long).
for fileName in `find -name ${thisDir}/dotwPosts/p*` ; do
May need to provide additional code, but didn't want to make the question too specific to my needs and more general hopefully for others seeing this common error where the other stackoverflow answer didn't seem to apply.
To iterate over file in a directory in bash, print the filenames as a zero separated stream and read it. That way you don't need to store all filenames at once in any place:
find "${thisDir}/dotwPosts/" -maxdepth 1 -type f -name 'p*' -print0 |
while IFS= read -d '' -r file; do
printf "%s\n" "$file"
done
To get the count, output a single character for each file and count the characters:
find "${thisDir}/dotwPosts/" -maxdepth 1 -type f -name 'p*' -printf . | wc -c
Don't use ` backticks, they use is discouraged. Bash hackers wiki discouraged and deprecated syntax. Use $(...) instead.
for fileName in $(...) is a common antipattern in bash. Most probably if you want to iterate over output of another command, you should use while IFS= read -r line loop. bashfaq How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
Try this:
for file in "${thisDir}/dotwPosts/p"*
do
# exclude non plain files
[[ -f $file ]] || continue
# do something with "$file"
...
done
I quoted "${thisDir}/dotwPosts/p", so var thisDir can't contain a relevant wildcards, but works with blanks. Otherwise remove the quotes.

Bash - iterate through output lines

What do I want: find all the nginx access log files, iterate them (get some data from them).
I'm stuck at for loop:
#!/bin/bash
logfiles="$(find /var/log/nginx -name 'access.log*')"
for lf in "$logfiles"
do
echo "file"
done
Output is only one "file" word, despite of there are more than one log file. What's wrong?
when you say
for lf in "$logfiles"
your quotes preserve the whitespace within find's output. The quotes, in this case, are incorrect. Removing them will properly iterate over the files:
$ for i in "`find . -iname '*.log'`"; do echo $i; done
./2.log ./3.log ./1.log
$ for i in `find . -iname '*.log'`; do echo $i; done
./2.log
./3.log
./1.log
But there's a much better way: you should stream your data instead of iterating. Consider this pattern:
$ find . -iname '*.log' | xargs -n 1 echo
./2.log
./3.log
./1.log
It's very much worth wrapping your head around xargs, which turns its standard input into additional arguments to add to its own, which it then executes. In this simple case, I'm telling xargs to run the command echo individually for each 1 (-n 1) of the files
There's a few reasons xargs is my go-to iteration operator whenever possible: firstly, it's very smart. Iterating over command output with for i in $(command) requires $(command) to provide your list in the form item1 item2 item3, causing problems if any of the items contain special characters, which are then interpreted by bash as part of the for arguments.
Here is an example of the space which typically becomes special in bash as a valid input field spearator.
$ for i in `find . -iname '*.log'`; do echo $i; done
./4
tricky.log
./2.log
./3.log
./1.log
the file 4 tricky.log, containing a space, has now caused a problem.
xargs can be smart enough to keep them separate. For some cases you can get around it with changing your $IFS, the input field separator. But that gets messy fast. With xargs, you have better options - specifially, xargs can also use the null character to terminate the items in its input stream with the -0 character. Other programs, namely find, can also use the null character in its output to match what xargs expects. In this sense, xargs and find are a great combination:
$ find . -iname '*.log' -print0 | xargs -0 -n 1 echo
./4 tricky.log
./2.log
./3.log
./1.log
But wait, there's more! The next step in your command will surely be to grep the files looking for whatever matching lines you wish to find. If your lines are large, you'll want to parallelize too. xargs can do this as well. You can add more steps ot the pipeline for filtering etc.
Finally, using subshell substitution $() as program arguments can lead to unintended commands when not used very carefully to avoid unintentional arguments in failure cases. I once wrote a script that used $() to find mysql's source directory to do some first-time setup. It said something like remove -r /$(find / -iname mysqldir) . Well, if there's no mysqldir in the expected location that turned into rm -r /. Not what I intended, obviously: d'oh!
That's why I use and encourage others to use xargs whenever possible.
lose the quotes in this line: for lf in $logfiles
But it looks like you may have only one file named access.log

How to use >> inside find -exec statement?

From time to time I have to append some text at the end of a bunch of files. I would normally find these files with find.
I've tried
find . -type f -name "test" -exec tail -n 2 /source.txt >> {} \;
This however results in writing the last two lines from /source.txt to a file named {} however many times a file was found matching the search criteria.
I guess I have to escape >> somehow but so far I wasn't successful.
Any help would be greatly appreciated.
-exec only takes one command (with optional arguments) and you can't use any bash operators in it.
So you need to wrap it in a bash -c '...' block, which executes everything between '...' in a new bash shell.
find . -type f -name "test" -exec bash -c 'tail -n 2 /source.txt >> "$1"' bash {} \;
Note: Everything after '...' is passed as regular arguments, except they start at $0 instead of $1. So the bash after ' is used as a placeholder to match how you would expect arguments and error processing to work in a regular shell, i.e. $1 is the first argument and errors generally start with bash or something meaningful
If execution time is an issue, consider doing something like export variable="$(tail -n 2 /source.txt)" and using "$variable" in the -exec. This will also always write the same thing, unlike using tail in -exec, which could change if the file changes. Alternatively, you can use something like -exec ... + and pair it with tee to write to many files at once.
A more efficient alternative (assuming bash 4):
shopt -s globstar
to_augment=( **/test )
tail -n 2 /source.txt | tee -a "${to_augment[#]}" > /dev/null
First, you create an array with all the file names, using a simple pattern that should be equivalent to your call to find. Then, use tee to append the desired lines to all those files at once.
If you have more criteria for the find command, you can still use it; this version is not foolproof, as it assumes no filename contains a newline, but fixing that is best left to another question.
while read -r fname; do
to_augment+=( "$fname" )
done < <(find ...)

Resources