I have written a small bash script called "isinFile.sh" for checking if the first term given to the script can be found in the file "file.txt":
#!/bin/bash
FILE="file.txt"
if [ `grep -w "$1" $FILE` ]; then
echo "true"
else
echo "false"
fi
However, running the script like
> ./isinFile.sh -x
breaks the script, since -x is interpreted by grep as an option.
So I improved my script
#!/bin/bash
FILE="file.txt"
if [ `grep -w -- "$1" $FILE` ]; then
echo "true"
else
echo "false"
fi
using -- as an argument to grep. Now running
> ./isinFile.sh -x
false
works. But is using -- the correct and only way to prevent code/option injection in bash scripts? I have not seen it in the wild, only found it mentioned in ABASH: Finding Bugs in Bash Scripts.
grep -w -- ...
prevents that interpretation in what follows --
EDIT
(I did not read the last part sorry). Yes, it is the only way. The other way is to avoid it as first part of the search; e.g. ".{0}-x" works too but it is odd., so e.g.
grep -w ".{0}$1" ...
should work too.
There's actually another code injection (or whatever you want to call it) bug in this script: it simply hands the output of grep to the [ (aka test) command, and assumes that'll return true if it's not empty. But if the output is more than one "word" long, [ will treat it as an expression and try to evaluate it. For example, suppose the file contains the line 0 -eq 2 and you search for "0" -- [ will decide that 0 is not equal to 2, and the script will print false despite the fact that it found a match.
The best way to fix this is to use Ignacio Vazquez-Abrams' suggestion (as clarified by Dennis Williamson) -- this completely avoids the parsing problem, and is also faster (since -q makes grep stop searching at the first match). If that option weren't available, another method would be to protect the output with double-quotes: if [ "$(grep -w -- "$1" "$FILE")" ]; then (note that I also used $() instead of backquotes 'cause I find them much easier to read, and quotes around $FILE just in case it contains anything funny, like whitespace).
Though not applicable in this particular case, another technique can be used to prevent filenames that start with hyphens from being interpreted as options:
rm ./-x
or
rm /path/to/-x
Related
I've answered my own question in writing this, but it might be helpful for others as I couldn't find a straightforward answer anywhere else. Please delete if inappropriate.
I'm trying to construct an if statement depending whether some <STRING> is found inside the environment $PATH.
When I pipe $PATH through grep I get a successful hit:
echo $PATH | grep -i "<STRING>"
But I was really struggling to find the syntax required to construct an if statement around this. It appears that the line below works. I know that the $(...) essentially passes the internal commands to the if statement, but I'm not sure why the [[...]] double brackets are needed:
if [[ $(echo $PATH | grep -i "<STRING>") ]]; then echo "HEY"; fi
Maybe someone could explain that for me to have a better understanding.
Thanks.
You could make better use of shell syntax. Something like this:
$ STRING="bin"
$ grep -i $STRING <<< $PATH && echo "HEY"
That is: first, save the search string in a variable (which I called STRING so it's easy to remember), and use that as the search pattern. Then, use the <<< redirection to input a "here string" - namely, the PATH variable.
Or, if you don't want a variable for the string:
$ grep -i "bin" <<< $PATH && echo "HEY"
Then, the construct && <some other command> means: IF the exit status of grep is 0 (meaning at least one successful match), THEN execute the "other command" (otherwise do nothing - just exit as soon as grep completes). This is the more common, more natural form of an "if... then..." statement, exactly of the kind you were trying to write.
Question for you though. Why the -i flag? That means "case independent matching". But in Unix and Linux file names, command names, etc. are case sensitive. Why do you care if the PATH matches the string BIN? It will, because bin is somewhere on the path, but if you then search for the BIN directory you won't find one. (The more interesting question is - how to match complete words only, so that for example to match bin, a directory name bin should be present; sbin shouldn't match bin. But that's about writing regular expressions - not quite what you were asking about.)
The following version - which doesn't even use grep - is based on the same idea, but it won't do case insensitive matching:
$ [[ $PATH == *$STRING* ]] && echo "HEY"
[[ ... ]] evaluates a Boolean expression (here, an equality using the * wildcard on the right-hand side); if true, && causes the execution of the echo command.
you don't need to use [[ ]], just:
if echo $PATH | grep -qi "<STRING>"; then echo "HEY"; fi
I'm trying to use enscript to print PDFs from Mutt, and hitting character encoding issues. One way around them seems to be to just use sed to replace the problem characters: sed -ir 's/[“”]/"/g' {input}
My test input file is this:
“very dirty”
we’re
I'm hoping to get "very dirty" and we're but instead I'm still getting
â\200\234very dirtyâ\200\235
weâ\200\231re
I found a nice little post on printing to PDFs from Mutt that I used as a starting point. I have a bash script that I point to from my .muttrc with set print_command="$HOME/.mutt/print.sh" -- the script currently reads about like this:
#!/bin/bash
input="$1" pdir="$HOME/Desktop" open_pdf=evince
# Straighten out curly quotes
sed -ir 's/[“”]/"/g' $input
sed -ir "s/[’]/'/g" $input
tmpfile="`mktemp $pdir/mutt_XXXXXXXX.pdf`"
enscript --font=Courier8 $input -2r --word-wrap --fancy-header=mutt -p - 2>/dev/null | ps2pdf - $tmpfile
$open_pdf $tmpfile >/dev/null 2>&1 &
sleep 1
rm $tmpfile
It does a fine job of creating a PDF (and works fine if you give it a file as an argument) but I can't figure out how to fix the curly quotes.
I've tried a bunch of variations on the sed line:
input=sed -r 's/[“”]/"/g' $input
$input=sed -ir "s/[’]/'/g" $input
Per the suggestion at Can I use sed to manipulate a variable in bash? I also tried input=$(sed -r 's/[“”]/"/g' <<< $input) and I get an error: "Syntax error: redirection unexpected"
But none manages to actually change $input -- what is the correct syntax to change $input with sed?
Note: I accepted an answer that resolved the question I asked, but as you can see from the comments there are a couple of other issues here. enscript is taking in a whole file as a variable, not just the text of the file. So trying to tweak the text inside the file is going to take a few extra steps. I'm still learning.
On Editing Variables In General
BashFAQ #21 is a comprehensive reference on performing search-and-replace operations in bash, including within variables, and is thus recommended reading. On this particular case:
Use the shell's native string manipulation instead; this is far higher performance than forking off a subshell, launching an external process inside it, and reading that external process's output. BashFAQ #100 covers this topic in detail, and is well worth reading.
Depending on your version of bash and configured locale, it might be possible to use a bracket expression (ie. [“”], as your original code did). However, the most portable thing is to treat “ and ” separately, which will work even without multi-byte character support available.
input='“hello ’cruel’ world”'
input=${input//'“'/'"'}
input=${input//'”'/'"'}
input=${input//'’'/"'"}
printf '%s\n' "$input"
...correctly outputs:
"hello 'cruel' world"
On Using sed
To provide a literal answer -- you almost had a working sed-based approach in your question.
input=$(sed -r 's/[“”]/"/g' <<<"$input")
...adds the missing syntactic double quotes around the parameter expansion of $input, ensuring that it's treated as a single token regardless of how it might be string-split or glob-expanded.
But All That May Not Help...
The below is mentioned because your test script is manipulating content passed on the command line; if that's not the case in production, you can probably disregard the below.
If your script is invoked as ./yourscript “hello * ’cruel’ * world”, then information about exactly what the user entered is lost before the script is started, and nothing you can do here will fix that.
This is because $1, in that scenario, will only contain “hello; ’cruel’ and world” are in their own argv locations, and the *s will have been replaced with lists of files in the current directory (each such file substituted as a separate argument) before the script was even started. Because the shell responsible for parsing the user's command line (which is not the same shell running your script!) did not recognize the quotes as valid at the time when it ran this parsing, by the time the script is running, there's nothing you can do to recover the original data.
Abstract: The way to use sed to change a variable is explored, but what you really need is a way to use and edit a file. It is covered ahead.
Sed
The (two) sed line(s) could be solved with this (note that -i is not used, it is not a file but a value):
input='“very dirty”
we’re'
sed 's/[“”]/\"/g;s/’/'\''/g' <<<"$input"
But it should be faster (for small strings) to use the internals of the shell:
input='“very dirty”
we’re'
input=${input//[“”]/\"}
input=${input//[’]/\'}
printf '%s\n' "$input"
$1
But there is an underlying problem with your script, you are trying to clean an input received from the command line. You are using $1 as the source of the string. Once somebody writes:
./script “very dirty”
we’re
That input is lost. It is broken into shell's tokens and "$1" will be “very only.
But I do not believe that is what you really have.
file
However, you are also saying that the input comes from a file. If that is the case, then read it in with:
input="$(<infile)" # not $1
sed 's/[“”]/\"/g;s/’/'\''/g' <<<"$input"
Or, if you don't mind to edit (change) the file, do this instead:
sed -i 's/[“”]/\"/g;s/’/'\''/g' infile
input="$(<infile)"
Or, if you are clear and certain that what is being given to the script is a filename, like:
./script infile
You can use:
infile="$1"
sed -i 's/[“”]/\"/g;s/’/'\''/g' "$infile"
input="$(<"$infile")"
Other comments:
Then:
Quote your variables.
Do not use the very old `…` syntax, use $(…) instead.
Do not use variables in UPPER case, those are reserved for environment variables.
And (unless you actually meant sh) use a shebang (first line) that targets bash.
The command enscript most definitively requires a file, not a variable.
Maybe you should use evince to open the PS file, there is no need of the step to make a pdf, unless you know you really need it.
I believe that is better use a file to store the output of enscript and ps2pdf.
Do not hide the errors printed by the commands until everything is working as desired, then, just call the script as:
./script infile 2>/dev/null
Or as required to make it less verbose.
Final script.
If you call the script with the name of the file that enscript is going to use, something like:
./script infile
Then, the whole script will look like this (runs both in bash or sh):
#!/usr/bin/env bash
Usage(){ echo "$0; This script require a source file"; exit 1; }
[ $# -lt 1 ] && Usage
[ ! -e $1 ] && Usage
infile="$1"
pdir="$HOME/Desktop"
open_pdf=evince
# Straighten out curly quotes
sed -i 's/[“”]/\"/g;s/’/'\''/g' "$infile"
tmpfile="$(mktemp "$pdir"/mutt_XXXXXXXX.pdf)"
outfile="${tmpfile%.*}.ps"
enscript --font=Courier10 "$infile" -2r \
--word-wrap --fancy-header=mutt -p "$outfile"
ps2pdf "$outfile" "$tmpfile"
"$open_pdf" "$tmpfile" >/dev/null 2>&1 &
sleep 5
rm "$tmpfile" "$outfile"
In bash, I want to say "if a file doesn't contain XYZ, then" do a bunch of things. The most natural way to transpose this into code is something like:
if [ ! grep --quiet XYZ "$MyFile" ] ; then
... do things ...
fi
But of course, that's not valid Bash syntax. I could use backticks, but then I'll be testing the output of the file. The two alternatives I can think of are:
grep --quiet XYZ "$MyFile"
if [ $? -ne 0 ]; then
... do things ...
fi
And
grep --quiet XYZ "$MyFile" ||
( ... do things ...
)
I kind of prefer the second one, it's more Lispy and the || for control flow isn't that uncommon in scripting languages. I can see arguments for the first one too, although when the person reads the first line, they don't know why you're executing grep, it looks like you're executing it for it's main effect, rather than just to control a branch in script.
Is there a third, more direct way which uses an if statement and has the grep in the condition?
Yes there is:
if grep --quiet .....
then
# If grep finds something
fi
or if the grep fails
if ! grep --quiet .....
then
# If grep doesn't find something
fi
You don't need the [ ] (test) to check the return value of a command. Just try:
if ! grep --quiet XYZ "$MyFile" ; then
This is a matter of taste since there obviously are multiple working solutions. When I deal with a problem like this, I usually apply wc -l after grep in order to count the lines that match. Then you have a single integer number that you can evaluate within a test condition. If the question only is whether there is a match at all (the number of matching lines does not matter), then applying wc probably is OTT and evaluation of grep's return code seems to be the best solution:
Normally, the exit status is 0 if selected lines are found and 1
otherwise. But the exit status is 2 if an error occurred, unless the
-q or --quiet or --silent option is used and a selected line is found. Note, however, that POSIX only mandates, for programs such as grep,
cmp, and diff, that the exit status in case of error be greater than
1; it is therefore advisable, for the sake of portability, to use
logic that tests for this general condition instead of strict equality
with 2.
I have a script which uses the following logic:
if [ ! -z "$1" ]; then # if any parameter is supplied
ACTION= # clear $ACTION
else
ACTION=echo # otherwise, set it to 'echo'
fi
This works fine, as-is. However, in reading the Shell Parameter Expansion section of the bash manual, it seems this should be able to be done in a single step. However, I can't quite wrap my head around how to do it.
I've tried:
ACTION=${1:-echo} # ends up with $1 in $ACTION
ACTION=${1:+}
ACTION=${ACTION:-echo} # ends up always 'echo'
and a few ways of nesting them, but nesting seems to be disallowed as far as I can tell.
I realize I've already got a working solution, but now I'm genuinely curious if this is possible. It's something that would be straightforward with a ternary operator, but I don't think bash has one.
If this is possible, I'd like to see the logic to do this seeming two-step process, with no if/else constructs, but using only any combination of the Shell Parameter Expansion features.
Thank you.
EDIT for elderarthis:
The remainder of the script is just:
find . -name "*\?[NMSD]=[AD]" -exec ${ACTION} rm -f "{}" +
I just want ACTION=echo as a sanity check against myself, hence, passing any argument will actually do the deletion (by nullifying ${ACTION}, whereas passing no args leaves echo in there.
And I know TIMTOWTDI; I'm looking to see if it can be done with just the stuff in the Shell Parameter Expansion section :-)
EDIT for Mikel:
$ cat honk.sh
#!/bin/bash
ACTION=${1-echo}
echo $ACTION
$ ./honk.sh
echo
$ ./honk.sh foo
foo
The last needs to have ACTION='', and thus return a blank line/null value.
If I insisted on doing it in fewer than 4 lines and no sub-shell, then I think I'd use:
ACTION=${1:+' '}
: ${ACTION:=echo}
This cheats slightly - it creates a blank action rather than an empty action if there is an argument to the script. If there is no argument, then ACTION is empty before the second line. On the second line, if action is empty, set it to 'echo'. In the expansion, since you (correctly) do not quote $ACTION, no argument will be passed for the blank.
Tester (xx.sh):
ACTION=${1:+' '}
: ${ACTION:=echo}
echo $ACTION rm -f a b c
Tests:
$ sh xx.sh 1
rm -f a b c
$ sh xx.sh
echo rm -f a b c
$ sh xx.sh ''
echo rm -f a b c
$
If the last line is incorrect, then remove the colon from before the plus.
If a sub-shell is acceptable, then one of these two single lines works:
ACTION=$([ -z "$1" ] && echo echo)
ACTION=$([ -z "${1+X}" ] && echo echo)
The first corresponds to the first version shown above (empty first arguments are treated as absent); the second deals with empty arguments as present. You could write:
ACTION=$([ -z "${1:+X}" ] && echo echo)
to make the relation with the second clearer - except you're only going to use one or the other, not both.
Since the markdown notation in my comment confused the system (or I got it wrong but didn't get to fix it quickly enough), my last comment (slightly amended) should read:
The notation ${var:+' '} means 'if $var is set and is not empty, then use what follows the +' (which, in this case, is a single blank). The notation ${var+' '} means 'if $var is set - regardless of whether it is empty or not - then use what follows the +'. These other expansions are similar:
${var:=X} - set $var to X unless it already has a non-empty value.
${var:-X} - expands to $var if it has a non-empty value and expands to X if $var is unset or is empty
Dropping the colon removes the 'empty' part of the test.
ACTION=${1:-echo}
is correct.
Make sure it's near the top of your script before anything modifies $1 (e.g. before any set command). Also, it wouldn't work inside a function, because $1 would be the first parameter to the function.
Also check if $1 is set but null, in which case fix how you're calling it, or use ACTION=${1-echo} (note there is no :).
Update
Ah, I assumed you must have meant the opposite, because it didn't really make sense otherwise.
It still seems odd, but I guess as a mental exercise, maybe you want something like this:
#!/bin/bash
shopt -s extglob
ACTION=$1
ACTION=${ACTION:-echo}
ACTION=${ACTION/!(echo)/} # or maybe ACTION=${ACTION#!(echo)}
echo ACTION=$ACTION
It's not quite right: it gives ACTION=o, but I think something along those lines should work.
Further, if you pass echo as $1, it will stay as echo, but I don't think that's a bad thing.
It's also terribly ugly, but you knew that when asking the question. :-)
I've mastered the basics of Bash compound conditionals and have read a few different ways to check for file existence of a wildcard file, but this one is eluding me, so I figured I'd ask for help...
I need to:
1.) Check if some file matching a pattern exists
AND
2.) Check that text in a different file exists.
I know there's lots of ways to do this, but I don't really have the knowledge to prioritize them (if you have that knowledge I'd be interested in reading about that as well).
First things that came to mind is to use find for #1 and grep for #2
So something like
if [ `grep -q "OUTPUT FILE AT STEP 1000" ../log/minimize.log` ] \
&& [ `find -name "jobscript_minim\*cmd\*o\*"` ]; then
echo "Both passed! (1)"
fi
That fails, though curiously:
if `grep -q "OUTPUT FILE AT STEP 1000" ../log/minimize.log` ;then
echo "Text passed!"
fi
if `find -name "jobscript_minim\*cmd\*o\*"` ;then
echo "File passed!"
fi
both pass...
I've done a bit of reading and have seen people talking about the problem of multiple filenames matching wildcards within an if statement. What's the best solution to this? (in answer my question, I'd assumed you take a crack at that question, as well, in the process)
Any ideas/solutions/suggestions?
Let's tackle why your attempt failed first:
if [ `grep -q …` ];
This runs the grep command between backticks, and interpolates the output inside the conditional command. Since grep -q doesn't produce any output, it's as if you wrote if [ ];
The conditional is supposed to test the return code of grep, not anything about its output. Therefore it should be simply written as
if grep -q …;
The find command returns 0 (i.e. true) even if it finds nothing, so this technique won't work. What will work is testing whether its output is empty, by collecting its output any comparing it to the empty string:
if [ "$(find …)" != "" ];
(An equivalent test is if [ -n "$(find …)" ].)
Notice two things here:
I used $(…) rather than backticks. They're equivalent, except that backticks require strange quoting inside them (especially if you try to nest them), whereas $(…) is simple and reliable. Just use $(…) and forget about backticks (except that you need to write \` inside double quotes).
There are double quotes around $(…). This is really important. Without the quotes, the shell would break the output of the find command into words. If find prints, say, two lines dir/file and dir/otherfile, we want if [ "dir/file dir/otherfile" = "" ]; to be executed, not if [ dir/file dir/otherfile = "" ]; which is a syntax error. This is a general rule of shell programming: always put double quotes around a variable or command substitution. (A variable substitution is $foo or ${foo}; a command substitution is $(command).)
Now let's see your requirements.
Check if some file matching a pattern exists
If you're looking for files in the current directory or in any directory below it recursively, then find -name "PATTERN" is right. However, if the directory tree can get large, it's inefficient, because it can spend a lot of time printing all the matches when we only care about one. An easy optimization is to only retain the first line by piping into head -n 1; find will stop searching once it realizes that head is no longer interested in what it has to say.
if [ "$(find -name "jobscript_minimcmdo" | head -n 1)" != "" ];
(Note that the double quotes already protect the wildcards from expansion.)
If you're only looking for files in the current directory, assuming you have GNU find (which is the case on Linux, Cygwin and Gnuwin32), a simple solution is to tell it not to recurse deeper than the current directory.
if [ "$(find -maxdepth 1 -name "jobscript_minim*cmd*o*")" != "" ];
There are other solutions that are more portable, but they're more complicated to write.
Check that text in a different file exists.
You've already got a correct grep command. Note that if you want to search for a literal string, you should use grep -F; if you're looking for a regexp, grep -E has a saner syntax than plain grep.
Putting it all together:
if grep -q -F "OUTPUT FILE AT STEP 1000" ../log/minimize.log &&
[ "$(find -name "jobscript_minim*cmd*o*")" != "" ]; then
echo "Both passed! (1)"
fi
bash 4
shopt -s globstar
files=$(echo **/jobscript_minim*cmd*o*)
if grep -q "pattern" file && [[ ! -z $files ]];then echo "passed"; fi
for i in filename*; do FOUND=$i;break;done
if [ $FOUND == 'filename*' ]; then
echo “No files found matching wildcard.”
else
echo “Files found matching wildcard.”
fi