This question already has answers here:
Checking the success of a command in a bash `if [ .. ]` statement
(1 answer)
When to wrap quotes around a shell variable?
(5 answers)
Closed last year.
I am trying to pass a regular expression as a parameter. What should I fix in my code?
My goal is to send find and the regular expression string, then use grep on the parameter so I can do whatever I want with what grep finds (which is print the count of occurrences).
This is what I send:
$ ./lab12.sh find [Gg]reen
Here's my bash code:
if [[ "$1" == "find" ]]
then
declare -i cnt=0
for file in /tmp/beatles/*.txt ; do
if [[ grep -e $2 ]] //problem is here...
then
((cnt=cnt+1))
fi
done
echo "$cnt songs contain the pattern "$2""
fi
The if statement takes a command. [[ being one, and grep is another, writing [[ grep ... ]] is essentially as wrong as writing vim grep, or cat grep etc, just use:
if grep -q -e "$pattern"
then
...
instead.
The -q switch to grep will disable output, but set the exit status to 0 (success) when the pattern is matches, and 1 (failure) otherwise, and the if statement will only execute the then block if the command succeded.
Using -q will allow grep to exit as soon as the first line is matches.
And as always, remember to wrap your paremeter expansions in double quotes, to avoid pathname expansion and wordsplitting.
Note that square brackets [...] will be interpreted by your calling shell, and you should escape them, or wrap the whole pattern in quotes.
It's always recommended use single quotes, as the only special character is another single quote.
$ ./lab12.sh find '[Gg]reen'
Related
This question already has answers here:
Difference between single and double quotes in Bash
(7 answers)
Closed 3 years ago.
I was looking at this link Check if a Bash array contains a value which says how to check for existence of an item in a list as follows:
if printf '%s\n' ${myarray[#]} | grep -q -P '^mypattern$'; then
# ...
fi
However, I want mypattern value to be passed as a variable as follows:
mynewpattern="xyz"
then I was expecting the following to work
if printf '%s\n' ${myarray[#]} | grep -q -P '^"$mynewpattern"$'; then
# ...
fi
But it is not picking the new pattern of xyz. What is the appropriate syntax to insert the new pattern?
I have just started learning bash.
The single quotes are wrong; you want double quotes instead of single.
However, grep -P is also slightly wrong here; it's not properly portable, and your pattern doesn't use any of the syntax which -P enables; also, you should quote your array properly.
if printf '%s\n' "${myarray[#]}" |
grep -q "^$mypattern\$"
then
...
Text between single quotes is passed through verbatim. If you want the shell to perform variable interpolation, use double quotes (and then you need to escape any literal backslash, dollar sign, or backtick).
Could you please try using like grep -q -P "^$var$"(in your script)
Here is an example script for same scenario for an Input_file(since no samples for array elements were provided so explaining it with an sample/example script here).
##Shell variable
var="bla"
##A sample Input_file
cat << EOF > Input_file
blabla test test
123abcd123test
bla
EOF
##Following is the code to check.
if grep -q -P "^$var$" Input_file
then
echo "match found."
fi
Above will only match lines which are starting with variable val's value.
I have a file and its name looks like:
12U12345._L001_R1_001.fastq.gz
I want to assign to a variable just the 12U12345 part.
So far I have:
variable=`basename $fastq | sed {s'/_S[0-9]*_L001_R1_001.fastq.gz//'}`
Note: $fastq is a variable with the full path to the file in it.
This solution currently returns the full file name, any ideas how to get this right?
Just use the built-in parameter expansion provided by the shell, instead of spawning a separate process
fastq="12U12345._L001_R1_001.fastq.gz"
printf '%s\n' "${fastq%%.*}"
12U12345
or use printf() itself to store to a new variable in one-shot
printf -v numericPart '%s' "${fastq%%.*}"
printf '%s\n' "${numericPart}"
Also bash has a built-in regular expression comparison operator, represented by =~ using which you could do
fastq="12U12345._L001_R1_001.fastq.gz"
regex='^([[:alnum:]]+)\.(.*)'
if [[ $fastq =~ $regex ]]; then
numericPart="${BASH_REMATCH[1]}"
printf '%s\n' "${numericPart}"
fi
You could use cut:
$> fastq="/path/to/12U12345._L001_R1_001.fastq.gz"
$> variable=$(basename "$fastq" | cut -d '.' -f 1)
$> echo "$variable"
12U12345
Also, please note that:
It's better to wrap your variable inside quotes. Otherwise you command won't work with filenames that contain space(s).
You should use $() instead of the backticks.
Using Bash Parameter Expansion to extract the basename and then extract the portion of the filename you want:
fastq="/path/to/12U12345._L001_R1_001.fastq.gz"
file="${fastq##*/}" # gives 12U12345._L001_R1_001.fastq.gz
string="${file%%.*}" # gives 12U12345
Note that Bash doesn't allow us to nest the parameter expansion. Otherwise, we could have combined statements 2 and 3 above.
This question already has answers here:
Stop shell wildcard character expansion?
(4 answers)
Closed 7 years ago.
I have 4 parameters adding to excetute file:
./project01.sh /* */ /^ ^/
and i want to make each parameters as variable
but before that i want to modify it because it will be unconvenient for futher operation. So i would like to make it like "/*" so put it in "" and before each character give \ because some of then are special characters.
I tried use like that:
beg1=echo $1 | sed(some change with $1)
but it change immediately $1 which is /* to direction /bin /boot itd. what can i do in that case?
First, when calling your script, there's nothing you can do to avoid quoting. You must call it as:
./project01.sh '/*' '*/' '/^' '^/'
...if you want to prevent any potential for shell manipulation. (^ is safe with some shells but not all).
This is because in the case of ./project01.sh /* (without quotes), expansion happens before the script is even started, so once your script is running, it's too late to make changes.
Second, use more quotes within your script:
echo "$1" | sed ...
...or, better (to fix cases where $1 contains -E, -n, or a similar value):
printf '%s\n' "$1" | sed ...
...or, better yet, if your shell is bash rather than /bin/sh...
sed ... <<<"$1"
However, if your goal in using sed is to add syntax quoting to your arguments, this will never work: The arguments are already expanded before the script is even run, so $1 is already /bin.
I am trying to validate user input against a regular expression.
vari=A
if [ $vari =~ [A-Z] ] ;
then
echo "hurray"
fi
The output I am getting is swf.sh[3]: =~: unknown test operator.
Can you please let me know the test operator I can use?
It's not built into Bourne shell, you need to use grep:
if echo "$vari" | grep -q '[A-Z]'; then
echo hurray
fi
If you want to match the whole string, remember to use the regex anchors, ^ and $. Note that the -q flag makes grep quiet, so its only output is the return value, for match/not match.
POSIX shell doesn't have a regular expression operator (or rather, the POSIX test command does not). Instead, you use the expr command to do a (limited) form of regular expression matching.
if expr "$vari" : '[A-Z]' > /dev/null; then
(I say "limited" because it always matches at the beginning of the string, as if the regular expression started with ^.) The exit status is 0 if a match is made; it also writes the number of characters matched to standard output, hence the redirect to /dev/null.
If you are actually using bash, you need to use the [[ command:
if [[ $vari =~ [A-Z] ]]; then
I know it is possible to invert grep output with the -v flag. Is there a way to only output the non-matching part of the matched line? I ask because I would like to use the return code of grep (which sed won't have). Here's sort of what I've got:
tags=$(grep "^$PAT" >/dev/null 2>&1)
[ "$?" -eq 0 ] && echo $tags
You could use sed:
$ sed -n "/$PAT/s/$PAT//p" $file
The only problem is that it'll return an exit code of 0 as long as the pattern is good, even if the pattern can't be found.
Explanation
The -n parameter tells sed not to print out any lines. Sed's default is to print out all lines of the file. Let's look at each part of the sed program in between the slashes. Assume the program is /1/2/3/4/5:
/$PAT/: This says to look for all lines that matches pattern $PAT to run your substitution command. Otherwise, sed would operate on all lines, even if there is no substitution.
/s/: This says you will be doing a substitution
/$PAT/: This is the pattern you will be substituting. It's $PAT. So, you're searching for lines that contain $PAT and then you're going to substitute the pattern for something.
//: This is what you're substituting for $PAT. It is null. Therefore, you're deleting $PAT from the line.
/p: This final p says to print out the line.
Thus:
You tell sed not to print out the lines of the file as it processes them.
You're searching for all lines that contain $PAT.
On these lines, you're using the s command (substitution) to remove the pattern.
You're printing out the line once the pattern is removed from the line.
How about using a combination of grep, sed and $PIPESTATUS to get the correct exit-status?
$ echo Humans are not proud of their ancestors, and rarely invite
them round to dinner | grep dinner | sed -n "/dinner/s/dinner//p"
Humans are not proud of their ancestors, and rarely invite them round to
$ echo $PIPESTATUS[1]
0[1]
The members of the $PIPESTATUS array hold the exit status of each respective command executed in a pipe. $PIPESTATUS[0] holds the exit status of the first command in the pipe, $PIPESTATUS[1] the exit status of the second command, and so on.
Your $tags will never have a value because you send it to /dev/null. Besides from that little problem, there is no input to grep.
echo hello |grep "^he" -q ;
ret=$? ;
if [ $ret -eq 0 ];
then
echo there is he in hello;
fi
a successful return code is 0.
...here is 1 take at your 'problem':
pat="most of ";
data="The apples are ripe. I will use most of them for jam.";
echo $data |grep "$pat" -q;
ret=$?;
[ $ret -eq 0 ] && echo $data |sed "s/$pat//"
The apples are ripe. I will use them for jam.
... exact same thing?:
echo The apples are ripe. I will use most of them for jam. | sed ' s/most\ of\ //'
It seems to me you have confused the basic concepts. What are you trying to do anyway?
I am going to answer the title of the question directly instead of considering the detail of the question itself:
"grep a pattern and output non-matching part of line"
The title to this question is important to me because the pattern I am searching for contains characters that sed will assign special meaning to. I want to use grep because I can use -F or --fixed-strings to cause grep to interpret the pattern literally. Unfortunately, sed has no literal option, but both grep and bash have the ability to interpret patterns without considering any special characters.
Note: In my opinion, trying to backslash or escape special characters in a pattern appears complex in code and is unreliable because it is difficult to test. Using tools which are designed to search for literal text leaves me with a comfortable 'that will work' feeling without considering POSIX.
I used both grep and bash to produce the result because bash is slow and my use of fast grep creates a small output from a large input. This code searches for the literal twice, once during grep to quickly extract matching lines and once during =~ to remove the match itself from each line.
while IFS= read -r || [[ -n "$RESULT" ]]; do
if [[ "$REPLY" =~ (.*)("$LITERAL_PATTERN")(.*) ]]; then
printf '%s\n' "${BASH_REMATCH[1]}${BASH_REMATCH[3]}"
else
printf "NOT-REFOUND" # should never happen
exit 1
fi
done < <(grep -F "$LITERAL_PATTERN" < "$INPUT_FILE")
Explanation:
IFS= Reassigning the input field separator is a special prefix for a read statement. Assigning IFS to the empty string causes read to accept each line with all spaces and tabs literally until end of line (assuming IFS is default space-tab-newline).
-r Tells read to accept backslashes in the input stream literally instead of considering them as the start of an escape sequence.
$REPLY Is created by read to store characters from the input stream. The newline at the end of each line will NOT be in $REPLY.
|| [[ -n "$REPLY" ]] The logical or causes the while loop to accept input which is not newline terminated. This does not need to exist because grep always provides a trailing newline for every match. But, I habitually use this in my read loops because without it, characters between the last newline and the end of file will be ignored because that causes read to fail even though content is successfully read.
=~ (.*)("$LITERAL_PATTERN")(.*) ]] Is a standard bash regex test, but anything in quotes in taken as a literal. If I wanted =~ to consider the regex characters in contained in $PATTERN, then I would need to eliminate the double quotes.
"${BASH_REMATCH[#]}" Is created by [[ =~ ]] where [0] is the entire match and [N] is the contents of the match in the Nth set of parentheses.
Note: I do not like to reassign stdin to a while loop because it is easy to error and difficult to see what is happening later. I usually create a function for this type of operation which acts typically and expects file_name parameters or reassignment of stdin during the call.