what is ~$ in linux shell scripts - shell

I see the below statement in a shell script
if [ "$file" = "conf" ] || echo $file | grep -q '~$'; then
What is ~$? I know other dollar notations like $1 $2 $# $$ $* but never saw anything like ~$.

'~$' pattern in grep matches all lines that end with '~'.
So the if portion will be executed, if the file name ends with ~ .
Actually the entire echo $file | grep -q '~$' means:
Try to match if filename ends with ~, but don't print the matching results.
If matched, execute the if part.
The '~$' does have special meaning. ie. end with ~

~$ is a sequence of two characters and has no special beaning in bash.
After all why should you be bothered about ~$ in grep -q '~$'.
It is pretty obvious that ~$ just makes a pattern.
Regarding
what is $ then
It has special meanings
when used in the context of variable, say $var.
when used in a regex saystuff$ which matches lines ending in stuff.
Please check [ Special Parameters ].

Related

Command line argument is not working for "grep" command in Bash

#! /bin/bash
grep -oh \w*ings "${1}"
while the above command is working in terminal, I can't figure out the problem in script
I'm trying to find all the words in the file that are ending with 'ing'
To grep, the string w*ings is a regex. To the shell, the string w*ings is a glob. The string \w*ings is a glob with an escaped w. The shell will take the string \w*ings and convert it to w*ings, and then (I'm not actually sure about the order here; we can check the documentation but it's not important) it will look in the current directory to see if that glob matches any files. If there are multiple files in the directory, it will expand to them. eg, if you have the files waings and wbings, then the shell will pass both as arguments to grep. Assuming $1 expands to foo, grep will get the arguments: -oh, waings, wbings, and foo exactly as if you had invoked grep -oh waings wbings foo. This is almost certainly not what you want. You probably intended to pass the literal string \w*ings to grep, so you ought to have done:
grep -oh '\w*ings' "${1}"
But from your comment, it sounds like you actually want to do something like:
grep -o '[[:<:]][_[:alnum:]]*ing[[:>:]]' "${1}"
or
grep -ow '[_[:alnum:]]*ing' "${1}"
(-w is equivalent to wrapping the pattern in [[:<:]] and [[:>:]])
This would be
grep -oh '.*wings' "$1"

can't pass parameters to grep from shell

I am trying to run a complex grep command from shell (currently zsh on MacOS, but bash would be ok)
I want to pass variables, i.e. $1 and $2, to the command : grep -e 'something $1' -e 'somethingelse $2' file
For instance my script:
#/bin/zsh
echo ------
echo grep -e "'"something $1"'" -e "'"somethingelse $2"'" file
echo ------
grep -e "'"something $1"'" -e "'"somethingelse $2"'" file
This doesn't work with:
% ~/scripts/test cat mouse
------
grep -e 'something cat' -e 'somethingelse mouse' file
------
grep: cat': No such file or directory
grep: mouse': No such file or directory
Any idea?
Don't try to add single-quotes when you run the command; just put double-quotes around the pattern (including the parameter):
#/bin/zsh
echo ------
echo grep -e "'something $1'" -e "'somethingelse $2'" file
echo ------
grep -e "something $1" -e "somethingelse $2" file
Note that when echoing it, I used single-quotes inside the double-quotes. They'll be printed, so it'll look ok, but the shell won't treat them as syntactically significant. When actually running grep, you don't want single-quotes at all.
Well, unless the something contains escapes or dollar signs; in that case, you can either escape them:
grep -e "\$ometh\\ng $1" -e "\$ometh\\nge\\se $2" file
Or mix single- and double-quoting, with single-quotes around the fixed pattern part, and double-quotes just around the parameter part:
grep -e '$ometh\ng '"$1" -e '$ometh\nge\se '"$2" file
I don't know why you want grep to see your quotes. Assuming your literal string something does not contain spaces or other characters which are significant to the shell (most notable filename expansion wildcards) and you are using zsh,
grep something$1 FILE
would work. Of course if you have spaces in or around your something, you need to quote it:
grep 'something '$1 FILE # Significant space between something and $1
or
grep "something $1" FILE
Since you also mentioned bash: In bash, only the last form (using double quotes) makes sense, because if $1 contained spaces, bash would do word splitting.

Bash: Nested variable expansion

How can I nest operations in bash? e.g I know that
$(basename $var)
will give me just the final part of the path and
${name%.*}
gives me everything before the extension.
How do I combine these two calls, I want to do something like:
${$(basename $var)%.*}
As #sid-m 's answer states, you need to change the order of the two expansions because one of them (the % stuff) can only be applied to variables (by giving their name):
echo "$(basename "${var%.*}")"
Other things to mention:
You should use double quotes around every expansion, otherwise you run into trouble if you have spaces in the variable values. I already did that in my answer.
In case you know or expect a specific file extension, basename can strip that off for you as well: basename "$var" .txt (This will print foo for foo.txt in $var.)
You can do it like
echo $(basename ${var%.*})
it is just the order that needs to be changed.
Assuming you want to split the file name, here is a simple pattern :
$ var=/some/folder/with/file.ext
$ echo $(basename $var) | cut -d "." -f1
file
If you know the file extension in advance, you can tell basename to remove it, either as a second argument or via the -s option. Both these yield the same:
basename "${var}" .extension
basename -s .extension "${var}"
If you don't know the file extension in advance, you can try to grep the proper part of the string.
### grep any non-slash followed by anything ending in dot and non-slash
grep -oP '[^/]*(?=\.[^/]*$)' <<< "${var}"

Using egrep and regular expression together

I want to search the below text file for words that ends in _letter, and get the whole portion upto "::". There is no space between any letter
blahblah:/blahblah::abc_letter:/blahblah/blahblah
blahblah:/blahblah::cd_123_letter:/blahblah/blahblah
blahblah:::/blahblah::24_cde_letter:/blahblah/blahblah
blahblah::/blahblah::45a6_letter:/blahblah/blahblah
blahblah:/blahblah::fgh_letter:/blahblah/blahblah
blahblah:/blahblah::789_letter:/blahblah/blahblah
I tried
egrep -o '*_letter'
and
egrep -o "*_letter"
But it only returns the word _letter
then I want to feed the input to the parametre of a shell script for loop. So the script will look like following
for i in [grep command]
mkdir $i
end
It will create the following directories
abc_letter/
cd_123_letter/
24_cde_letter/
45a6_letter/
fgh_letter/
789_letter/
ps: The result between :: and _letter doesn't contain any special character, only alphanumeric character
also my system doesn't have perl
Assuming no spaces or new-lines:
for i in $(sed 's/^.*:\([^/]*_letter\):.*$/\1/g' infile); do
mkdir $i
done
To extract after : to _letter strings from a file.txt and use them in your for loop, you can use the following egrep and revise your: script.sh, like this:
#!/bin/bash
for i in $(egrep -o "[^:]+_letter" file.txt); do
mkdir -p $i
done
Then you run ./script.sh, and later you check with ls, you see:
$ ls -1
24_cde_letter
45a6_letter
789_letter
abc_letter
cd_123_letter
fgh_letter
file.txt
script.sh
Explanation
Your original egrep -o '*_letter' probably just confused bash filename expansion with regular expression,
In bash, *something uses star globbing character to match * = anything here + something.
However in regular expression star * means the preceding character zero or more times. Since * is at the beginning of what you wrote, there is nothing before it, so it does not match anything there.
The only thing egrep can match is _letter, and since we are using the -o option it only displays the match, on an individual line, and thus why you originally only saw a line of _letter matches
Our new changes:
egrep pattern starts with [^ ... ], a negation, matches the opposite of what characters you put within. We put : within.
The + says to match the preceding one or more times.
So combined, it says look for anything-but-:, and do this one or more times.
Thus of course it matches anything after :, and keeps matching, until the next part of the pattern
The next part of the pattern is just _letter
egrep -o so only matched text will be shown, one per line
So in this way, from lines such as:
blahblah:/blahblah::abc_letter:/blahblah/blahblah
It successfully extracts:
abc_letter
Then, changes to your bash script:
Bash command substitution $() to have the results of the egrep command sent to the for-loop
for i value...; do ... done syntax
mkdir -p just a convenience in case you are re-testing, it will not error if directory was already made.
So altogether it helps to extract the pattern you wanted and generate directories with those names.

grep a pattern and output non-matching part of line

I know it is possible to invert grep output with the -v flag. Is there a way to only output the non-matching part of the matched line? I ask because I would like to use the return code of grep (which sed won't have). Here's sort of what I've got:
tags=$(grep "^$PAT" >/dev/null 2>&1)
[ "$?" -eq 0 ] && echo $tags
You could use sed:
$ sed -n "/$PAT/s/$PAT//p" $file
The only problem is that it'll return an exit code of 0 as long as the pattern is good, even if the pattern can't be found.
Explanation
The -n parameter tells sed not to print out any lines. Sed's default is to print out all lines of the file. Let's look at each part of the sed program in between the slashes. Assume the program is /1/2/3/4/5:
/$PAT/: This says to look for all lines that matches pattern $PAT to run your substitution command. Otherwise, sed would operate on all lines, even if there is no substitution.
/s/: This says you will be doing a substitution
/$PAT/: This is the pattern you will be substituting. It's $PAT. So, you're searching for lines that contain $PAT and then you're going to substitute the pattern for something.
//: This is what you're substituting for $PAT. It is null. Therefore, you're deleting $PAT from the line.
/p: This final p says to print out the line.
Thus:
You tell sed not to print out the lines of the file as it processes them.
You're searching for all lines that contain $PAT.
On these lines, you're using the s command (substitution) to remove the pattern.
You're printing out the line once the pattern is removed from the line.
How about using a combination of grep, sed and $PIPESTATUS to get the correct exit-status?
$ echo Humans are not proud of their ancestors, and rarely invite
them round to dinner | grep dinner | sed -n "/dinner/s/dinner//p"
Humans are not proud of their ancestors, and rarely invite them round to
$ echo $PIPESTATUS[1]
0[1]
The members of the $PIPESTATUS array hold the exit status of each respective command executed in a pipe. $PIPESTATUS[0] holds the exit status of the first command in the pipe, $PIPESTATUS[1] the exit status of the second command, and so on.
Your $tags will never have a value because you send it to /dev/null. Besides from that little problem, there is no input to grep.
echo hello |grep "^he" -q ;
ret=$? ;
if [ $ret -eq 0 ];
then
echo there is he in hello;
fi
a successful return code is 0.
...here is 1 take at your 'problem':
pat="most of ";
data="The apples are ripe. I will use most of them for jam.";
echo $data |grep "$pat" -q;
ret=$?;
[ $ret -eq 0 ] && echo $data |sed "s/$pat//"
The apples are ripe. I will use them for jam.
... exact same thing?:
echo The apples are ripe. I will use most of them for jam. | sed ' s/most\ of\ //'
It seems to me you have confused the basic concepts. What are you trying to do anyway?
I am going to answer the title of the question directly instead of considering the detail of the question itself:
"grep a pattern and output non-matching part of line"
The title to this question is important to me because the pattern I am searching for contains characters that sed will assign special meaning to. I want to use grep because I can use -F or --fixed-strings to cause grep to interpret the pattern literally. Unfortunately, sed has no literal option, but both grep and bash have the ability to interpret patterns without considering any special characters.
Note: In my opinion, trying to backslash or escape special characters in a pattern appears complex in code and is unreliable because it is difficult to test. Using tools which are designed to search for literal text leaves me with a comfortable 'that will work' feeling without considering POSIX.
I used both grep and bash to produce the result because bash is slow and my use of fast grep creates a small output from a large input. This code searches for the literal twice, once during grep to quickly extract matching lines and once during =~ to remove the match itself from each line.
while IFS= read -r || [[ -n "$RESULT" ]]; do
if [[ "$REPLY" =~ (.*)("$LITERAL_PATTERN")(.*) ]]; then
printf '%s\n' "${BASH_REMATCH[1]}${BASH_REMATCH[3]}"
else
printf "NOT-REFOUND" # should never happen
exit 1
fi
done < <(grep -F "$LITERAL_PATTERN" < "$INPUT_FILE")
Explanation:
IFS= Reassigning the input field separator is a special prefix for a read statement. Assigning IFS to the empty string causes read to accept each line with all spaces and tabs literally until end of line (assuming IFS is default space-tab-newline).
-r Tells read to accept backslashes in the input stream literally instead of considering them as the start of an escape sequence.
$REPLY Is created by read to store characters from the input stream. The newline at the end of each line will NOT be in $REPLY.
|| [[ -n "$REPLY" ]] The logical or causes the while loop to accept input which is not newline terminated. This does not need to exist because grep always provides a trailing newline for every match. But, I habitually use this in my read loops because without it, characters between the last newline and the end of file will be ignored because that causes read to fail even though content is successfully read.
=~ (.*)("$LITERAL_PATTERN")(.*) ]] Is a standard bash regex test, but anything in quotes in taken as a literal. If I wanted =~ to consider the regex characters in contained in $PATTERN, then I would need to eliminate the double quotes.
"${BASH_REMATCH[#]}" Is created by [[ =~ ]] where [0] is the entire match and [N] is the contents of the match in the Nth set of parentheses.
Note: I do not like to reassign stdin to a while loop because it is easy to error and difficult to see what is happening later. I usually create a function for this type of operation which acts typically and expects file_name parameters or reassignment of stdin during the call.

Resources