grep up to and including equal sign for CLI parameter - bash

My goal is to match a command line argument prefix that looks like:
--abc=
Both of the patterns below (and many others), allow:
--abc==
Somehow, I can't find a grep way to ensure there is just one equal sign.
grep -i '^--[a-z]\{2,\}=\{1,1\}'
grep -i '^--[a-z]\{2,\}='
grep 2.20
CentOS Linux 7.3.1611

ERE:
^--[[:alpha:]]{2,}=[^=]+$
^--[[:alpha:]]{2,}= matches --, then two or more alphabetic characters in your locale, then a literal =
[^=]+$ matches one or more characters that are not = at the end
BRE:
^--[[:alpha:]]\{2,\}=[^=]\+$
Example:
$ grep -E '^--[[:alpha:]]{2,}=[^=]*$' <<<'--foobar=spam'
--foobar=spam
$ grep -E '^--[[:alpha:]]{2,}=[^=]*$' <<<'--foobar=23'
--foobar=23
$ grep -E '^--[[:alpha:]]{2,}=[^=]*$' <<<'--123ad='
$ grep -E '^--[[:alpha:]]{2,}=[^=]+$' <<<'--spamegg='

Related

Grep with a regex character range that includes the NULL character

When I include the NULL character (\x00) in a regex character range in BSD grep, the result is unexpected: no characters match. Why is this happening?
Here is an example:
$ echo 'ABCabc<>/ă' | grep -o [$'\x00'-$'\x7f']
Here I expect all characters up until the last one to match, however the result is no output (no matches).
Alternatively, when I start the character range from \x01, it works as expected:
$ echo 'ABCabc<>/ă' | grep -o [$'\x01'-$'\x7f']
A
B
C
a
b
c
<
>
/
Also, here are my grep and BASH versions:
$ grep --version
grep (BSD grep) 2.5.1-FreeBSD
$ echo $BASH_VERSION
3.2.57(1)-release
On BSD grep, you may be able to use this:
LC_ALL=C grep -o '[[:print:][:cntrl:]]' <<< 'ABCabc<>/ă'
A
B
C
a
b
c
<
>
/
Or you can just install gnu grep using home brew package and run:
grep -oP '[[:ascii:]]' <<< 'ABCabc<>/ă'
Noting that $'...' is a shell quoting construct, this,
$ echo 'ABCabc<>/ă' | grep -o [$'\x00'-$'\x7f']
would try to pass a literal NUL character as part of the command line argument to grep. That's impossible to do in any Unix-like system, as the command line arguments are passed to the process as NUL-terminated strings. So in effect, grep sees just the arguments -o and [.
You would need to create some pattern that matches the NUL byte without including it literally. But I don't think grep supports the \000 or \x00 escapes itself. Perl does, though, so this prints the input line with the NUL:
$ printf 'foo\nbar\0\n' |perl -ne 'print if /\000/'
bar
As an aside, at least GNU grep doesn't seem to like that kind of a range expression, so if you were to use that, you'd to do something different. In the C locale, [[:cntrl:][:print:]]' might perhaps work to match the characters from \x01 to \x7f, but I didn't check comprehensively.
The manual for grep has some descriptions of the classes.
Note also that [$'\x00'-$'\x7f'] has an unquoted pair of [ and ] and so is a shell glob. This isn't related to the NUL byte, but if you had files that match the glob (any one-letter names, if the glob works on your system -- it doesn't on my Linux), or had failglob or nullglob set, it would probably give results you didn't want. Instead, quote the brackets too: $'[\x00-\x7f]'.

combine grep and grep -v search together

I am trying to combine grep and grep -v search together.
Output should be display all lines ending with .xml, but to exclude lines starting with $.
Here are the commands I have tried; none worked:
grep *.xml file1.txt | grep -v '$' file1.txt > output
grep *.xml | grep -v '$' file1.txt > output
grep *.xml grep -v '$' file1.txt > output
grep *.xml '$' file1.txt > output
To match a $ at the start of a line, anchor it to the start of the line with ^. Also, $ by itself matches the end of the line (it's a special character, just like ^), and * will not do what you think it does (it works differently in regular expressions compared to in shell globbing patterns). So,
grep -v '^\$'
will filter out all lines starting with a $.
You can do either
grep '\.xml$' file1.txt | grep -v '^\$'
or
grep '^[^$].*\.xml$' file1.txt
to find all lines in the file file1.txt that do not start with $ but that ends with .xml.
Notice that I also escape the dot in .xml as that otherwise matches any character, and that the second command combines both criteria by using a character range ([ ... ]) containing all characters except $ (the .* matches any number of any characters).
The single quotes are necessary so that the shell won't interpret the regular expression as a shell globbing pattern.
You should use "cat" command to direct the output to an file.
And then use regular expression to filter the keyword, in this case all lines start with $ symbol is '^[$]'.
So you can use command cat *.xml | grep -v '^[$]'.

How to exclude hyphen as word separator in bash

I am unable to grep for exact word match containing hyphen as in
/home/imper-home,3,0,0,0,jim.imper,NONE,NONE,NONE,http://sanjose
/home/imper,15,10,3,30,jim.imper,NONE,NONE,NONE,http://sanjose-age
I tried
grep -w imper
but it returns both /home/imper-home and /home/imper.
I want only /home/imper-home to returned by using,
grep -wv /home/imper
This will work in general:
word=imper
grep -w "$word" file | grep -v "$word-"

sed or grep to read between a set of parentheses

I'm trying to read a version number from between a set of parentheses, from this output of some command:
Test Application version 1.3.5
card 0: A version 0x1010000 (1.0.0), 20 ch
Total known cards: 1
What I'm looking to get is 1.0.0.
I've tried variations of sed and grep:
command.sh | grep -o -P '(?<="(").*(?=")")'
command.sh | sed -e 's/(\(.*\))/\1/'
and plenty of variations. No luck :-(
Help?
You were almost there! In pgrep, use backslashes to keep literal meaning of parentheses, not double quotes:
grep -o -P '(?<=\().*(?=\))'
Having GNU grep you can also use the \K escape sequence available in perl mode:
grep -oP '\(\K[^)]+'
\K removes what has been matched so far. In this case the starting ( gets removed from match.
Alternatively you could use awk:
awk -F'[()]' 'NF>1{print $2}'
The command splits input lines using parentheses as delimiters. Once a line has been splitted into multiple fields (meaning the parentheses were found) the version number is the second field and gets printed.
Btw, the sed command you've shown should be:
sed -ne 's/.*(\(.*\)).*/\1/p'
There are a couple of variations that will work. First with grep and sed:
grep '(' filename | sed 's/^.*[(]\(.*\)[)].*$/\1/'
or with a short shell script:
#!/bin/sh
while read -r line; do
value=$(expr "$line" : ".*(\(.*\)).*")
if [ "x$value" != "x" ]; then
printf "%s\n" "$value"
fi
done <"$1"
Both return 1.0.0 for your given input file.

grep not finding ".*" string values

I have a file temp.txt as below.
a.*,super
I want to grep .* to check whether the value is present in the file or not.
Command used:
grep -i ".*" temp.txt
returns nothing
This is because grep considers the pattern as a regular expression.
To make grep interpret it as a literal, use -F.
grep -F ".*" temp.txt
Also, note -i is not needed, because there is no case distinction to take into account (we for example use it to make grep return AB, aB, Ab and ab when doing grep -i "ab").
As man grep says:
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by newlines,
any of which is to be matched. (-F is specified by POSIX.)
-i, --ignore-case
Ignore case distinctions in both the PATTERN and the input files. (-i
is specified by POSIX.)
Using awk
awk '/\.\*/' file
or fgrep
fgrep ".*" file
Both ., * have special meaning in regular expression. Escape them to match literally.
$ cat temp.txt
a.*,super
$ grep "\.\*" temp.txt
a.*,super
$ echo $?
0
$ grep "there-is-no-such-string" temp.txt
$ echo $?
1
-i is not need because there's no alphabet in the regular expression.

Resources