grep match exact substring ignoring regex syntax [duplicate] - bash

This question already has answers here:
Grep for literal strings
(6 answers)
Closed 6 years ago.
Is there some way to make grep match an exact string, and not parse it as a regex? Or is there some tool to escape a string properly for grep?
$ version=10.4
$ echo "10.4" | grep $version
10.4
$ echo "1034" | grep $version # shouldn't match
1034

Use grep -F or fgrep.
$ echo "1034" | grep -F $version # shouldn't match
$ echo "10.4" | grep -F $version
10.4
See man page:
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated
by newlines, any of which is to be matched.
I was looking for the term "literal match" or "fixed string".
(See also Using grep with a complex string and How can grep interpret literally a string that contains an asterisk and is fed to grep through a variable?)

Related

Can be run on the command line, but not in a shell script? [duplicate]

This question already has answers here:
Bash: execute content of variable including pipe [duplicate]
(2 answers)
Closed 1 year ago.
I want to use grep to search for matching strings in the file, but because the file is too big, I only search for the first 500 lines.
I wrote in the shell script:
#!/bin/bash
patterns=(
llc_prefetcher_operat
to_prefetch
llc_prefetcher_cache_fill
)
search_file_path="mix1-bimodal-no-bop-lru-4core.txt"
echo ${#patterns[*]}
cmd="head -500 ${search_file_path} | grep -a "
for(( i=0;i<${#patterns[#]};i++)) do
cmd=$cmd" -e "\"${patterns[i]}\"
done;
echo $cmd
$cmd >junk.log
The result of running the script is:
3
head -500 mix1-bimodal-no-bop-lru-4core.txt | grep -a -e "llc_prefetcher_operat" -e "to_prefetch" -e "llc_prefetcher_cache_fill"
head: invalid option -a
Try'head --help' for more information.
On the penultimate line, I printed out the string of the executed command. I ran it directly on the command line and it was successful.
That is the following sentence.
head -500 mix1-bimodal-no-bop-lru-4core.txt | grep -a -e "llc_prefetcher_operat" -e "to_prefetch" -e "llc_prefetcher_cache_fill"
Note that in the grep command, if I do not add the -a option, there will be a problem of matching the binary file.
Why does this problem occur? Thank you!
Instead of trying to build a string holding a complex command, you're better off using grep's -f option and bash process substitution to pass the list of patterns to search for:
head -500 "$search_file_path" | grep -Faf <(printf "%s\n" "${patterns[#]}") > junk.log
It's shorter, simpler and less error prone.
(I added -F to the grep options because none of your example patterns have any regular expression metacharacters; so fixed string searching will likely be faster)
The biggest problem with what you're doing is the | is treated as just another argument to head when $cmd is word split. It's not treated as a pipeline delimiter like it is when a literal one is present.

grep -v excludes a file it should not exclude [duplicate]

This question already has answers here:
Using grep to search for a string that has a dot in it
(9 answers)
Closed 5 years ago.
It seemed to me that grep -v displays the files that don't contain the following string.
How comes the file named highscore.txt doesn't appear when using grep -v ".c" ?
$ ls -1
a.out
easy.txt
hard.txt
highscores.txt
main.c
main.txt
util.c
$ ls -1 | grep -v ".c"
a.out
easy.txt
hard.txt
medium.txt
The ".c" in your grep command is a regular expression, and . means "any character".
To fix this, you can
Escape the period:
grep -v '\.c$'
I've added the "end of string" anchor $ to exclude false positives for files like something.cpp.
Use the -F option for "fixed strings":
grep -vF '.c'
Notice that this would also exclude something.cpp, which probably isn't what you want.
Use extended glob patterns to exclude anything ending in .c:
shopt -s extglob
ls -1 !(*.c)
Here, *.c is not a regular expression, but a glob pattern, where . is a literal period and has no special meaning.

Why did I get different answers when I changed grep to egrep in the latter half of each [duplicate]

This question already has answers here:
Difference between egrep and grep
(6 answers)
Closed 6 years ago.
$ egrep "^COMP[29]041" enrolments | grep "|F$" | wc -l
24
$ egrep "^COMP[29]041" enrolments | egrep "|F$" | wc -l
166
$
The content of file enrolments:
COMP2041|4836917|Ruld, Ruld |3978/2|M
COMP2041|4850109|Rvyiparzal, Ilbvuy |3979/3|M
COMP2041|2858836|Rzild, Fia Held |3730/4|M
COMP2041|4823158|Sheld, Yild |3978/2|M
COMP2041|4818044|Sheo, Sheo |3978/2|M
COMP2041|4818497|Sheo, Xa |3978/2|M
COMP9041|4899688|Shild, Ge |8680/2|M
COMP2041|4869506|Shild, Yild |3645/2|M
COMP9041|4897426|Shild, Yild |8680/2|M
COMP9041|4368551|Sho, Wuld |8684 |M
COMP2041|4339940|Shuld, Puaxail Baili |3978/3|F
COMP2041|4330093|Veh, Yeold-He |3711/3|M
COMP2041|2230267|Vikil, Ivrha |3978/3|F
COMP2041|4312663|Viy Chiobhova, Jiozrigh |3978/1|M
.......
The question is why I got different answers when I changed grep to egrep in the latter half of each.
What are the differences between grep and egrep?
In egrep (or, preferably, grep -E), the | is a metacharacter, whereas in plain grep it is a plain (non-meta) character.
The |F$ term in egrep looks for an empty string or F at the end of line; it finds an empty string on every line.
The same term in grep looks for a |F at the end of line. To look for that with egrep, you'd need to escape the metacharacter with a backslash: grep -E '\|F$' enrolments.
In short, the plain grep command understands Basic Regular Expressions (BRE). The egrep or 'extended grep' command understands Extended Regular Expressions (ERE). Some versions of grep (such as GNU grep) can be compiled to recognize Perl-Compatible Regular Expressions (PCRE).

How to exclude hyphen as word separator in bash

I am unable to grep for exact word match containing hyphen as in
/home/imper-home,3,0,0,0,jim.imper,NONE,NONE,NONE,http://sanjose
/home/imper,15,10,3,30,jim.imper,NONE,NONE,NONE,http://sanjose-age
I tried
grep -w imper
but it returns both /home/imper-home and /home/imper.
I want only /home/imper-home to returned by using,
grep -wv /home/imper
This will work in general:
word=imper
grep -w "$word" file | grep -v "$word-"

Why do you have to escape | and + in grep between apostrophes?

I was under the impression that within single quotes, e.g. 'pattern', bash special characters are not interpolated, so one need only escape single quotes themselves.
Why then does echo "123" | grep '[0-9]+' output nothing, whereas echo "123" | grep '[0-9]\+' (plus sign escaped) output 123? (Likewise, echo "123" | grep '3|4' outputs nothing unless you escape the |.)
This is under bash 4.1.2 and grep 2.6.3 on CentOS 6.5.
grep uses Basic Regular Expressions, like sed and vi. In that you have to escape metacharacters, and it is tedious.
You probably want Extended Regular Expressions, so use egrep or grep -E (depending on the version in use). Check your man grep.
See also the GNU documentation for a full list of the characters involved.
Most languages use Extended Regular Expressions (EREs) these days, and they are much easier to use. Basic Regular Expressions (BREs) are really a throw-back.
That seems to be the regular expression engine that grep uses. If you use a different one, it works:
$ echo "123" | grep '[0-9]+'
$ echo "123" | grep -P '[0-9]+'
123
$ echo "123" | grep '3|4'
$ echo "123" | grep -P '3|4'
123

Resources