Exclude a string using grep -v - bash

I have a requirement to exclude a selected string. I am trying to use grep -v option.
Input:
AM2RGHK
AM2RGHK-JO
AM2RGHK-FN
Output should be:
AM2RGHK-JO
AM2RGHK-FN
From the input list, If I want to exclude only first line, I am using grep -v AM2RGHK
But I am not getting any output. grep -v excludes all the strings in the same sequence. Any clue?

grep is matching all the input lines because it's default behavior is to match line that contains the given pattern. It doesn't have to be exactly equal to it.
You can tell grep that it has to find an exact match by using the option -x (--line-regexp).
grep -v -x AM2RGHK does what you want.
Side notes:
Since you don't seem to use an actual regex but you just need simple text match, you may consider the option -F (--fixed-strings). It tells grep to not give special meaning to any character in the pattern.
Moreover, it's always a good practice to encase shell strings in ''. This ensures that the shell doesn't try to interpret any characters, like whitespaces. It can spare you a lot of headaches.
The resulting command would be:
grep -vxF 'AM2RGHK'

grep -v '^AM2RGHK$' input.txt
input.txt:
AM2RGHK
AM2RGHK-JO
AM2RGHK-FN
standard output:
AM2RGHK-JO
AM2RGHK-FN

Think about what you have control over, in this case you can compare against the string in question but instead of just the string we can add a pad character on each end and look for that.
#!/bin/bash
read_file="/tmp/list.txt"
exclude="AM2RGHK"
while IFS= read -r line ;do
if ! [[ "Z${line}Z" = "Z${exclude}Z" ]] ;then
echo "$line"
fi
done < "${read_file}"
This case we are saying if ZAM2RGHKZ != current then print it, the comparison would be as follows:
ZAM2RGHKZ = ZAM2RGHKZ do nothing
ZAM2RGHKZ != ZAM2RGHK-JOZ print because they don't match
ZAM2RGHKZ != ZAM2RGHK-FNZ print because they don't match
Hence the output becomes:
AM2RGHK-JO
AM2RGHK-FN
Note: there are more succinct ways to do this but this is a good way as well.

grep has the option -x
-x, --line-regexp
Select only those matches that exactly match the whole line. (-x is specified by POSIX.)
for example:
[dachnik#test]$ cat > greptest
AM2RGHK
AM2RGHK-JO
AM2RGHK-FN[dachnik#test]$ cat greptest
AM2RGHK
AM2RGHK-JO
AM2RGHK-FN[dachnik#test]$ grep -v -x AM2RGHK greptest
AM2RGHK-JO
AM2RGHK-FN

Related

grep for exact word in a file containing "."

I have a file named "TestGrep" that contains content as shown below
#!/bin/bash
/ParentFolder/a #email1.com
/ParentFolder/b #email2.com
/ParentFolder/.a #email1.com
/ParentFolder/.b #email2.com
/ParentFolder/ #email3.com
I am using the below grep command
grep -Fw "/ParentFolder/" TestGrep
The output is
/ParentFolder/.a #email1.com
/ParentFolder/.b #email2.com
/ParentFolder/ #email3.com
It is somehow ignoring the dots in the TestGrep file.
I want the output to be shown as below
/ParentFolder/ #email3.com
How can I query using grep command that would just check if the exact string match is done and return output as expected.
Could you please try following. Using -E option of grep here.
grep -E '/ParentFolder/\s+' Input_file
From man grep about -E option of grep:
-E, --extended-regexp
Interpret PATTERN as an extended regular expression
\s+ means looks for spaces one or more occurrences.

Grep with a variable

I would like to grep a number of documents by using a set of search terms and to specify the number of characters after match. Here is what I tried
grep -F -o -P "$(<search.txt).{0,4}" foo.txt
but I get the message 'grep: conflicting matchers specified' because -F and '-oP' cannot be combined. It does not work with '-E' either.
-F and -P are conflicting options, simple as that. The first means that the patterns are fixed strings, the second means that the patterns are Perl-compatible regular expressions. Perhaps you meant to use -f instead, which reads patterns from a file or a process substitution.
If you want to match any of the patterns in your file, followed by 4 characters, you could use something like this
grep -oP -f <(awk '{print $0 ".{4}"}' search.txt) file
This dynamically adds the pattern to each line in the file.
Alternatively, a more portable and concise version would be this:
sed 's/$/.{0,4}/' search.txt | grep -f - -oP file

How to Read a file word by word and use those words to grep in bash shell?

I want to read a file word by word and i want to use each word in that text file as an input to grep.
to read the file word by word i have used the following code
for word in $(<filename)
do
echo "$word"
done
now when I replaced
echo "$word"
with
grep -i "$word"
I'm not getting any output.
The following will read the file word by word and apply grep using the read word as input:
#!/bin/bash
while read line; do
for word in $line; do
grep -i "<REGULAR_EXPRESSION_HERE>" "$word"
done
done < filename
The reason you are not getting any output is that grep expects two arguments. If you leave out the filename argument, it will wait for you to type in the text to grep from; it is reading standard input. (This is what allows you to use it in a pipeline, like command | grep error.)
Anyway, what you are attempting is already built into grep. Just pass it the file of search expressions as an argument to -f.
grep -irf filename .
where -r says to search recursively through all the files in a directory and . is the current directory.
Note, however, that this will search for matches anywhere on a line. If your input file contains dog then grep will find a match on lines which contain dogmatic or endogenous; and if it contains an empty line, it will match all lines in all files. Maybe look at the -w and/or -x options (as well as perhaps -F to disarm any regex specials in the input) to address these issues.
See if this serves your purpose:
$ grep -o "\S*" filename | grep -i "<your regex here>"
The first grep in the pipeline will flatten the file to one word per line. Then second grep will search those word for your regex.
Note: This answer assumes that the individual words in file are the data you want to grep in. If those are supposed to be interpreted as filenames, refer to higuaro's answer.
This is what worked for me
while read line
do
output=`grep -i "$line" /filepath/*`
if [ $? -eq 0 ]; then
echo "$line present in file : $output"
fi
done <filename

How to grep return result as the matching term

I would like to return only the first instance (case-insensitive) of the term I used to search (if there's a match), how would I do this?
example:
$ grep "exactly-this"
Binary file /Path/To/Some/Files/file.txt matches
I would like to return the result like:
$ grep "exactly-this"
exactly-this
grep has an inbuilt count argument
You can use the -m option to give a count argument to grep
grep -m 1 "exactly-this"
If you want to avoid the message in case of the binary files,use
grep -a -m 1 "exactly-this"
Note that this will print the word in which the match occurred.Since it is a binary file,the word may span over multiple lines
What you need is the -o option of grep.
From the man page
-o, --only-matching
Prints only the matching part of the lines.
Test:
[jaypal:~/Temp] cat file
This is a file with some exactly this in the middle
with exactly this in the begining
and some at the very end in brackets (exactly this)
[jaypal:~/Temp] grep -o 'exactly this' file
exactly this
exactly this
exactly this
[jaypal:~/Temp] grep -om1 'exactly this' file
exactly this

bash grep newline

[Editorial insertion: Possible duplicate of the same poster's earlier question?]
Hi, I need to extract from the file:
first
second
third
using the grep command, the following line:
second
third
How should the grep command look like?
Instead of grep, you can use pcregrep which supports multiline patterns
pcregrep -M 'second\nthird' file
-M allows the pattern to match more than one line.
Your question abstract "bash grep newline", implies that you would want to match on the second\nthird sequence of characters - i.e. something containing newline within it.
Since the grep works on "lines" and these two are different lines, you would not be able to match it this way.
So, I'd split it into several tasks:
you match the line that contains "second" and output the line that has matched and the subsequent line:
grep -A 1 "second" testfile
you translate every other newline into the sequence that is guaranteed not to occur in the input. I think the simplest way to do that would be using perl:
perl -npe '$x=1-$x; s/\n/##UnUsedSequence##/ if $x;'
you do a grep on these lines, this time searching for string ##UnUsedSequence##third:
grep "##UnUsedSequence##third"
you unwrap the unused sequences back into the newlines, sed might be the simplest:
sed -e 's/##UnUsedSequence##/\n'
So the resulting pipe command to do what you want would look like:
grep -A 1 "second" testfile | perl -npe '$x=1-$x; s/\n/##UnUsedSequence##/ if $x;' | grep "##UnUsedSequence##third" | sed -e 's/##UnUsedSequence##/\n/'
Not the most elegant by far, but should work. I'm curious to know of better approaches, though - there should be some.
I don't think grep is the way to go on this.
If you just want to strip the first line from any file (to generalize your question), I would use sed instead.
sed '1d' INPUT_FILE_NAME
This will send the contents of the file to standard output with the first line deleted.
Then you can redirect the standard output to another file to capture the results.
sed '1d' INPUT_FILE_NAME > OUTPUT_FILE_NAME
That should do it.
If you have to use grep and just don't want to display the line with first on it, then try this:
grep -v first INPUT_FILE_NAME
By passing the -v switch, you are telling grep to show you everything but the expression that you are passing. In effect show me everything but the line(s) with first in them.
However, the downside is that a file with multiple first's in it will not show those other lines either and may not be the behavior that you are expecting.
To shunt the results into a new file, try this:
grep -v first INPUT_FILE_NAME > OUTPUT_FILE_NAME
Hope this helps.
I don't really understand what do you want to match. I would not use grep, but one of the following:
tail -2 file # to get last two lines
head -n +2 file # to get all but first line
sed -e '2,3p;d' file # to get lines from second to third
(not sure how standard it is, it works in GNU tools for sure)
So you just don't want the line containing "first"? -v inverts the grep results.
$ echo -e "first\nsecond\nthird\n" | grep -v first
second
third
Line? Or lines?
Try
grep -E -e '(second|third)' filename
Edit: grep is line oriented. you're going to have to use either Perl, sed or awk to perform the pattern match across lines.
BTW -E tell grep that the regexp is extended RE.
grep -A1 "second" | grep -B1 "third" works nicely, and if you have multiple matches it will even get rid of the original -- match delimiter
grep -E '(second|third)' /path/to/file
egrep -w 'second|third' /path/to/file
you could use
$ grep -1 third filename
this will print a string with match and one string before and after. Since "third" is in the last string you get last two strings.
I like notnoop's answer, but building on AndrewY's answer (which is better for those without pcregrep, but way too complicated), you can just do:
RESULT=`grep -A1 -s -m1 '^\s*second\s*$' file | grep -s -B1 -m1 '^\s*third\s*$'`
grep -v '^first' filename
Where the -v flag inverts the match.

Resources