How to grep return result as the matching term - terminal

I would like to return only the first instance (case-insensitive) of the term I used to search (if there's a match), how would I do this?
example:
$ grep "exactly-this"
Binary file /Path/To/Some/Files/file.txt matches
I would like to return the result like:
$ grep "exactly-this"
exactly-this

grep has an inbuilt count argument
You can use the -m option to give a count argument to grep
grep -m 1 "exactly-this"
If you want to avoid the message in case of the binary files,use
grep -a -m 1 "exactly-this"
Note that this will print the word in which the match occurred.Since it is a binary file,the word may span over multiple lines

What you need is the -o option of grep.
From the man page
-o, --only-matching
Prints only the matching part of the lines.
Test:
[jaypal:~/Temp] cat file
This is a file with some exactly this in the middle
with exactly this in the begining
and some at the very end in brackets (exactly this)
[jaypal:~/Temp] grep -o 'exactly this' file
exactly this
exactly this
exactly this
[jaypal:~/Temp] grep -om1 'exactly this' file
exactly this

Related

Exclude a string using grep -v

I have a requirement to exclude a selected string. I am trying to use grep -v option.
Input:
AM2RGHK
AM2RGHK-JO
AM2RGHK-FN
Output should be:
AM2RGHK-JO
AM2RGHK-FN
From the input list, If I want to exclude only first line, I am using grep -v AM2RGHK
But I am not getting any output. grep -v excludes all the strings in the same sequence. Any clue?
grep is matching all the input lines because it's default behavior is to match line that contains the given pattern. It doesn't have to be exactly equal to it.
You can tell grep that it has to find an exact match by using the option -x (--line-regexp).
grep -v -x AM2RGHK does what you want.
Side notes:
Since you don't seem to use an actual regex but you just need simple text match, you may consider the option -F (--fixed-strings). It tells grep to not give special meaning to any character in the pattern.
Moreover, it's always a good practice to encase shell strings in ''. This ensures that the shell doesn't try to interpret any characters, like whitespaces. It can spare you a lot of headaches.
The resulting command would be:
grep -vxF 'AM2RGHK'
grep -v '^AM2RGHK$' input.txt
input.txt:
AM2RGHK
AM2RGHK-JO
AM2RGHK-FN
standard output:
AM2RGHK-JO
AM2RGHK-FN
Think about what you have control over, in this case you can compare against the string in question but instead of just the string we can add a pad character on each end and look for that.
#!/bin/bash
read_file="/tmp/list.txt"
exclude="AM2RGHK"
while IFS= read -r line ;do
if ! [[ "Z${line}Z" = "Z${exclude}Z" ]] ;then
echo "$line"
fi
done < "${read_file}"
This case we are saying if ZAM2RGHKZ != current then print it, the comparison would be as follows:
ZAM2RGHKZ = ZAM2RGHKZ do nothing
ZAM2RGHKZ != ZAM2RGHK-JOZ print because they don't match
ZAM2RGHKZ != ZAM2RGHK-FNZ print because they don't match
Hence the output becomes:
AM2RGHK-JO
AM2RGHK-FN
Note: there are more succinct ways to do this but this is a good way as well.
grep has the option -x
-x, --line-regexp
Select only those matches that exactly match the whole line. (-x is specified by POSIX.)
for example:
[dachnik#test]$ cat > greptest
AM2RGHK
AM2RGHK-JO
AM2RGHK-FN[dachnik#test]$ cat greptest
AM2RGHK
AM2RGHK-JO
AM2RGHK-FN[dachnik#test]$ grep -v -x AM2RGHK greptest
AM2RGHK-JO
AM2RGHK-FN

How to grep and match the first occurrence of a line?

Given the following content:
title="Bar=1; Fizz=2; Foo_Bar=3;"
I'd like to match the first occurrence of Bar value which is 1. Also I don't want to rely on soundings of the word (like double quote in the front), because the pattern could be in the middle of the line.
Here is my attempt:
$ grep -o -m1 'Bar=[ ./0-9a-zA-Z_-]\+' input.txt
Bar=1
Bar=3
I've used -m/--max-count which suppose to stop reading the file after num matches, but it didn't work. Why this option doesn't work as expected?
I could mix with head -n1, but I wondering if it is possible to achieve that with grep?
grep is line-oriented, so it apparently counts matches in terms of lines when using -m[1]
- even if multiple matches are found on the line (and are output individually with -o).
While I wouldn't know to solve the problem with grep alone (except with GNU grep's -P option - see anubhava's helpful answer), awk can do it (in a portable manner):
$ awk -F'Bar=|;' '{ print $2 }' <<<"Bar=1; Fizz=2; Foo_Bar=3;"
1
Use print "Bar=" $2, if the field name should be included.
Also note that the <<< method of providing input via stdin (a so-called here-string) is specific to Bash, Ksh, Zsh; if POSIX compliance is a must, use echo "..." | grep ... instead.
[1] Options -m and -o are not part of the grep POSIX spec., but both GNU and BSD/OSX grep support them and have chosen to implement the line-based logic.
This is consistent with the standard -c option, which counts "selected lines", i.e., the number of matching lines:
grep -o -c 'Bar=[ ./0-9a-zA-Z_-]\+' <<<"Bar=1; Fizz=2; Foo_Bar=3;" yields 1.
Using perl based regex flavor in gnu grep you can use:
grep -oP '^(.(?!Bar=\d+))*Bar=\d+' <<< "Bar=1; Fizz=2; Foo_Bar=3;"
Bar=1
(.(?!Bar=\d+))* will match 0 or more of any characters that don't have Bar=\d+ pattern thus making sure we match first Bar=\d+
If intent is to just print the value after = then use:
grep -oP '^(.(?!Bar=\d+))*Bar=\K\d+' <<< "Bar=1; Fizz=2; Foo_Bar=3;"
1
You can use grep -P (assuming you are on gnu grep) and positive look ahead ((?=.*Bar)) to achieve that in grep:
echo "Bar=1; Fizz=2; Foo_Bar=3;" | grep -oP -m 1 'Bar=[ ./0-9a-zA-Z_-]+(?=.*Bar)'
First use a grep to make the line start with Bar, and then get the Bar at the start of the line:
grep -o "Bar=.*" input.txt | grep -o -m1 "^Bar=[ ./0-9a-zA-Z_-]\+"
When you have a large file, you can optimize with
grep -o -m1 "Bar=.*" input.txt | grep -o -m1 "^Bar=[ ./0-9a-zA-Z_-]\+"

one command line grep and word count recursively

I can do the following using a for loop
for f in *.txt; do grep 'RINEX' $f |wc -l; done
Is there any possibility to get an individual file report by running one liner?
Meaning that I want to grep & wc one file at the time in a similar fashion like
grep 'RINEX' *.txt
UPDATE:
grep -c 'RINEX' *.txt
returns the name of each file and its corresponding number of occurrences. Thx #Evert
grep is not the right tool for this task.
grep does line based match, e.g. line grep 'o' <<< "fooo" will return 1 line. however we have 3 os.
This one-liner should do what you want:
awk -F'RINEX' 'FILENAME!=f{if(f)print f,s;f=FILENAME;s=0}
{s+=(NF-1)}
END{print f,s}' /path/*.txt

grep like command to find matching lines plus neighbourhood lines

grep command is really powerful and I use it a lot.
Sometime I have the necessity to find something with grep looking inside many many files to find the string I barely remember helping myself with -i (ignore case) option, -r (recursive) and also -v (exclude).
But what I really need is to have a special output from grep which highlight the matching line(s) plus the neighbourhood lines (given the matching line I'd like to see, let's say, the 2 preceding and the 2 subsequent lines).
Is there a way to get this result using bash?
Grep itself will do this
grep -A 2 -B 2 foo myfile.txt
most greps allow the "context" flag making it a bit more readable:
grep --context=3 foo myfile.txt
You can omit -C
grep -2 foo myfile.txt
is equal to
grep -C 2 foo myfile.txt

bash grep newline

[Editorial insertion: Possible duplicate of the same poster's earlier question?]
Hi, I need to extract from the file:
first
second
third
using the grep command, the following line:
second
third
How should the grep command look like?
Instead of grep, you can use pcregrep which supports multiline patterns
pcregrep -M 'second\nthird' file
-M allows the pattern to match more than one line.
Your question abstract "bash grep newline", implies that you would want to match on the second\nthird sequence of characters - i.e. something containing newline within it.
Since the grep works on "lines" and these two are different lines, you would not be able to match it this way.
So, I'd split it into several tasks:
you match the line that contains "second" and output the line that has matched and the subsequent line:
grep -A 1 "second" testfile
you translate every other newline into the sequence that is guaranteed not to occur in the input. I think the simplest way to do that would be using perl:
perl -npe '$x=1-$x; s/\n/##UnUsedSequence##/ if $x;'
you do a grep on these lines, this time searching for string ##UnUsedSequence##third:
grep "##UnUsedSequence##third"
you unwrap the unused sequences back into the newlines, sed might be the simplest:
sed -e 's/##UnUsedSequence##/\n'
So the resulting pipe command to do what you want would look like:
grep -A 1 "second" testfile | perl -npe '$x=1-$x; s/\n/##UnUsedSequence##/ if $x;' | grep "##UnUsedSequence##third" | sed -e 's/##UnUsedSequence##/\n/'
Not the most elegant by far, but should work. I'm curious to know of better approaches, though - there should be some.
I don't think grep is the way to go on this.
If you just want to strip the first line from any file (to generalize your question), I would use sed instead.
sed '1d' INPUT_FILE_NAME
This will send the contents of the file to standard output with the first line deleted.
Then you can redirect the standard output to another file to capture the results.
sed '1d' INPUT_FILE_NAME > OUTPUT_FILE_NAME
That should do it.
If you have to use grep and just don't want to display the line with first on it, then try this:
grep -v first INPUT_FILE_NAME
By passing the -v switch, you are telling grep to show you everything but the expression that you are passing. In effect show me everything but the line(s) with first in them.
However, the downside is that a file with multiple first's in it will not show those other lines either and may not be the behavior that you are expecting.
To shunt the results into a new file, try this:
grep -v first INPUT_FILE_NAME > OUTPUT_FILE_NAME
Hope this helps.
I don't really understand what do you want to match. I would not use grep, but one of the following:
tail -2 file # to get last two lines
head -n +2 file # to get all but first line
sed -e '2,3p;d' file # to get lines from second to third
(not sure how standard it is, it works in GNU tools for sure)
So you just don't want the line containing "first"? -v inverts the grep results.
$ echo -e "first\nsecond\nthird\n" | grep -v first
second
third
Line? Or lines?
Try
grep -E -e '(second|third)' filename
Edit: grep is line oriented. you're going to have to use either Perl, sed or awk to perform the pattern match across lines.
BTW -E tell grep that the regexp is extended RE.
grep -A1 "second" | grep -B1 "third" works nicely, and if you have multiple matches it will even get rid of the original -- match delimiter
grep -E '(second|third)' /path/to/file
egrep -w 'second|third' /path/to/file
you could use
$ grep -1 third filename
this will print a string with match and one string before and after. Since "third" is in the last string you get last two strings.
I like notnoop's answer, but building on AndrewY's answer (which is better for those without pcregrep, but way too complicated), you can just do:
RESULT=`grep -A1 -s -m1 '^\s*second\s*$' file | grep -s -B1 -m1 '^\s*third\s*$'`
grep -v '^first' filename
Where the -v flag inverts the match.

Resources