GREP: is there a way to use grep inserting a text between filename and the pattern? - shell

I have grep --color -EH "^([^,]*\,){3}5" try.csv
and the output it does is this:
try.csv:410,30151010,K,5001,,,,,,,,,,,,,,,,,,,,,,,,,,,,,512
try.csv:652,20151010,K,5001,,,,,,,,,,,,,,,,,,,,,,,,,,,,,41
try.csv:109,30151010,R,5005,,,,,,,,,,,,,,,,,,,,,,,,,,,,,455
I tried grep --color -EH "^([^,]*,){3}5" try.csv | perl -ne 'print ",$_"'
but the output looks like this :
,try.csv:410,30151010,K,5001,,,,,,,,,,,,,,,,,,,,,,,,,,,,,512
,try.csv:652,20151010,K,5001,,,,,,,,,,,,,,,,,,,,,,,,,,,,,41
,try.csv:109,30151010,R,5005,,,,,,,,,,,,,,,,,,,,,,,,,,,,,455
Expected output:
try.csv:,410,30151010,K,5001,,,,,,,,,,,,,,,,,,,,,,,,,,,,,512
try.csv:,652,20151010,K,5001,,,,,,,,,,,,,,,,,,,,,,,,,,,,,41
try.csv:,109,30151010,R,5005,,,,,,,,,,,,,,,,,,,,,,,,,,,,,455
I am very new to Perl and shell. I'm searching in the CSV files.

You may insert a comma by using sed,
$ grep --color -EH "^([^,]*\,){3}5" try.csv | sed 's/:/&,/'
s/:/&,/: the the special character & in the replacement refers to that portion of the string which matched. And you may add a comma behind & to meet your requirement.

Related

How do I grep only the word I gave as a variable?

I want grep a word in my file, but only the part I gave to grep.
Example : in my file, i've "hell\nhell:o", i want grep hell but not hello.
How can I do that?
Give this a try:
grep -E "\bhell(\s|$)" file
add o if you just want the matched word:
kent$ echo "hell\nhell:o"|grep -oE "\bhell(\s|$)"
hell
or:
grep -oP '\bhell(?=\s|$)'
Using the -o option of grep :
grep -o hell your_file
From grep manual page :
-o, --only-matching
Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.
Your question (and comments) aren't overly clear but you could try either:
"negative lookahead"
using the "end of line anchor"
Negative Lookahead
The following regex will match any hell that isn't followed by a : (adapt as suitable):
hell(?!:)
grep doesn't support this, so you'll need to use perl or something else:
echo -e "hell\nhell:o" \
| perl -ne 'print if /hell(?!:)/'
End of Line Anchor
The following will work with grep, matching only where the hell touches the end of the line ($):
hell$
echo -e "hell\nhell:o" \
| grep 'hell$'
Try to use -oP
echo "hell hello" | grep -oP '\bhell\b'

Grep multiple strings from text file

Okay so I have a textfile containing multiple strings, example of this -
Hello123
Halo123
Gracias
Thank you
...
I want grep to use these strings to find lines with matching strings/keywords from other files within a directory
example of text files being grepped -
123-example-Halo123
321-example-Gracias-com-no
321-example-match
so in this instance the output should be
123-example-Halo123
321-example-Gracias-com-no
With GNU grep:
grep -f file1 file2
-f FILE: Obtain patterns from FILE, one per line.
Output:
123-example-Halo123
321-example-Gracias-com-no
You should probably look at the manpage for grep to get a better understanding of what options are supported by the grep utility. However, there a number of ways to achieve what you're trying to accomplish. Here's one approach:
grep -e "Hello123" -e "Halo123" -e "Gracias" -e "Thank you" list_of_files_to_search
However, since your search strings are already in a separate file, you would probably want to use this approach:
grep -f patternFile list_of_files_to_search
I can think of two possible solutions for your question:
Use multiple regular expressions - a regular expression for each word you want to find, for example:
grep -e Hello123 -e Halo123 file_to_search.txt
Use a single regular expression with an "or" operator. Using Perl regular expressions, it will look like the following:
grep -P "Hello123|Halo123" file_to_search.txt
EDIT:
As you mentioned in your comment, you want to use a list of words to find from a file and search in a full directory.
You can manipulate the words-to-find file to look like -e flags concatenation:
cat words_to_find.txt | sed 's/^/-e "/;s/$/"/' | tr '\n' ' '
This will return something like -e "Hello123" -e "Halo123" -e "Gracias" -e" Thank you", which you can then pass to grep using xargs:
cat words_to_find.txt | sed 's/^/-e "/;s/$/"/' | tr '\n' ' ' | dir_to_search/*
As you can see, the last command also searches in all of the files in the directory.
SECOND EDIT: as PesaThe mentioned, the following command would do this in a much more simple and elegant way:
grep -f words_to_find.txt dir_to_search/*

How to exclude hyphen as word separator in bash

I am unable to grep for exact word match containing hyphen as in
/home/imper-home,3,0,0,0,jim.imper,NONE,NONE,NONE,http://sanjose
/home/imper,15,10,3,30,jim.imper,NONE,NONE,NONE,http://sanjose-age
I tried
grep -w imper
but it returns both /home/imper-home and /home/imper.
I want only /home/imper-home to returned by using,
grep -wv /home/imper
This will work in general:
word=imper
grep -w "$word" file | grep -v "$word-"

How to grep with a pattern that includes a " quotation mark?

I want to grep a line that includes a quotation mark, more specifically I want to grep lines that include a " mark.
more specifically I want to grep lines like:
#include "something.h"
then pipe into sed to just return something.h
A single grep will do this job.
grep -oP '(?<=")[^"]*(?=")' file
Example:
$ echo '#include "something.h"' | grep -oP '(?<=")[^"]*(?=")'
something.h
sed '#n
/"/ s/.*"\([^"]*\)" *$/\1/p' YourFile
No need of grep (unless performance on huge file is wanted) with a sed. Sed could filter and adapt directly the content
In your case, /"/ is certainly modified by /#include *"/
in case of several string between quote
sed '#n
/"/ {s/"[^"]*$/"/;s/[^"]*"\([^"]*\)" */\1/gp;}' YourFile
You can use awk to get included filename:
awk -F'"' '{print $2}' file.c
something.h

Text Manipulation using sed or AWK

I get the following result in my script when I run it against my services. The result differs depending on the service but the text pattern showing below is similar. The result of my script is assigned to var1. I need to extract data from this variable
$var1=HOST1*prod*gem.dot*serviceList : svc1 HOST1*prod*kem.dot*serviceList : svc3, svc4 HOST1*prod*fen.dot*serviceList : svc5, svc6
I need to strip the name of the service list from $var1. So the end result should be printed on separate line as follow:
svc1
svc2
svc3
svc4
svc5
svc6
Can you please help with this?
Regards
Using sed and grep:
sed 's/[^ ]* :\|,\|//g' <<< "$var1" | grep -o '[^ ]*'
sed deletes every non-whitespace before a colon and commas. Grep just outputs the resulting services one per line.
Using gnu grep and gnu sed:
grep -oP ': *\K\w+(, \w+)?' <<< "$var1" | sed 's/, /\n/'
svc1
svc3
svc4
svc5
svc6
grep is the perfect tool for the job.
From man grep:
-o, --only-matching
Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.
Sounds perfect!
As far as I'm aware this will work on any grep:
echo "$var1" | grep -o 'svc[0-9]\+'
Matches "svc" followed by one or more digits. You can also enable the "highly experimental" Perl regexp mode with -P, which means you can use the \d digit character class and don't have to escape the + any more:
grep -Po 'svc\d+' <<<"$var1"
In bash you can use <<< (a Here String) which supplies "$var1" to grep on the standard input.
By the way, if your data was originally on separate lines, like:
HOST1*prod*gem.dot*serviceList : svc1
HOST1*prod*kem.dot*serviceList : svc3, svc4
HOST1*prod*fen.dot*serviceList : svc5, svc6
This would be a good job for awk:
awk -F': ' '{split($2,a,", "); for (i in a) print a[i]}'

Resources