shell script to read contain from file and grep on other file - shell

I am working on shell, I want to write one liner which will read the file contents of file A and execute grep command on file B.
for example, suppose there are two file
dataFile.log which have following value
abc
xyz
... and so on
now read abc and grep on searchFile.log like grep abc searchFile.log
I have shell script for the same but want one liner for it
for i in "cat dataFile.log" do grep $i searchFile.log done;

try this:
grep -f dataFile.log searchFile.log
Note that if you want to grep as fixed string, you need -F, if you want to match the text in dataFile.log as regex, use -E or -P

How about the following: it even ignores blank lines and # comments:
while read FILE; do if [[ "$FILE" != [/a-zA-Z0-9]* ]]; do continue; fi; grep -h pattern "$FILE"; done;
Beware: have not compiled this.

You can use grep -f option:
cat dataFile.log | grep -f searchFile.log
Edit
OK, now I understand the problem. You want to use every line from dataFile.log to grep in searchFile.log. I also see you have value1|value2|..., so instead of grep you need egrep.
Try with this:
for i in `cat dataFile.log`
do
egrep "$i" searchFile.log
done
Edit 2
Following chepner suggestion:
egrep -f dataFile.log searchFile.log

Related

User input into variables and grep a file for pattern

H!
So I am trying to run a script which looks for a string pattern.
For example, from a file I want to find 2 words, located separately
"I like toast, toast is amazing. Bread is just toast before it was toasted."
I want to invoke it from the command line using something like this:
./myscript.sh myfile.txt "toast bread"
My code so far:
text_file=$1
keyword_first=$2
keyword_second=$3
find_keyword=$(cat $text_file | grep -w "$keyword_first""$keyword_second" )
echo $find_keyword
i have tried a few different ways. Directly from the command line I can make it run using:
cat myfile.txt | grep -E 'toast|bread'
I'm trying to put the user input into variables and use the variables to grep the file
You seem to be looking simply for
grep -E "$2|$3" "$1"
What works on the command line will also work in a script, though you will need to switch to double quotes for the shell to replace variables inside the quotes.
In this case, the -E option can be replaced with multiple -e options, too.
grep -e "$2" -e "$3" "$1"
You can pipe to grep twice:
find_keyword=$(cat $text_file | grep -w "$keyword_first" | grep -w "$keyword_second")
Note that your search word "bread" is not found because the string contains the uppercase "Bread". If you want to find the words regardless of this, you should use the case-insensitive option -i for grep:
find_keyword=$(cat $text_file | grep -w -i "$keyword_first" | grep -w -i "$keyword_second")
In a full script:
#!/bin/bash
#
# usage: ./myscript.sh myfile.txt "toast" "bread"
text_file=$1
keyword_first=$2
keyword_second=$3
find_keyword=$(cat $text_file | grep -w -i "$keyword_first" | grep -w -i "$keyword_second")
echo $find_keyword

Regular expression for extract line between two characters

I have several sequences to test to see if they are present in my file and I want to extract them in another file. The sequences start with a unique id that must be kept and end with ">" that I don't want to keep. I did a test but I have a problem with the regular expression
#!/bin/bash
cat data.fsa | grep "Qrob" | wc -l
for gene_id in 'gene1' 'gene2'
do
if cat "data.fsa" |grep $gene_id >/dev/null 2>&1
then
echo "data.fsa" | sed -n "s/.*${gene_id}\(.*\)>.*/\"\1\"/p"
else
continue
fi
done
How do I do this? Thanks for your help
I understand my error thanks to you ! Thank you.
sed -n "/^>$gene_id/,/^>/p" data.fsa >> test.fsa && sed -i '$d' test.fsa
I generate the file directly and I delete with sed -i '$d' test.fsa manually the last selection.

grep from two variables

I am trying to eliminate the duplicate lines of a list like this.
LINES='opa
opa
eita
eita
argh'
DUPLICATE='opa
eita'
The output I am looking for is argh.
Till now, this is what I tried:
echo -e "$DUPLICATE" | grep --invert-match -Ff- <(echo -e "$LINES")
And:
grep --invert-match -Ff- <(echo -e "$DUPLICATE") <(echo -e "$LINES")
But unsuccessfuly.
I know that I can achieve this if I put the content of $LINES into a file:
echo -e "$DUPLICATE" | grep --invert-match -Ff- FILE
But I'd like to know if this is possible only with variables.
Passing a dash as the file name to -f means "read from stdin". Get rid of it so the file name given to -f is the process substitution.
There's no need for echo -e, and -v is shorter and more common than --invert-match.
echo "$LINES" | grep -vFf <(echo "$DUPLICATE")
Equivalently, using a herestring:
grep -vFf <(echo "$DUPLICATE") <<< "$LINES"
another approach which doesn't require to create a duplicate list separately,
$ awk '{a[$0]++} END{for(k in a) if(a[k]==1) print k}' <<< "$LINES"
count occurrence of each line, print only if it's not duplicated (count==1).

Print the contents of files from the output of a program

Let's say I have a program foo that finds files with a certain specification and that the output of running foo is:
file1.txt
file2.txt
file3.txt
I want to print the contents of each of those files (preferably with the file name prepended). How would I do this? I would've thought piping it to cat like so:
foo | cat
would work but it doesn't.
EDIT:
My solution to this problem prints out each file and prepends the filename to each line of output is:
foo | xargs grep .
This gets output similar to:
file1.txt: Hello world
file2.txt: My name is foobar.
<your command> | xargs cat
You need xargs here:
foo | xargs cat
In order to allow for file names that have spaces in them, you'll need something like this:
#/bin/bash
while read -r file
do
# Check for existence of the file before using cat on it.
if [[ -f $file ]]; then
cat "$file"
# Don't bother with empty lines
elif [[ -n $file ]]; then
echo "There is no file named '$file'"
fi
done
Put this a script. Let's call it myscript.sh. Then, execute:
foo | myscript.sh
foo | xargs grep '^' /dev/null
why grep on ^ ? to display also empty lines (replace with "." if you want only non-empty lines)
why is there a /dev/null ? so that, in addition to any filename provided in "foo" output, there is at least 1 additionnal file (and a file NOT maching anything, such as /dev/null). That way there is AT LEAST 2 filenames given to grep, and thus grep will always show the matching filename.

Bash grep variable from multiple variables on a single line

I am using GNU bash, version 4.2.20(1)-release (x86_64-pc-linux-gnu). I have a music file list I dumped into a variable: $pltemp.
Example:
/Music/New/2010s/2011;Ziggy Marley;Reggae In My Head
I wish to grep the 3rd field above, in the Master-Music-List.txt, then continue another grep for the 2nd field. If both matched, print else echo "Not Matched".
So the above will search for the Song Title (Reggae In My Head), then will make sure it has the artist "Shaggy" on the same line, for a success.
So far, success for a non-variable grep;
$ grep -i -w -E 'shaggy.*angel' Master-Music-MM-Playlist.m3u
$ if ! grep Shaggy Master-Music-MM-Playlist.m3u ; then echo "Not Found"; fi
$ grep -i -w Angel Master-Music-MM-Playlist.m3u | grep -i -w shaggy
I'm not sure how to best construct the 'entire' list to process.
I want to do this on a single line.
I used this to dump the list into the variable $pltemp...
Original: \Music\New\2010s\2011\Ziggy Marley - Reggae In My Head.mp3
$ pltemp="$(cat Reggae.m3u | sed -e 's/\(.*\)\\/\1;/' -e 's/\(.*\)\ -\ /\1;/' -e 's/\\/\//g' -e 's/\\/\//g' -e 's/.mp3//')"
If you realy want to "grep this, then grep that", you need something more complex than grep by itself. How about awk?
awk -F';' '$3~/title/ && $2~/artist/ {print;n=1;exit;} END {if(n=0)print "Not matched";}'
If you want to make this search accessible as a script, the same thing simply changes form. For example:
#!/bin/sh
awk -F';' -vartist="$1" -vtitle="$2" '$3~title && $2~artist {print;n=1;exit;} END {if(n=0)print "Not matched";}'
Write this to a file, make it executable, and pipe stuff to it, with the artist substring/regex you're looking for as the first command line option, and the title substring/regex as the second.
On the other hand, what you're looking for might just be a slightly more complex regular expression. Let's wrap it in bash for you:
if ! echo "$pltemp" | egrep '^[^;]+;[^;]*artist[^;]*;.*title'; then
echo "Not matched"
fi
You can compress this to a single line if you like. Or make it a stand-along shell script, or make it a function in your .bashrc file.
awk -F ';' -v title="$title" -v artist="$artist" '$3 ~ title && $2 ~ artist'
Well, none of the above worked, so I came up with this...
for i in *.m3u; do
cat "$i" | sed 's/.*\\//' | while read z; do
grep --color=never -i -w -m 1 "$z" Master-Music-Playlist.m3u \
| echo "#NotFound;"$z" "
done > "$i"-MM-Final.txt;
done
Each line is read (\Music\Lady Gaga - Paparazzi.mp3), the path is stripped, the song is searched in the Master Music List, if not found, it echos "Not Found", saved into a new playlist.
Works {Solved}
Thanks anyway.

Resources