Passing empty strings to grep command - bash

I have this script where I ask for 4 patterns and then use those in a grep command. That is, I want to see if a line matches any of the patterns.
echo -n "Enter pattern1"
read pat1
echo -n "Enter pattern2"
read pat2
echo -n "Enter pattern3"
read pat3
echo -n "Enter pattern4"
read pat4
cat somefile.txt | grep $pat1 | grep $pat2 | grep $pat3 | grep $pat4
The problem I'm running into is that if the user doesn't supply one of the patterns (which I want to allow) the grep command doesn't work.
So, is there a way to have grep ignore one of the patterns if it's returned empty?

Your code has lots of problems:
Code duplication
Interactive asking for potentially unused information
using echo -n is not portable
useless use of cat
Here is what I wrote that is closer to what you should use instead:
i=1
printf %s "Enter pattern $i: "
read -r input
while [[ $input ]]; do
pattern+=(-e "$input")
let i++
printf %s "Enter pattern $i (Enter or Ctrl+D to stop entering patterns): "
read -r input
done
echo
grep "${pattern[#]}" somefile.txt
EDIT: This does not answer OP's question, this searches for multiple patterns with OR instead of AND...
Here is a working AND solution (it will stop prompting for patterns on the first empty one or after the 4th one):
pattern=
for i in {1..4}; do
printf %s "Enter pattern $i: "
read -r input
[[ $input ]] || break
pattern="${pattern:+"$pattern && "}/${input//\//\\/}/"
done
echo # skip a line
awk "$pattern" somefile.txt
Here are some links from which you can learn how to program in bash:
Bash Guide
Bash FAQ

Related

How can I grep a list of names from case?

So as an example, I have a bunch of apps that are constantly writing to /var/log/app//nonsence.file there's nothing else those folders, just logs from this one set of apps. so I can easily do:
cat /var/log/app/*/nonsence.file
and I'll get a nice stream of the app logs.
Mixed into this stream are periodic references to people. I'd like to build a script to trigger when certain names appear in the stream.
I can do this easily enough:
cat /var/log/app/*/nonsence.file | grep 'greg|john|suzy|stacy'
and I can put THAT into a simple script thusly:
#!/bin/sh
NAME=`cat /var/log/app/*/nonsence.file | grep 'greg\|john\|suzy\|stacy'`
case "$NAME" in
"greg" ) echo "I found greg!" >> ~/names.meh ;;
"john" ) echo "I found john!" >> ~/names.meh ;;
"suzy" ) echo "I found suzy!" >> ~/names.meh ;;
"stacy" ) echo "I found stacy!" >> ~/names.meh ;;
* ) echo "forever alone..." >> ~/names.meh ;;
esac
easy peasy!
the trouble is, the list of names change from time to time and I would really like a neater list.
After some thinking I believe what I REALLY want to do is add each name into the case section only. so what do I need to do in the NAME variable section to tell the command to grep the name referenced in the case section?
cat file | grep is a useless use of cat. Just grep file.
Command in a pipe are by default block buffered.
The >> ~/names.meh is just repetition. Just specify it once for the whole block.
The backticks ` are discouraged. It's preferred to use $(..) instead.
Each time NAME=... is assigned the file is read, while you seem to want to want:
... I'd like to build a script to trigger when certain names appear in the stream.
which suggest you want to react when the name appears in the script, not after some time.
You may try:
patterns=(greg john suzy stacy)
printf "%s\n" /var/log/app/*/nonsence.file |
# tail each file at the same time by spawning for each a background process
xargs -P0 -n1 tail -F -n+1 |
# grep for the patterns
# pass the patterns from a file
# the <(...) is a process substitution, a bash extension
grep --line-buffered -f <(printf "%s\n" "${patterns[#]}") -o |
# for each grepped content execute different action
while IFS= read -r line; do
case "$line" in)
"greg") someaction; ;;
# etc
*) echo "Internal error - unhandled pattern"; ;;
esac
done >> ~/names.me
Because specyfing patterns twice is lame, you could do an associative function to map the patterns to function names, or just use unique function names and geenerate from them the pattern list:
pattern_greg() { echo "greg"; }
pattern_kamil() { echo "well, not greg"; }
patterns=($(declare -F | sed 's/declare -f //; /^pattern_/!d; s/pattern_//'))
... |
while IFS= read -r line; do
if declare -f pattern_"$line" >/dev/null 2>&1; then
pattern_"$line"
else
echo "Internal error occured"
fi
done
alternatively, but I like the functions better:
greg_function() { echo do something; }
kamil_callback() { echo do something else; }
declare -A patterns
patterns=([greg]=greg_function [kamil]=kamil_callback)
... | grep -f <(printf "%s\n" ${!patterns[#]}) ... |
while IFS= read -r line; do
# I think this is how to check if array element is set
if [[ -n "${patterns[$line]}" ]]; then
"${patterns[$line]}"
else
echo error
fi
done

Issues with grep and get a count of a string in a loop

I have a set of search strings in a file (File1) and a content file (File2). I am trying to loop through all the search strings within File1 and get a count of each of the search string within File2 and output it - I want to automate this and make it generic so I can search through multiple content files. However, I dont seem to be able to get the exact count when I execute this loop. I get a "0" count for each of the strings although I have those strings in the file. Unable to figure out what I am doing wrong and can use some help !
Below is the script I came up with:
#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
count=$(echo cat "$2" | grep -c "$line")
echo "$count - $line"
done < "$1"
Command I am using to run this script:
./scanscript.sh File1.log File2.log
I say this since I searched this command separately and get the right value. This command works by itself but I want to put this in a loop
cat File2.log | grep -c "Search String"
Sample Data for File 1 (Search Strings):
/SERVER_NAME/Root/DEV/Database/NJ-CONTENT/Procs/
/SERVER_NAME3/Root/DEV/Database/NJ-CONTENT/Procs/
Sample Data for File 2 (Content File):
./SERVER_NAME/Root/DEV/Database/NJ-CONTENT/Procs/test.test_proc.sql:29:
./SERVER_NAME2/Root/DEV/Database/NJ-CONTENT/Procs/test.test_proc.sql:100:
./SERVER_NAME3/Root/DEV/Database/NJ-CONTENT/Procs/test.test_proc.sql:143:
./SERVER_NAME4/Root/DEV/Database/NJ-CONTENT/Procs/test.test_proc.sql:223:
./SERVER_NAME5/Root/DEV/Database/NJ-CONTENT/Procs/test.test_proc.sql:5589:
Problem is this line:
count=$(echo cat "$2" | grep -c "$line")
That should be changed to:
count=$(grep -Fc "$line" "$2")
Also note -F is to be used for fixed string search instead of regex search.
Full code:
while IFS='' read -r line || [[ -n "$line" ]]; do
count=$(grep -Fc "$line" "$2");
echo "$count - $line";
done < "$1"
Run it as:
./scanscript.sh File1.log File2.log
Output:
1 - /SERVER_NAME/Root/DEV/Database/NJ-CONTENT/Procs/
1 - /SERVER_NAME3/Root/DEV/Database/NJ-CONTENT/Procs/

Bash (split) file name comparison fails

In my directory I have files (*fastq.gz.fasta) and directories, whose names contain the filenames (*fastq.gz.fasta-blastdb):
IVC6_Meino.clust.gz.fasta-blastdb
IVC5_Mehiv.clust.gz.fasta-blastdb
....
IVC6_Meino.clust.gz.fasta
IVC5_Mehiv.clust.gz.fasta
....
In a bash script I want to compare the filenames with the direcories using the cut option on the latter to extract only the filename part. If those two names match I want to do further stuff (for now echo match or no match respectively).
I have written the following piece of code:
#!/bin/bash
for file in *.fasta
do
for db in *-blastdb
do
echo $file, $db | cut -d '-' -f 1
if [[ $file = "$db | cut -d '-' -f 1" ]]; then
echo "match"
else
echo "no match"
fi
done
done
But it does not detect matches. The output looks like this:
...
IVC6_Meino.clust.gz.fasta, IIIA11_Meova.clust.gz.fasta
no match
IVC6_Meino.clust.gz.fasta, IVC5_Mehiv.clust.gz.fasta
no match
IVC6_Meino.clust.gz.fasta, IVC6_Meino.clust.gz.fasta
no match
The last line should read match as you can see, the strings look the same.
What am i missing?
You can use parameter expansion to do this more easily:
for file in *.fasta
do
for db in *-blastdb
do
echo "$file", "$db"
if [[ "${file%%.fasta}" = "${db%%.fasta-blastdb}" ]]; then
echo "match"
else
echo "no match"
fi
done
done
If you want to fix yours, the problem is the use of $db | cut -d '-' -f 1 With echo it appears that echo is printing the pipe. It isn't. cut is printing. When you do [[ $file = "$db | cut -d '-' -f 1" ]] it is equivalent to [[ $file = [return code from last pipe component] ]]
You need to use the $(..) shell construct to capture the output of the pipe and you need to echo to get the contents of $db to start the pipe. You should quote "$db" so you do not have word splitting or globbing from the contents of the variable.
Like so:
for file in *.fasta
do
for db in *-blastdb
do
ts=$(echo "$db" | cut -d '-' -f 1)
echo "$file", "$ts"
if [[ "$file" = "$ts" ]]; then
echo "match"
else
echo "no match"
fi
done
done # this works I think -- not tested...
Please be careful with your quoting with Bash and liberally use ShellCheck.
The structure you have is also not the most efficient. You will loop over the *-blastdb glob once for every file in *-blastdb. If you have a lot of files, that could get really slow.
To solve that, you could rewrite this loop with Bash arrays (best if you have Bash 4+) or use awk:
ext1=.fasta
ext2=.fasta-blastdb
awk 'FNR==NR{
s=$0
sub("\\"ext1"$","",s)
seen[s]=$0
next}
{
s=$0
sub("\\"ext2"$","",s)
if (s in seen)
print seen[s], $0
}
' ext1="$ext1" ext2="$ext2" <(for fn in *$ext1; do echo "$fn"; done) <(for fn in *$ext2; do echo "$fn"; done)
Each glob is only executing once and awk is using an array to test if the basenames are the same.
Best

Shell script hangs forever grepping from file with name from "read $file"

I have my below shell script which searches for a string inside a file and returns the count. Not sure why it's getting stuck in the middle. Please can anyone explain.
#!/bin/bash
read -p "Enter file to be searched: " $file
read -p "Enter the word you want to search for: " $word
count=$(grep -o "^${word}:" $file | wc -l)
echo "The count for `$word`: " $count
OUTPUT:
luckee#zarvis:~/scripts$ ./wordsearch.sh
Enter file to be searched: apple.txt
Enter the word you want to search for: apple
^C
read needs to be passed a variable name. file, not $file.
#!/bin/bash
read -p "Enter file to be searched: " file
read -p "Enter the word you want to search for: " word
count=$(grep -o -e "$word" "$file" | wc -l)
echo "The count for $word: $count"
What was happening previously is that your file variable was empty, so your code was running:
count=$(grep -o "^${word}:" | wc -l)
...with no input specified, so it would wait forever for stdin.
By the way -- you don't need wc for this; grep can emit a counter itself, using the -c argument (also called --count in the GNU implementation). If you want that counter to go by words rather than lines, one can use tr to put each word on its own line:
count=$(tr '[[:space:]]' '\n' <"$file" | grep -c -e "$word")

Count mutiple occurences of a word on the same line using grep

Here I made a small script that take input from user searching some pattern from a file and displays required no of lines from that file where the pattern is found. Although this code is searching the pattern line wise due to standard grep practice. I mean if the pattern occurs twice on the same line, i want the output to print twice. Hope I make some sense.
#!/bin/sh
cat /dev/null>copy.txt
echo "Please enter the sentence you want to search:"
read "inputVar"
echo "Please enter the name of the file in which you want to search:"
read "inputFileName"
echo "Please enter the number of lines you want to copy:"
read "inputLineNumber"
[[-z "$inputLineNumber"]] || inputLineNumber=20
cat /dev/null > copy.txt
for N in `grep -n $inputVar $inputFileName | cut -d ":" -f1`
do
LIMIT=`expr $N + $inputLineNumber`
sed -n $N,${LIMIT}p $inputFileName >> copy.txt
echo "-----------------------" >> copy.txt
done
cat copy.txt
As I understood, the task is to count number of pattern occurrences in line. It can be done like so:
count=$((`echo "$line" | sed -e "s|$pattern|\n|g" | wc -l` - 1))
Suppose you have one file to read. Then, code will be following:
#!/bin/bash
file=$1
pattern="an."
#reading file line by line
cat -n $file | while read input
do
#storing line to $tmp
tmp=`echo $input | grep "$pattern"`
#counting occurrences count
count=$((`echo "$tmp" | sed -e "s|$pattern|\n|g" | wc -l` - 1))
#printing $tmp line $count times
for i in `seq 1 $count`
do
echo $tmp
done
done
I checked this for pattern "an." and input:
I pass here an example of many 'an' letters
an
ananas
an-an-as
Output is:
$ ./test.sh input
1 I pass here an example of many 'an' letters
1 I pass here an example of many 'an' letters
1 I pass here an example of many 'an' letters
3 ananas
4 an-an-as
4 an-an-as
Adapt this to your needs.
How about using awk?
Assume the pattern you are searching for is in variable $pattern and the file you are checking is $file
The
count=`awk 'BEGIN{n=0}{n+=split($0,a,"'$pattern'")-1}END {print n}' $file`
or for a line
count=`echo $line | awk '{n=split($0,a,"'$pattern'")-1;print n}`

Resources