Why does a "while read" loop stop when grep is run with an empty argument? - bash

The following code does not work as I would expect:
(the original purpose of the script is to make a relation between items of two files where the identifiers are not sorted in the same order, but my question raises rather a curiosity about basic shell functionalities)
#!/bin/sh
process_line() {
id="$1"
entry=$(grep $id index.txt) # the "grep" line
if [ "$entry" = "" ]; then
echo 00000 $id
else
echo $entry | awk '{print $2, $1;}'
fi
}
cat << EOF > index.txt
xyz 33333
abc 11111
def 22222
EOF
cat << EOF | while read line ; do process_line "$line"; done
abc
def
xyz
EOF
The output is:
11111 abc
22222 def
00000
But I would expect:
11111 abc
22222 def
00000
33333 xyz
(the last line is missing in the actual output)
My investigations show that the "grep" line is the one that leads to the early interruption of the while loop. However I cannot see the causal relationship.

That's because in the third iteration with the empty line, you call process_line with an empty id. This leads to grep index.txt, i.e. no file name. This grep reads from stdin and that'll consume all your input you pipe into the while loop.
To see this in action, add set -x at the top of your script.
You can get the desired behaviour if you replace the empty id with a string guaranteed to be not found, such as
entry=$(grep "${id:-NoSuchString}" index.txt)

Changing the "process_line" function to the following might help...
process_line() {
id=$1
if [ "$id" = "" ]
then
echo "00000"
else
entry=$(grep "${id}" index.txt)
echo "$entry" | awk '{ print $2, $1 }'
fi
}
Explanation:
if the "id" passed in is empty then just output the default
move the grep to the else clause so it only executes when "id" has a value
solves the problem with the missing quotes around id in the grep statement
another thing to consider is the case where "id" is not-empty but not found in the index.txt file. This could result in a blank output. Adding an if statement after the grep call to handle this case may be a good idea depending on what the overall intention is.
Hope that helps

Related

how to grep certain string which matches certain value

I have file abc.sh which contains below data -
a_a_1 was unsuccessful
a_a_5 was completed
a_a_2 was unsuccessful
a_a_4 was unsuccessful
a_a_9 was unsuccessful
now, I have a variable abc which contains value 2,1,9 ..i.e abc=2,1,9
now want to print only those lines from file which matches value 2,1,9 and above string.
output should be like-
a_a_2 was unsuccessful
a_a_1 was unsuccessful
a_a_9 was unsuccessful
How to achieve above output?
Since this is tagged tcl...
#!/usr/bin/env tclsh
proc main {abc} {
set abc [string map {, |} $abc]
set re [string cat {^a_a_(?:} $abc {)\M}]
while {[gets stdin line] >= 0} {
if {[regexp $re $line]} {
puts $line
}
}
}
main [lindex $argv 0]
Example usage:
$ ./findit.tcl 2,1,9 < abc.sh
Basically, it converts the CSV 2,1,9 into pipe delimited 2|1|9 and uses that as part of a bigger regular expression, and prints lines read from standard input that match it.
Since you also seem to be interested in a bash solution (you already got one for Tcl):
grep -E "_(${abc//,/|}) " abc.sh
The idea here is to translate the 2,1,9 into the regexp pattern 2|1|9. An alternative, similar in spirit, would be
grep "_[${abc//,/}] " abc.sh
which produces the pattern [219].
Your variable need to be:
abc="2\|1\|9" for grep $abc abc.sh
a=2 b=1 c=9 for grep "[$a]\|[$b]\|[$c]" abc.sh
Well as your question, if you want for abc=2,1,9 , you can just change the , as \| using sed when executing grep like this:
grep $(echo $abc | sed "s/,/\\\|/g") abc.sh
*ps: English is not my primary language so please excuse any grammar mistakes :)

find if there is any string in a line after matching regex bash

In a file looking like below I would to find if all lines with "PRIO" has any value after that and if there are some values missing I would like to write it as output.
I've tried to do this with grep, but it only matches if there's even one occurrence of looking word.
cat PATH_TO_FILE | grep 'PRIO' &> /dev/null
if [ $? == 0 ]; then
echo "matched"
else
echo "not found"
fi
File structure looks simillar to this one below
name1
sdgk
PRIO 3
name2
PRIO
dsl dfhhhdf
name3
fnslkf hsdhfd
jlkg;jslk sgdgdsg
kfasdjmgkdlsgl sdggsehg
PRIO 1
name4
sdgds
dsdsgdg
PRIO 2
sdgg
With awk this is very simple by checking the number of fields.
awk '/PRIO/{ str=(NF>1)?"matched":"not found"; print str }' <file>
This does :
/PRIO/ : if a line contains the word PRIO perform action {...}
{...} : if the number for fields is bigger then 1 `(NF>1)1, it matched otherwise it did not.
If you want to ensure that PRIO is the first word, then use $1=="PRIO", and if you want to print the line number then use
awk '($1=="PRIO"){ str=(NF>1)?"matched":"not found"; print NR,str }' <file>
not sure what you want to test but
$ grep -q 'PRIO\s*$' file
checks whether there are any PRIO without any value after. For your sample input this will succeed, which you can use this as an error condition.
if grep -q 'PRIO\s*$' file
then echo "found missing value instance"
else echo "all instances have values"
fi

Count lines following a pattern from file

For example I have a file test.json that contains a series of line containing:
header{...}
body{...}
other text (as others)
empty lines
I wanted to run a script that returns the following
Counted started on : test.json
- headers : 4
- body : 5
- <others>
Counted finished : <time elapsed>
What I got so far is this.
count_file() {
echo "Counted started on : $1"
#TODO loop
cat $1 | grep header | wc -l
cat $1 | grep body | wc -l
#others
echo "Counted finished : " #TODO timeElapsed
}
Edit:
Edit question and added code snippet
Perl on Command Line
perl -E '$match=$ARGV[1];open(Input, "<", $ARGV[0]);while(<Input>){ ++$n if /$match/g } say $match," ",$n;' your-file your-pattern
For me
perl -E '$match=$ARGV[1];open(Input, "<", $ARGV[0]);while(<Input>){ ++$n if /$match/g } say $match," ",$n;' parsing_command_line.pl my
It counts how many number of pattern my are, in my script parsing_command_line.pl
output
my 3
For you
perl -E '$match=$ARGV[1];open(Input, "<", $ARGV[0]);while(<Input>){ ++$n if /$match/g } say $match," ",$n;' test.json headers
NOTE
Write all code in one line on your command prompt
First argument is your file
Second is your pattern
This is not a complete code since you have to enter all your pattern one-by-one
You can capture the result of a commande in a variable, like:
result=`cat $1 | grep header | wc -l`
and then print the result:
echo "# headers : $b"
` is the eval operator that let replace the whole expression by the output of the command inside.

Output a record from an existing file based on a matching condition in bash scripting

I need to be able to output a record if a condition is true.
Suppose this is the existing file,
Record_ID,Name,Last Name,Phone Number
I am trying to output record if the last name matches. I collect user input to get last name and then perform the following operation.
read last_name
cat contact_records.txt | awk -F, '{if($3=='$last_name')print "match"; else print "no match";}'
This script outputs no match for every record within contact_records.txt
Your script has two problems:
First, $last_name is not considered quoted in the context of 'awk'. For example, if "John" is to be queried, you are comparing $3 with the variable John rather than string "John". This can be fixed by adding two double-quotes as below:
read last_name
cat contact_records.txt | awk -F, '{if($3=="'$last_name'")print "match"; else print "no match";}'
Second, it actually scans the whole contact_records.txt and prints match/no match for each line of comparison. For example, contact_records.txt has 100 lines, with "John" in it. Then, querying if John is in it by this script yields 1 "match"'s and 99 "no match"'s. This might not be what you want. Here's a fix:
read last_name
if [ `cat contact_records.txt | cut -d, -f 3 | grep -c "$last_name"` -eq 0 ]; then
echo "no match"
else
echo "match"
fi

Bash add to end of file (>>) if not duplicate line

Normally I use something like this for processes I run on my servers
./runEvilProcess.sh >> ./evilProcess.log
However I'm currently using Doxygen and it produces lots of duplicate output
Example output:
QGDict::hashAsciiKey: Invalid null key
QGDict::hashAsciiKey: Invalid null key
QGDict::hashAsciiKey: Invalid null key
So you end up with a very messy log
Is there a way I can only add the line to the log file if the line wasn't the last one added.
A poor example (but not sure how to do in bash)
$previousLine = ""
$outputLine = getNextLine()
if($previousLine != $outputLine) {
$outputLine >> logfile.log
$previousLine = $outputLine
}
If the process returns duplicate lines in a row, pipe the output of your process through uniq:
$ ./t.sh
one
one
two
two
two
one
one
$ ./t.sh | uniq
one
two
one
If the logs are sent to the standard error stream, you'll need to redirect that too:
$ ./yourprog 2>&1 | uniq >> logfile
(This won't help if the duplicates come from multiple runs of the program - but then you can pipe your log file through uniq when reviewing it.)
Create a filter script (filter.sh):
while read line; do
if [ "$last" != "$line" ]; then
echo $line
last=$line
fi
done
and use it:
./runEvilProcess.sh | sh filter.sh >> evillog

Resources