awk command returns inconsistent result - shell

I have a script which performs several validations in a file.
One of the validation is to check if a line exceeds 95 characters (including spaces). If it does, it will return the line number/s where the line exceeds 95 characters. Below is the command used for the validation:
$(cat $FileName > $tempfile)
totalCharacters=$(awk '{print length($0);}' $tempFile | grep -vn [9][4])
if [[$totalCharacters != "" ]]; then
totalCharacters=$(awk '{print length($0);}' $tempFile | grep -vn [9][4] | cut -d : -f1 | tr "\n" ",")
echo "line too long"
exit 1
fi
In the lower environment the code is working as expected. But in production, there are time that the validation returns the error "line too long" but does not return any line number. We just reran the script and it does not return any error.
I would like to know what could be wrong in the command used.
The previous developer who worked on this said that it could be an issue with the use of the awk command, but I am not sure since this is the first time I have encountered using the awk command.

This is not well written. All you need is:
awk 'length>94{print NR; f=1} END{exit f}' file
If there are lines longer than 94 chars, it will print the line numbers and the exit with status 1; otherwise the exit status will be 0 and no output will be generated.

Your script should just be:
awk '
length()>95 { printf "%s%d", (c++?",":""), NR }
END { if (c) {print "line too long"; exit 1} }
' "$FileName"

Related

Unexpected behaviour with awk exit

I have next code:
process_mem() {
used=`sed -n -e '/^Cpu(s):/p' $temp_data_file | awk '{print $2}' | sed 's/\%us,//'`
idle=`sed -n -e '/^Cpu(s):/p' $temp_data_file | awk '{print $5}' | sed 's/\%id,//'`
awk -v used=$used \
-v custom_cpu_thres=$custom_cpu_thres \
'{
if(used>custom_cpu_thres){
exit 1
}else{
exit 0
}
}'
return=$?
echo $return
if [[ $return -eq 1 ]]; then
echo $server_name"- High CPU Usage (Used:"$used".Idle:"$idle"). "
out=1
else
echo $server_name"- Normal CPU Usage (Used:"$used".Idle:"$idle"). "
fi
}
while IFS='' read -r line || [[ -n "$line" ]]; do
server_name=`echo $line | awk '{print $1}'`
custom_cpu_thres=`echo $line | awk '{print $3}'`
if [ "$custom_cpu_thres" = "-" ]; then
custom_cpu_thres=$def_cpu_thres
fi
expect -f "$EXPECT_SCRIPT" "$command" >/dev/null 2>&1
result=$?
if [[ $result -eq 0 ]]; then
process_mem
else
echo $server_name"- Error in Expect Script. "
out=1
fi
echo $server_name
done < $conf_file
exit $out
The problem is that read bash loop should be executed 4 times (one per line readed). However, if I write the awk code with an exit inside, read bash loop exits after first loop.
Why is this happening? In my opinion exit code in awk code shouldn't affect bash script..
Regards.
I believe the statement you make is false.
You stated:
The problem is that read bash loop should be executed 4 times (one per line read). However, if I write the awk code with an exit inside, read bash loop exits after the first loop.
I do not believe that the script exits after the first loop, but is stuck in the first loop. The reason I make this statement is that your awk script is flawed. The way you wrote it is :
awk -v used=$used -v custom_cpu_thres=$custom_cpu_thres \
'{ if(used>custom_cpu_thres){ exit 1 }
else{ exit 0 } }'
The problem here is that Awk did not get an input file. If no input file is proved to awk, it is reading stdin (similar to processing a pipe or keyboard input). Since no information is sent to stdin (unless you pressed a couple of keys and accidentally hit Enter) the script will not move forward and Awk is awaiting input.
The standard input shall be used only if no file operands are specified, or if a file operand is '-', or if a progfile option-argument is '-'; see the INPUT FILES section. If the awk program contains no actions and no patterns, but is otherwise a valid awk program, standard input and any file operands shall not be read and awk shall exit with a return status of zero.
source : Awk POSIX Standard
The following bash-line demonstrates the above statement:
$ while true; do awk '{print "woot!"; exit }'; done
Only when you press some keys followed by Enter, the word "woot!" is printed on the screen!
How to solve your problem:
The easiest way to solve your problem using Awk is by making use of the BEGIN block. This block is executed before it reads any input line (or stdin). If you tell Awk to exit in a begin block, it will terminate Awk without reading any input. Thus:
awk -v used=$used -v custom_cpu_thres=$custom_cpu_thres \
'BEGIN{ if(used>custom_cpu_thres){ exit 1 }
else{ exit 0 } }'
or shorter
awk -v used=$used -v custom_cpu_thres=$custom_cpu_thres \
'BEGIN{ exit (used>custom_cpu_thres) }
However, Awk is a bit of an overkill here. A simple bash test would suffice:
[[ "$used" -le "$custom_cpu_thres" ]]
result=$?
or
(( used <= custom_cpu_thres ))
result=$?

For loop in a awk command

I have a file which has rows , now i want to read it'w value from awk command in Unix. I am able to read that file , but i have added a for loop to traverse all the data into the file. But my for loop is not ending it is going in infinite loop.
Below code i am using to read the file and get the data of $1 ,$2 and $3 position
file=$1;
nbrClients=`wc -l $file | cut -d' ' -f1`;
echo $nbrClients;
awk '{
for(i=0; i<=$nbrClients; ++i)
{print $1 $2 $3}
}' $file
File which i am reading has below format :
abc 12 test.txt
abc 12 test.txt
abc 12 test.txt
abc 12 test.txt
abc 12 test.txt
abc 12 test.txt
So for this nbrClients value will be 6 and it should loop for 6 times but it is not doing so .Please suggest what wrong i am doing in this.
Here is the full code which i am trying to :
file=$1;
nbrClients=`wc -l $file | cut -d' ' -f1`;
echo $nbrClients;
file=$1;
cat | awk '{
fileName=$1
tnxCount=$2
for i in `seq 1 $tnxCount`
do
echo "Starting thread number $i"
nohup perl /home/user/abc.pl -i $fileName >>/home/user/test_load_${today}.out 2>&1 &
done
}' $file;
I think the problem here is that you're under the impression that the for loop is what will cause awk to step through your input file, whereas it's awk's nature to do that already.
Awk works by taking a set of condition { statement } pairs, and then FOR EACH LINE OF INPUT, evaluating the condition, and if it rings true, executing the statement. Note that conditions can be statements (since functions and other commands have a return value) and statements can include if constructs, so there's a lot of flexibility here.
Note that awk can also reduce or simplify stuff you'd do in a shell script. Consider the following:
#!/bin/sh
file="$1"
awk '
NR==FNR {
ClientCount++
next
}
FNR==1 {
printf "%s: %d\n", FILENAME, ClientCount
}
{
print $1, $2, $3
}
' "$file" "$file"
This script reads your input file twice -- once to count the lines (so that the line count can be placed at the top of theoutput), and once to process the lines, printing the first three fields. The script is composed of three condition { statement } groupings:
The first one is the counter. It only operates on the first instance of the file, and the next command insures that no other commands will be run on that file.
The second one operates on the first line of the file. But since the first condition captured all of the first file, this statement will only be executed once, when the first line of the second file is in play.
The third one is what prints the bulk of your output. With awk, when no condition is included, the condition is assumed to be "true", so this statement runs for each line of the second file.
The awk script could of course be compressed onto a single line, I've spaced it out for easier reading.
Note also that this method of keeping or showing a line count might be a little heavy handed. If you know that you're just showing a line count, you can use the internal awk variable NR. At the point in your script where the second condition is evaluated, NR-1 is the line count of the previous file, so you could use:
#!/bin/sh
file="$1"
awk '
NR==FNR {
next
}
FNR==1 {
printf "%s: %d\n", FILENAME, NR-1
}
{
print $1, $2, $3
}
' "$file" "$file"
updating the answer based on comment and latest version of the question
file=$1;
nbrClients=`wc -l $file | cut -d' ' -f1`;
echo $nbrClients;
file=$1;
cat $file | awk -v fileName="$1" -v tnxCount="$2" '{
system("echo "Starting thread number $i"")
system("nohup perl /home/user/abc.pl -i $fileName >>/home/user/test_load_${today}.out 2>&1 &")
}';

Bash: Complex command in if statement

I'm writing this script, which should detect an error after the smart test is done. But I can't get it to detect any error, or not of course.
if [[ smartctl --log=selftest /dev/sda | awk 'NR>7 {print $4,$5,$6,$7}' | sed 's/offline//g; s/00%//g' != *"Completed without error"* ]; then
echo "No error detected"
else echo "Error detected"
fi
Output:
./test.sh: line 19: conditional binary operator expected
./test.sh: line 19: syntax error near `--log=selftest'
./test.sh: line 19: `if [[ smartctl --log=selftest /dev/sda | awk 'NR>7 {print $4,$5,$6,$7}' | sed 's/offline//g; s/00%//g' != *"Completed without error"* ]]; then'
So obviously I'm doing something wrong. But all the tutorials say two [[]] thingies, but I think the command is quite complex, it doesn't work... How can I make it work?
If you want to do a substring comparison, you need to pass a string on the left-hand side of the = or != operator to [[ ]].
A command substitution, $(), will replace the command it contains with its output, giving you a single string which can be compared in this way.
That is:
smartctl_output=$(smartctl --log=selftest /dev/sda | awk 'NR>7 {print $4,$5,$6,$7}' | sed 's/offline//g; s/00%//g')
if [[ "$smartctl_output" != *"Completed without error"* ]; then
: ...put your error handling here...
fi
or, a bit less readably:
if [[ "$(smartctl --log=selftest /dev/sda | awk 'NR>7 {print $4,$5,$6,$7}' | sed 's/offline//g; s/00%//g')" != *"Completed without error"* ]; then
: ...put your error handling here...
fi
You are confusing things. If the command you want to test is smartctl, don't replace it with [[. You want either or, not both. (See also e.g. Bash if statement syntax error)
Anyway, piping awk through sed and then using the shell to compare the result to another string seems like an extremely roundabout way of doing things. The way to communicate with if is to return a non-zero exit code for error.
if smartctl --log=selftest /dev/sda |
awk 'NR>7 { if ($4 OFS $5 OFS $6 OFS $7 ~ /Completed without error/) e=1; exit }
END { exit 1-e }'
then
echo "No error detected"
else
echo "Error detected"
fi

Bash - grep & echo only linenumbers of occurence

I am struggling with only displaying an errormessage with the linenumbers.
e.g.
ERROR: Rule19: Tunerparams and/or CalcInternal in Script at 13, 15, 22
Could you please check and help me to get it right (I am very new to this)
checkCodingRule19()
{
grep -En "TunerParams|CalcInternal" $INPUT_FILE &&
echo "error: ´Rule 19: Tunerparams and/or Calicinternal in Script at $line"
}
Instead of grep you can use this simple awk script:
awk '(NR==13 || NR==15 || NR==22) && /TunerParams|CalcInternal/' file.log
NR==13 || NR==15 || NR==22 will execute this command only for line numbers 13, 15 & 22
/TunerParams|CalcInternal/ will search for these patterns in a line
Better to check for line numbers first to avoid regex search in each line.
line=`awk '$0 ~ /Tunerparams|CalcInternal/ {printf NR ", " }' < $INPUT_FILE | sed "s/, $//"`
echo "error: Rule 19: Tunerparams and/or Calicinternal in Script at $line"
Technical Explanation
Use awk to search for Tunerparams or CalcInternal in $INPUT_FILE. Print NR, the line number, each time a match is made. Append a ", ". Pipe the output to sed to trim the last comma. $line now has the comma-delimited list of numbers. So simply echo it out.
I noticed there is a "´" in your echo statement which probably doesn't belong.

How to verify information using standard linux/unix filters?

I have the following data in a Tab delimited file:
_ DATA _
Col1 Col2 Col3 Col4 Col5
blah1 blah2 blah3 4 someotherText
blahA blahZ blahJ 2 someotherText1
blahB blahT blahT 7 someotherText2
blahC blahQ blahL 10 someotherText3
I want to make sure that the data in 4th column of this file is always an integer. I know how to do this in perl
Read each line, Store value of 4th column in a variable
check if that variable is an integer
if above is true, continue the loop
else break out of the loop with message saying file data not correct
But how would I do this in a shell script using standard linux/unix filter? My guess would be to use grep, but I am not sure how?
cut -f4 data | LANG=C grep -q '[^0-9]' && echo invalid
LANG=C for speed
-q to quit at first error in possible long file
If you need to strip the first line then use tail -n+2 or you could get hacky and use:
cut -f4 data | LANG=C sed -n '1b;/[^0-9]/{s/.*/invalid/p;q}'
awk is the tool most naturally suited for parsing by columns:
awk '{if ($4 !~ /^[0-9]+$/) { print "Error! Column 4 is not an integer:"; print $0; exit 1}}' data.txt
As you get more complex with your error detection, you'll probably want to put the awk script in a file and invoke it with awk -f verify.awk data.txt.
Edit: in the form you'd put into verify.awk:
{
if ($4 !~/^[0-9]+$/) {
print "Error! Column 4 is not an integer:"
print $0
exit 1
}
}
Note that I've made awk exit with a non-zero code, so that you can easily check it in your calling script with something like this in bash:
if awk -f verify.awk data.txt; then
# action for success
else
# action for failure
fi
You could use grep, but it doesn't inherently recognize columns. You'd be stuck writing patterns to match the columns.
awk is what you need.
I can't upvote yet, but I would upvote Jefromi's answer if I could.
Sometimes you need it BASH only, because tr, cut & awk behave differently on Linux/Solaris/Aix/BSD/etc:
while read a b c d e ; do [[ "$d" =~ ^[0-9] ]] || echo "$a: $d not a numer" ; done < data
Edited....
#!/bin/bash
isdigit ()
{
[ $# -eq 1 ] || return 0
case $1 in
*[!0-9]*|"") return 0;;
*) return 1;;
esac
}
while read line
do
col=($line)
digit=${col[3]}
if isdigit "$digit"
then
echo "err, no digit $digit"
else
echo "hey, we got a digit $digit"
fi
done
Use this in a script foo.sh and run it like ./foo.sh < data.txt
See tldp.org for more info
Pure Bash:
linenum=1; while read line; do field=($line); if ((linenum>1)); then [[ ! ${field[3]} =~ ^[[:digit:]]+$ ]] && echo "FAIL: line number: ${linenum}, value: '${field[3]}' is not an integer"; fi; ((linenum++)); done < data.txt
To stop at the first error, add a break:
linenum=1; while read line; do field=($line); if ((linenum>1)); then [[ ! ${field[3]} =~ ^[[:digit:]]+$ ]] && echo "FAIL: line number: ${linenum}, value: '${field[3]}' is not an integer" && break; fi; ((linenum++)); done < data.txt
cut -f 4 filename
will return the fourth field of each line to stdout.
Hopefully that's a good start, because it's been a long time since I had to do any major shell scripting.
Mind, this may well not be the most efficient compared to iterating through the file with something like perl.
tail +2 x.x | sort -n -k 4 | head -1 | cut -f 4 | egrep "^[0-9]+$"
if [ "$?" == "0" ]
then
echo "file is ok";
fi
tail +2 gives you all but the first line (since your sample has a header)
sort -n -k 4 sorts the file numerically on the 4th column, letters will rise to the top.
head -1 gives you the first line of the file
cut -f 4 gives you the 4th column, of the first line
egrep "^[0-9]+$" checks if the value is a number (integers in this case).
If egrep finds nothing, $? is 1, otherwise it's 0.
There's also:
if [ `tail +2 x.x | wc -l` == `tail +2 x.x | cut -f 4 | egrep "^[0-9]+$" | wc -l` ] then
echo "file is ok";
fi
This will be faster, requiring two simple scans through the file, but it's not a single pipeline.
#OP, use awk
awk '$4+0<=0{print "not ok";exit}' file

Resources