Compare floating point numbers using regex in Bash - bash

I need to check if current NTP offset is biger than 2.xxxxx , where xxx is any number
for example 2.005596757,2.006086349
offset=$(ntpdate -q 1.2.3.4 | head -1 | cut -d " " -f 6 | sed "s/.$//")
echo $offset
current offset variable is: 0.841816 so need to compare if X.XXXXXX is bigger or equal to 2.XXXXXXXXX , where X is any number in range [0-9]
offset=$(ntpdate -q 10.160.82.10 | head -1 | cut -d " " -f 6 | sed "s/.$//")
if [ $offset -ge ([2]+\.?[0-9]*) ]
then
echo "offset too high"
fi
But getting error
./1.sh: line 9: syntax error near unexpected token `('
./1.sh: line 9: `if [ $offset -ge ([2]+\.?[0-9]*)|([0-9]*\.[0-9]+) ]'

Why don't you compare only the integer part? Like:
if [ ${offset%.*} -ge 2 ]; then
echo offset too high
fi

For comparing a floating point number it is better to use awk or perl as bash can only handle integer numbers.
You may consider this awk solution that eliminates head, cut and sed as a bonus:
if ntpdate -q 1.2.3.4 | awk 'NR == 1 && $6 < 2 {exit 1} {exit 0}'; then
echo 'offset too high'
fi

Related

Arithmetic operation fails in Shell script

Basically I'm trying to check if there are any 200 http responses in the log, in last 3 line. but I'm getting the below error. Because of this the head command is failing..Please help
LINES=`cat http_access.log |wc -l`
for i in $LINES $LINES-1 $LINES-2
do
echo "VALUE $i"
head -$i http_access.log | tail -1 > holy.txt
temp=`cat holy.txt| awk '{print $9}'`
if [[ $temp == 200 ]]
then
echo "line $i has 200 code at "
cat holy.txt | awk '{print $4}'
fi
done
Output:
VALUE 18
line 18 has 200 code at [21/Jan/2018:15:34:23
VALUE 18-1
head: invalid trailing option -- - Try `head --help' for more information.
Use $((...)) to perform arithmetic.
for i in $((LINES)) $((LINES-1)) $((LINES-2))
Without it, it's attempting to run the commands:
head -18 http_access.log
head -18-1 http_access.log
head -18-2 http_access.log
The latter two are errors.
A more flexible way to write the for loop would be using C-style syntax:
for ((i = LINES - 2; i <= LINES; ++i)); do
...
done
You got the why from JohnKugelman's answer, I will just propose a simplified code that might work for you:
while read -ra fields; do
[[ ${fields[9]} = 200 ]] && echo "Line ${fields[0]} has 200 code: ${fields[4]}"
done < <(cat -n http_access.log | tail -n 3 | tac)
cat -n: Numbers lines of the file
tail -n 3: Prints 3 last lines. You can just change this number for more lines
tac: Prints the lines outputted by tail in reversed order
read -ra fields: Reads the fields into an array $fields
${fields[0]}: The line number
${fields[num_of_field]}: Individual fields
You can also use wc instead of numbering using cat -n. For larger inputs, this will be slightly faster:
lines=$(wc -l < http_access.log)
while read -ra fields; do
[[ ${fields[8]} = 200 ]] && echo "Line $lines has 200 code: ${fields[3]}"
((lines--))
done < <(tail -n 3 http_access.log | tac)

Convert Floating Point Number To Integer Via File Read

I'm trying to get this to work when the "line" is in the format ###.###
Example line of data:
Query_time: 188.882
Current script:
#!/bin/bash
while read line; do
if [ $(echo "$line" | cut -d: -f2) -gt 180 ];
then
echo "Over 180"
else
echo "Under 180"
fi
done < test_file
Errors I get:
./calculate: line 4: [: 180.39934: integer expression expected
If you have:
line='Query_time: 188.882'
This expression:
$(echo "$line" | cut -d: -f2) -gt 180
Will give an error invalid arithmetic operator since BASH cannot handle floating point numbers.
You can use this awk command:
awk -F ':[ \t]*' '{print ($2 > 180 ? "above" : "under")}' <<< "$line"
above
You can use this awk:
$ echo Query_time: 188.882 | awk '{ print ($2>180?"Over ":"Under ") 180 }'
Over 180
It takes the second space delimited field ($2) and using conditional operator outputs if it was over or under (less than or equal to) 180.

Receiving conditional operator expected error in bash

Please excuse this extremely inefficient script, I am new to shell scripting. I am receiving an error near the if clause in the function matchFS(). I have posted the error down below. Can anyone offer me some guidance?
#!/bin/bash
function matchFS() {
usage=$(df -h | tail -n +2 | awk '{print $5}' | sed 's/%//g')
usagearr=( $usage )
for i in "${usagearr[#]}"
do
if [[ $1 eq "${usagearr[$i]}" ]]; then
# print matching row from df -h
fi
done
}
usage=$(df -h | tail -n +2 | awk '{print $5}' | sed 's/%//g')
usagearr=( $usage )
len=${#usagearr[#]}
for (( i=0; i<$len; i++ )) # we have to use (( )) here to represent the c style for loop
do
if [ "${usagearr[$i]}" -gt "10" ]; then
matchFS ${usagearr[$i]}
fi
done
Error: line 13: conditional binary operator expected
line 13: syntax error near eq'
line 13: if [[ $1 eq "49 ]]; then'
If you look at help test you'll quickly realize that eq is not one of the choices. At least, not without adding something else to it.
#!/bin/bash
function matchFS() {
### duplicate definition, these are already known to the function.
usage=$(df -h | tail -n +2 | awk '{print $5}' | sed 's/%//g')
usagearr=( $usage )
### you probably did want to use another variable here,
### because the "i" is also shared with the caller
for i in "${usagearr[#]}"
do
### -eq instead of eq
if [[ $1 -eq "${usagearr[$i]}" ]]; then
### the if statement can not be empty
# print matching row from df -h
fi
done
}
usage=$(df -h | tail -n +2 | awk '{print $5}' | sed 's/%//g')
usagearr=( $usage )
len=${#usagearr[#]}
for (( i=0; i<$len; i++ )) # we have to use (( )) here to represent the c style for loop
do
if [ "${usagearr[$i]}" -gt "10" ]; then
matchFS ${usagearr[$i]}
fi
done

Problem with floating point comparison

I am trying to check if a value I read from a text file is zero:
[[ $(echo $line | cut -d" " -f5) -gt 0 ]] && [[ $(echo $line | cut -d" " -f7 | bc -l) -eq 0 ]]
With the first condition there is no problem because f5 are integers. The problem comes form the second condition. I receive this error message:
[[: 1.235: syntax error: invalid arithmetic operator (error token is ".235")
I have tried several suggestions I found in different forums such as using echo $line | cut -d" " -f7 | bc -l with and without double quotes, etc. However, the error persist. f7 is a positive number and is given with 3 decimal places. Removing decimals or approximating is not an option because I need the result to be exactly zero (0.000).
Generally, you can't compare floating-point numbers for equality. This is because the binary representation of decimal numbers is not precise and you get rounding errors. This is the standard answer that most others will give you.
In this specific case, you don't actually need to compare floating-point numbers, because you're just testing whether some text represents a specific number. Since you're in shell, you can either use a regular string compare against "0.000" - assuming your data is rounded in that way - or using regular expressions with grep/egrep. Something like
egrep -q '0(|\.0+)'
Will match 0, 0.0, 0.00, etc, and will exit indicating success or failure, which you can use in the surrounding if statement:
if cut and pipe soup | egrep ... ; then
...
fi
Use a string comparison instead. Replace:
-eq 0
with:
= '0.000'
TZ:
Script section from comment:
for clus in $(ls *.cluster) ; do
while read line ; do
if [[ $(echo $line | cut -d" " -f11) -gt 0 ]] && [[ "$(echo $line | cut -d" " -f15 | bc -l)" = '0.000' ]] ; then
cat $(echo $line | cut -d" " -f6).pdb >> test/$(echo $line | cut -d" " -f2)_pisa.pdb
fi
done < $clus
done
My pseudo-Python interpretation:
for clus in *.cluster:
for line in clus:
fields = line.split(' ')
# field numbers are counting from 1 as in cut
if int(field 11) > 0 and str(field 15) == '0.000':
fin_name = (field 6) + '.pdb'
fout_name = (field 2) + '_pisa.pdb'
cat fin_name >> fout_name
Is that what you intended?

How to verify information using standard linux/unix filters?

I have the following data in a Tab delimited file:
_ DATA _
Col1 Col2 Col3 Col4 Col5
blah1 blah2 blah3 4 someotherText
blahA blahZ blahJ 2 someotherText1
blahB blahT blahT 7 someotherText2
blahC blahQ blahL 10 someotherText3
I want to make sure that the data in 4th column of this file is always an integer. I know how to do this in perl
Read each line, Store value of 4th column in a variable
check if that variable is an integer
if above is true, continue the loop
else break out of the loop with message saying file data not correct
But how would I do this in a shell script using standard linux/unix filter? My guess would be to use grep, but I am not sure how?
cut -f4 data | LANG=C grep -q '[^0-9]' && echo invalid
LANG=C for speed
-q to quit at first error in possible long file
If you need to strip the first line then use tail -n+2 or you could get hacky and use:
cut -f4 data | LANG=C sed -n '1b;/[^0-9]/{s/.*/invalid/p;q}'
awk is the tool most naturally suited for parsing by columns:
awk '{if ($4 !~ /^[0-9]+$/) { print "Error! Column 4 is not an integer:"; print $0; exit 1}}' data.txt
As you get more complex with your error detection, you'll probably want to put the awk script in a file and invoke it with awk -f verify.awk data.txt.
Edit: in the form you'd put into verify.awk:
{
if ($4 !~/^[0-9]+$/) {
print "Error! Column 4 is not an integer:"
print $0
exit 1
}
}
Note that I've made awk exit with a non-zero code, so that you can easily check it in your calling script with something like this in bash:
if awk -f verify.awk data.txt; then
# action for success
else
# action for failure
fi
You could use grep, but it doesn't inherently recognize columns. You'd be stuck writing patterns to match the columns.
awk is what you need.
I can't upvote yet, but I would upvote Jefromi's answer if I could.
Sometimes you need it BASH only, because tr, cut & awk behave differently on Linux/Solaris/Aix/BSD/etc:
while read a b c d e ; do [[ "$d" =~ ^[0-9] ]] || echo "$a: $d not a numer" ; done < data
Edited....
#!/bin/bash
isdigit ()
{
[ $# -eq 1 ] || return 0
case $1 in
*[!0-9]*|"") return 0;;
*) return 1;;
esac
}
while read line
do
col=($line)
digit=${col[3]}
if isdigit "$digit"
then
echo "err, no digit $digit"
else
echo "hey, we got a digit $digit"
fi
done
Use this in a script foo.sh and run it like ./foo.sh < data.txt
See tldp.org for more info
Pure Bash:
linenum=1; while read line; do field=($line); if ((linenum>1)); then [[ ! ${field[3]} =~ ^[[:digit:]]+$ ]] && echo "FAIL: line number: ${linenum}, value: '${field[3]}' is not an integer"; fi; ((linenum++)); done < data.txt
To stop at the first error, add a break:
linenum=1; while read line; do field=($line); if ((linenum>1)); then [[ ! ${field[3]} =~ ^[[:digit:]]+$ ]] && echo "FAIL: line number: ${linenum}, value: '${field[3]}' is not an integer" && break; fi; ((linenum++)); done < data.txt
cut -f 4 filename
will return the fourth field of each line to stdout.
Hopefully that's a good start, because it's been a long time since I had to do any major shell scripting.
Mind, this may well not be the most efficient compared to iterating through the file with something like perl.
tail +2 x.x | sort -n -k 4 | head -1 | cut -f 4 | egrep "^[0-9]+$"
if [ "$?" == "0" ]
then
echo "file is ok";
fi
tail +2 gives you all but the first line (since your sample has a header)
sort -n -k 4 sorts the file numerically on the 4th column, letters will rise to the top.
head -1 gives you the first line of the file
cut -f 4 gives you the 4th column, of the first line
egrep "^[0-9]+$" checks if the value is a number (integers in this case).
If egrep finds nothing, $? is 1, otherwise it's 0.
There's also:
if [ `tail +2 x.x | wc -l` == `tail +2 x.x | cut -f 4 | egrep "^[0-9]+$" | wc -l` ] then
echo "file is ok";
fi
This will be faster, requiring two simple scans through the file, but it's not a single pipeline.
#OP, use awk
awk '$4+0<=0{print "not ok";exit}' file

Resources