Bash and awk to do division - bash

I am trying to do a simple division computation between two integers that will result in a float. I do not want to use bc. This approach works for me for a different purpose with slightly different syntax but I am not quite sure where I am messing up. I am positive that the variables are getting assigned correctly, but I have an error once I try to do the division, and nothing actually gets assigned to the variable. Can anyone help?
Thanks in advance!
rate=`awk '{ shared = "'"${tempRatioArray[0]}"'"; total = "'"${tempRatioArray[1]}"'";\
printf "%3.0f\t", shared/total }' | awk '{print}'`

You can use bc:
bc -l <<<"scale=3; 5/2"
2.500
Adapting to your code:
bc -l <<< "scale=3; ${tempRatioArray[0]} / ${tempRatioArray[1]}"

That is not correct way of using shell variables in awk and you don't need 2 awk commands.
Use it like this:
rate=$(awk -v shared="${tempRatioArray[0]}" -v total="${tempRatioArray[1]}" 'BEGIN {
printf "%.3f", (shared/total) }')

Related

Awk muliplication gives me a different value than the normal multiplication of 2 numbers [duplicate]

I have a pipe delimited feed file which has several fields. Since I only need a few, I thought of using awk to capture them for my testing purposes. However, I noticed that printf changes the value if I use "%d". It works fine if I use "%s".
Feed File Sample:
[jaypal:~/Temp] cat temp
302610004125074|19769904399993903|30|15|2012-01-13 17:20:02.346000|2012-01-13 17:20:03.307000|E072AE4B|587244|316|13|GSM|1|SUCC|0|1|255|2|2|0|213|2|0|6|0|0|0|0|0|10|16473840051|30|302610|235|250|0|7|0|0|0|0|0|10|54320058002|906|722310|2|0||0|BELL MOBILITY CELLULAR, INC|BELL MOBILITY CELLULAR, INC|Bell Mobility|AMX ARGENTINA SA.|Claro aka CTI Movil|CAN|ARG|
I am interested in capturing the second column which is 19769904399993903.
Here are my tests:
[jaypal:~/Temp] awk -F"|" '{printf ("%d\n",$2)}' temp
19769904399993904 # Value is changed
However, the following two tests works fine -
[jaypal:~/Temp] awk -F"|" '{printf ("%s\n",$2)}' temp
19769904399993903 # Value remains same
[jaypal:~/Temp] awk -F"|" '{print $2}' temp
19769904399993903 # Value remains same
So is this a limit of "%d" of not able to handle long integers. If thats the case why would it add one to the number instead of may be truncating it?
I have tried this with BSD and GNU versions of awk.
Version Info:
[jaypal:~/Temp] gawk --version
GNU Awk 4.0.0
Copyright (C) 1989, 1991-2011 Free Software Foundation.
[jaypal:~/Temp] awk --version
awk version 20070501
Starting with GNU awk 4.1 you can use --bignum or -M
$ awk 'BEGIN {print 19769904399993903}'
19769904399993904
$ awk --bignum 'BEGIN {print 19769904399993903}'
19769904399993903
ยง Command-Line Options
I believe the underlying numeric format in this case is an IEEE double. So the changed value is a result of floating point precision errors. If it is actually necessary to treat the large values as numerics and to maintain accurate precision, it might be better to use something like Perl, Ruby, or Python which have the capabilities (maybe via extensions) to handle arbitrary-precision arithmetic.
UPDATE: Recent versions of GNU awk support arbitrary precision arithmetic. See the GNU awk manual for more info.
ORIGINAL POST CONTENT:
XMLgawk supports arbitrary precision arithmetic on floating-point numbers.
So, if installing xgawk is an option:
zsh-4.3.11[drado]% awk --version |head -1; xgawk --version | head -1
GNU Awk 4.0.0
Extensible GNU Awk 3.1.6 (build 20080101) with dynamic loading, and with statically-linked extensions
zsh-4.3.11[drado]% awk 'BEGIN {
x=665857
y=470832
print x^4 - 4 * y^4 - 4 * y^2
}'
11885568
zsh-4.3.11[drado]% xgawk -lmpfr 'BEGIN {
MPFR_PRECISION = 80
x=665857
y=470832
print mpfr_sub(mpfr_sub(mpfr_pow(x, 4), mpfr_mul(4, mpfr_pow(y, 4))), 4 * y^2)
}'
1.0000000000000000000000000
This answer was partially answered by #Mark Wilkins and #Dennis Williamson already but I found out the largest 64-bit integer that can be handled without losing precision is 2^53.
Eg awk's reference page
http://www.gnu.org/software/gawk/manual/gawk.html#Integer-Programming
(sorry if my answer is too old. Figured I'd still share for the next person before they spend too much time on this like I did)
You're running into Awk's Floating Point Representation Issues. I don't think you can find a work-around within awk framework to perform arithmetic on huge numbers accurately.
Only possible (and crude) way I can think of is to break the huge number into smaller chunk, perform your math and join them again or better yet use Perl/PHP/TCL/bsh etc scripting languages that are more powerful than awk.
Using nawk on Solaris 11, I convert the number to a string by adding (concatenate) a null to the end, and then use %15s as the format string:
printf("%15s\n", bignum "")
another caveat about the precision :
the errors pile up with extra operations ::
echo 19769904399993903 | mawk2 '{ CONVFMT = "%.2000g";
OFMT = "%.20g";
} {
print;
print +$0;
print $0/1.0
print $0^1.0;
print exp(-log($0))^-1;
print exp(1*log($0))
print sqrt(exp(exp(log(20)-log(10))*log($0)))
print (exp(exp(log(6)-log(3))*log($0)))^2^-1
}'
19769904399993903
19769904399993904
19769904399993904
19769904399993904
19769904399993912
19769904399993908
19769904399993628 <<<โ€”โ€” -275
19769904399993768 <<<โ€”- -135
The first few only off by less than 10.
last 2 equations have triple digit deltas.
For any of the versions that require calling helper math functions, simply getting the -M bignum flag is insufficient. One must also set the PREC variable.
For this exmaple, setting PREC=64 and OFMT="%.17g" should suffice.
Beware of setting OFMT too high, relative to PREC, otherwise you'll see oddities like this :
gawk -M -v PREC=256 -e '{ CONVFMT="%.2000g"; OFMT="%.80g";... } '
19769904399993903
19769904399993903.000000000000000000000000000000000000000000000000000000000003734
19769904399993903.000000000000000000000000000000000000000000000000000000000003734
19769904399993903.000000000000000000000000000000000000000000000000000000000003734
19769904399993903.000000000000000000000000000000000000000000000000000000000003734
since 80 significant digits require precision of at least 265.75, so basically 266-bits, but gawk is fast enough that you can probably safely pre-set it at PREC=4096/8192 instead of having to worry about it everytime

awk-if with variables doesn't work properly sometimes

In a text, I want to classify my data according to it range.
For example, 8.1,9.1,9.9 are all in [8,10). I used variables called left and right to replace 8 and 10 respectively in awk-if. But it doesn't work properly.
My data like this:
9.1 aa
9.2 bb
10.1 cc
11.9 dd
Then my scripts like this:
left=8;right=10 #left=10;right=12
echo "["$left","$right"]:"
cat data | awk '{if(($1>="'$left'")&&($1<"'$right'")) print $2}' | xargs
The result is empty.
[8,10]:
But if I use 8 and 10 directly (without variables), it's OK. And when I use left=10, right=12, it works also properly.
I also found when left=98, right=100, it also didn't work. So why sometimes it doesn't work? Thanks a lot!
With awk's option -v:
left=8;right=10
awk -v l="$left" -v r="$right" '{if($1>=l&&$1<r) print $2}' data
or with environment variables:
export left=8 right=10
awk '{if($1>=ENVIRON["left"]&&$1<ENVIRON["right"]) print $2}' data
Output:
aa
bb
You are performing string comparison but want numeric comparison. Just swap the quotes:
left=8;right=10 #left=10;right=12
echo "["$left","$right"]:"
cat data | awk '{if(($1>='"$left"')&&($1<'"$right"')) print $2}' | xargs

shell: write integer division result to a variable and print floating number

I'm trying to write a shell script and plan to calculate a simple division using two variables inside the script. I couldn't get it to work. It's some kind of syntax error.
Here is part of my code, named test.sh
awk '{a+=$5} END {print a}' $variable1 > casenum
awk '{a+=$5} END {print a}' $variable2 > controlnum
score=$(echo "scale=4; $casenum/$controlnum" | bc)
printf "%s\t%s\t%.4f\n", $variable3 $variable4 $score
It's just the $score that doesn't work.
I tried to use either
sh test.sh
or
bash test.sh
but neither worked. The error message is:
(standard_in) 1: syntax error
Does anyone know how to make it work? Thanks so much!
You are outputting to files, not to vars. For this, you need var=$(command). Hence, this should make it:
casenum=$(awk '{a+=$5} END {print a}' $variable1)
controlnum=$(awk '{a+=$5} END {print a}' $variable2)
score=$(echo "scale=4; $casenum/$controlnum" | bc)
printf "%s\t%s\t%.4f\n", $variable3 $variable4 $score
Note $variable1 and $variable2 should be file names. Otherwise, indicate it.
First your $variable1 and $variable2 must expand to a name of an existing file; but that's not a syntax error, it's just a fact that makes your code wrong, unless you mean really to cope with files containing numbers and accumulating the sum of the fifth field into a file. Since casenum and controlnum are not assigned (in fact you write the awk result to a file, not into a variable), your score computation expands to
score=$(echo "scale=4; /" | bc)
which is wrong (Syntax error comes from this).
Then, the same problem with $variable3 and $variable4. Are they holding a value? Have you assigned them with something like
variable=...
? Otherwise they will expand as "". Fixing these (including assigning casenum and controlnum), will fix everything, since basically the only syntax error is when bc tries to interpret the command / without operands. (And the comma after the printf is not needed).
The way you assign the output of execution of a command to a variable is
var=$(command)
or
var=`command`
If I understand your commands properly, you could combine calculation of score with a single awk statement as follows
score=$(awk 'NR==FNR {a+=$5; next} {b+=$5} END {printf "%.4f", a/b}' $variable1 $variable2)
This is with assumption that $variable1 and $variable2 are valid file names
Refer to #fedorqui's solution if you want to stick to your approach of 2 awk and 1 bc.

how can I supply bash variables as fields for print in awk

I currently am trying to use awk to rearrange a .csv file that is similar to the following:
stack,over,flow,dot,com
and the output would be:
over,com,stack,flow,dot
(or any other order, just using this as an example)
and when it comes time to rearrange the csv file, I have been trying to use the following:
first='$2'
second='$5'
third='$1'
fourth='$3'
fifth='$4'
awk -v a=$first -v b=$second -v c=$third -v d=$fourth -v e=$fifth -F '^|,|$' '{print $a,$b,$c,$d,$e}' somefile.csv
with the intent of awk/print interpreting the $a,$b,$c,etc as field numbers, so it would come out to the following:
{print $2,$5,$1,$3,$4}
and print out the fields of the csv file in that order, but unfortunately I have not been able to get this to work correctly yet. I've tried several different methods, this seeming like the most promising, but unfortunately have not been able to get any solution to work correctly yet. Having said that, I was wondering if anyone could possibly give any suggestions or point out my flaw as I am stumped at this point in time, any help would be much appreciated, thanks!
Use simple numbers:
first='2'
second='5'
third='1'
fourth='3'
fifth='4'
awk -v a=$first -v b=$second -v c=$third -v d=$fourth -v e=$fifth -F '^|,|$' \
'{print $a, $b, $c, $d, $e}' somefile.csv
Another way with a shorter example:
aa='$2'
bb='$1'
cc='$3'
awk -F '^|,|$' "{print $aa,$bb,$cc}" somefile.csv
You already got the answer to your specific question but have you considered just specifying the order as a string instead of each individual field? For example:
order="2 5 1 3 4"
awk -v order="$order" '
BEGIN{ FS=OFS=","; n=split(order,a," ") }
{ for (i=1;i<n;i++) printf "%s%s",$(a[i]),OFS; print $(a[i]) }
' somefile.csv
That way if you want to add/delete fields or change the order you just trivially rearrange the numbers in the first line instead of having to mess with a bunch of hard-coded variables, etc.
Note that I changed your FS as there was no need for it to be that complicated. Also, you don't need the shell variable, "order",you could just populate the awk variable of the same name explicitly, I just started with the shell variable since you had started with shell variables so maybe you have a reason.

BashScripting: Reading out a specific variable

my question is actually rather easy, but I suck at bash scripting and google was no help either. So here is the problem:
I have an executable that writes me a few variables to stdout. Something like that:
MrFoo:~$ ./someExec
Variable1=5
Another_Weird_Variable=12
VARIABLENAME=42
What I want to do now is to read in a specific one of these variables (I already know its name), store the value and use it to give it as an argument to another executable.
So, a simple call like
./function2 5 // which comes from ./function2 Variable1 from above
I hope you understand the problem and can help me with it
With awk you can do something like this (this is for passing value of 1st variable)
./someExec | awk -F= 'NR==1{system("./function2 " $2)}'
or
awk -F= 'NR==1{system("./function2 " $2)}' <(./someExec)
Easiest way to go is probably to use a combination of shell and perl or ruby. I'll go with perl since it's what I cut my teeth on. :)
someExec.sh
#!/bin/bash
echo Variable1=5
echo Another_Weird_Variable=12
echo VARIABLENAME=42
my_shell_script.sh
#!/bin/bash
myVariable=`./someExec | perl -wlne 'print $1 if /Variable1=(.*)/'`
echo "Now call ./function2 $myVariable"
[EDIT]
Or awk, as Jaypal pointed out 58 seconds before I posted my answer. :) Basically, there are a lot of good solutions. Most importantly, though, make sure you handle both security and error cases properly. In both of the solutions so far, we're assuming that someExec will provide guaranteed well-formed and innocuous output. But, consider if someExec were compromised and instead provided output like:
./someExec
5 ; rm -rf / # Uh oh...
You can use awk like this:
./function2 $(./someExec | awk -F "=" '/Variable1/{print $2}')
which is equivalent to:
./function2 5
If you can make sure someExec's output is safe you can use eval.
eval $(./someExec)
./function2 $Variable1
You can use this very simple and straight forward way:
./exp1.sh | grep "Variable1" | awk -F "=" '{print $2}'
If you want to use only one variable from the file use the below
eval $(grep 'Variable1' ./someExec )
./function2 $Variable1
And, if you want to use all the variables of a file, use
eval $(./someExec)
./function2 $<FILE_VARIBALE_NAME>

Resources