Bash Math Oddity (Floating Point Division)

Bash Math Oddity (Floating Point Division) - bash

So I'm having some trouble with bash / bc math here..
I'm trying to print the filesize in of a backup after I move it to my gdrive via rclone for backup. So I get the filesize via an rclone ls statement with awk print $1 which works great.
In my specific example, I get the value of 1993211 (bytes).
So in my printing code I try to divide this by 1048576 to get it into mb. Which should give me 1.9 mb.
However,
$ expr 1993211 / 1048576 | bc -l
prints 1
I've tried various other math options listed here (incl via python / node) and I always get 1 or 1.0. How is this possible?
The calculation should be 1993211 / 1048576 = 1.90087413788
Any idea whats going on here?

That's because it does integer division.
Do get floating point division you could run:
bc -l <<< '1993211 / 1048576'
which returns: 1.90087413787841796875
or you can set the number of decimals using scale:
bc -l <<< 'scale=5; 1993211 / 1048576'
which returns: 1.90087

In the command expr 1993211 / 1048576 | bc -l, expr divides 1993211 by 1048576 using integer division ('cause that's what expr knows how to do), gets "1" as the result, and prints it. bc -l receives that "1" as input, and since there's no operation specified (expr already did that), it just prints it.
What you want is to pass the expression "1993211 / 1048576" directly as input to bc -l:
$ echo "1993211 / 1048576" | bc -l
1.90087413787841796875

Related

how to calculate percentage difference with cmp command

I am aware of cmp command in linux which is used to do byte-by-byte comparison, could we build upon this to get percentage difference.
Example I have two files a1.jpg and a2.jpg
So when I compare these two files using cmp. Could I get percentage of difference between these two files.
example: a1.jpg -> has 1000 bytes and a2.jpg has 1021 (taking bigger file as reference)
So could get percentage difference between two files i.e No of byter differing/Total bytes in larger
Looking for some shell script snippet. Thanks in advance

You could create a file script with the following content - let us call this file percmp.sh:
#!/bin/sh
DIFF=`cmp -l $1 $2 | wc -l`
SIZE_A=`wc -c $1 | awk '{print $1}'`
SIZE_B=`wc -c $2 | awk '{print $1}'`
if [ $SIZE_A -gt $SIZE_B ]
then
MAX=$SIZE_A
else
MAX=$SIZE_B
fi
echo $DIFF/$MAX*100|bc -l
Be sure that it will be saved with Linux encription.
Then you run it with the two file names as arguments. For example, assuming percmp.sh and the two files are in the same folder you run the command:
sh percmp.sh FILE1.jpg FILE2.jpg
Otherwise you specify the full path of both the script and the files.
The code do exactly what you need, if you need reference:
#!/bin/sh tells how to interpret the file
cmp -l lists all the different bites
wc -l number of rows (in the code: lenght of the list of different bites -> number of different bytes)
wc -c size of a file
awk text parsing (to get ONLY the size of the file)
-gt Greater Than
bc -l performs the inputed division
Hope I helped!

Convert between byte count and "human-readable" string

Is there a shell command that simply converts back and forth between a number string in bytes and the "human-readable" number string offered by some commands via the -h option?
To clarify the question: ls -l without the -h option (some output supressed)
> ls -l
163564736 file1.bin
13209 file2.bin
gives the size in bytes, while with the -hoption (some output supressed)
> ls -lh
156M file1.bin
13K file2.bin
the size is human readable in kilobytes and megabytes.
Is there a shell command that simply turns 163564736into 156M and 13209 into 13K and also does the reverse?

numfmt
To:
echo "163564736" | numfmt --to=iec
From:
echo "156M" | numfmt --from=iec

There is no standard (cross-platform) tool to do it. But solution using awk is described here

Explanation to an assignment

I am NOT looking for an answer to this problem. I am having trouble understanding what I should be trying to accomplish in this assignment. I welcome Pseudo code or hints if you would like. but what I really need is an explanation to what I need to be making, and what the output should be/look like. please do not write out a lot of code though I would like to try that on my own.
(()) = notes from me
The assignment is:
a program (prog.exe) ((we are given this program)) that reads 2 integers (m, n) and 1 double (a) from an input data file named input.in. For example, the sample input.in given file contains the values
5 7 1.23456789012345
when you run ./prog.exe the output is a long column of floating-point numbers
in additions to the program, there is a file called ain.in that contains a long column of double precision values.
copy prog.exe and ain.in to working directory
Write a bash script that does that following:
-Runs ./prog.exe for all combonations of
--m=0,1,...,10
--n=0,1,...,5
--a=every value in the file ain.in
-this is essentially a triple nested loop over m,n and the ain.in values
-for each combination of m,n and ain.in value above:
-- generate the appropriate input file input.in
-- run the program and redirect the output to some temporary output file.
--extract the 37th and 51st values from this temporary output file and store these in a file called average.in
-when the 3 nested loops terminate the average.in file should contain a long list of floating point values.
-your script should return the average of the values contained in average.in
HINTS: seq , awk , output direction, will be useful here
thank you to whoever took the time to even read through this.
This is my second bash coding assignment and im still trying to get a grasp on it and a better explanation would be very helpful. thanks again!

this is one way of generating all input combinations without explicit loops
join -j9 <(join -j9 <(seq 0 10) <(seq 0 5)) ain.in | cut -d' ' -f2-

The idea is to write a bash script that will test prog.exe with a variety of input conditions. This means recreating input.in and running prog.exe many times. Each time you run prog.exe, input.in should contain a different three numbers, e.g.,
First run:
0 0 <first line of ain.in>
Second run:
0 0 <second line of ain.in>
. . . last run:
10 5 <last line of ain.in>
You can use seq and for loops to accomplish this.
Then, you need to systematically save the output of each run, e.g.,
./prog.exe > tmp.out
# extract line 37 and 51 and append to average.ln
sed -n '37p; 51p; 51q' tmp.out >> average.ln
Finally, after testing all the combinations, use awk to compute the average of all the lines in average.in.

One-liner inspired by #karakfa:
join -j9 <(join -j9 <(seq 0 10) <(seq 0 5)) ain.in | cut -d' ' -f2- |
sed "s/.*/echo & >input.in;./prog.exe>tmp.out; sed -n '37p;51p;51q' tmp.out/" |
sh | awk '{sum+=$1; n++} END {print sum/n}'

Bash: output line count from wc in human readable format

Is that possible? Doing wc the straight forward way I have to spend some mental energy to see that the file contains more than 40 million lines:
$ wc -l 20150210.txt
45614736 20150210.txt
I searched around and numfmt showed up, but that is evidently not available on OSX (nor on brew). So is there a simple way to do this on OSX? Thanks.

If you have POSIX printf you can use the %'d:
printf "%'d\n" $(wc -l < file )
From man printf:
'
For decimal conversion (i, d, u, f, F, g, G) the output is to be
grouped with thousands' grouping characters if the locale information
indicates any. Note that many versions of gcc(1) cannot parse this
option and will issue a warning. SUSv2 does not include %'F
Test
$ seq 100000 > a
$ printf "%'d\n" $(wc -l <a )
100,000
Note also the trick wc -l < file to get the number without the file name.

Argument list too long for input

I am executing a shell script, on a high level it reads the records from a csv and consequently do some DB operation.
I have analyzed by manually running the script. Its running fine for less than 900 records in file but it gives the error for more than 900 records. Below is the screenshot of the error which i get after some time:
There is a part of the script which is picking record 1 by 1:
Could you please suggest why this is happening? I have read similar topics when user got that error, but unable to relate them with my scneario.
Cheers

I've hit this problem before and it's quite easy to replicate:
unset a; export a=$(perl -e 'print "a"x(1024*64)'); whoami
tiago
unset a; export a=$(perl -e 'print "a"x(1024*128)'); whoami
bash: /usr/bin/whoami: Argument list too long
perl -e 'print "a"x(1024*64)' | wc -c
65536
perl -e 'print "a"x(1024*128)' | wc -c
131072
So something between 65536 and 131072 bytes breaks, When I had This problem instead of export the value I was printing and used pipes to work with data. Another way around is to use files.
You can find nice experiments: What is the maximum size of an environment variable value?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio