Trying to get an average using the contents of two files - bash

So I have two files in my directory that contain a number in each of them. I want to make a script that calculates the average of these two numbers. How would I write it? Would this be correct?
avg=$((${<file1.txt}-${<file2.txt})/2)

Your example does not work. Furthermore, your formula is probably incorrect. Here are two options without unnecessary cat:
avg=$(( (`<file1.txt` + `<file2.txt`) / 2 ))
or
avg=$(( ($(<file1.txt) + $(<file2.txt)) / 2 ))
I find the first one more readable though. Also be warned: this trivial approach will cause problems when your files contain more than just the plain numbers.
EDIT:
I should have noted that the first syntactical/legacy option which uses the backticks (` `) is no longer recommended and should be avoided. You can read more about the WHY here. Thanks at mklement0 for the link!
EDIT2:
According to Eric, the values are floating point numbers. You can't do this directly in bash because only integer numbers are supported. You have to use a little helper:
avg=$(bc <<< "( $(<file1.txt) + $(<file2.txt) ) / 2")
or maybe easier to understand
avg=$(echo "( $(<file1.txt) + $(<file2.txt) ) / 2" | bc)
For those who might wonder what bc is (see man bc):
bc is a language that supports arbitrary precision numbers with
interactive execution of statements.
Here is another alternative since perl is usually installed by default:
avg=$(perl -e 'print( ($ARGV[0] + $ARGV[1]) / 2 )' -- $(<file1.txt) $(<file2.txt))

You'll want to use a command substitution:
avg=$(($(cat file1.txt)-$(cat file2.txt)/2))
However, Bash is a pretty bad language for doing maths (at least unless it's completely integer maths). You might want to look into bc or a "real" language like Python.

Related

Returning values from functions when efficiency matters

It seems to me, there are several ways to return a value from a Bash function.
Approach 1: Use a "local-global" variable, which is defined as local in the caller:
func1() {
a=10
}
parent1() {
local a
func1
a=$(($a + 1))
}
Approach 2: Use command substitution:
func2() {
echo 10
}
parent2() {
a=$(func2)
a=$(($a + 1))
}
How much speedup could one expect from using approach 1 over approach2?
And, I know that it is not good programming practice to use global variables like in approach 1, but could it at some point be justified due to efficiency considerations?
The single most expensive operation in shell scripting is forking. Any operation involving a fork, such as command substitution, will be 1-3 orders of magnitude slower than one that doesn't.
For example, here's a straight forward approach for a loop that reads a bunch of generated files on the form of file-1234 and strips out the file- prefix using sed, requiring a total of three forks (command substitution + two stage pipeline):
$ time printf "file-%s\n" {1..10000} |
while read line; do n=$(echo "$line" | sed -e "s/.*-//"); done
real 0m46.847s
Here's a loop that does the same thing with parameter expansion, requiring no forks:
$ time printf "file-%s\n" {1..10000} |
while read line; do n=${line#*-}; done
real 0m0.150s
The forky version takes 300x longer.
Therefore, the answer to your question is yes: if efficiency matters, you have solid justification for factoring out or replacing forky code.
When the fork count is constant with respect to the input (or it's too messy to make it constant), and the code is still too slow, that's when you should rewrite it in a faster language.
surely approach 1 is much faster than approach 2 because it has not any interrupt (which in turn may need several OS kernel crossing to service) and has only one memory access!!!

Beginner in shell script

I am a beginner to shell scripting, so to get used to them I am starting off easy scripts. Trying to calculate the rate of interest for a "principal" amount, I wrote the below shell script.
But I am getting the output as:(150000*0.8)/100. I thought I will be getting mathematically solved output which is 1200. (pr=($principal*$rof)/100)
Can anyone help me in this? What mistake I have made?
principal=150000
rof=0.8
pr=($principal*$rof)/100
echo $pr
There are a couple of issues with this piece of code. Assuming you are using bash, the correct way is shown below,
Arithmetic operations are performed with the syntax,
x=$(( a + b ))
So, for your case, it becomes,
pr=$((( principle * rof ) / 100))
It is not possible to perform floating point operations in bash. You can use the unix utility bc for such purposes. In your case,
pr=`bc <<< "( $principle * $rof ) / 100"`
So, your complete code now becomes,
#!/bin/bash
principle=150000
rof=0.8
pr=`bc <<< "( $principle * $rof ) / 100"`
echo $pr
Bash does not support floating point number aritmetic, e.g. see this post
$> principle=150000;rof=8;pr=`expr $principle \* $rof / 1000`;echo $pr
1200

Evaluating a mathematical expression stored as a string, into a single number (bash)

I am working on Mac OSX and using bash as my shell. I currently have a string which I wish want evaluated as a number. When I echo the string I get 1.e8*1.07**100. Is there any way to pass this string on to be evaluated as a number?
The background as to why it is a string to start with is because the expression was built step by step. First 1.e8*1.07**%%d is within the code, then the user inputs an integer to be taken as what 1.07 will be raised to the power of. So in the example above, the user would have input 100, and thus the script is stuck with 1.e8*1.07**100, which is the correct expression I was hoping for, but I would have liked it to be evaluated when I echo the variable where it is store.
Actual important bits of code:
BASE=$(printf '1.e8*1.07**%%d')
#Get user input assigned to pow
NUM=$(printf ${BASE} ${pow})
echo $NUM #1.e8*1.07**100
Thanks for any help you can offer.
[Edit: I would also like to not just echo the answer, but store it as a variable.]
How about:
python -c "print $NUM"
By the way, you could just write
BASE="1.e8*1.07**%d"
(In fact, you don't even need the quotes.)
In most unix* systems you'll find a tool called bc that can perform calculations. You'll might need to rewrite your input though, I thinks it accepts ^ instead of **, and I'm not sure about the 1.e8 notation.
It happens that perl can evaluate that exact expression
$ x="1.e8*1.07**100"
$ y=$(perl -E "say $x")
$ echo $y
86771632556.6417

Contents of ls appearing in middle of output

I am encountering a very weird situation when wrapping a bash script call in echo $(). This is strange enough that I don't know what code to present, so I will describe the general situation. I have a script, which we will call "run.sh", and it has some output. This is generally formatted quite nicely, with whitespace and line breaks.
I am trying to compare this output with a value that I got when I ran it once previously. To do this, the code compares the "new" value with the old by checking if these two are the same, i.e.:
expression=$(./runProcess.sh "$process");
expected=$(cat UnitTests/expect-process-$process);
if [ "$expression" == "$expected" ]; then
Clearly to get a value of "old" to compare with future testings I need to compute $(./runProcess.sh) by hand. When I do this, I get a version of the output with significantly less whitespace. However it is clearly wrong, because the contents of ls turn up in the middle of it. By that I mean that I get the following type of output running these two commands:
./runProcess.sh g,g:
R2With2Gluons =
+ ncol*i_*pi_^2*A*g^2 * (
- 17/24*d_(mu1,mu2)*d_(m1,m2)*p1.p1
- 31/8*d_(mu1,mu2)*d_(m1,m2)*p1.p2
- 17/24*d_(mu1,mu2)*d_(m1,m2)*p2.p2
+ 7/12*d_(m1,m2)*p1(mu1)*p1(mu2)
+ 1/24*d_(m1,m2)*p1(mu1)*p2(mu2)
+ 89/24*d_(m1,m2)*p1(mu2)*p2(mu1)
+ 7/12*d_(m1,m2)*p2(mu1)*p2(mu2)
);
0.01 sec out of 0.01 sec
echo $(./runProcess.sh g,g):
R2With3Gluons = + coeff(m1,m2,m3)*ncol*pi_^2*A*g^3 Auto Diagrams UnitTests colourCalc.frm form.set functions.frm output.frm process.frm process.mid qgraf2form.frm qgrafProcessor.py runProcess.sh runProcesses.sh test vertices.frm ( + 35/24*d_(mu1,mu2)*p1(mu3) - 35/24*d_(mu1,mu2)*p2(mu3) - 35/24*d_(mu1,mu3)*p1(mu2) + 35/24*d_(mu1,mu3)*p3(mu2) + 35/24*d_(mu2,mu3)*p2(mu1) - 35/24*d_(mu2,mu3)*p3(mu1) ); 0.40 sec out of 0.40 sec
And here is ls:
ls:
Auto form.set process.mid runProcesses.sh
Diagrams functions.frm qgraf2form.frm test
UnitTests output.frm qgrafProcessor.py vertices.frm
colourCalc.frm process.frm runProcess.sh
I can provide exact examples if necessary, but I hope this is illuminating enough. Why could this possibly be happening? I'm using bash on OS X Mountain Lion.
Use more quotes!!!
Try:
echo "$(./run.sh)"
instead. (Yes, with quotes).
Try:
old=$(./run.sh)
echo "$old"
you'll have the correct output (with $old in quotes). Now, regarding your test, use, as advised by sampson-chen:
[[ "$old" == "$(./run.sh)" ]]
(you don't need to quote the variables or the command substitution when assigning the variable old, but, as a general rule, you can use quotes every time). ((see Gordon Davisson's excellent comments to this post, that I've actually upvoted, with a few caveats about globs and quoting variables inside [[ ... ]])).
Edit. As you've edited your post, I see you're using an inefficient cat. Instead of:
expected=$(cat UnitTests/expect-process-$process)
please use
expected=$(< "UnitTests/expect-process-$process")
It's hard to say without your exact script, but for starters, your comparison:
old == $(./run.sh);
should be:
if [[ "$old" == "$(./run.sh)" ]]; then

Bash/batch multiple file, single folder, incrimental rename script; user provided filename prefix parameter

I have a folder of files which need to be renamed.
Instead of a simple incrimental numeric rename function I need to first provide a naming convention which will then incriment in order to ensure file name integrity within the folder.
say i have files:
wei12346.txt
wifr5678.txt
dkgj5678.txt
which need to be renamed to:
Eac-345-018.txt
Eac-345-019.txt
Eac-345-020.txt
Each time i run the script the naming could be different and the numeric incriment to go along with it may also be ddifferent:
Ebc-345-010.pdf
Ebc-345-011.pdf
Ebc-345-012.pdf
So i need to ask for a provided parameter from the user, i was thinking this might be useful as the previous file name in the list of files to be indexed eg: Eac-345-017.txt
The other thing I am unsure about with the incriment is how the script would deal with incrimenting 099 to 100 or 999 to 1000 as i am not aware of how this process is carried out.
I have been told that this is an easy script in perl however I am running cygwin on a windows machine in work and have access to only bash and windows shells in order to execute the script.
Any pointers to get me going would be greatly appreciated, i have some experience programming but scripting is almost entirely new.
Thanks,
Craig
(i understand there are allot of posts on this type of thing already but none seem to offer any concise answer, hence my question)
#!/bin/bash
prefix="$1"
shift
base_n="$1"
shift
step="$1"
shift
n=$base_n
for file in "$#" ; do
formatted_n=$(printf "%03d" $n)
# re-use original file extension whilke we're at it.
mv "$file" "${prefix}-${formatted_n}.${file##*.}"
let n=n+$step
done
Save the file, invoke it like this:
bash fancy_rename.sh Ebc-345- 10 1 /path/to/files/*
Note: In your example you "renamed" a .txt to a .pdf, but above I presumed the extension would stay the same. If you really wanted to just change the extension then it would be a trivial change. If you wanted to actually convert the file format then it would be a little more complex.
Note also that I have formatted the incrementing number with %03d. This means that your number sequence will be e.g.
010
011
012
...
099
100
101
...
999
1000
Meaning that it will be zero padded to three places but will automatically overflow if the number is larger. If you prefer consistency (always 4 digits) you should change the padding to %04d.
OK, you can do the following. You can ask the user first the prefix and then the starting sequence number. Then, you can use the built-in printf from bash to do the correct formatting on the numbers, but you may have to decide to provide enough number width to hold all the sequence, because this will result in a more homogeneous names. You can use read to read user input:
echo -n "Insert the prefix: "
read prefix
echo -n "Insert the sequence number: "
read sn
for i in * ; do
fp=`printf %04d $sn`
mv "$i" "$prefix-$fp.txt"
sn=`expr $sn + 1`
done
Note: You can extract the extension also. That wouldn't be a problem. Also, here I selected 4 numbers fot the sequence number, calculated into the variable $fp.

Resources