bash getting numbers from a file and make the average of them - bash

Write a script that expects a file as its first argument. Some lines of the
file will consist of integers 0 - 1000.
The script should select the lines matching the previous criteria and print out their average to stdout (average of n integers is their sum divided by n).
And the file given looks like this:
22
78907
77 88 99 0000
need 11 gallons of water
0
roses are red
11
Example output:
11
Explanation: (22 + 11 + 0) / 3 = 11
I have tried already with this code:
#!/bin/bash
sum=0
ind=0
while IFS='' read -r line || [[ -n "$line" ]]; do
if [[ $line =~ ^[a-zA-Z\ ]+$ ]]
then
${sum}=${sum}+${#line}
${ind}=${ind}+1
echo ${sum}
fi
done < "$1"
value=${sum}/${ind}
echo ${value}
the print of this code is always 0/0 and some errors like:
./test1: line 9: 0=0+13: command not found
./test1: line 10: 0=0+1: command not found
Any ideas?

Part of the issue with your script is answered here.. Your variable assignments are incorrect. You only use the $ to refer to a variable that has already been assigned. The assignment process drops the dollar sign.
The other issue you're having is that your arithmetic is not being expressed within an arithmetic expression.
Note that you can use use arithmetic expansion to handle your variables:
if [[ $line =~ ^[a-zA-Z\ ]+$ ]]; then
(( sum += ${#line} ))
(( ind++ ))
printf '%s\n' "$sum"
fi
and later ...
value="$(( sum / ind ))"
printf '%s\n' "$value"
Beware that bash can only deal with integer math, floats are truncated. For more advanced math, consider using bc or dc (which are not built in to bash, they are separate tools that may need to be installed on your system) or another language like awk or perl which can do the same thing with better performance and more precise math.
That said, you can "fake" a couple of decimal places with a few extra lines of code and string manipulation, if you really need to:
$ sum=100; ind=7
$ printf -v x '%d' "$((${sum}00/${ind}))"
$ printf '%d.%d\n' "${x%??}" "${x:$((${#x}-2))}"
14.28
The first printf has division which multiplies the dividend by 100 (by adding two zeroes after it). The resultant quotient is then split with the second printf to insert the decimal point. This is a hack. Use tools that support real math.

Related

Formatting Output From Shell Script [duplicate]

This question already has answers here:
Echo tab characters in bash script
(10 answers)
Closed 5 years ago.
I am working on a shell script that takes stdin or file as input and prints the averages and medians for rows or columns depending on the arguments.
When calculating the averages for the columns, the output needs to print out the following (tabbed):
My output currently looks like this (no spaces or tabs):
Averages:
92480654263
Medians:
6368974
Is there a way to echo out the averages and medians with tabs so each average and median set align left correctly? Here is a sample of how I am printing out the averages:
echo "Averages:"
while read i
do
sum=0
count=0
mean=0
#Cycle through the numbers in the rows
for num in $i
do
#Perform calculations necessary to determine the average and median
sum=$(($sum + $num))
count=`expr $count + 1`
mean=`expr $sum / $count`
done
echo -n "$mean"
done < $1
man echo:
-e enable interpretation of backslash escapes
If -e is in effect, the following sequences are recognized:
\t horizontal tab
I'd try echo -n -e "$mean\t", didn't test it though.
You should use printf. For instance, this will print a value followed by a tab
printf "%s\t" "$mean"
You can actually print several values separated by tabs if you want by adding arguments :
printf "%s\t" "$mean" "$count"
You can use an array expansion to print several values separated by tabs :
printf "%s\t" "${my_array[#]}"
Among advantages of printf over echo is the availability of flexible formatting strings, and the fact that implementations of printf vary less than those of echo among shells and operating systems.
You could try using column command but it does take additional steps:
echo "Averages:"
while read line
do
sum=0
count=0
mean=0
#Cycle through the numbers in the rows
for num in $line
do
#Perform calculations necessary to determine the average and median
(( sum += num ))
(( count++ ))
(( mean = sum / count ))
done
(( mean == 0 )) && out=$mean || out="$out|$mean"
done < $1
echo "$out" | column -s'|' -t
Above is untested as I do not have the original file, but you should get the idea. I would add that the division will also provide truncated values so not exactly accurate.

Adding a list of space separated numbers

Currently stuck in a situation where I ask the user to input a line of numbers with a space in between, then have the program display those numbers with a delay, then add them. I have everything down, but can't seem to figure out a line of code to coherently calculate the sum of their input, as most of my attempts end up with an error, or have the final number multiplied by the 2nd one (not even sure how?). Any help is appreciated.
echo Enter a line of numbers to be added.
read NUMBERS
COUNTER=0
for NUM in $NUMBERS
do
sleep 1
COUNTER=`expr $COUNTER + 1`
if [ "$NUM" ]; then
echo "$NUM"
fi
done
I've tried echo expr $NUM + $NUM to little success, but this is really all I can some up with.
Start with
NUMBERS="4 3 2 6 5 1"
echo $NUMBERS
Your script can be changed into
sum=0
for NUM in ${NUMBERS}
do
sleep 1
((counter++))
(( sum += NUM ))
echo "Digit ${counter}: Sum=$sum"
done
echo Sum=$sum
Another way is using bc, usefull for input like 1.6 2.3
sed 's/ /+/g' <<< "${NUMBERS}" | bc
Set two variables n and m, store their sum in $x, print it:
n=5 m=7 x=$((n + m)) ; echo $x
Output:
12
The above syntax is POSIX compatible, (i.e. works in dash, ksh, bash, etc.); from man dash:
Arithmetic Expansion
Arithmetic expansion provides a mechanism for evaluating an arithmetic
expression and substituting its value. The format for arithmetic expan‐
sion is as follows:
$((expression))
The expression is treated as if it were in double-quotes, except that a
double-quote inside the expression is not treated specially. The shell
expands all tokens in the expression for parameter expansion, command
substitution, and quote removal.
Next, the shell treats this as an arithmetic expression and substitutes
the value of the expression.
Two one-liners that do most of the job in the OP:
POSIX:
while read x ; do echo $(( $(echo $x | tr ' ' '+') )) ; done
bash:
while read x ; do echo $(( ${x// /+} )) ; done
bash with calc, (allows summing real, rational & complex numbers, as well as sub-operations):
while read x ; do calc -- ${x// /+} ; done
Example input line, followed by output:
-8!^(1/3) 2^63 -1
9223372036854775772.7095244707464171953

Summing row in bash [duplicate]

I am trying to read a file line by line and find the average of the numbers in each line. I am getting the error: expr: non-numeric argument
I have narrowed the problem down to sum=expr $sum + $i, but I'm not sure why the code doesn't work.
while read -a rows
do
for i in "${rows[#]}"
do
sum=`expr $sum + $i`
total=`expr $total + 1`
done
average=`expr $sum / $total`
done < $fileName
The file looks like this (the numbers are separated by tabs):
1 1 1 1 1
9 3 4 5 5
6 7 8 9 7
3 6 8 9 1
3 4 2 1 4
6 4 4 7 7
With some minor corrections, your code runs well:
while read -a rows
do
total=0
sum=0
for i in "${rows[#]}"
do
sum=`expr $sum + $i`
total=`expr $total + 1`
done
average=`expr $sum / $total`
echo $average
done <filename
With the sample input file, the output produced is:
1
5
7
5
2
5
Note that the answers are what they are because expr only does integer arithmetic.
Using sed to preprocess for expr
The above code could be rewritten as:
$ while read row; do expr '(' $(sed 's/ */ + /g' <<<"$row") ')' / $(wc -w<<<$row); done < filename
1
5
7
5
2
5
Using bash's builtin arithmetic capability
expr is archaic. In modern bash:
while read -a rows
do
total=0
sum=0
for i in "${rows[#]}"
do
((sum += $i))
((total++))
done
echo $((sum/total))
done <filename
Using awk for floating point math
Because awk does floating point math, it can provide more accurate results:
$ awk '{s=0; for (i=1;i<=NF;i++)s+=$i; print s/NF;}' filename
1
5.2
7.4
5.4
2.8
5.6
Some variations on the same trick of using the IFS variable.
#!/bin/bash
while read line; do
set -- $line
echo $(( ( $(IFS=+; echo "$*") ) / $# ))
done < rows
echo
while read -a line; do
echo $(( ( $(IFS=+; echo "${line[*]}") ) / ${#line[*]} ))
done < rows
echo
saved_ifs="$IFS"
while read -a line; do
IFS=+
echo $(( ( ${line[*]} ) / ${#line[*]} ))
IFS="$saved_ifs"
done < rows
Others have already pointed out that expr is integer-only, and recommended writing your script in awk instead of shell.
Your system may have a number of tools on it that support arbitrary-precision math, or floats. Two common calculators in shell are bc which follows standard "order of operations", and dc which uses "reverse polish notation".
Either one of these can easily be fed your data such that per-line averages can be produced. For example, using bc:
#!/bin/sh
while read line; do
set - ${line}
c=$#
string=""
for n in $*; do
string+="${string:++}$1"
shift
done
average=$(printf 'scale=4\n(%s) / %d\n' $string $c | bc)
printf "%s // avg=%s\n" "$line" "$average"
done
Of course, the only bc-specific part of this is the format for the notation and the bc itself in the third last line. The same basic thing using dc might look like like this:
#!/bin/sh
while read line; do
set - ${line}
c=$#
string="0"
for n in $*; do
string+=" $1 + "
shift
done
average=$(dc -e "4k $string $c / p")
printf "%s // %s\n" "$line" "$average"
done
Note that my shell supports appending to strings with +=. If yours does not, you can adjust this as you see fit.
In both of these examples, we're printing our output to four decimal places -- with scale=4 in bc, or 4k in dc. We are processing standard input, so if you named these scripts "calc", you might run them with command lines like:
$ ./calc < inputfile.txt
The set command at the beginning of the loop turns the $line variable into positional parameters, like $1, $2, etc. We then process each positional parameter in the for loop, appending everything to a string which will later get fed to the calculator.
Also, you can fake it.
That is, while bash doesn't support floating point numbers, it DOES support multiplication and string manipulation. The following uses NO external tools, yet appears to present decimal averages of your input.
#!/bin/bash
declare -i total
while read line; do
set - ${line}
c=$#
total=0
for n in $*; do
total+="$1"
shift
done
# Move the decimal point over prior to our division...
average=$(($total * 1000 / $c))
# Re-insert the decimal point via string manipulation
average="${average:0:$((${#average} - 3))}.${average:$((${#average} - 3))}"
printf "%s // %0.3f\n" "$line" "$average"
done
The important bits here are:
* declare which tells bash to add to $total with += rather than appending it as if it were a string,
* the two average= assignments, the first of which multiplies $total by 1000, and the second of which splits the result at the thousands column, and
* printf whose format enforces three decimal places of precision in its output.
Of course, input still needs to be integers.
YMMV. I'm not saying this is how you should solve this, just that it's an option. :)
This is a pretty old post, but came up at the top my Google search, so thought I'd share what I came up with:
while read line; do
# Convert each line to an array
ARR=( $line )
# Append each value in the array with a '+' and calculate the sum
# (this causes the last value to have a trailing '+', so it is added to '0')
ARR_SUM=$( echo "${ARR[#]/%/+} 0" | bc -l)
# Divide the sum by the total number of elements in the array
echo "$(( ${ARR_SUM} / ${#ARR[#]} ))"
done < "$filename"

Reverse Triangle using shell

OK so Ive been at this for a couple days,im new to this whole bash UNIX system thing i just got into it but I am trying to write a script where the user inputs an integer and the script will take that integer and print out a triangle using the integer that was inputted as a base and decreasing until it reaches zero. An example would be:
reverse_triangle.bash 4
****
***
**
*
so this is what I have so far but when I run it nothing happens I have no idea what is wrong
#!/bin/bash
input=$1
count=1
for (( i=$input; i>=$count;i-- ))
do
for (( j=1; j>=i; j++ ))
do
echo -n "*"
done
echo
done
exit 0
when I try to run it nothing happens it just goes to the next line. help would be greatly appreciated :)
As I said in a comment, your test is wrong: you need
for (( j=1; j<=i; j++ ))
instead of
for (( j=1; j>=i; j++ ))
Otherwise, this loop is only executed when i=1, and it becomes an infinite loop.
Now if you want another way to solve that, in a much better way:
#!/bin/bash
[[ $1 = +([[:digit:]]) ]] || { printf >&2 'Argument must be a number\n'; exit 1; }
number=$((10#$1))
for ((;number>=1;--number)); do
printf -v spn '%*s' "$number"
printf '%s\n' "${spn// /*}"
done
Why is it better? first off, we check that the argument is really a number. Without this, your code is subject to arbitrary code injection. Also, we make sure that the number is understood in radix 10 with 10#$1. Otherwise, an argument like 09 would raise an error.
We don't really need an extra variable for the loop, the provided argument is good enough. Now the trick: to print n times a pattern, a cool method is to store n spaces in a variable with printf: %*s will expand to n spaces, where n is the corresponding argument found by printf.
For example:
printf '%s%*s%s\n' hello 42 world
would print:
hello world
(with 42 spaces).
Editor's note: %*s will NOT generally expand to n spaces, as evidenced by above output, which contains 37 spaces.
Instead, the argument that * is mapped to,42, is the field width for the sfield, which maps to the following argument,world, causing string world to be left-space-padded to a length of 42; since world has a character count of 5, 37 spaces are used for padding.
To make the example work as intended, use printf '%s%*s%s\n' hello 42 '' world - note the empty string argument following 42, which ensures that the entire field is made up of padding, i.e., spaces (you'd get the same effect if no arguments followed 42).
With printf's -v option, we can store any string formatted by printf into a variable; here we're storing $number spaces in spn. Finally, we replace all spaces by the character *, using the expansion ${spn// /*}.
Yet another possibility:
#!/bin/bash
[[ $1 = +([[:digit:]]) ]] || { printf >&2 'Argument must be a number\n'; exit 1; }
printf -v s '%*s' $((10#1))
s=${s// /*}
while [[ $s ]]; do
printf '%s\n' "$s"
s=${s%?}
done
This time we construct the variable s that contains a bunch of * (number given by user), using the previous technique. Then we have a while loop that loops while s is non empty. At each iteration we print the content of s and we remove a character with the expansion ${s%?} that removes the last character of s.
Building on gniourf_gniourf's helpful answer:
The following is simpler and performs significantly better:
#!/bin/bash
count=$1 # (... number-validation code omitted for brevity)
# Create the 1st line, composed of $count '*' chars, and store in var. $line.
printf -v line '%.s*' $(seq $count)
# Count from $count down to 1.
while (( count-- )); do
# Print a *substring* of the 1st line based on the current value of $count.
printf "%.${count}s\n" "$line"
done
printf -v line '*%.s' $(seq $count) is a trick that prints * $count times, thanks to %.s* resulting in * for each argument supplied, irrespective of the arguments' values (thanks to %.s, which effectively ignores its argument). $(seq $count) expands to $count arguments, resulting in a string composed of $count * chars. overall, which - thanks to -v line, is stored in variable $line.
printf "%.${count}s\n" "$line" prints a substring from the beginning of $line that is $count chars. long.

Bash - Stripping and adding leading zeros to numbers before concatenating into string ordered strings

I need to automate a backup solution which stores files in folders such as YYYYMMDD.nn.
Every day few files would be backed up like this so the resulting folder names could be 20141002.01, 20141002.2 ... 20141002.10. My current script works for YYYYMMDD.n but when n is more than 9 sorting and picking up the last folder doesn't work because 20141002.10 is above 20141002.9 hens switching to YYYYMMDD.nn format and the approach of separating the nn, stripping leading zeros, then incrementing, and adding leading zeros if needed.
I have a function which checks the last folder for today's date and creates the next one.
createNextProcessedFolder() {
local LastFolderName=`ls -1 ${ProcessedListsDir} | grep ${CurrentDate} | tail -n 1`
n=`echo ${LastFolderName} | sed -r 's/^.{9}//'`
n="$((10#$n))"
nextFolderName=${CurrentDate}.$((if[[ $(( ${n}+1 )) < 10 ]];then n="0$((${n}+1))";else n="$(( ${n}+1 ))"; fi))
mkdir ${ProcessedListsDir}/${nextFolderName}
if [[ -d ${ProcessedListsDir}/${nextFolderName} ]]
then
echo "New folder ${nextFolderName} was created"
else
echo "Error: ${nextFolderName} was not created"
fi
Location="${ProcessedListsDir}/${nextFolderName}"
}
So when I try to run this I get an error like:
line 21: if[[ 1 < 10 ]];then n="01";else n="1"; fi: syntax error: invalid arithmetic operator (error token is ";then n="01";else n="1"; fi")
Line 21 is:
nextFolderName=${CurrentDate}.$((if[[ $(( ${n}+1 )) < 10 ]];then n="0$((${n}+1))";else n="$(( ${n}+1 ))"; fi))
I'm sure there will be more errors after this one but I would really appreciate if somebody helped me with this.
You cannot use $((...)) for command substitution as it needs to be $(...)
You need spaces before and after [[ and ]]. You can also use ((...)) in BASH:
Try this:
(( (n+1) < 10 )) && n="0$((n++))" || ((n++))
nextFolderName="${CurrentDate}.${n}"
For completeness, another solution is:
n=$( printf "%02d" $n )
The 02 before the d means prepend with 0s up to 2 digits. Or:
nextFolderName="${CurrentDate}."$( printf "%02d" "$n" )
So my problem was with incrementing a number witch was extracted from a string with a leading zero and then returning the incremented number with a leading zero if smaller than 10. The solution I ended up using can be represented with the below script.
I guess it can't be shorter than that
n=$1
(( ((n++)) < 10 )) && n="0$n"
echo $n
Something I didn't expect is that I don't have to strip leading zeros from n using this, n++ does it while incrementing :-)
Thanks again anubhava for pointing me in the right direction.

Resources