Sum/Average numbers in a single line - UNIX - bash

I'm working on a small script to take 3 numbers in a single line, sum and average them, and print the result at the end of the line. I know how to use the paste command, but everything I'm finding is telling me how to average a column. I need to average a line, not a column. Any advice? Thanks!

awk to the rescue!
$ echo 1 2 3 | awk -v RS=' ' '{sum+=$1; count++} END{print sum, sum/count}'
6 2
works for any number of input fields
$ echo 1 2 3 4 | awk -v RS=' ' '{sum+=$1; count++} END{print sum, sum/count}'
10 2.5

You can manipulate your line before giving it to bc. With bc you have additional possibilities such as setting the scale.
A simple mean from 1 2 3 would be
echo "1 2 3" | sed -e 's/\([0-9.]\) /\1+/g' -e 's/.*/(&)\/3/' | bc
You can wrap it in a function and see more possibilities:
function testit {
echo "Input $#"
echo "Integer mean"
echo "$#" | sed -e 's/\([0-9.]\) /\1+/g' -e 's/.*/(&)\/'$#'/' | bc
echo "floating decimal mean"
echo "$#" | sed -e 's/\([0-9.]\) /\1+/g' -e 's/.*/(&)\/'$#'/' | bc -l
echo "2 decimal output mean"
echo "$#" | sed -e 's/\([0-9.]\) /\1+/g' -e 's/.*/scale=2; (&)\/'$#'/' | bc
echo
}
testit 4 5 6
testit 4 5 8
testit 4.2 5.3 6.4
testit 1 2 3 4 5 6 7 8 9

Related

How can I use 'echo' output as an operand for the 'seq' command within a terminal?

I have an excercise where I need to sum together every digit up until a given number like this:
Suppose I have the number 12, I need to do 1+2+3+4+5+6+7+8+9+1+0+1+1+1+2.
(numbers past 9 are split up into their separate digits eg. 11 = 1+1, 234 = 2+3+4, etc.)
I know I can just use:
seq -s '' 12
which outputs 123456789101112 and then add them all together with '+' in between and then pipe to 'bc' BUT I have to specifically do :
echo 12 | ...
as the first step (because the online IDE fills it in as the unchangeable first step for every testcase) and when I do this I start to have problems with seq
I tried
echo 12 | seq -s '' $1
### or just ###
echo 12 | seq -s ''
but can't get it to work as this just gives back a missing operand error for seq (because I'm in the terminal, not a script and the '12' isn't just assigned to $1 I assume), any recommendations on how to avoid it or how to get seq to interpret the 12 from echo as operand or alternative ways to go?
seq -s '' $(cat)
full solution:
echo "12" | seq -s '' $(cat) | sed 's/./&+/g; s/$/0/' | bc
Or
echo 12 | { echo $(( $({ seq -s '' $(< /dev/stdin); echo; } | sed -E 's/([[:digit:]])/\1+/g; s/$/0/') )); }
without sed:
d=$(echo 12 | { seq -s '' $(< /dev/stdin); echo; }); echo $(( "${d//?/&+}0" ))
echo 12 | awk '{
cnt=0
for(i=1;i<=$1;i++) {
cnt+=i
printf("%s%s",i,i<$1?"+":"=")
}
print cnt
}'
Prints:
1+2+3+4+5+6+7+8+9+10+11+12=78
If it is supposed to be just the digits added up:
echo 12 | awk '{s=""
for(i=1;i<=$1;i++) s=s i
split(s,ch,"")
for(i=1;i<=length(ch); i++) cnt+=ch[i]
print cnt
}'
51
Or a POSIX pipeline:
$ echo 12 | seq -s '' "$(cat)" | sed -E 's/([0-9])/\1+/g; s/$/0/' | bc
51

Explanation of awk function

I am converting some bash-style (actually using busybox) scripts to c for usage in a custom kernel driver. Everything is going fine but I'm dreadfully unfamiliar with awk, and would really appreciate an explanation of what this one liner is doing. The function is here:
checksum=`echo $sum | busybox awk '{$NF *= -1; print}'`
checksum and sum are standard integers that have been accounted for, and can be either positive or negative. I just have no clue what happens when sum is piped into the awk function.
This piece of code awk '{$NF *= -1; print}' multiplies the value of the last field $NF by -1 in all the lines and then it prints the whole line with the new value assigned to last field $NF.
This syntax is often called a shorthand assignment and is equivalent to $NF=$NF*-1. Similarilly we have more shorthand operations like addition and subtraction:
$ echo "1 2 3" |awk '{$NF *=10;print}' #Equivalent to $NF=$NF*10
1 2 30
$ echo "1 2 3" |awk '{$NF +=10;print}' #Equivalent to $NF=$NF+10
1 2 13
$ echo "1 2 3" |awk '{$NF -=10;print}' #Equivalent to $NF=$NF-10
1 2 -7
$ echo "1 2 3" |awk '{$NF /=10;print}' #Equivalent to $NF=$NF/10
1 2 0.3
In your case:
$ echo "1 2 3" |awk '{$NF *=-1;print}'
1 2 -3
Mind that in awk, each input line - each record, is by default separated by one or more spaces.
Then each line is split into fields starting from $1 (first field) up to the last field $NF.
$ echo "1 2 3" |awk '{print $1}'
1
$ echo "1 2 3" |awk '{print $2}'
2
$ echo "1 2 3" |awk '{print $3}'
3
$ echo "1 2 3" |awk '{print $NF}'
3
The whole record in awk is called $0:
$ echo "1 2 3" |awk '{print $0}'
1 2 3
A single print, by default prints the whole line $0:
$ echo "1 2 3" |awk '{print}'
1 2 3

Count the number of digits in a bash variable

I have a number num=010. I would like to count the number of digits contained in this number. If the number of digits is above a certain number, I would like to do some processing.
In the above example, the number of digits is 3.
Thanks!
Assuming the variable only contains digits then the shell already does what you want here with the length Shell Parameter Expansion.
$ var=012
$ echo "${#var}"
3
In BASH you can do this:
num='a0b1c0d23'
n="${num//[^[:digit:]]/}"
echo ${#n}
5
Using awk you can do:
num='012'
awk -F '[0-9]' '{print NF-1}' <<< "$num"
3
num='00012'
awk -F '[0-9]' '{print NF-1}' <<< "$num"
5
num='a0b1c0d'
awk -F '[0-9]' '{print NF-1}' <<< "$num"
3
Assuming that the variable x is the "certain number" in the question
chars=`echo -n $num | wc -c`
if [ $chars -gt $x ]; then
....
fi
this work for arbitrary string mixed with digits and non digits:
ndigits=`echo $str | grep -P -o '\d' | wc -l`
demo:
$ echo sf293gs192 | grep -P -o '\d' | wc -l
6
Using sed:
s="string934 56 96containing digits98w6"
num=$(echo "$s" |sed 's/[^0-9]//g')
echo ${#num}
10
Using grep:
s="string934 56 96containing digits98w6"
echo "$s" |grep -o "[0-9]" |grep -c ""
10

Parsing Strings in Bash w/out a Delimiter

I've got a piece of a script I'm trying to figure out, so maybe its a simple question for someone more experienced out there.
Here is the code:
#!/bin/bash
echo "obase=2;$1" | bc
Used like:
$./script 12
Outputs:
1100
My question is, how can I parse this 4 digit number into separate digits? (to then delimit with cut -d ' ' and input those into an array...)
I'd like to be able to get the following output:
1 1 0 0
Is this even possible in BASH? I know its easier with other languages.
can use sed
echo "obase=2;$1" | bc | sed 's/./& /g'
or if you prefer longer form:
echo "obase=2;$1" | bc | sed 's/\(.\)/\1 /g'
if your sed supports -r
echo "obase=2;$1" | bc | sed -r 's/(.)/\1 /g'
To print individual digits from a string you can use fold:
s=1100
fold -w1 <<< "$s"
1
1
0
0
To create an array:
arr=( $(fold -w1 <<< "$s") )
set|grep arr
arr=([0]="1" [1]="1" [2]="0" [3]="0")

Count line lengths in file using command line tools

Problem
If I have a long file with lots of lines of varying lengths, how can I count the occurrences of each line length?
Example:
file.txt
this
is
a
sample
file
with
several
lines
of
varying
length
Running count_line_lengths file.txt would give:
Length Occurences
1 1
2 2
4 3
5 1
6 2
7 2
Ideas?
This
counts the line lengths using awk, then
sorts the (numeric) line lengths using sort -n and finally
counts the unique line length values uniq -c.
$ awk '{print length}' input.txt | sort -n | uniq -c
1 1
2 2
3 4
1 5
2 6
2 7
In the output, the first column is the number of lines with the given length, and the second column is the line length.
Pure awk
awk '{++a[length()]} END{for (i in a) print i, a[i]}' file.txt
4 3
5 1
6 2
7 2
1 1
2 2
Using bash arrays:
#!/bin/bash
while read line; do
((histogram[${#line}]++))
done < file.txt
echo "Length Occurrence"
for length in "${!histogram[#]}"; do
printf "%-6s %s\n" "${length}" "${histogram[$length]}"
done
Example run:
$ ./t.sh
Length Occurrence
1 1
2 2
4 3
5 1
6 2
7 2
$ perl -lne '$c{length($_)}++ }{ print qq($_ $c{$_}) for (keys %c);' file.txt
Output
6 2
1 1
4 3
7 2
2 2
5 1
Try this:
awk '{print length}' FILENAME
Or next if you want the longest length:
awk '{ln=length} ln>max{max=ln} END {print FILENAME " " max}'
You can combine above command with find using -exec option.
You can accomplish this by using basic unix utilities only:
$ printf "%s %s\n" $(for line in $(cat file.txt); do printf $line | wc -c; done | sort -n | uniq -c | sed -E "s/([0-9]+)[^0-9]+([0-9]+)/\2 \1/")
1 1
2 2
4 3
5 1
6 2
7 2
How it works?
Here's the source file:
$ cat file.txt
this
is
a
sample
file
with
several
lines
of
varying
length
Replace each line of the source file with its length:
$ for line in $(cat file.txt); do printf $line | wc -c; done
4
2
1
6
4
4
7
5
2
7
6
Sort and count the number of length occurrences:
$ for line in $(cat file.txt); do printf $line | wc -c; done | sort -n | uniq -c
1 1
2 2
3 4
1 5
2 6
2 7
Swap and format the numbers:
$ printf "%s %s\n" $(for line in $(cat file.txt); do printf $line | wc -c; done | sort -n | uniq -c | sed -E "s/([0-9]+)[^0-9]+([0-9]+)/\2 \1/")
1 1
2 2
4 3
5 1
6 2
7 2
If you allow for the columns to be swapped and don't need the headers, something as easy as
while read line; do echo -n "$line" | wc -m; done < file | sort | uniq -c
(without any advanced tricks with sed or awk) will work. The output is:
1 1
2 2
3 4
1 5
2 6
2 7
One important thing to keep in mind: wc -c counts the bytes, not the characters, and will not give the correct length for strings containing multibyte characters. Therefore the use of wc -m.
References:
man uniq(1)
man sort(1)
man wc(1)

Resources