Setting Bash variable to last number in output - bash

I have bash running a command from another program (AFNI). The command outputs two numbers, like this:
70.0 13.670712
I need to make a bash variable that will be whatever the last # is (in this case 13.670712). I've figured out how to make it print only the last number, but I'm having trouble setting it to be a variable. What is the best way to do this?
Here is the code that prints only 13.670712:
test="$(3dBrickStat -mask ../../template/ROIs.nii -mrange 41 41 -percentile 70 1 70 'stats.s1_ANTS+tlrc[25]')"; echo "${test}" | awk '{print $2}'

Just pipe(|) the command output to awk. Here in your example, awk reads from stdout of your previous command and prints the 2nd column de-limited by the default single white-space character.
test="$(3dBrickStat -mask ../../template/ROIs.nii -mrange 41 41 -percentile 70 1 70 'stats.s1_ANTS+tlrc[25]' | awk '{print $2}')"
printf "%s\n" "$test"
13.670712
(or) using echo
echo "$test"
13.670712
This is the simplest of the ways to do this, if you are looking for other ways to do this in bash-ism, use read command as using process-substitution
read _ va2 < <(3dBrickStat -mask ../../template/ROIs.nii -mrange 41 41 -percentile 70 1 70 'stats.s1_ANTS+tlrc[25]')
printf "%s\n" "$val2"
13.670712
Another more portable version using set, which will work irrespective of the shell available.
set -- $(3dBrickStat -mask ../../template/ROIs.nii -mrange 41 41 -percentile 70 1 70 'stats.s1_ANTS+tlrc[25]');
printf "%s\n" "$2"
13.670712

You can use cut to print to print the second column:
$ echo "70.0 13.670712" | cut -d ' ' -f2
13.670712
And assign that to a variable with command substitution:
$ sc="$(echo '70.0 13.670712' | cut -d ' ' -f2)"
$ echo "$sc"
13.670712
Just replace echo '70.0 13.670712' with the command that is actually producing the two numbers.
If you want to grab the last value of some delimited field (or delimited output from a command), you can use parameter expansion. This is completely internal to Bash:
$ echo "$s"
$ echo ${s##*' '}
10
$ echo "$s2"
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
$ echo ${s2##*' '}
20
And then just assign directly:
$ echo "$s2"
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
$ lf=${s2##*' '}
$ echo "$lf"
20

Related

Is there a way to print lines from a file from n to m and than reverse their positions?

I'm trying to print text from line 10 to 20 and then reverse their positions.
I've tried this:
sed '10!G;h;$!d' file.txt
But it only prints from 10 to end of the file. Is there any way to stop it at line 20 by using only one sed command?
Almost there, you just need to replace $!d with the 'until' line-number
sed -n '10,20p' tst.txt
// Prints line 10 <--> 20
sed -n '10!G;h;20p' tst.txt
// Prints REVERSE line 10 <--> 20
output:
20
19
18
17
16
15
14
13
12
11
10
tst.txt:
1
2
3
4
...
19
20
Info
You can use this to print a range of lines:
sed -n -e 10,20p file.txt | tac
tac will reverse the order of the lines
And for those of you without tac (like those mac users out there):
sed -n -e 10,20p file.txt | tail -r

Sorting tab delimited numbers by column with pure bash script.

Im stuck on some homework. The requirements of the assignment are to accept an input file and perform some statistics on the values. The user may specify whether to calculate the statistics by row or by value. The shell script must be pure bash script so I can't use awk, sed, perl, python etc.
sample input:
1 1 1 1 1 1 1
39 43 4 3225 5 2 2
6 57 8 9 7 3 4
3 36 8 9 14 4 3
3 4 2 1 4 5 5
6 4 4814 7 7 6 6
I can't figure out how to sort and process the data by column. My code for processing the rows works fine.
# CODE FOR ROWS
while read -r line
echo $(printf "%d\n" $line | sort -n) | tr ' ' \\t > sorted.txt
....
#I perform the stats calculations
# for row line by working with the temp file sorted.txt
done
How could I process this data by column? I've never worked with shell script so I've been staring at this for hours.
If you wanted to analyze by columns you'll need the cols value first (number of columns). head -n 1 gives you the first row, and NF counts the number of fields, giving us the number of columns.
cols=$(head -n 1 test.txt | awk '{print NF}');
Then you can use cut with the '\t' delimiter to grab every column from input.txt, and run it through sort -n, as you did in your original post.
$ for i in `seq 2 $((cols+1))`; do cut -f$i -d$'\t' input.txt; done | sort -n > output.txt
For rows, you can use the shell built-in printf with the format modifier %dfor integers. The sort command works on lines of input, so we replace spaces ' ' with newlines \n using the tr command:
$ cat input.txt | while read line; do echo $(printf "%d\n" $line); done | tr ' ' '\n' | sort -n > output.txt
Now take the output file to gather our statistics:
Min: cat output.txt | head -n 1
Max: cat output.txt | tail -n 1
Sum: (courtesy of Dimitre Radoulov): cat output.txt | paste -sd+ - | bc
Mean: (courtesy of porges): cat output.txt | awk '{ $total += $2 } END { print $total/NR }'
Median: (courtesy of maxschlepzig): cat output.txt | awk ' { a[i++]=$1; } END { print a[int(i/2)]; }'
Histogram: cat output.txt | uniq -c
8 1
3 2
4 3
6 4
3 5
4 6
3 7
2 8
2 9
1 14
1 36
1 39
1 43
1 57
1 3225
1 4814

Dividing one file into separate based on line numbers

I have the following test file:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
I want to separate it in a way that each file contains the last line of the previous file as the first line. The example would be:
file 1:
1
2
3
4
5
file2:
5
6
7
8
9
file3:
9
10
11
12
13
file4:
13
14
15
16
17
file5:
17
18
19
20
That would make 4 files with 5 lines and 1 file with 4 lines.
As a first step, I tried to test the following commands I wrote to get only the first file which contains the first 5 lines. I can't figure out why the awk command in the if statement, instead of printing the first 5 lines, it prints the whole 20?
d=$(wc test)
a=$(echo $d | cut -f1 -d " ")
lines=$(echo $a/5 | bc -l)
integer=$(echo $lines | cut -f1 -d ".")
for i in $(seq 1 $integer); do
start=$(echo $i*5 | bc -l)
var=$((var+=1))
echo start $start
echo $var
if [[ $var = 1 ]]; then
awk 'NR<=$start' test
fi
done
Thanks!
Why not just use the split util available from your POSIX toolkit. It has an option to split on number of lines which you can give it as 5
split -l 5 input-file
From the man split page,
-l, --lines=NUMBER
put NUMBER lines/records per output file
Note that, -l is POSIX compliant also.
$ ls
$
$ seq 20 | awk 'NR%4==1{ if (out) { print > out; close(out) } out="file"++c } {print > out}'
$
$ ls
file1 file2 file3 file4 file5
.
$ cat file1
1
2
3
4
5
$ cat file2
5
6
7
8
9
$ cat file3
9
10
11
12
13
$ cat file4
13
14
15
16
17
$ cat file5
17
18
19
20
If you're ever tempted to use a shell loop to manipulate text again, make sure to read https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice first to understand at least some of the reasons to use awk instead. To learn awk, get the book Effective Awk Programming, 4th Edition, by Arnold Robbins.
oh. and wrt why your awk command awk 'NR<=$start' test didn't work - awk is not shell, it has no more access to shell variables (or vice-versa) than a C program does. To init an awk variable named awkstart with the value of a shell variable named start and then use that awk variable in your script you'd do awk -v awkstart="$start" 'NR<=awkstart' test. The awk variable can also be named start or anything else sensible - it is completely unrelated to the name of the shell variable.
You could improve your code by removing the unneccesary echo cut and bc and do it like this
#!/bin/bash
for i in $(seq $(wc -l < test) ); do
(( i % 4 != 1 )) && continue
tail +$i test | head -5 > "file$(( 1+i/4 ))"
done
But still the awk solution is much better. Reading the file only once and taking actions based on readily available information (like the linenumber) is the way to go. In shell you have to count the lines, there is no way around it. awk will give you that (and a lot of other things) for free.
Use split:
$ seq 20 | split -l 5
$ for fn in x*; do echo "$fn"; cat "$fn"; done
xaa
1
2
3
4
5
xab
6
7
8
9
10
xac
11
12
13
14
15
xad
16
17
18
19
20
Or, if you have a file:
$ split -l test_file

Read the number of columns using awk/sed

I have the following test file
Kmax Event File - Text Format
1 4 1000
65 4121 9426 12312
56 4118 8882 12307
1273 4188 8217 12309
1291 4204 8233 12308
1329 4170 8225 12303
1341 4135 8207 12306
63 4108 8904 12300
60 4106 8897 12307
731 4108 8192 12306
...
ÿÿÿÿÿÿÿÿ
In this file I want to delete the first two lines and apply some mathematical calculations. For instance each column i will be $i-(i-1)*number. A script that does this is the following
#!/bin/bash
if test $1 ; then
if [ -f $1.evnt ] ; then
rm -f $1.dat
sed -n '2p' $1.evnt | (read v1 v2 v3
for filename in $1*.evnt ; do
echo -e "Processing file $filename"
sed '$d' < $filename > $1_tmp
sed -i '/Kmax/d' $1_tmp
sed -i '/^'"$v1"' '"$v2"' /d' $1_tmp
cat $1_tmp >> $1.dat
done
v3=`wc -l $1.dat | awk '{print $1}' `
echo -e "$v1 $v2 $v3" > .$1.dat
rm -f $1_tmp)
else
echo -e "\a!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
echo -e " Event file $1.evnt doesn't exist !!!!!!"
echo -e "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
fi
else
echo -e "\a!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
echo -e "!!!!! Give name for event files !!!!!"
echo -e "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
fi
awk '{print $1, $2-4096, $3-(2*4096), $4-(3*4096)}' $1.dat >$1_Processed.dat
rm -f $1.dat
exit 0
The file won't always have 4 columns. Is there a way to read the number of columns, print this number and apply those calculations?
EDIT The idea is to have an input file (*.evnt), convert it to *.dat or any other ascii file(it doesn't matter really) which will only include the number in columns and then apply the calculation $i=$i-(i-1)*number. In addition it will keep the number of columns in a variable, that will be called in another program. For instance in the above file, number=4096 and a sample output file is the following
65 25 1234 24
56 22 690 19
1273 92 25 21
1291 108 41 20
1329 74 33 15
1341 39 15 18
63 12 712 12
60 10 705 19
731 12 0 18
while in the console I will get the message There are 4 detectors.
Finally a new file_processed.dat will be produced, where file is the initial name of awk's input file.
The way it should be executed is the following
./myscript <filename>
where <filename> is the name without the format. For instance, the files will have the format filename.evnt so it should be executed using
./myscript filename
Let's start with this to see if it's close to what you're trying to do:
$ numdet=$( awk -v num=4096 '
NR>2 && NF>1 {
out = FILENAME "_processed.dat"
for (i=1;i<=NF;i++) {
$i = $i-(i-1)*num
}
nf = NF
print > out
}
END {
printf "There are %d detectors\n", nf | "cat>&2"
print nf
}
' file )
There are 4 detectors
$ cat file_processed.dat
65 25 1234 24
56 22 690 19
1273 92 25 21
1291 108 41 20
1329 74 33 15
1341 39 15 18
63 12 712 12
60 10 705 19
731 12 0 18
$ echo "$numdet"
4
Is that it?
Using awk
awk 'NR<=2{next}{for (i=1;i<=NF;i++) $i=$i-(i-1)*4096}1' file

Floating point results in Bash integer division

I have a backup script on my server which does cron jobs of backups, and sends me a summary of files backed up, including the size of the new backup file. As part of the script, I'd like to divide the final size of the file by (1024^3) to get the file size in GB, from the file size in bytes.
Since bash does not have floating point calculation, I am trying to use pipes to bc to get the result, however I'm getting stumped on basic examples.
I tried to get the value of Pi to a scale, however,
even though the following works:
~ #bc -l
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
4/3
1.33333333333333333333
22/7
3.14285714285714285714
q
0
quit
A non interactive version does not work:
#echo $(( 22/7 )) | bc
3
This works:
#echo '22/7' | bc -l
3.14285714285714285714
But I need to use variables. So it doesnt help that the following does not work:
#a=22 ; b=7
#echo $(( a/b )) | bc -l
3
I'm obviously missing something in the syntax for using variables in Bash, and could use with some 'pointers' on what I've misunderstood.
As DigitalRoss said, I can use the following:
#echo $a / $b | bc -l
3.14285714285714285714
However I cant use complex expressions like:
#echo $a / (( $b-34 )) | bc -l
-bash: syntax error near unexpected token `('
#echo $a / (( b-34 )) | bc -l
-bash: syntax error near unexpected token `('
#echo $a / (( b-34 )) | bc -l
-bash: syntax error near unexpected token `('
Can someone give me a working correct syntax for getting floating point results with complicated arithmetic expresssions?
Just double-quote (") the expression:
echo "$a / ( $b - 34 )" | bc -l
Then bash will expand the $ variables and ignore everything else and bc will see an expression with parentheses:
$ a=22
$ b=7
$ echo "$a / ( $b - 34 )"
22 / ( 7 - 34 )
$ echo "$a / ( $b - 34 )" | bc -l
-.81481481481481481481
Please note that your echo $(( 22/7 )) | bc -l actually makes bash calculate 22/7 and then send the result to bc. The integer output is therefore not the result of bc, but simply the input given to bc.
Try echo $(( 22/7 )) without piping it to bc, and you'll see.
scale variable determines number of digits after decimal separator
$ bc
$ scale=2
$ 3/4
$ .75
I would prefer awk over bc, it is does the same thing in one command and also gives you more flexibilty to add variables and format your output by using printf:
# Define vars in the command
awk -v a=3 -v b=2 'BEGIN{print a/b}'
1.5
# Define vars earlier and init with them awk vars
c=3
d=2
awk -v a=$c -v b=$d 'BEGIN{print a/b}'
1.5
# Use vars that are defined in script
a=3
b=2
awk 'BEGIN{print '$a'/'$b'}'
# Format your output using C printf syntax
awk -v a=3 -v b=2 'BEGIN{printf("%.3f\n", a/b)}'
1.500
Also bc does not return a code error if it divides by zero, so you can't check the error:
echo 3/0 | bc -l
Runtime error (func=(main), adr=5): Divide by zero
# The error code is zero, that means there is no errors
echo $?
0
While awk does return a code error 2:
awk -v a=3 -v b=0 'BEGIN{print a/b}'
awk: cmd. line:1: fatal: division by zero attempted
# awk returned code error 2, that indicates that something went wrong
echo $?
2
The code error can be used to check for division by zero like:
# Set your own vars
if output=$(awk -v a=3 -v b=0 'BEGIN{print a/b}' 2> /dev/null); then
echo "$output"
else
echo "error, division by zero"
fi
u can handle the div-zero error checking directly at awk :
for a in 19 29 31; do
for b in 11 3 0; do
gawk -v PREC=512 -Mbe '$++NF= +(_=$NF) ? $(!!_)/_ : "div_by_zero"' \
\
CONVFMT='%.59g' OFS=' \t| ' <<< "${a} ${b}"; done; done
19 | 11 | 1.7272727272727272727272727272727272727272727272727272727273
19 | 3 | 6.3333333333333333333333333333333333333333333333333333333333
19 | 0 | div_by_zero
29 | 11 | 2.6363636363636363636363636363636363636363636363636363636364
29 | 3 | 9.6666666666666666666666666666666666666666666666666666666667
29 | 0 | div_by_zero
31 | 11 | 2.8181818181818181818181818181818181818181818181818181818182
31 | 3 | 10.333333333333333333333333333333333333333333333333333333333
31 | 0 | div_by_zero
if u don't need all that GMP precision, then mawk is willing to directly return an infinity instead of a fatal error message :
for a in 19 29 31; do for b in 11 3 0; do
mawk '$++NF=$++_/$(_+_--)' CONVFMT='%.19g' OFS='\t' <<<"$a $b";done;done
19 11 1.727272727272727293
19 3 6.333333333333333037
19 0 inf
29 11 2.636363636363636243
29 3 9.666666666666666075
29 0 inf
31 11 2.818181818181818343
31 3 10.33333333333333393
31 0 inf
or better yet, do it from one single call to awk instead of calling it nonstop :
for a in 19 29 31; do for b in 11 3 0; do
echo "${a} ${b}"
done; done | mawk '$++NF = $(++_) / $(_+_--)' CONVFMT='%.19g' OFS='\t'
19 11 1.727272727272727293
19 3 6.333333333333333037
19 0 inf
29 11 2.636363636363636243
29 3 9.666666666666666075
29 0 inf
31 11 2.818181818181818343
31 3 10.33333333333333393
31 0 inf
Or if you so prefer, have mawk call gawk-gmp indirectly :
echo "22 7\n22 4\n22 0" |
mawk '$++NF = substr(_=__="", (__="gawk -v PREC=65536 -Mbe"\
" \47BEGIN { printf(\"%.127f\","(+(_=$(NF-!_))\
? "("($!__)")/("(_)")" : (+(_=$!__)<-_?__:"-") \
"log(_<_)")") } \47" ) | getline _, close(__))_'
22 7 3.1428571428571428571428571428571428571428571428571428………
22 4 5.5000000000000000000000000000000000000000000000000000………
22 0 +inf

Resources