Extracting numbers from the variable in bash script - bash

I have a bash script which outputs some statistics on the screen for each sample number(example shown below) when I run the test.sh code.
What I am trying to do is that to store the output in a variable and extract the Mean value for a testNum --> 11 and 52
A=`./test.sh` # I have all the output in a variable A
Now I need to extract the mean value of 11 which is -128 and -96 for 52
I'm trying and thinking How I can do this
Can anyone help me in this please ?
This is the example test code : test.sh
#!/bin/sh
echo Valid Numbers per States:3G phones
echo testNum N_States Mean Value
echo 1 10 -128
echo 2 10 -95
echo 3 10 -94
echo 4 10 -94
echo 5 10 -94
echo 6 10 -128
echo 7 10 -91
echo 8 10 -94
echo 9 10 -94
echo 10 10 -94
echo 11 10 -128
echo --------------------------------------------
echo Valid Numbers per States :4G phones
echo testNum N_States Mean Value
echo 36 10 -95
echo 40 10 -95
echo 44 10 -95
echo 48 10 -95
echo 52 10 -96
echo 56 10 -95
echo 60 10 -96
echo 64 10 -96
echo 100 10 -99
echo 104 10 -97
echo 108 10 -98
echo 112 9 -98
echo 116 9 -98
echo 120 9 -99
echo 124 9 -98
echo 128 9 -98
echo 132 9 -98
echo 136 9 -99
echo 140 9 -99
echo 144 9 -99
echo 149 9 -98
echo 153 9 -99
echo 157 9 -99
echo 161 9 -99
echo 165 9 -98
echo --------------------------------------------
I have used the commands echo "$x" | grep -w '^52' | cut -d' ' -f3
But Linux on my proprietary hardware doesn't allow ^ (not sure if its version or something) .. I can run this on any bash shell..works fine, but if I run the same command, it doesn't output anything.
So I started doing some awk on it
Here temp is 128
NF_Est is output of the script
echo "$NF_Est" | grep -w 128 | awk '{if ($1==128)print $0}' | tr -s " " | cut -d" " -f3
But this is not working if I am the "temp" values in multiple columns
Any suggestions where I am messing up (or) this can be done in a much simpler way?

You really want awk for this:
$ ./test.sh | awk '$1==11{print $3}'
-128
$ ./test.sh | awk '$1==52{print $3}'
-96
If you want to extract the value from $A instead of running the script, just do: echo "$A" | awk ...

You can use grep and cut to extract the information:
echo "$x" | grep -w '^11' | cut -d' ' -f3
echo "$x" | grep -w '^52' | cut -d' ' -f3
grep filters its input, outputting only lines that match the given pattern. ^ matches at the beginning of line. -w means "match whole words", without it, it would also output the lines 112 and 116.
cut extracts columns from its intput. -d specifies the delimiter, a space in this case, and -f says which columns to extract.

Related

How can I use 'echo' output as an operand for the 'seq' command within a terminal?

I have an excercise where I need to sum together every digit up until a given number like this:
Suppose I have the number 12, I need to do 1+2+3+4+5+6+7+8+9+1+0+1+1+1+2.
(numbers past 9 are split up into their separate digits eg. 11 = 1+1, 234 = 2+3+4, etc.)
I know I can just use:
seq -s '' 12
which outputs 123456789101112 and then add them all together with '+' in between and then pipe to 'bc' BUT I have to specifically do :
echo 12 | ...
as the first step (because the online IDE fills it in as the unchangeable first step for every testcase) and when I do this I start to have problems with seq
I tried
echo 12 | seq -s '' $1
### or just ###
echo 12 | seq -s ''
but can't get it to work as this just gives back a missing operand error for seq (because I'm in the terminal, not a script and the '12' isn't just assigned to $1 I assume), any recommendations on how to avoid it or how to get seq to interpret the 12 from echo as operand or alternative ways to go?
seq -s '' $(cat)
full solution:
echo "12" | seq -s '' $(cat) | sed 's/./&+/g; s/$/0/' | bc
Or
echo 12 | { echo $(( $({ seq -s '' $(< /dev/stdin); echo; } | sed -E 's/([[:digit:]])/\1+/g; s/$/0/') )); }
without sed:
d=$(echo 12 | { seq -s '' $(< /dev/stdin); echo; }); echo $(( "${d//?/&+}0" ))
echo 12 | awk '{
cnt=0
for(i=1;i<=$1;i++) {
cnt+=i
printf("%s%s",i,i<$1?"+":"=")
}
print cnt
}'
Prints:
1+2+3+4+5+6+7+8+9+10+11+12=78
If it is supposed to be just the digits added up:
echo 12 | awk '{s=""
for(i=1;i<=$1;i++) s=s i
split(s,ch,"")
for(i=1;i<=length(ch); i++) cnt+=ch[i]
print cnt
}'
51
Or a POSIX pipeline:
$ echo 12 | seq -s '' "$(cat)" | sed -E 's/([0-9])/\1+/g; s/$/0/' | bc
51

How to sample 50 random files from my dataset witch each file having the same probabilty to be taken in shell script?

does find /mnt/Dataset/ -type f | shuf -n 50 is doing the trick?
Does shuf wait to count all the lines then do a random selection? Does shuf give the same probability to each line? Or should I use another tool?
When you are wondering how shuf works with the pipeline (wait for the pipeline to be finished or process data when it is available, you can write a test. The test will look like:
for ((i=0; i<20; i++)); do
(printf "%s\n" {1..9}; sleep 0.1; echo 10) | shuf | tr '\n' ' '
echo
done
This test is without the -n option and you want a larger sample to look at the averages. The next loop is better for testing
for ((i=0; i<10000; i++)); do
(printf "%s\n" {1..9}; sleep 0.01; echo 10) | shuf | tr '\n' ' '
echo
done > sample.txt
# Look for how often 10 is the last number on a line
grep -c "10 $" sample.txt
I also did a test:
cut -d " " -f1 sample.txt | sort | uniq -c
1040 1
985 10
976 2
1012 3
981 4
999 5
1043 6
974 7
979 8
1011 9
I did not check the distribution with the sample size, but it feels like a good random distribution.

Sum/Average numbers in a single line - UNIX

I'm working on a small script to take 3 numbers in a single line, sum and average them, and print the result at the end of the line. I know how to use the paste command, but everything I'm finding is telling me how to average a column. I need to average a line, not a column. Any advice? Thanks!
awk to the rescue!
$ echo 1 2 3 | awk -v RS=' ' '{sum+=$1; count++} END{print sum, sum/count}'
6 2
works for any number of input fields
$ echo 1 2 3 4 | awk -v RS=' ' '{sum+=$1; count++} END{print sum, sum/count}'
10 2.5
You can manipulate your line before giving it to bc. With bc you have additional possibilities such as setting the scale.
A simple mean from 1 2 3 would be
echo "1 2 3" | sed -e 's/\([0-9.]\) /\1+/g' -e 's/.*/(&)\/3/' | bc
You can wrap it in a function and see more possibilities:
function testit {
echo "Input $#"
echo "Integer mean"
echo "$#" | sed -e 's/\([0-9.]\) /\1+/g' -e 's/.*/(&)\/'$#'/' | bc
echo "floating decimal mean"
echo "$#" | sed -e 's/\([0-9.]\) /\1+/g' -e 's/.*/(&)\/'$#'/' | bc -l
echo "2 decimal output mean"
echo "$#" | sed -e 's/\([0-9.]\) /\1+/g' -e 's/.*/scale=2; (&)\/'$#'/' | bc
echo
}
testit 4 5 6
testit 4 5 8
testit 4.2 5.3 6.4
testit 1 2 3 4 5 6 7 8 9

Replacing every element in a matrix

I have a table of numbers like this --
13030 11537 40387 38
31500 174 40387 38
8928 7132 40387 40387 40387 40387 38
1299 174 40387 38
All the rows dont have the same number of columns.
I want to make another table of the same size, by replacing the each number with cmd $number where cmd is a generic bash command (may be piped). And I want to do the whole thing in bash.
Can this be done?
The next program will do it:
while read line
do
for num in $line
do
result=$( cmd $num )
echo -n "$result "
done
echo
done
I like using xargs. Replace echo %s + 1 | bc with your command. My example adds one to each number.
xargs -L 1 -i bash -c "printf 'echo -n \"\$(echo %s + 1 | bc) \";' {} ; echo 'echo;'"

how can i echo a line once , then the rest keep them the way they are in unix bash?

I have the following comment:
(for i in 'cut -d "," -f1 file.csv | uniq`; do var =`grep -c $i file.csv';if (($var > 1 )); then echo " you have the following repeated numbers" $i ; fi ; done)
The output that i get is : You have the following repeated numbers 455
You have the following repeated numbers 879
You have the following repeated numbers 741
what I want is the following output:
you have the following repeated numbers:
455
879
741
Try moving the echo of the header line before the for-loop :
(echo " you have the following repeated numbers"; for i in 'cut -d "," -f1 file.csv | uniq`; do var =`grep -c $i file.csv';if (($var > 1 )); then echo $i ; fi ; done)
Or only print the header once :
(header=" you have the following repeated numbers\n"; for i in 'cut -d "," -f1 file.csv | uniq`; do var =`grep -c $i file.csv';if (($var > 1 )); then echo -e $header$i ; header=""; fi ; done)
Well, here's what I came to:
1) generated input for testing
for x in {1..35},aa,bb ; do echo $x ; done > file.csv
for x in {21..48},aa,bb ; do echo $x ; done >> file.csv
for x in {32..63},aa,bb ; do echo $x ; done >> file.csv
unsort file.csv > new.txt ; mv new.txt file.csv
2) your line ( corrected syntax errors)
dtpwmbp:~ pwadas$ for i in $(cut -d "," -f1 file.csv | uniq);
do var=`grep -c $i file.csv`; if [ "$var" -ge 1 ] ;
then echo " you have the following repeated numbers" $i ; fi ; done | head -n 10
you have the following repeated numbers 8
you have the following repeated numbers 41
you have the following repeated numbers 18
you have the following repeated numbers 34
you have the following repeated numbers 3
you have the following repeated numbers 53
you have the following repeated numbers 32
you have the following repeated numbers 33
you have the following repeated numbers 19
you have the following repeated numbers 7
dtpwmbp:~ pwadas$
3) my line:
dtpwmbp:~ pwadas$ echo "you have the following repeated numbers:";
for i in $(cut -d "," -f1 file.csv | uniq); do var=`grep -c $i file.csv`;
if [ "$var" -ge 1 ] ; then echo $i ; fi ; done | head -n 10
you have the following repeated numbers:
8
41
18
34
3
53
32
33
19
7
dtpwmbp:~ pwadas$
I added quotes, changed if() to [..] expression, and finally moved description sentence out of loop. Number of occurences tested is digit near "-ge" condition. If it is "1", then numbers which appear once or more are printed. Note, that in this expression, if file contains e.g. numbers
8
12
48
then "8" is listed in output as appearing twice. with "-ge 2", if no digits appear more than once, no output (except heading) is printed.

Resources