variable error in bash when doing calculation - bash

I assigned output of piping into a variable, but when I try to use the variable to do math, it won't allow me:
%%bash
cd /data/ref/
grep -v ">" EN | wc -c > ref
cat ref
cd /example/
grep -v ">" SR | wc -l > sample
cat sample
echo $((x= cat sample, y= cat ref, u=x/y, z=u*100))
I get this error:
41858
38986
bash: line 7: x= cat sample, y= cat ref, u=x/y, z=u*100: syntax error in expression (error token is "sample, y= cat ref, u=x/y, z=u*100"

You received that error because you passed an invalid arithmetic expression into a bash arithetic expansion. Only an arithmetic expression is allowed for this place. What you try to do seems like this:
ref="$(grep -v ">" /data/ref/EN | wc -c)"
sample="$(grep -v ">" /example/SR | wc -l)"
# this is only integer division
#u=$(( sample / ref ))
#z=$(( 100 * u ))
# to do math calculations, you can use bc
u=$(bc <<< "scale=2; $sample/$ref")
z=$(bc <<< "scale=2; 100*$u")
printf "%d, %d, %.2f, %.2f\n" "$ref" "$sample" "$u" "$z"
so hopefully you get an output like this:
41858, 38986, 0.93, 93.00
Notes:
There is no need to cd before executing a grep, it accepts the full path with the target filename as an argument. So without changing directory, you can grep various locations.
In order to save the output of your command (which is only a number) you don't need to save it in a file and cat the file. Just use the syntax var=$( ) and var will be assigned the output of this command substitution.
Have in mind that / will result to 0 for the division 38986/41858 because it's the integer division. If you want to do math calculations with decimals, you can see this post for how to do them using bc.
To print anything, use the shell builtin printf. Here the last two numbers are formatted with 2 decimal points.

Related

How to parse multiple line output as separate variables

I'm relatively new to bash scripting and I would like someone to explain this properly, thank you. Here is my code:
#! /bin/bash
echo "first arg: $1"
echo "first arg: $2"
var="$( grep -rnw $1 -e $2 | cut -d ":" -f1 )"
var2=$( grep -rnw $1 -e $2 | cut -d ":" -f1 | awk '{print substr($0,length,1)}')
echo "$var"
echo "$var2"
The problem I have is with the output, the script I'm trying to write is a c++ function searcher, so upon launching my script I have 2 arguments, one for the directory and the second one as the function name. This is how my output looks like:
first arg: Projekt
first arg: iseven
Projekt/AX/include/ax.h
Projekt/AX/src/ax.cpp
h
p
Now my question is: how do can I save the line by line output as a variable, so that later on I can use var as a path, or to use var2 as a character to compare. My plan was to use IF() statements to determine the type, idea: IF(last_char == p){echo:"something"}What I've tried was this question: Capturing multiple line output into a Bash variable and then giving it an array. So my code looked like: "${var[0]}". Please explain how can I use my line output later on, as variables.
I'd use readarray to populate an array variable just in case there's spaces in your command's output that shouldn't be used as field separators that would end up messing up foo=( ... ). And you can use shell parameter expansion substring syntax to get the last character of a variable; no need for that awk bit in your var2:
#!/usr/bin/env bash
readarray -t lines < <(printf "%s\n" "Projekt/AX/include/ax.h" "Projekt/AX/src/ax.cpp")
for line in "${lines[#]}"; do
printf "%s\n%s\n" "$line" "${line: -1}" # Note the space before the -1
done
will display
Projekt/AX/include/ax.h
h
Projekt/AX/src/ax.cpp
p

Why does my bash script flag this awk substring command as a syntactic error when it works in the terminal?

I'm trying to extract a list of dates from a series of links using lynx's dump function and piping the output through grep and awk. This operation works successfully in the terminal and outputs dates accurately. However, when it is placed into a shell script, bash claims a syntax error:
Scripts/ETC/PreD.sh: line 18: syntax error near unexpected token `('
Scripts/ETC/PreD.sh: line 18: ` lynx --dump "$link" | grep -m 1 Date | awk '{print substr($0,10)}' >> dates.txt'
For context, this is part of a while-read loop in which $link is being read from a file. Operations undertaken inside this while-loop when the awk command is removed are all successful, as are similar while-loops that include other awk commands.
I know that either I'm misunderstanding how bash handles variable substitution, or how bash handles awk commands, or some combination of the two. Any help would be immensely appreciated.
EDIT: Shellcheck is divided on this, the website version finds no error, but my downloaded version provides error SC1083, which says:
This { is literal. Check expression (missing ;/\n?) or quote it.
A check on the Shellcheck GitHub page provides this:
This error is harmless when the curly brackets are supposed to be literal, in e.g. awk {'print $1'}.
However, it's cleaner and less error prone to simply include them inside the quotes: awk '{print $1}'.
Script follows:
#!/bin/bash
while read -u 4 link
do
IFS=/ read a b c d e <<< "$link"
echo "$e" >> 1.txt
lynx --dump "$link" | grep -A 1 -e With: | tr -d [:cntrl:][:digit:][] | sed 's/\With//g' | awk '{print substr($0,10)}' | sed 's/\(.*\),/\1'\ and'/' | tr -s ' ' >> 2.txt
lynx --dump "$link" | grep -m 1 Date | awk '{print substr($0,10)}' >> dates.txt
done 4< links.txt
In sed command you have unmatched ', due to unquoted '.
In awk script your have constant zero length variable.
From gawk manual:
substr(string, start [, length ])
Return a length-character-long substring of string, starting at character number start. The first character of a string is character
number one.48 For example, substr("washington", 5, 3) returns "ing".
If length is not present, substr() returns the whole suffix of string that begins at character number start. For example,
substr("washington", 5) returns "ington". The whole suffix is also
returned if length is greater than the number of characters remaining
in the string, counting from character start.
If start is less than one, substr() treats it as if it was one. (POSIX doesn’t specify what to do in this case: BWK awk acts this way,
and therefore gawk does too.) If start is greater than the number of
characters in the string, substr() returns the null string. Similarly,
if length is present but less than or equal to zero, the null string
is returned.
Also I suggest you combine grep|awk|sed|tr into single awk script. And debug the awk script with printouts.
From:
lynx --dump "$link" | grep -A 1 -e With: | tr -d [:cntrl:][:digit:][] | sed 's/\With//g' | awk '{print substr($0,10,length)}' | sed 's/\(.*\),/\1'\ and'/' | tr -s ' ' >> 2.txt
To:
lynx --dump "$link" | awk '/With/{found=1;next}found{found=0;print sub(/\(.*\),/,"& and",gsub(/ +/," ",substr($0,10)))}' >> 2.txt
From:
lynx --dump "$link" | grep -m 1 Date | awk '{print substr($0,10,length)}' >> dates.txt
To:
lynx --dump "$link" | awk '/Date/{print substr($0,10)}' >> dates.txt

bash scripting, how to parse string separated with :

I have lines that look like these
value: "15"
value: "20"
value: "3"
I am getting this as input pipe after grepping
... | grep value:
What I need is a simple bash script that takes this pipe and produce me the sum
15 + 20 + 3
So my command will be:
... | grep value: | calculate_sum_value > /tmp/sum.txt
sum.txt should contain a single number which is the sum.
How can I do with bash? I have no experience with bash at all.
You could try awk. Something like this should work
... | grep value: | awk '{sum+=$2}END{print sum}'
And you could possibly avoid grep alltogether like this
.... | awk '/^value:/{sum+=$2}END{print sum}'
Update:
You can add the " character as a field seperator with the -F option.
... | awk -F\" '/^value:/{sum+=$2}END{print sum}'
My first try was to grab the stuff on the right of the colon and let bash sum it:
$ sum=0
$ cat sample.txt | while IFS=: read key value; do ((sum += value)); done
bash: ((: "15": syntax error: operand expected (error token is ""15"")
bash: ((: "20": syntax error: operand expected (error token is ""20"")
bash: ((: "3": syntax error: operand expected (error token is ""3"")
0
So, have to remove the quotes. Fine, use a fancy Perl regex to extract the first set of digits to the right of the colon:
$ cat sample.txt | grep -oP ':\D+\K\d+'
15
20
3
OK, onwards:
$ cat sample.txt | grep -oP ':\D+\K\d+' | while read n; do ((sum+=n)); done; echo $sum
0
Huh? Oh yeah, running while in a pipeline puts the modifications to sum in a subshell, not in the current shell. Well, do the echo in the subshell too:
$ cat sample.txt | grep -oP ':\D+\K\d+' | { while read n; do ((sum+=n)); done; echo $sum; }
38
That's better, but still the value is not in the current shell. Let's try something trickier
$ set -- $(cat sample.txt | grep -oP ':\D+\K\d+')
$ sum=$(IFS=+; bc <<< "$*")
$ echo $sum
38
And yes, UUOC, but it's a placeholder for whatever the OP's pipeline was.

piping files in unix for "wc" while retaining filename

I have a bunch of files of the form myfile[somenumber] that are in nested directories.
I want to generate a line count on each of the files, and output that count to a file.
These files are binary and so they have to be piped through an additional script open_file before they can be counted by "wc". I do:
ls ~/mydir/*/*/other_dir/myfile* | while read x; do open_file $x | wc -l; done > stats
this works, but the problem is that it outputs the line counts to the file stats without saying the original filename. for example, it outputs:
100
150
instead of:
/mydir/...pathhere.../myfile1: 100
/mydir/...pathhere.../myfile2: 150
Second question:
What if I wanted to divide the number of wc -l by a constant, e.g. dividing it by 4, before outputting it to the file?
I know that the number of lines is a multiple of 4 so the result should be in an integer. Not sure how to do that from the above script.
how can I make it put the original filename and the wc -l result in the output file?
thank you.
You can output the file name before counting the lines:
echo -n "$x: " ; open_file $x | wc -l. The -n parameter to echo omits the trailing newline in the output.
To divide integers, you can use expr, e.g., expr $(open_file $x | wc -l) / 4.
So, the complete while loop will look as follows:
while read x; do echo -n "$x: " ; expr $(open_file $x | wc -l) / 4 ; done
Try this:
while read x; do echo -n "$x: " ; s=$(open_file $x | wc -l); echo $(($s / 4));
You've thrown away the filename by the time you get to wc(1) -- all it ever sees is a pipe(7) -- but you can echo the filename yourself before opening the file. If open_file fails, this will leave you with an ugly output file, but it might be a suitable tradeoff.
The $((...)) uses bash(1) arithmetic expansion. It might not work on your shell.

How to do exponentiation in Bash

I tried
echo 10**2
which prints 10**2. How to calculate the right result, 100?
You can use the let builtin:
let var=10**2 # sets var to 100.
echo $var # prints 100
or arithmetic expansion:
var=$((10**2)) # sets var to 100.
Arithmetic expansion has the advantage of allowing you to do shell arithmetic and then just use the expression without storing it in a variable:
echo $((10**2)) # prints 100.
For large numbers you might want to use the exponentiation operator of the external command bc as:
bash:$ echo 2^100 | bc
1267650600228229401496703205376
If you want to store the above result in a variable you can use command substitution either via the $() syntax:
var=$(echo 2^100 | bc)
or the older backtick syntax:
var=`echo 2^100 | bc`
Note that command substitution is not the same as arithmetic expansion:
$(( )) # arithmetic expansion
$( ) # command substitution
Various ways:
Bash
echo $((10**2))
Awk
awk 'BEGIN{print 10^2}' # POSIX standard
awk 'BEGIN{print 10**2}' # GNU awk extension
bc
echo '10 ^ 2' | bc
dc
dc -e '10 2 ^ p'
Actually var=$((echo 2^100 | bc)) doesn't work - bash is trying to do math inside (()). But a
command line sequence is there instead so it creates an error
var=$(echo 2^100 | bc) works as the value is the result of the command line executing inside
()

Resources