Set variable in current shell from awk - bash

Is there a way to set a variable in my current shell from within awk?
I'd like to do some processing on a file and print out some data; since I'll read the whole file through, I'd like to save the number of lines -- in this case, FNR.
Happens though I can't seem to find a way to set a shell variable with FNR value; if not this, I'd have to read the FNR from my output file, to set, say num_lines, with FNR value.
I've tried some combinations using awk 'END{system(...)}', but could not manage it to work. Any way around this?

Here's another way.
This is especially useful when when you've got the values of your variables in a single variable and you want split them up. For example, you have a list of values from a single row in a database that you want to create variables out of.
val="hello|beautiful|world" # assume this string comes from a database query
read a b c <<< $( echo ${val} | awk -F"|" '{print $1" "$2" "$3}' )
echo $a #hello
echo $b #beautiful
echo $c #world
We need the 'here string' i.e <<< in this case, because the read command does not read from a pipe and instead reads from stdin

$ echo "$var"
$ declare $( awk 'BEGIN{print "var=17"}' )
$ echo "$var"
17
Here's why you should use declare instead of eval:
$ eval $( awk 'BEGIN{print "echo \"removing all of your files, ha ha ha....\""}' )
removing all of your files, ha ha ha....
$ declare $( awk 'BEGIN{print "echo \"removing all of your files\""}' )
bash: declare: `"removing': not a valid identifier
bash: declare: `files"': not a valid identifier
Note in the first case that eval executes whatever string awk prints, which could accidentally be a very bad thing!

You can't export variables from a subshell to its parent shell. You have some other choices, though, including:
Make another pass of the file using AWK to count records, and use command substitution to capture the result. For example:
FNR=$(awk 'END {print FNR}' filename)
Print FNR in the subshell, and parse the output in your other process.
If FNR is the same as number of lines, you can call wc -l < filename to get your count.

A warning for anyone trying to use declare as suggested by several answers.
eval does not have this problem.
If the awk (or other expression) provided to declare results in an empty string then declare will dump the current environment.
This is almost certainly not what you would want.
eg: if your awk pattern doesn't exist in the input you will never print an output, therefore you will end up with unexpected behaviour.
An example of this....
unset var
var=99
declare $( echo "foobar" | awk '/fail/ {print "var=17"}' )
echo "var=$var"
var=99
The current environment as seen by declare is printed
and $var is not changed
A minor change to store the value to set in an awk variable and print it at the end solves this....
unset var
var=99
declare $( echo "foobar" | awk '/fail/ {tmp="17"} END {print "var="tmp}' )
echo "var=$var"
var=
This time $var is unset ie: set to the null string var=''
and there is no unwanted output.
To show this working with a matching pattern
unset var
var=99
declare $( echo "foobar" | awk '/foo/ {tmp="17"} END {print "var="tmp}' )
echo "var=$var"
var=
This time $var is unset ie: set to the null string var=''
and there is no unwanted output.

Make awk print out the assignment statement:
MYVAR=NewValue
Then in your shell script, eval the output of your awk script:
eval $(awk ....)
# then use $MYVAR
EDIT: people recommend using declare instead of eval, to be slightly less error-prone if something other than the assignment is printed by the inner script. It's bash-only, but it's okay when the shell is bash and the script has #!/bin/bash, correctly stating this dependency.
The eval $(...) variant is widely used, with existing programs generating output suitable for eval but not for declare (lesspipe is an example); that's why it's important to understand it, and the bash-only variant is "too localized".

To synthesize everything here so far I'll share what I find is useful to set a shell environment variable from a script that reads a one-line file using awk. Obviously a /pattern/ could be used instead of NR==1 to find the needed variable.
# export a variable from a script (such as in a .dotfile)
declare $( awk 'NR==1 {tmp=$1} END {print "SHELL_VAR=" tmp}' /path/to/file )
export SHELL_VAR
This will avoid a massive output of variables if a declare command is issued with no argument, as well as the security risks of a blind eval.

echo "First arg: $1"
for ((i=0 ; i < $1 ; i++)); do
echo "inside"
echo "Welcome $i times."
cat man.xml | awk '{ x[NR] = $0 } END { for ( i=2 ; i<=NR ; i++ ) { if (x[i] ~ // ) {x[i+1]=" '$i'"}print x[i] }} ' > $i.xml
done
echo "compleated"

Related

Assign bash value from value in specific line

I have a file that looks like:
>ref_frame=1
TPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYMDDLYVGSD
>ref_frame=2
HQGLDISTMCFHRDGKDHQQYSKVA*QKS*SLLENKIQT*LSINTWMICM*DLT
>ref_frame=3
TRD*ISVQCASTGMERITSNIPK*HDKNLRAF*KTKSRHSYLSIHG*FVCRI*
>test_3_2960_3_frame=1
TPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPSRKQNPDIVIYQYMDDLYVGSD
I want to assign a bash variable so that echo $variable gives test_3_2960
The line/row that I want to assign the variable to will always be line 7. How can I accomplish this using bash?
so far I have:
variable=`cat file.txt | awk 'NR==7'`
echo $variable = >test_3_2960_3_frame=1
Using sed
$ variable=$(sed -En '7s/>(([^_]*_){2}[0-9]+).*/\1/p' input_file)
$ echo "$variable"
test_3_2960
No pipes needed here...
$: variable=$(awk -F'[>_]' 'NR==7{ OFS="_"; print $2, $3, $4; exit; }' file)
$: echo $variable
test_3_2960
-F is using either > or _ as field separators, so your data starts in field 2.
OFS="_" sets the Output Field Separator, but you could also just use "_" instead of commas.
exit keeps it from wasting time bothering to read beyond line 7.
If you wish to continue with awk
$ variable=$(awk 'NR==7' file.txt | awk -F "[>_]" '{print $2"_"$3"_"$4}')
$ echo $variable
test_3_2960

How to parse multiple line output as separate variables

I'm relatively new to bash scripting and I would like someone to explain this properly, thank you. Here is my code:
#! /bin/bash
echo "first arg: $1"
echo "first arg: $2"
var="$( grep -rnw $1 -e $2 | cut -d ":" -f1 )"
var2=$( grep -rnw $1 -e $2 | cut -d ":" -f1 | awk '{print substr($0,length,1)}')
echo "$var"
echo "$var2"
The problem I have is with the output, the script I'm trying to write is a c++ function searcher, so upon launching my script I have 2 arguments, one for the directory and the second one as the function name. This is how my output looks like:
first arg: Projekt
first arg: iseven
Projekt/AX/include/ax.h
Projekt/AX/src/ax.cpp
h
p
Now my question is: how do can I save the line by line output as a variable, so that later on I can use var as a path, or to use var2 as a character to compare. My plan was to use IF() statements to determine the type, idea: IF(last_char == p){echo:"something"}What I've tried was this question: Capturing multiple line output into a Bash variable and then giving it an array. So my code looked like: "${var[0]}". Please explain how can I use my line output later on, as variables.
I'd use readarray to populate an array variable just in case there's spaces in your command's output that shouldn't be used as field separators that would end up messing up foo=( ... ). And you can use shell parameter expansion substring syntax to get the last character of a variable; no need for that awk bit in your var2:
#!/usr/bin/env bash
readarray -t lines < <(printf "%s\n" "Projekt/AX/include/ax.h" "Projekt/AX/src/ax.cpp")
for line in "${lines[#]}"; do
printf "%s\n%s\n" "$line" "${line: -1}" # Note the space before the -1
done
will display
Projekt/AX/include/ax.h
h
Projekt/AX/src/ax.cpp
p

Check $1 with sed to conditionally replace last occurrence of a word in a path string

I have a variable that contains a complete path name. I am trying to conditionally replace the last occurrence of a word in the path. Example script to show what I am trying
#!/bin/sh
testvar="/home/downloads/user/downloads"
if [ "$1" = "alternate" ]; then
newtestvar=$(echo $testvar | sed -e 's/\(.*\)downloads$/\1alternate_downloads/g')
else
newtestvar=$(echo $testvar | sed -e 's/\(.*\)downloads$/\1new_downloads/g')
fi
echo "testvar:" $testvar
echo "newtestvar:" $newtestvar
Run #1
$ ./foofile
testvar: /home/downloads/user/downloads
newtestvar: /home/downloads/user/new_downloads
Run #2
$ ./foofile alternate
testvar: /home/downloads/user/downloads
newtestvar: /home/downloads/user/alternate_downloads
I do get the intended result, but I am looking for a way to avoid the if/else and rather achieve the result by checking the $1 in context of sed.
Edit-1
I replaced the if/else block with following shorthand. but it looks really clumsy and difficult to read.
newtestvar=$([[ $1 = "alternate" ]] && echo $testvar | sed -e 's/\(.*\)downloads$/\1alternate_downloads/g' || echo $testvar | sed -e 's/\(.*\)downloads$/\1new_downloads/g')
You can avoid sed and handle this in bash itself:
#!/bin/bash
testvar="/home/downloads/user/downloads"
# default s to "new"
s="${1:-new}"
# replace only last value of downloads
newtestvar="${testvar/%downloads/${s}_downloads}"
# examine both variables
declare -p testvar newtestvar
Now call it as:
./foofile
declare -- testvar="/home/downloads/user/downloads"
declare -- newtestvar="/home/downloads/user/new_downloads"
./foofile alternate
declare -- testvar="/home/downloads/user/downloads"
declare -- newtestvar="/home/downloads/user/alternate_downloads"
This can probably not be done with sed, because sed has no way to test the value of a variable and then conditionally branch the execution.
However, it can be done with AWK:
#!/bin/sh
testvar="/home/downloads/user/downloads"
newtestvar=$(awk -v arg="$1" '{
replacement = arg == "alternate" ? "alternate_downloads" : "new_downloads";
sub("downloads$", replacement);
print $0;
}
' <<<"$testvar")
echo "testvar:" $testvar
echo "newtestvar:" $newtestvar

shell variable inside awk without -v option it is working

I noticed that a shell script variable can be used inside an awk script like this:
var="help"
awk 'BEGIN{print "'$var'" }'
Can anyone tell me how to change the value of var inside awk while retaining the value outside of awk?
Similarly to accessing a variable of shell script inside awk, can we access shell array inside awk? If so, how?
It is impossible; the only variants you have:
use command substitution and write output of awk to the variable;
write data to file and then read from the outer shell;
produce shell output and then execute it with eval.
Examples.
Command substitution, one variable:
$ export A=10
$ A=$(awk 'END {print 2*ENVIRON["A"]}' < /dev/null)
$ echo $A
20
Here you multiple A by two and write the result of multiplication back.
eval; two variables:
$ A=10
$ B=10
$ eval $(awk 'END {print "A="2*ENVIRON["A"]"; B="2*ENVIRON["B"]}' < /dev/null)
$ echo $A
20
$ echo $B
20
$ awk 'END {print "A="2*ENVIRON["A"]"; B="2*ENVIRON["B"]}' < /dev/null
A=40; B=40
It uses a file intermediary, but it does work:
var="hello world"
cat > /tmp/my_script.awk.$$ <<EOF
BEGIN { print \"$var\" }
EOF
awk /tmp/my_script.awk.$$
rm -f /tmp/my_script.awk.$$
This uses the here document feature of the shell, Check your shell manual for the rules about interpolation within a here document.

How to assign the output of a command to a variable?

this is probably a very stupid question; in a bash script, given the output of, for instance;
awk '{print $7}' temp
it gives 0.54546
I would like to give this to a variable, so I tried:
read ENE <<< $(awk '{print $7}' temp)
but I get
Syntax error: redirection unexpected
Could you tell me why, and what is the easiest way to do this assignment?
Thanks
You can do command substitution as:
ENE=$(awk '{print $7}' temp)
or
ENE=`awk '{print $7}' temp`
This will assign the value 0.54546 to the variable ENE
your syntax should be
read ENE <<<$(awk '{print $1}' file)
you can directly assign the value as well
ENE=$(awk '{print $7}' temp)
you can also use the shell
$ var=$(< temp)
$ set -- $var
$ echo $7
or you can read it into array
$ declare -a array
$ read -a array <<<$(<file)
$ echo ${array[6]}
In general, Bash is kinda sensitive to spaces (requiring them some places, and breaking if they are added to other places,) which in my opinion is too bad. Just remember that there should be no space on either side of an equal sign, there should be no space after a dollar sign, and parentheses should be lined with spaces ( like this ) (not like this.)
`command` and $( command ) are the same thing, but $( this version can be $( nested ) ) whereas "this version can be `embedded in strings.` "

Resources