Embedding awk in a shell script - bash

I've been using a bash script (script.sh) which uses various awk scripts (script1.awk,script2.awk) which are tailored at "runtime", by replacing values for instance.
I've been looking for ways to embed them completely within the first bash script.
Ideally, I would like to have a file looking like this :
################################
# AWK scripts #
################################
read -d '' scriptVariable <<'EOF'
'
{my block commands;}
'
EOF
################################
# End of AWK Scripts #
################################
awk $scriptVariable ${inputfile} # This line obviously doesn't work
instead of the traditional :
awk '{
my script commands
' ${intputfile}
Of course, I could write them to a file but the whole point is not to. Any suggestions ?
EDIT : Although dogbane answers works fine, the next problem is that with the <<'HERE' tags, newline characters are not read.I can't unquote it since, otherwise, he's trying to interpret the awk script then encountering a $ sign within (and there are). And with no newlines, I can't comment anything within the awk script (without commenting half the script when the newlines characters are being removed ...). Anyone ?
<< 'EOF'
BEGIN{#Hello
print $1
}
EOF # Is read as BEGIN{#Hello print $1} by awk; ie BEGIN{
<< EOF
BEGIN{#Hello
print $1
}
EOF #Is read correctly by awk but Bash tried to express $1 and fails

Remove the single quotes around the awk script and enclose the script variable in double-quotes. This works for me:
################################
# AWK scripts #
################################
read -d '' scriptVariable << 'EOF'
BEGIN {
print "start"
}
{
print $0
}
END{
print "hello"
}
EOF
################################
# End of AWK Scripts #
################################
awk "$scriptVariable" ${inputfile}

The single quotes around awk scripts can be cantankerous in a shell.
echo dummy code | awk '/dummy/{print '$1' $2}' - another_file | while read LINE; do
case $LINE in
"$1 code")echo success;;
*)echo $LINE;;
esac
done
This will remove from stdin and then another_file and output lines containing dummy and replace the first field with the first argument to your bash script. Notice that the bash variable $1 had to be unquoted.

You can include your awk inside your bash script using a Here-Document.

Related

How to parse multiple line output as separate variables

I'm relatively new to bash scripting and I would like someone to explain this properly, thank you. Here is my code:
#! /bin/bash
echo "first arg: $1"
echo "first arg: $2"
var="$( grep -rnw $1 -e $2 | cut -d ":" -f1 )"
var2=$( grep -rnw $1 -e $2 | cut -d ":" -f1 | awk '{print substr($0,length,1)}')
echo "$var"
echo "$var2"
The problem I have is with the output, the script I'm trying to write is a c++ function searcher, so upon launching my script I have 2 arguments, one for the directory and the second one as the function name. This is how my output looks like:
first arg: Projekt
first arg: iseven
Projekt/AX/include/ax.h
Projekt/AX/src/ax.cpp
h
p
Now my question is: how do can I save the line by line output as a variable, so that later on I can use var as a path, or to use var2 as a character to compare. My plan was to use IF() statements to determine the type, idea: IF(last_char == p){echo:"something"}What I've tried was this question: Capturing multiple line output into a Bash variable and then giving it an array. So my code looked like: "${var[0]}". Please explain how can I use my line output later on, as variables.
I'd use readarray to populate an array variable just in case there's spaces in your command's output that shouldn't be used as field separators that would end up messing up foo=( ... ). And you can use shell parameter expansion substring syntax to get the last character of a variable; no need for that awk bit in your var2:
#!/usr/bin/env bash
readarray -t lines < <(printf "%s\n" "Projekt/AX/include/ax.h" "Projekt/AX/src/ax.cpp")
for line in "${lines[#]}"; do
printf "%s\n%s\n" "$line" "${line: -1}" # Note the space before the -1
done
will display
Projekt/AX/include/ax.h
h
Projekt/AX/src/ax.cpp
p

how to prevent for loop from using space as deliminator, bash script

I am trying to right a bash script to do multiple checks and searches for a CMS my company uses. I trying to implement a function for a user to be able to search for a certain macro call and the function return all the files that contain the call, the line the macro is called on, and the actual code in the macro call. What I have seems to be getting screwed up by the fact I am using a for loop to format the output. Here's the snippet of the script I am working on:
elif [ "$choice" = "2" ]
then
echo -e "\n What macro call are we looking for $name?"
read macrocall
for i in $(grep -inR "$macrocall" $sitepath/templates/macros/); do
file=$(echo $i | cut -d\: -f1 | awk -F\/ '{ print $NF }')
line=$(echo $i | cut -d\: -f2)
calltext=$(echo $i | cut -d\: -f3-)
echo -e "\nFile: $file"
echo -e "\nLine: $line"
echo -e "\nMacro Call from file: $calltext"
done
fi
the current script runs the first few fields until it gets a a space and then everything gets all screwy. Anybody have any idea how I can have the for loops deliminator to be each result of the grep? any suggestions would be helpful. Let me know if any of you need more info. Thanks!
The right way to do this would be more like:
printf "\n What macro call are we looking for %s?" "$name"
read macrocall
# ensure globbing is off and set IFS to a newline after saving original values
oSET="$-"; set -f; oIFS="$IFS"; IFS=$'\n'
awk -v macrocall="$macrocall" '
BEGIN { lc_macrocall = "\\<" tolower(macrocall) "\\>" }
tolower($0) ~ lc_macrocall {
file=FILENAME
sub(/.*\//,"",file)
printf "\n%s\n", file
printf "\n%d\n", FNR
printf "\nMacro Call from file: %s\n", $0
}
' $(find "$sitepath/templates/macros" -type f -print)
# restore original IFS and globbing values
IFS="$oIFS"; set +f -"$oSET"
This solves the problem of having spaces in your file names as originally requested, but also handles globbing characters in your file names, and the various typical echo issues.
You can set the internal field separator $IFS (which is normally set to space, tab and newline) to just newline to get around this problem:
IFS="\n"

Set variable in current shell from awk

Is there a way to set a variable in my current shell from within awk?
I'd like to do some processing on a file and print out some data; since I'll read the whole file through, I'd like to save the number of lines -- in this case, FNR.
Happens though I can't seem to find a way to set a shell variable with FNR value; if not this, I'd have to read the FNR from my output file, to set, say num_lines, with FNR value.
I've tried some combinations using awk 'END{system(...)}', but could not manage it to work. Any way around this?
Here's another way.
This is especially useful when when you've got the values of your variables in a single variable and you want split them up. For example, you have a list of values from a single row in a database that you want to create variables out of.
val="hello|beautiful|world" # assume this string comes from a database query
read a b c <<< $( echo ${val} | awk -F"|" '{print $1" "$2" "$3}' )
echo $a #hello
echo $b #beautiful
echo $c #world
We need the 'here string' i.e <<< in this case, because the read command does not read from a pipe and instead reads from stdin
$ echo "$var"
$ declare $( awk 'BEGIN{print "var=17"}' )
$ echo "$var"
17
Here's why you should use declare instead of eval:
$ eval $( awk 'BEGIN{print "echo \"removing all of your files, ha ha ha....\""}' )
removing all of your files, ha ha ha....
$ declare $( awk 'BEGIN{print "echo \"removing all of your files\""}' )
bash: declare: `"removing': not a valid identifier
bash: declare: `files"': not a valid identifier
Note in the first case that eval executes whatever string awk prints, which could accidentally be a very bad thing!
You can't export variables from a subshell to its parent shell. You have some other choices, though, including:
Make another pass of the file using AWK to count records, and use command substitution to capture the result. For example:
FNR=$(awk 'END {print FNR}' filename)
Print FNR in the subshell, and parse the output in your other process.
If FNR is the same as number of lines, you can call wc -l < filename to get your count.
A warning for anyone trying to use declare as suggested by several answers.
eval does not have this problem.
If the awk (or other expression) provided to declare results in an empty string then declare will dump the current environment.
This is almost certainly not what you would want.
eg: if your awk pattern doesn't exist in the input you will never print an output, therefore you will end up with unexpected behaviour.
An example of this....
unset var
var=99
declare $( echo "foobar" | awk '/fail/ {print "var=17"}' )
echo "var=$var"
var=99
The current environment as seen by declare is printed
and $var is not changed
A minor change to store the value to set in an awk variable and print it at the end solves this....
unset var
var=99
declare $( echo "foobar" | awk '/fail/ {tmp="17"} END {print "var="tmp}' )
echo "var=$var"
var=
This time $var is unset ie: set to the null string var=''
and there is no unwanted output.
To show this working with a matching pattern
unset var
var=99
declare $( echo "foobar" | awk '/foo/ {tmp="17"} END {print "var="tmp}' )
echo "var=$var"
var=
This time $var is unset ie: set to the null string var=''
and there is no unwanted output.
Make awk print out the assignment statement:
MYVAR=NewValue
Then in your shell script, eval the output of your awk script:
eval $(awk ....)
# then use $MYVAR
EDIT: people recommend using declare instead of eval, to be slightly less error-prone if something other than the assignment is printed by the inner script. It's bash-only, but it's okay when the shell is bash and the script has #!/bin/bash, correctly stating this dependency.
The eval $(...) variant is widely used, with existing programs generating output suitable for eval but not for declare (lesspipe is an example); that's why it's important to understand it, and the bash-only variant is "too localized".
To synthesize everything here so far I'll share what I find is useful to set a shell environment variable from a script that reads a one-line file using awk. Obviously a /pattern/ could be used instead of NR==1 to find the needed variable.
# export a variable from a script (such as in a .dotfile)
declare $( awk 'NR==1 {tmp=$1} END {print "SHELL_VAR=" tmp}' /path/to/file )
export SHELL_VAR
This will avoid a massive output of variables if a declare command is issued with no argument, as well as the security risks of a blind eval.
echo "First arg: $1"
for ((i=0 ; i < $1 ; i++)); do
echo "inside"
echo "Welcome $i times."
cat man.xml | awk '{ x[NR] = $0 } END { for ( i=2 ; i<=NR ; i++ ) { if (x[i] ~ // ) {x[i+1]=" '$i'"}print x[i] }} ' > $i.xml
done
echo "compleated"

shell variable inside awk without -v option it is working

I noticed that a shell script variable can be used inside an awk script like this:
var="help"
awk 'BEGIN{print "'$var'" }'
Can anyone tell me how to change the value of var inside awk while retaining the value outside of awk?
Similarly to accessing a variable of shell script inside awk, can we access shell array inside awk? If so, how?
It is impossible; the only variants you have:
use command substitution and write output of awk to the variable;
write data to file and then read from the outer shell;
produce shell output and then execute it with eval.
Examples.
Command substitution, one variable:
$ export A=10
$ A=$(awk 'END {print 2*ENVIRON["A"]}' < /dev/null)
$ echo $A
20
Here you multiple A by two and write the result of multiplication back.
eval; two variables:
$ A=10
$ B=10
$ eval $(awk 'END {print "A="2*ENVIRON["A"]"; B="2*ENVIRON["B"]}' < /dev/null)
$ echo $A
20
$ echo $B
20
$ awk 'END {print "A="2*ENVIRON["A"]"; B="2*ENVIRON["B"]}' < /dev/null
A=40; B=40
It uses a file intermediary, but it does work:
var="hello world"
cat > /tmp/my_script.awk.$$ <<EOF
BEGIN { print \"$var\" }
EOF
awk /tmp/my_script.awk.$$
rm -f /tmp/my_script.awk.$$
This uses the here document feature of the shell, Check your shell manual for the rules about interpolation within a here document.

Why awk '{ print }' doesn't start a new line but loops on space char

I have this shell script
#!/bin/bash
LINES=$(awk '{ print }' filename.txt)
for LINE in $LINES; do
echo "$LINE"
done
And filename.txt has this content
Loreum ipsum dolores
Loreum perche non se imortale
The shell script is iterating all spaces of the lines in filename.txt while it is supposed to loop only those two lines.
But when I type the "awk '{ print }' filename.txt" in terminal then it loops ok.
Any explanations?
Thanks in advance!
The $(...) construct absorbs all the output from awk as one large string, and then for LINE in $LINES splits on whitespace. You want this construct instead:
#! /bin/sh
while read LINE; do
printf '%s\n' "$LINE"
done < filename.txt
The other answers are good, another thing you can do is temporarily change your IFS (Internal Field Separator) variable. If you update your shell script to look like this:
#!/bin/bash
IFS="
"
LINES=$(awk '{ print }' filename.txt)
for LINE in $LINES; do
echo "$LINE"
done
This updates the IFS to be a newline instead of ' ' which should also do what you want.
Just another suggestion.
You need to loop over LINES as an array as all lines are stored as an array there.
Here's an example how to loop over the lines:
http://tldp.org/LDP/abs/html/arrays.html#SCRIPTARRAY

Resources