How to scape shell variable with spaces within AWK script - bash

I have the path of "file1 Nov 2018.txt" stored in variable "var". Then I use this shell variable inside the awk script
to generate another script (this is a small example). The issue is the path and the filename have spaces and even I put the variable between double quotes ""
and within awk I put between single quotes '' is not working either. I get the error "No such file or directory"
How to handle this path that has spaces?
The script is like this:
var="/mydrive/d/data 2018/Documents Nov/file1 Nov 2018.txt"
z=$(awk -v a="$var" 'BEGIN{str = "cat " 'a' ; print str}')
eval "$z"
I get these errors:
$ eval "$z"
cat: /mydrive/d/data: No such file or directory
cat: 2018/Documents: No such file or directory
cat: Nov/file1: No such file or directory
cat: Nov: No such file or directory
cat: 2018.txt: No such file or directory
Thanks for any help.

The single-quote escape sequence comes in handy here. Note that 047 is the value in octal for the ASCII ' character, and awk allows you to use \nnn within a string to include any character using its octal value.
$ cat 'foo bar.txt'
a b c
1 2 3
$ var="foo bar.txt"
$ echo "$var"
foo bar.txt
$ z=$(awk -v a="$var" 'BEGIN{print "cat \047" a "\047"}')
$ eval "$z"
a b c
1 2 3
Maybe it's a bit nicer with printf:
$ awk -v a="$var" 'BEGIN{ printf "cat \047%s\047\n", a }'
cat 'foo bar.txt'
The problem is coming from the fact that the single quote has special meaning to the shell, so it's not surprising that there's a clash when single quotes are also being used in your awk program, when that program is on the command line.
This can be avoided by putting the awk program in its own file:
$ cat a.awk
BEGIN { printf "cat '%s'\n", a }
$ awk -v a="$var" -f a.awk
cat 'foo bar.txt'

remove the single quotes around a and add escaped double quotes instead.
$ echo success > "a b"
$ var="a b"; z=$(awk -v a="$var" 'BEGIN{print "cat \"" a "\""}');
$ eval "${z}"
success
however, most likely you're doing some task unnecessarily complex.

$ cat > path\ to/test
foo
$ z=$(awk -v a="$var" 'BEGIN{gsub(/ /,"\\ ",a); str = "cat " a ; print str}')
$ echo "$z"
cat path\ to/test
$ eval "$z"
foo
The key (in this solution) being: gsub(/ /,"\\ ",a) ie. escaping the spaces with a \ (\\ due to awk).

With bash's printf %q "$var" you can correctly escape any string for later use in eval - even linebreaks will be handled correctly. However, the resulting string may contain special symbols like \ that could be interpreted by awk when assigning variables with awk -v var="$var". Therefore, better pass the variable via stdin:
path='/path/with spaces/and/special/symbols/like/*/?/\/...'
cmd=$(printf %q "$path" | awk '{print "cat "$0}')
eval "$cmd"
In this example the generated command $cmd is
cat /path/with\ spaces/and/special/symbols/like/\*/\?/\\/...

Related

Replace one character by the other (and vice-versa) in shell

Say I have strings that look like this:
$ a='/o\\'
$ echo $a
/o\
$ b='\//\\\\/'
$ echo $b
\//\\/
I'd like a shell script (ideally a one-liner) to replace / occurrences by \ and vice-versa.
Suppose the command is called invert, it would yield (in a shell prompt):
$ invert $a
\o/
$ invert $b
/\\//\
For example using sed, it seems unavoidable to use a temporary character, which is not great, like so:
$ echo $a | sed 's#/#%#g' | sed 's#\\#/#g' | sed 's#%#\\#g'
\o/
$ echo $b | sed 's#/#%#g' | sed 's#\\#/#g' | sed 's#%#\\#g'
/\\//\
For some context, this is useful for proper printing of git log --graph --all | tac (I like to see newer commits at the bottom).
tr is your friend:
% echo 'abc' | tr ab ba
bac
% echo '/o\' | tr '\\/' '/\\'
\o/
(escaping the backslashes in the output might require a separate step)
I think this can be done with (g)awk:
$ echo a/\\b\\/c | gawk -F "/" 'BEGIN{ OFS="\\" } { for(i=1;i<=NF;i++) gsub(/\\/,"/",$i); print $0; }'
a\/b/\c
$ echo a\\/b/\\c | gawk -F "/" 'BEGIN{ OFS="\\" } { for(i=1;i<=NF;i++) gsub(/\\/,"/",$i); print $0; }'
a/\b\/c
$
-F "/" This defines the separator, The input will be split in "/", and should no longer contain a "/" character.
for(i=1;i<=NF;i++) gsub(/\\/,"/",$i);. This will replace, in all items in the input, the backslash (\) for a slash (/).
If you want to replace every instance of / with \, you can uses the y command of sed, which is quite similar to what tr does:
$ a='/o\'
$ echo "$a"
/o\
$ echo "$a" | sed 'y|/\\|\\/|'
\o/
$ b='\//\\/'
$ echo "$b"
\//\\/
$ echo "$b" | sed 'y|/\\|\\/|'
/\\//\
If you are strictly limited to GNU AWK you might get desired result following way, let file.txt content be
\//\\\\/
then
awk 'BEGIN{FPAT=".";OFS="";arr["/"]="\\";arr["\\"]="/"}{for(i=1;i<=NF;i+=1){if($i in arr){$i=arr[$i]}};print}' file.txt
gives output
/\\////\
Explanation: I inform GNU AWK that field is any single character using FPAT built-in variable and that output field separator (OFS) is empty string and create array where key-value pair represent charactertobereplace-replacement, \ needs to be escaped hence \\ denote literal \. Then for each line I iterate overall all fields using for loop and if given field hold character present in array arr keys I do exchange it for corresponding value, after loop I print line.
(tested in gawk 4.2.1)

Expand matched strings in sed

Is it possible to expand the matched string in a sed command? I want to substitute variable names in a file with their values, this is my script at the moment:
#!/bin/bash
echo "Running the build script..."
VAR1="2005648"
VAR2="7445aa"
SERVER_NAME=$(hostname)
TIMESTAMP=$(date +%m-%d-%Y)
sed -i "s/{[A-Z_][A-Z_]*}/$&/g" my_file.txt #variable names in the file are written between { }
and this is a snapshot of my_file.txt:
Building finished at {TIMESTAMP}
{VAR1}:{VAR2}
On: {SERVER_NAME}
current working directory: {PWD}
But it doesn't work. Instead of substituting the variable name with it's value, It inserts a dollar sign right before the curly bracket.
How do I resolve this?
You could use envsubst to substitute environment variables, otherwise you would need a bunch of sed commands to replace everything.
Change your template file to:
Building finished at ${TIMESTAMP}
${VAR1}:${VAR2}
On: ${SERVER_NAME}
current working directory: ${PWD}
And the script to:
#!/bin/bash
echo "Running the build script..."
export VAR1="2005648"
export VAR2="7445aa"
export SERVER_NAME=$(hostname)
export TIMESTAMP=$(date +%m-%d-%Y)
# only replace the defined variables
envsubst '$VAR1 $VAR2 $SERVER_NAME $TIMESTAMP' < my_file.txt > newfile
# replace all environment variables ($USER, $HOME, $HOSTNAME, etc.)
#envsubst < my_file.txt > newfile.txt > newfile
The script replaces environment variables $VAR1, $VAR2, $SERVER_NAME and $TIMESTAMP in my_file.txt and saves the output to newfile.
You can see that ${PWD} doesn't get replaced, because I forgot to add it to the list.
In the second commented example all environment variables are replaced and non-existing variables are replaced by an empty string.
You can use the $VARNAME or ${VARNAME} syntax in the template.
I'd actually do it in a single pass this way using an awk that supports ENVIRON[], e.g. any POSIX awk:
$ cat tst.sh
#!/bin/env bash
echo "Running the build script..."
VAR1=2005648 \
VAR2=7445aa \
SERVER_NAME=$(hostname) \
TIMESTAMP=$(date +%m-%d-%Y) \
awk '
{
while ( match($0,/{[[:alnum:]_]+}/) ) {
printf "%s", substr($0,1,RSTART-1) ENVIRON[substr($0,RSTART+1,RLENGTH-2)]
$0 = substr($0,RSTART+RLENGTH)
}
print
}
' file
$ ./tst.sh
Running the build script...
Building finished at 04-14-2020
2005648:7445aa
On: MyLogin
current working directory: /home/MyLogin
but if you really want to do multiple passes calling sed inside a shell loop then ${!variable} is your friend, here's a start:
$ cat tst.sh
#!/bin/env bash
VAR1='2005648'
VAR2='7445aa'
SERVER_NAME='foo'
for var in VAR1 VAR2 SERVER_NAME; do
echo "var, $var, ${!var}"
done
$ ./tst.sh
var, VAR1, 2005648
var, VAR2, 7445aa
var, SERVER_NAME, foo
.
$ VAR1='stuff'
$ var='VAR1'; echo 'foo {VAR1} bar' | sed "s/{$var}/${!var}/"
foo stuff bar
The awk script is robust but YMMV using sed depending on the contents of the variables, e.g. it'd fail if they contain & or / or \1 or .... ENVIRON[] only has access to shell variables set on the awk command line or exported, hence the escape at the end of each line that sets a shell variable so it's part of the awk command line.
You can try this.
#!/usr/bin/env bash
echo "Running the build script..."
VAR1="2005648"
VAR2="7445aa"
SERVER_NAME=$(hostname)
TIMESTAMP=$(date +%m-%d-%Y)
sed "s|{TIMESTAMP}|$TIMESTAMP|;s|{VAR1}|$VAR1|;s|{VAR2}|$VAR2|;s|{SERVER_NAME}|$SERVER_NAME|;s|{PWD}|$PWD|" file.txt
Just add {} in the variables e.g. {$TIMESTAMP} and so on, if you really need it.
That should work unless there is something more that is not included in the question above.

gawk and grep using same pattern but need different treatement for escape sequence

I have a bash shell script (named testawk to replace a line containing a pattern (1st arg) with one or more than one line (2nd arg) and operates on the filename given in 3rd arg. The shell script is given below:
#!/bin/bash
if grep -s "$1" "$3" > /dev/null; then
gawk -v nm2="$2" -v nm1="$1" '{ if ($0 ~ nm1) print nm2;else print $0}' "$3" > "$3".bak
mv "$3".bak "$3"
fi
If I have a file named "aa" containing the following:
a;
b<*c;
And, if I run testawk as:
./testawk "a;" "x<*y;" "aa"
aa contains:
x<*y;
b<*c;
But, if I run testawk on original aa file again as:
./testawk "b<*c;" "x<*y;" "aa"
aa contains now as (unchanged content):
a;
b<*c;
Because, grep "b<*c;" "aa" cannot find the pattern.
To make grep happy, if I use escape sequences as:
grep "b<\*c;" "aa"
It could match and shows:
b<*c;
if I use testawk using the escape sequence as below:
./testawk "b<\*c;" "x<*y;" "aa"
gawk does not like that and complains as:
gawk: warning: escape sequence `\*' treated as plain `*'
And aa gets no changed content as:
a;
b<*c;
Any remedy to make both grep and gawk happy to find and replace b<*c;
Please suggest how to replace b<*c;.
This should do what I think you are asking for:
if awk -v nm2="$2" -v nm1="$1" 'index($0,nm1){f=1; $0=nm2} 1; END{exit !f}' "$3" > "${3}.bak"
then
mv "${3}.bak" "$3"
# do stuff with modified file "$3"
else
rm -f "${3}.bak"
# do stuff with unmodified file "$3"
fi
No need to escape anything except backslashes and we can deal with that differently if you have those.

Using variables with grep is not working

I have a file VALIDATION_CONFIG_FILE.cfg which contains the records below:
ES_VDF_1|1
DE_VDF_1|2
ES_VDF_1|7
When I am using the grep command below by using variable then the command is returning ES_VDF_1 output. As per my understanding, command should not give any results. When I use the same command without using variables (use values directly) then command is returning no results, which is as expected. So what is the problem with variables which I am using?
FEED_ID_1_7="HU_VDF_1"
FEED_ID_2_7="ES_VDF_1"
FEED_ID_3_7="PT_VDF_2"
awk -F'|' '{ if($2=="7") print $1; }' VALIDATION_CONFIG_FILE.cfg |
grep -E -v '${FEED_ID_1_7}|${FEED_ID_2_7}|${FEED_ID_3_7}'
Output: ES_VDF_1
awk -F'|' '{ if($2=="7") print $1; }' VALIDATION_CONFIG_FILE.cfg |
grep -E -v 'ES_VDF_1|HU_VDF_1|PT_VDF_2'
Output: nothing
The problem you are seeing is that single quotes in Bash do not interpolate variables, whereas double quotes do.
For example with a variable imaginatively called "VARIABLE":
alex#yuzu:~$ export VARIABLE="foo"
If you echo it with double quotes, it is interpolated and the value of the variable is used:
alex#yuzu:~$ echo "$VARIABLE"
foo
But if you use single quotes the literal string '$VARIABLE' is used instead:
alex#yuzu:~$ echo '$VARIABLE'
$VARIABLE
The same goes for your grep.
grep -E -v '${FEED_ID_1_7}|${FEED_ID_2_7}|${FEED_ID_3_7}'
Should be:
grep -E -v "${FEED_ID_1_7}\|${FEED_ID_2_7}\|${FEED_ID_3_7}"
For example:
alex#yuzu:~$ echo "foo" | grep -E "$VARIABLE|$HOME|$USER"
foo
alex#yuzu:~$ echo "foo" | grep -E '$VARIABLE|$HOME|$USER'
[ no output ]
This is happening due to quotes.
Single quotes won't interpolate anything, but double quotes will do. Replace single quotes to double quotes with variables like below :
awk -F'|' '{ if($2=="7") print $1; }' VALIDATION_CONFIG_FILE.cfg |
grep -E -v "${FEED_ID_1_7}|${FEED_ID_2_7}|${FEED_ID_3_7}"
Refer bash manual for more details
Adding to Kaoru/Nishu Tayal's answer, you can make it safer further by using normal text search with fgrep and multiple -e:
fgrep -v -e "${FEED_ID_1_7}" -e "${FEED_ID_2_7}" -e "${FEED_ID_3_7}"
This would help prevent misinterpretations just in case special characters would be added to the values of variables.
If you don't have fgrep try grep -F.

shell variable inside awk without -v option it is working

I noticed that a shell script variable can be used inside an awk script like this:
var="help"
awk 'BEGIN{print "'$var'" }'
Can anyone tell me how to change the value of var inside awk while retaining the value outside of awk?
Similarly to accessing a variable of shell script inside awk, can we access shell array inside awk? If so, how?
It is impossible; the only variants you have:
use command substitution and write output of awk to the variable;
write data to file and then read from the outer shell;
produce shell output and then execute it with eval.
Examples.
Command substitution, one variable:
$ export A=10
$ A=$(awk 'END {print 2*ENVIRON["A"]}' < /dev/null)
$ echo $A
20
Here you multiple A by two and write the result of multiplication back.
eval; two variables:
$ A=10
$ B=10
$ eval $(awk 'END {print "A="2*ENVIRON["A"]"; B="2*ENVIRON["B"]}' < /dev/null)
$ echo $A
20
$ echo $B
20
$ awk 'END {print "A="2*ENVIRON["A"]"; B="2*ENVIRON["B"]}' < /dev/null
A=40; B=40
It uses a file intermediary, but it does work:
var="hello world"
cat > /tmp/my_script.awk.$$ <<EOF
BEGIN { print \"$var\" }
EOF
awk /tmp/my_script.awk.$$
rm -f /tmp/my_script.awk.$$
This uses the here document feature of the shell, Check your shell manual for the rules about interpolation within a here document.

Resources