Expand matched strings in sed - bash

Is it possible to expand the matched string in a sed command? I want to substitute variable names in a file with their values, this is my script at the moment:
#!/bin/bash
echo "Running the build script..."
VAR1="2005648"
VAR2="7445aa"
SERVER_NAME=$(hostname)
TIMESTAMP=$(date +%m-%d-%Y)
sed -i "s/{[A-Z_][A-Z_]*}/$&/g" my_file.txt #variable names in the file are written between { }
and this is a snapshot of my_file.txt:
Building finished at {TIMESTAMP}
{VAR1}:{VAR2}
On: {SERVER_NAME}
current working directory: {PWD}
But it doesn't work. Instead of substituting the variable name with it's value, It inserts a dollar sign right before the curly bracket.
How do I resolve this?

You could use envsubst to substitute environment variables, otherwise you would need a bunch of sed commands to replace everything.
Change your template file to:
Building finished at ${TIMESTAMP}
${VAR1}:${VAR2}
On: ${SERVER_NAME}
current working directory: ${PWD}
And the script to:
#!/bin/bash
echo "Running the build script..."
export VAR1="2005648"
export VAR2="7445aa"
export SERVER_NAME=$(hostname)
export TIMESTAMP=$(date +%m-%d-%Y)
# only replace the defined variables
envsubst '$VAR1 $VAR2 $SERVER_NAME $TIMESTAMP' < my_file.txt > newfile
# replace all environment variables ($USER, $HOME, $HOSTNAME, etc.)
#envsubst < my_file.txt > newfile.txt > newfile
The script replaces environment variables $VAR1, $VAR2, $SERVER_NAME and $TIMESTAMP in my_file.txt and saves the output to newfile.
You can see that ${PWD} doesn't get replaced, because I forgot to add it to the list.
In the second commented example all environment variables are replaced and non-existing variables are replaced by an empty string.
You can use the $VARNAME or ${VARNAME} syntax in the template.

I'd actually do it in a single pass this way using an awk that supports ENVIRON[], e.g. any POSIX awk:
$ cat tst.sh
#!/bin/env bash
echo "Running the build script..."
VAR1=2005648 \
VAR2=7445aa \
SERVER_NAME=$(hostname) \
TIMESTAMP=$(date +%m-%d-%Y) \
awk '
{
while ( match($0,/{[[:alnum:]_]+}/) ) {
printf "%s", substr($0,1,RSTART-1) ENVIRON[substr($0,RSTART+1,RLENGTH-2)]
$0 = substr($0,RSTART+RLENGTH)
}
print
}
' file
$ ./tst.sh
Running the build script...
Building finished at 04-14-2020
2005648:7445aa
On: MyLogin
current working directory: /home/MyLogin
but if you really want to do multiple passes calling sed inside a shell loop then ${!variable} is your friend, here's a start:
$ cat tst.sh
#!/bin/env bash
VAR1='2005648'
VAR2='7445aa'
SERVER_NAME='foo'
for var in VAR1 VAR2 SERVER_NAME; do
echo "var, $var, ${!var}"
done
$ ./tst.sh
var, VAR1, 2005648
var, VAR2, 7445aa
var, SERVER_NAME, foo
.
$ VAR1='stuff'
$ var='VAR1'; echo 'foo {VAR1} bar' | sed "s/{$var}/${!var}/"
foo stuff bar
The awk script is robust but YMMV using sed depending on the contents of the variables, e.g. it'd fail if they contain & or / or \1 or .... ENVIRON[] only has access to shell variables set on the awk command line or exported, hence the escape at the end of each line that sets a shell variable so it's part of the awk command line.

You can try this.
#!/usr/bin/env bash
echo "Running the build script..."
VAR1="2005648"
VAR2="7445aa"
SERVER_NAME=$(hostname)
TIMESTAMP=$(date +%m-%d-%Y)
sed "s|{TIMESTAMP}|$TIMESTAMP|;s|{VAR1}|$VAR1|;s|{VAR2}|$VAR2|;s|{SERVER_NAME}|$SERVER_NAME|;s|{PWD}|$PWD|" file.txt
Just add {} in the variables e.g. {$TIMESTAMP} and so on, if you really need it.
That should work unless there is something more that is not included in the question above.

Related

How to assign a value to a variable which name is in a variable in bash?

I have a config file which looks like this:
$ cat .config
PARAM1 = avalue # a comment
PARAM2 = "many values" # another comment
# PARAM3=blabla
I wrote a function to read from it:
get_from_config_file()
{
A=$(grep "$1" ${config_file} | grep -v "^#" | sed s'/^[[:space:]]*//g' | sed s'/#.*$//' | sed s'/^.*=[[:space:]]*//' | sed s'/[[:space:]]*$//' | sed s'/"//g')
echo "$A"
}
Then I can read the parameters from the config file which works fine:
PARAM1=$(get_from_config_file "PARAM1")
PARAM2=$(get_from_config_file "PARAM2")
But I wanted to make it better (I have many parameters in this config file) so I wanted to be able to grab the value of all my parameters and then assign to variables in a simple for loop -- and here I got in trouble:
for name in PARAM1 PARAM2
do
value=$(get_from_config_file "$name")
echo $name, $value
# How to assign here $value to a variable named PARAM1, PARAM2 which is contained in name ?
# Note that I do not want to use an array for this
# param[$name]="$value"
done
Thanks,
Define variables directly using declare
for name in PARAM1 PARAM2
do
declare -gx "$name"="$(get_from_config_file "$name")"
#echo $name, $value
# How to assign here $value to a variable named PARAM1, PARAM2 which is contained in name ?
# Note that I do not want to use an array for this
# param[$name]="$value"
done
echo PARAM1="$PARAM1"
echo PARAM2="$PARAM2"
When you run this command, declare -gx "$name"="$value", Bash expands the variables, name and value, first. Then execute the command, declare -gx PARAM1=foobar
declare options:
-g create global variables when used in a shell function; otherwise
ignored
-x to make NAMEs export
for name in PARAM1 PARAM2
do
value=$(get_from_config_file "$name")
eval "$name"="$value"
done
This should assign the way you want.
Also for get_from_config_file how about,
awk -v input="$name" -F" = " '$0 ~ input{split($2, arr, "#"); print arr[1]}' .config
read is the command you are missing. while and read can be used together to read variables from a file. How you process the file to remove the comments is up to you; there are many ways. In the following example, I used sed to remove the # comments and convert the delimiter = to a single space.
sed '/^[[:blank:]]*#/d;s/#.*//;s/[[:blank:]]*=[[:blank:]]*/ /' "/.config" \
| while read -r name value; do
echo $name $value
done
If you remove the whitespaces around = signs the file can be interpreted as shell script:
. .config
If you don't want to edit the file:
eval "$(cat .config | sed 's/ = /=/')"

How to scape shell variable with spaces within AWK script

I have the path of "file1 Nov 2018.txt" stored in variable "var". Then I use this shell variable inside the awk script
to generate another script (this is a small example). The issue is the path and the filename have spaces and even I put the variable between double quotes ""
and within awk I put between single quotes '' is not working either. I get the error "No such file or directory"
How to handle this path that has spaces?
The script is like this:
var="/mydrive/d/data 2018/Documents Nov/file1 Nov 2018.txt"
z=$(awk -v a="$var" 'BEGIN{str = "cat " 'a' ; print str}')
eval "$z"
I get these errors:
$ eval "$z"
cat: /mydrive/d/data: No such file or directory
cat: 2018/Documents: No such file or directory
cat: Nov/file1: No such file or directory
cat: Nov: No such file or directory
cat: 2018.txt: No such file or directory
Thanks for any help.
The single-quote escape sequence comes in handy here. Note that 047 is the value in octal for the ASCII ' character, and awk allows you to use \nnn within a string to include any character using its octal value.
$ cat 'foo bar.txt'
a b c
1 2 3
$ var="foo bar.txt"
$ echo "$var"
foo bar.txt
$ z=$(awk -v a="$var" 'BEGIN{print "cat \047" a "\047"}')
$ eval "$z"
a b c
1 2 3
Maybe it's a bit nicer with printf:
$ awk -v a="$var" 'BEGIN{ printf "cat \047%s\047\n", a }'
cat 'foo bar.txt'
The problem is coming from the fact that the single quote has special meaning to the shell, so it's not surprising that there's a clash when single quotes are also being used in your awk program, when that program is on the command line.
This can be avoided by putting the awk program in its own file:
$ cat a.awk
BEGIN { printf "cat '%s'\n", a }
$ awk -v a="$var" -f a.awk
cat 'foo bar.txt'
remove the single quotes around a and add escaped double quotes instead.
$ echo success > "a b"
$ var="a b"; z=$(awk -v a="$var" 'BEGIN{print "cat \"" a "\""}');
$ eval "${z}"
success
however, most likely you're doing some task unnecessarily complex.
$ cat > path\ to/test
foo
$ z=$(awk -v a="$var" 'BEGIN{gsub(/ /,"\\ ",a); str = "cat " a ; print str}')
$ echo "$z"
cat path\ to/test
$ eval "$z"
foo
The key (in this solution) being: gsub(/ /,"\\ ",a) ie. escaping the spaces with a \ (\\ due to awk).
With bash's printf %q "$var" you can correctly escape any string for later use in eval - even linebreaks will be handled correctly. However, the resulting string may contain special symbols like \ that could be interpreted by awk when assigning variables with awk -v var="$var". Therefore, better pass the variable via stdin:
path='/path/with spaces/and/special/symbols/like/*/?/\/...'
cmd=$(printf %q "$path" | awk '{print "cat "$0}')
eval "$cmd"
In this example the generated command $cmd is
cat /path/with\ spaces/and/special/symbols/like/\*/\?/\\/...

Turning a list of abs pathed files to a comma delimited string of files in bash

I have been working in bash, and need to create a string argument. bash is a newish for me, to the point that I dont know how to build a string in bash from a list.
// foo.txt is a list of abs file names.
/foo/bar/a.txt
/foo/bar/b.txt
/delta/test/b.txt
should turn into: a.txt,b.txt,b.txt
OR: /foo/bar/a.txt,/foo/bar/b.txt,/delta/test/b.txt
code
s = ""
for file in $(cat foo.txt);
do
#what goes here? s += $file ?
done
myShellScript --script $s
I figure there was an easy way to do this.
with for loop:
for file in $(cat foo.txt);do echo -n "$file",;done|sed 's/,$/\n/g'
with tr:
cat foo.txt|tr '\n' ','|sed 's/,$/\n/g'
only sed:
sed ':a;N;$!ba;s/\n/,/g' foo.txt
This seems to work:
#!/bin/bash
input="foo.txt"
while IFS= read -r var
do
basename $var >> tmp
done < "$input"
paste -d, -s tmp > result.txt
output: a.txt,b.txt,b.txt
basename gets you the file names you need and paste will put them in the order you seem to need.
The input field separator can be used with set to create split/join functionality:
# split the lines of foo.txt into positional parameters
IFS=$'\n'
set $(< foo.txt)
# join with commas
IFS=,
echo "$*"
For just the file names, add some sed:
IFS=$'\n'; set $(sed 's|.*/||' foo.txt); IFS=,; echo "$*"

Sed variable too long

I need to substitute a unique string in a json file: {FILES} by a bash variable that contains thousands of paths: ${FILES}
sed -i "s|{FILES}|$FILES|" ./myFile.json
What would be the most elegant way to achieve that ? The content of ${FILES} is a result of an "aws s3" command. The content would look like :
FILES="/file1.ipk, /file2.ipk, /subfolder1/file3.ipk, /subfolder2/file4.ipk, ..."
I can't think of a solution where xargs would help me.
The safest way is probably to let Bash itself expand the variable. You can create a Bash script containing a here document with the full contents of myFile.json, with the placeholder {FILES} replaced by a reference to the variable $FILES (not the contents itself). Execution of this script would generate the output you seek.
For example, if myFile.json would contain:
{foo: 1, bar: "{FILES}"}
then the script should be:
#!/bin/bash
cat << EOF
{foo: 1, bar: "$FILES"}
EOF
You can generate the script with a single sed command:
sed -e '1i#!/bin/bash\ncat << EOF' -e 's/\$/\\$/g;s/{FILES}/$FILES/' -e '$aEOF' myFile.json
Notice sed is doing two replacements; the first one (s/\$/\\$/g) to escape any dollar signs that might occur within the JSON data (replace every $ by \$). The second replaces {FILES} by $FILES; the literal text $FILES, not the contents of the variable.
Now we can combine everything into a single Bash one-liner that generates the script and immediately executes it by piping it to Bash:
sed -e '1i#!/bin/bash\ncat << EOF' -e 's/\$/\\$/g;s/{FILES}/$FILES/' -e '$aEOF' myFile.json | /bin/bash
Or even better, execute the script without spawning a subshell (useful if $FILES is set without export):
sed -e '1i#!/bin/bash\ncat << EOF' -e 's/\$/\\$/g;s/{FILES}/$FILES/' -e '$aEOF' myFile.json | source /dev/stdin
Output:
{foo: 1, bar: "/file1.ipk, /file2.ipk, /subfolder1/file3.ipk, /subfolder2/file4.ipk, ..."}
Maybe perl would have fewer limitations?
perl -pi -e "s#{FILES}#${FILES}#" ./myFile.json
It's a little gross, but you can do it all within shell...
while read l
do
if ! echo "$l" | grep -q '{DATA}'
then
echo "$l"
else
echo "$l" | sed 's/{DATA}.*$//'
echo "$FILES"
echo "$l" | sed 's/^.*{DATA}//'
fi
done <./myfile.json >newfile.json
#mv newfile.json myfile.json
Obviously I'd leave the final line commented until you were confident it worked...
Maybe just don't do it? Can you just :
echo "var f = " > myFile2.json
echo $FILES >> myFile2.json
And reference myFile2.json from within your other json file? (You should put the global f variable into a namespace if this works for you.)
Instead of putting all those variables in an environment variable, put them in a file. Then read that file in perl:
foo.pl:
open X, "$ARGV[0]" or die "couldn't open";
shift;
$foo = <X>;
while (<>) {
s/world/$foo/;
print;
}
Command to run:
aws s3 ... >/tmp/myfile.$$
perl foo.pl /tmp/myfile.$$ <myFile.json >newFile.json
Hopefully that will bypass the limitations of the environment variable space and the argument length by pulling all the processing within perl itself.

shell variable inside awk without -v option it is working

I noticed that a shell script variable can be used inside an awk script like this:
var="help"
awk 'BEGIN{print "'$var'" }'
Can anyone tell me how to change the value of var inside awk while retaining the value outside of awk?
Similarly to accessing a variable of shell script inside awk, can we access shell array inside awk? If so, how?
It is impossible; the only variants you have:
use command substitution and write output of awk to the variable;
write data to file and then read from the outer shell;
produce shell output and then execute it with eval.
Examples.
Command substitution, one variable:
$ export A=10
$ A=$(awk 'END {print 2*ENVIRON["A"]}' < /dev/null)
$ echo $A
20
Here you multiple A by two and write the result of multiplication back.
eval; two variables:
$ A=10
$ B=10
$ eval $(awk 'END {print "A="2*ENVIRON["A"]"; B="2*ENVIRON["B"]}' < /dev/null)
$ echo $A
20
$ echo $B
20
$ awk 'END {print "A="2*ENVIRON["A"]"; B="2*ENVIRON["B"]}' < /dev/null
A=40; B=40
It uses a file intermediary, but it does work:
var="hello world"
cat > /tmp/my_script.awk.$$ <<EOF
BEGIN { print \"$var\" }
EOF
awk /tmp/my_script.awk.$$
rm -f /tmp/my_script.awk.$$
This uses the here document feature of the shell, Check your shell manual for the rules about interpolation within a here document.

Resources