Search and replace with Bash - bash

I have a mustache like template file that needs to be parsed. e.g.
abc def ghijk{{ var1 }} lmno {{ var2 }} pq
rst={{ var3 }} uvwzyx
Variables like {{ var1 }} need to be replaced by a variable with the same name, which is already defined previous in the bash script, e.g var1="foobar".
I am thinking of using while read and awk to accomplish this, but don't know what is the correct way to do string manipulation in this case.
Thanks in advance!

export var1="duper"
export var2="tester"
export var3=1231
sed -e 's/{{ *\([^} ]*\) *}}/$\1/g' -e 's/^/echo "/' -e 's/$/"/' input | sh
Gives:
abc def ghijkduper lmno tester pq
rst=1231 uvwzyx

Here's an all awk version that requires a file of key=value pairs for the replacements. If I make a vars file like:
var1 =foobar
var2 =elf
var3 =monkey data
where I "cheated" and included the whitespaces associated with vars in your data file. Then I made an executable awk file like:
#!/usr/bin/awk -f
BEGIN {FS="="}
NR==FNR {vars[$1]=$2; next}
set_delims==0 {FS="[{][{]|[}][}]"; $0=$0; set_delims=1 }
{
for(i=1;i<=NF;i++) {
printf( "%s%s", ($i in vars) ? vars[$i] : $i, i==NF?"\n":"")
}
}
If the executable awk file is called awko it can be run like awko vars data:
abc def ghijkfoobar lmno elf pq
rst=monkey data uvwzyx

I had a similar issue recently and found this nice mustache implementation in bash: https://github.com/tests-always-included/mo
If you have all your variables already defined in the current bash context it's as simple as calling:
./mo template.file
With a slightly different usage it also supports other mustache features like partials or arrays.

To do this in a scalable manner (lots of macros to expand), you'll probably want to look into using a macro processor. GNU m4 springs to mind, though it only allows alphanumeric and _ characters in macro names, so may be tricky for this particular task.
To get you started, you can use sed to do the replacements with this sed script (I called it mustache.sed):
s/{{ var1 }}/foobar/g
s/{{ var2 }}/bazbar/g
s/{{ var3 }}/quxbar/g
In use with your example:
$ sed -f mustache.sed mustache.txt
abc def ghijkfoobar lmno bazbar pq
rst=quxbar uvwzyx
$
You could put this sed script all on one sed command line, but I think using the script makes it more readable.

Related

Modify a shell variable inside awk block of code

Is there any way to modify a shell variable inside awk block of code?
--------- [shell_awk.sh]---------------------
#!/bin/bash
shell_variable_1=<value A>
shell_variable_2=<value B>
shell_variable_3=<value C>
awk 'function A(X)
{ return X+1 }
{ a=A('$shell_variable_1')
b=A('$shell_variable_2')
c=A('$shell_variable_3')
shell_variable_1=a
shell_variable_2=b
shell_variable_3=c
}' FILE.TXT
--------- [shell_awk.sh]---------------------
This is a very simple example, the real script load a file and make some changes using functions, I need to keep each value before change into a specific variable, so then I can register into MySQL the before and after value.
The after value is received from parameters ($1, $2 and so on).
The value before I already know how to get it from the file.
All is done well, except the shell_variable been set by awk variable. Outside from awk block code is easy to set, but inside, is it possible?
No program -- in awk, shell, or any other language -- can directly modify a parent process's memory. That includes variables. However, of course, your awk can write contents to stdout, and the parent shell can read that content and modify its own variables itself.
Here's an example of awk that writes key/value pairs out to be read by bash. It's not perfect -- read the caveats below.
#!/bin/bash
shell_variable_1=10
shell_variable_2=20
shell_variable_3=30
run_awk() {
awk -v shell_variable_1="$shell_variable_1" \
-v shell_variable_2="$shell_variable_2" \
-v shell_variable_3="$shell_variable_3" '
function A(X) { return X+1 }
{ a=A(shell_variable_1)
b=A(shell_variable_2)
c=A(shell_variable_3) }
END {
print "shell_variable_1=" a
print "shell_variable_2=" b
print "shell_variable_3=" c
}' <<<""
}
while IFS="=" read -r key value; do
printf -v "$key" '%s' "$value"
done < <(run_awk)
for var in shell_variable_{1,2,3}; do
printf 'New value for %s is %s\n' "$var" "${!var}"
done
Advantages
Doesn't use eval. Content such as $(rm -rf ~) in the output from awk won't be executed by your shell.
Disadvantages
Can't handle variable contents with newlines. (You could fix this by NUL-delimiting output from your awk script, and adding -d '' to the read command).
A hostile awk script could modify PATH, LD_LIBRARY_PATH, or other security-sensitive variables. (You could fix this by reading variables into an associative array, rather than the global namespace, or by enforcing a prefix on their names).
The code above uses several ksh extensions also available in bash; however, it will not run with POSIX sh. Thus, be sure not to run this via sh scriptname (which only guarantees POSIX functionality).

Assign a variable the value of a string in a file

I have a file called info.log which contains the line:
/home/jax/Main_X_1_A
X, 1 and A are meaningful and they can change. However "Main" and the underscores remain the same.
Is it possible to use a utility to assign a shell variable a value based on the information in info.log?
E.g.
MY_VERSION="?_?_?";
Where the question marks represent the single characters that are found in those locations.
For example if info.log contained this line:
/home/jax/Main_1_2_3
And we used that data to initialise a shell variable:
MY_VERSION=...
echo $MY_VERSION
The output would be:
1_2_3
Updating question with better example:
Info.log
MODULE=TEST
QUICK_BUILD_DIR=/usr/apps/Main_1_2_3
ANT_FILE=build.xml
FANCE=/usr/apps/test/Main_1_2_3
I want to be able to take these three numbers(1, 2 and 3):
QUICK_BUILD_DIR=/usr/apps/Main_1_2_3
And assign them to variables.
Note: 1, 2 and 3 are just example numbers and they can change.
Can you try this?
var="MY_VERSION=1_3_2"
version=$(echo $var | sed 's/.*MAIN_\(.*\)/\1/') #version will be 1_3_2
This uses bash and sed.
A GNU Awk Solution
$ MY_VERSION=$(awk -F/ '/Main_/ { sub(/Main_/, "", $NF); print $NF }' info.log)
$ echo "$MY_VERSION"
X_1_A
You can use this awk command:
cat file
/home/jill/Main_1_2_4
/home/jax/Main_1_2_3
/home/john/Main_X_1_A
awk -v u=jax -F '/' '$3==u{sub(/^Main_/, "", $4); print $4}' file
1_2_3
Here you can pass any username in u variable to awk (as jax is being passed here) and version will be picked from that particular line.
No need for external utilities. Bash can do the string manipulation for you:
$ cat info.log
/home/jax/Main_X_1_A
$ read -r a < info.log
$ b="${a#*_}"
$ echo "$b"
X_1_A

"Piping" values into Bash variables

I have a Python script that outputs two numbers like so: 1.0 2.0 (that's a space in between the numbers, but it can be a \t, or whatever. I want a bash variable to save the 1.0, and another variable to save the 2.0. Is this possible?
In the past, I've only "piped" one value into a variable like so:
var=`python file.py` ;
but now, I'm interested in saving two values from the python file. Conceptually, similar to:
var1,var2=`python file.py` ;
Any advice / help?
Thanks!
You can use something like this:
read var1 var2 < <(python file.py)
The funky <( ) syntax is called process substitution.
The one-liner I use for splitting fields is
... | awk '{print $1}' | ... # or $2, $3, etc.
so you could do
var = `foo`
var1 = `echo "$var" | awk '{print $1}'`
var2 = `echo "$var" | awk '{print $2}'`
edit: added quotes around $var
I guess the most efficient and elegant thing here would be to use readarray in order to read the value into an array. That's if you're okay with using arrays, of course. You should be, but you never know. This would require the delimiter to be a newline, though. Anyhow :
readarray -t values < <(python file.py)
Will get you an array of one element for each line output by the python file.py with the trailing newline removed. Check out man bash for other options for this very cool builtin.

set multiple variables from one awk command?

This is a very common script:
#!/bin/bash
teststr="col1 col2"
var1=`echo ${teststr} | awk '{print $1}'`
var2=`echo ${teststr} | awk '{print $2}'`
echo var1=${var1}
echo var2=${var2}
However I dont like this, especially when there are more fields to parse.
I guess there should be a better way like:
(var1,var2)=`echo ${teststr} | awk '{print $1 $2}'
(in my imagination)
Is that so?
Thanks for help to improve effeciency and save some CPU power.
This might work for you:
var=(col0 col1 col2)
echo "${var[1]}"
col1
Yes, you can, but the best practice is to use the awk way to pass variables to awk.
Example using shell script variables
awk -v awkVar1="$scriptVar1" -v awkVar2="$scriptVar2" '<your awk code>'
Example using environmental variables
awk -v awkVar1=ENVIRON["ENV_VAR1"] -v awkVar2=ENVIRON["ENV_VAR2"] '<your awk code>'
It's possible to use script and environmental variables at the same time
awk -v awkVar1=ENVIRON["ENV_VAR1"] -v awkVar2="$scriptVar2" '<your awk code>'
You may find bash tricks to circumvent the awk way to do it, but it's not safe.
Explanation and more examples
Awk works this way, because it's a programming language by itself and has it's own way to use variables 'inside' awk statements.
By 'inside' i mean the part between the single quotes.
Let's see an example, where we turn off DHCP in a config file, all done using variables in a shell script. I'm going to explain the last line of code.
The script isn't optimal, it's main purpose is to use script variables. Explaining how the script does its job is out of scope of this answer, the focus is on explaining the use of variables.
#!/bin/bash
# set some variables
# set path to the config file to edit
CONFIG_FILE=/etc/netplan/01-netcfg.yaml
# find the line number of the line to change using awk and assign it to a variable
DHCP_LINE=$(awk '/dhcp4: yes/{print FNR}' $CONFIG_FILE)
# get the number of spaces used for identation using awk and assign it to a variable
SPACES=$(awk -v awkDHCP_LINE="$DHCP_LINE" 'FNR==awkDHCP_LINE {print match($0,/[^ ]|$/)-1}' $CONFIG_FILE)
# find DHCP setting and turn it off if needed
awk -v awkDHCP_LINE="$DHCP_LINE" -v awkSPACES="$SPACES" 'FNR==awkDHCP_LINE {sub("dhcp4: yes", "dhcp4: no")}' $CONFIG_FILE
Let's break this last line up to pieces for explanation.
awk -v awkDHCP_LINE="$DHCP_LINE" -v awkSPACES="$SPACES"
This part above assigns the value of DHCP_LINE script variable to the awkDHCP_LINE awk variable and the the value of SPACES script variable to the awkSPACESawk variable.
Please note, that the SPACES variable is passed to awk for the sake of showing how to pass multiple variables only; the awk command doesn't process it.
'FNR==awkDHCP_LINE {sub("dhcp4: yes", "dhcp4: no")}'
This one above is the 'inside' part of awk where the variable(s) passed to awk can be used.
$CONFIG_FILE
This part is outside awk, a generic script variable is used to specify the file that should be processed.
I hope this clears things a bit :)
Note: if you have lots of variables to pass, the solution provided by #potong may prove a better approach depending on your use case.
Bash has Array Support, We just need to supply values dynamically :)
function test_set_array_from_awk(){
# Note : -a is required as declaring array
let -a myArr;
# Hard Coded Valeus
# myArr=( "Foo" "Bar" "other" );
# echo "${myArr[1]}" # Print Bar
# Dynamic values
myArr=( $(echo "" | awk '{print "Foo"; print "Bar"; print "Fooo-And-Bar"; }') );
# Value #index 0
echo "${myArr[0]}" # Print Foo
# Value #index 1
echo "${myArr[1]}" # Print Bar
# Array Length
echo ${#myArr[#]} # Print 3 as array length
# Safe Reading with Default value
echo "${myArr[10]-"Some-Default-Value"}" # Print Some-Default-Value
echo "${myArr[10]-0}" # Print 0
echo "${myArr[10]-''}" # Print ''
echo "${myArr[10]-}" # Print nothing
# With Dynamic Index
local n=2
echo "${myArr["${n}"]-}" # Print Fooo-And-Bar
}
# calling test function
test_set_array_from_awk
Bash Array Documentation : http://tldp.org/LDP/abs/html/arrays.html
You can also use shell set builtin to place whitespace seperated (or more accurately, IFS seperated) into the variables $1, $2 and so on:
#!/bin/bash
teststr="col1 col2"
set -- $teststr
echo "$1" # col1
echo "$2" # col2

How to replace ${} placeholders in a text file?

I want to pipe the output of a "template" file into MySQL, the file having variables like ${dbName} interspersed. What is the command line utility to replace these instances and dump the output to standard output?
The input file is considered to be safe, but faulty substitution definitions could exist. Performing the replacement should avoid performing unintended code execution.
Update
Here is a solution from yottatsa on a similar question that only does replacement for variables like $VAR or ${VAR}, and is a brief one-liner
i=32 word=foo envsubst < template.txt
Of course if i and word are in your environment, then it is just
envsubst < template.txt
On my Mac it looks like it was installed as part of gettext and from MacGPG2
Old Answer
Here is an improvement to the solution from mogsie on a similar question, my solution does not require you to escale double quotes, mogsie's does, but his is a one liner!
eval "cat <<EOF
$(<template.txt)
EOF
" 2> /dev/null
The power on these two solutions is that you only get a few types of shell expansions that don't occur normally $((...)), `...`, and $(...), though backslash is an escape character here, but you don't have to worry that the parsing has a bug, and it does multiple lines just fine.
Sed!
Given template.txt:
The number is ${i}
The word is ${word}
we just have to say:
sed -e "s/\${i}/1/" -e "s/\${word}/dog/" template.txt
Thanks to Jonathan Leffler for the tip to pass multiple -e arguments to the same sed invocation.
Use /bin/sh. Create a small shell script that sets the variables, and then parse the template using the shell itself. Like so (edit to handle newlines correctly):
File template.txt:
the number is ${i}
the word is ${word}
File script.sh:
#!/bin/sh
#Set variables
i=1
word="dog"
#Read in template one line at the time, and replace variables (more
#natural (and efficient) way, thanks to Jonathan Leffler).
while read line
do
eval echo "$line"
done < "./template.txt"
Output:
#sh script.sh
the number is 1
the word is dog
I was thinking about this again, given the recent interest, and I think that the tool that I was originally thinking of was m4, the macro processor for autotools. So instead of the variable I originally specified, you'd use:
$echo 'I am a DBNAME' | m4 -DDBNAME="database name"
Create rendertemplate.sh:
#!/usr/bin/env bash
eval "echo \"$(cat $1)\""
And template.tmpl:
Hello, ${WORLD}
Goodbye, ${CHEESE}
Render the template:
$ export WORLD=Foo
$ CHEESE=Bar ./rendertemplate.sh template.tmpl
Hello, Foo
Goodbye, Bar
template.txt
Variable 1 value: ${var1}
Variable 2 value: ${var2}
data.sh
#!/usr/bin/env bash
declare var1="value 1"
declare var2="value 2"
parser.sh
#!/usr/bin/env bash
# args
declare file_data=$1
declare file_input=$2
declare file_output=$3
source $file_data
eval "echo \"$(< $file_input)\"" > $file_output
./parser.sh data.sh template.txt parsed_file.txt
parsed_file.txt
Variable 1 value: value 1
Variable 2 value: value 2
Here's a robust Bash function that - despite using eval - should be safe to use.
All ${varName} variable references in the input text are expanded based on the calling shell's variables.
Nothing else is expanded: neither variable references whose names are not enclosed in {...} (such as $varName), nor command substitutions ($(...) and legacy syntax `...`), nor arithmetic substitutions ($((...)) and legacy syntax $[...]).
To treat a $ as a literal, \-escape it; e.g.:\${HOME}
Note that input is only accepted via stdin.
Example:
$ expandVarsStrict <<<'$HOME is "${HOME}"; `date` and \$(ls)' # only ${HOME} is expanded
$HOME is "/Users/jdoe"; `date` and $(ls)
Function source code:
expandVarsStrict(){
local line lineEscaped
while IFS= read -r line || [[ -n $line ]]; do # the `||` clause ensures that the last line is read even if it doesn't end with \n
# Escape ALL chars. that could trigger an expansion..
IFS= read -r -d '' lineEscaped < <(printf %s "$line" | tr '`([$' '\1\2\3\4')
# ... then selectively reenable ${ references
lineEscaped=${lineEscaped//$'\4'{/\${}
# Finally, escape embedded double quotes to preserve them.
lineEscaped=${lineEscaped//\"/\\\"}
eval "printf '%s\n' \"$lineEscaped\"" | tr '\1\2\3\4' '`([$'
done
}
The function assumes that no 0x1, 0x2, 0x3, and 0x4 control characters are present in the input, because those chars. are used internally - since the function processes text, that should be a safe assumption.
here's my solution with perl based on former answer, replaces environment variables:
perl -p -e 's/\$\{(\w+)\}/(exists $ENV{$1}?$ENV{$1}:"missing variable $1")/eg' < infile > outfile
I would suggest using something like Sigil:
https://github.com/gliderlabs/sigil
It is compiled to a single binary, so it's extremely easy to install on systems.
Then you can do a simple one-liner like the following:
cat my-file.conf.template | sigil -p $(env) > my-file.conf
This is much safer than eval and easier then using regex or sed
Here is a way to get the shell to do the substitution for you, as if the contents of the file were instead typed between double quotes.
Using the example of template.txt with contents:
The number is ${i}
The word is ${word}
The following line will cause the shell to interpolate the contents of template.txt and write the result to standard out.
i='1' word='dog' sh -c 'echo "'"$(cat template.txt)"'"'
Explanation:
i and word are passed as environment variables scopped to the execution of sh.
sh executes the contents of the string it is passed.
Strings written next to one another become one string, that string is:
'echo "' + "$(cat template.txt)" + '"'
Since the substitution is between ", "$(cat template.txt)" becomes the output of cat template.txt.
So the command executed by sh -c becomes:
echo "The number is ${i}\nThe word is ${word}",
where i and word are the specified environment variables.
If you are open to using Perl, that would be my suggestion. Although there are probably some sed and/or AWK experts that probably know how to do this much easier. If you have a more complex mapping with more than just dbName for your replacements you could extend this pretty easily, but you might just as well put it into a standard Perl script at that point.
perl -p -e 's/\$\{dbName\}/testdb/s' yourfile | mysql
A short Perl script to do something slightly more complicated (handle multiple keys):
#!/usr/bin/env perl
my %replace = ( 'dbName' => 'testdb', 'somethingElse' => 'fooBar' );
undef $/;
my $buf = <STDIN>;
$buf =~ s/\$\{$_\}/$replace{$_}/g for keys %replace;
print $buf;
If you name the above script as replace-script, it could then be used as follows:
replace-script < yourfile | mysql
file.tpl:
The following bash function should only replace ${var1} syntax and ignore
other shell special chars such as `backticks` or $var2 or "double quotes".
If I have missed anything - let me know.
script.sh:
template(){
# usage: template file.tpl
while read -r line ; do
line=${line//\"/\\\"}
line=${line//\`/\\\`}
line=${line//\$/\\\$}
line=${line//\\\${/\${}
eval "echo \"$line\"";
done < ${1}
}
var1="*replaced*"
var2="*not replaced*"
template file.tpl > result.txt
I found this thread while wondering the same thing. It inspired me to this (careful with the backticks)
$ echo $MYTEST
pass!
$ cat FILE
hello $MYTEST world
$ eval echo `cat FILE`
hello pass! world
Lots of choices here, but figured I'd toss mine on the heap. It is perl based, only targets variables of the form ${...}, takes the file to process as an argument and outputs the converted file on stdout:
use Env;
Env::import();
while(<>) { $_ =~ s/(\${\w+})/$1/eeg; $text .= $_; }
print "$text";
Of course I'm not really a perl person, so there could easily be a fatal flaw (works for me though).
It can be done in bash itself if you have control of the configuration file format. You just need to source (".") the configuration file rather than subshell it. That ensures the variables are created in the context of the current shell (and continue to exist) rather than the subshell (where the variable disappear when the subshell exits).
$ cat config.data
export parm_jdbc=jdbc:db2://box7.co.uk:5000/INSTA
export parm_user=pax
export parm_pwd=never_you_mind
$ cat go.bash
. config.data
echo "JDBC string is " $parm_jdbc
echo "Username is " $parm_user
echo "Password is " $parm_pwd
$ bash go.bash
JDBC string is jdbc:db2://box7.co.uk:5000/INSTA
Username is pax
Password is never_you_mind
If your config file cannot be a shell script, you can just 'compile' it before executing thus (the compilation depends on your input format).
$ cat config.data
parm_jdbc=jdbc:db2://box7.co.uk:5000/INSTA # JDBC URL
parm_user=pax # user name
parm_pwd=never_you_mind # password
$ cat go.bash
cat config.data
| sed 's/#.*$//'
| sed 's/[ \t]*$//'
| sed 's/^[ \t]*//'
| grep -v '^$'
| sed 's/^/export '
>config.data-compiled
. config.data-compiled
echo "JDBC string is " $parm_jdbc
echo "Username is " $parm_user
echo "Password is " $parm_pwd
$ bash go.bash
JDBC string is jdbc:db2://box7.co.uk:5000/INSTA
Username is pax
Password is never_you_mind
In your specific case, you could use something like:
$ cat config.data
export p_p1=val1
export p_p2=val2
$ cat go.bash
. ./config.data
echo "select * from dbtable where p1 = '$p_p1' and p2 like '$p_p2%' order by p1"
$ bash go.bash
select * from dbtable where p1 = 'val1' and p2 like 'val2%' order by p1
Then pipe the output of go.bash into MySQL and voila, hopefully you won't destroy your database :-).
In place perl editing of potentially multiple files, with backups.
perl -e 's/\$\{([^}]+)\}/defined $ENV{$1} ? $ENV{$1} : ""/eg' \
-i.orig \
-p config/test/*
I created a shell templating script named shtpl. My shtpl uses a jinja-like syntax which, now that I use ansible a lot, I'm pretty familiar with:
$ cat /tmp/test
{{ aux=4 }}
{{ myarray=( a b c d ) }}
{{ A_RANDOM=$RANDOM }}
$A_RANDOM
{% if $(( $A_RANDOM%2 )) == 0 %}
$A_RANDOM is even
{% else %}
$A_RANDOM is odd
{% endif %}
{% if $(( $A_RANDOM%2 )) == 0 %}
{% for n in 1 2 3 $aux %}
\$myarray[$((n-1))]: ${myarray[$((n-1))]}
/etc/passwd field #$n: $(grep $USER /etc/passwd | cut -d: -f$n)
{% endfor %}
{% else %}
{% for n in {1..4} %}
\$myarray[$((n-1))]: ${myarray[$((n-1))]}
/etc/group field #$n: $(grep ^$USER /etc/group | cut -d: -f$n)
{% endfor %}
{% endif %}
$ ./shtpl < /tmp/test
6535
6535 is odd
$myarray[0]: a
/etc/group field #1: myusername
$myarray[1]: b
/etc/group field #2: x
$myarray[2]: c
/etc/group field #3: 1001
$myarray[3]: d
/etc/group field #4:
More info on my github
To me this is the easiest and most powerful solution, you can even include other templates using the same command eval echo "$(<template.txt):
Example with nested template
create the template files, the variables are in regular bash syntax ${VARIABLE_NAME} or $VARIABLE_NAME
you have to escape special characters with \ in your templates otherwhise they will be interpreted by eval.
template.txt
Hello ${name}!
eval echo $(<nested-template.txt)
nested-template.txt
Nice to have you here ${name} :\)
create source file
template.source
declare name=royman
parse the template
source template.source && eval echo "$(<template.txt)"
the output
Hello royman!
Nice to have you here royman :)
envsubst
please don't use anything else (ie. don't eval)

Resources