Perl replace content of a file with variable - shell

I have a a.md file:
Start file
```
replace content here by a variable or result of a command such as pwd,
the whole content in this code block should be something like /Users/danny/MyProjects/CurrentDir
```
End file
And a script in update.sh to update file above:
#!/bin/sh
t=$(pwd)
echo $t
perl -i -p0e 's/(?<=```shell\n)[\s\S]*(?=\n```)/$t/s' a.md
It works fine with a string but I cannot replace with a variable such as $(pwd).

A Perl command-line program in that shell script cannot use variables from the script just so but we need to pass variables to it somehow.
There are a few ways to do that,† and perhaps using -s switch is simplest here
#!/bin/bash
t=$(pwd)
echo $t
perl -i -0777 -s -pe's/(?<=```shell\n)[\s\S]*(?=\n```)/$d/s' -- -d="$t" a.md
# or
# perl -i -0777 -s -pe'...' -- -d="$(pwd)" a.md
The -s for perl enables a basic support for switches for the program itself.
So that -d=... becomes available as $d in the program, either with the assigned value, here the value of the bash variable $t, or 1 if not assigned (-d). We can pass multiple variables this way, each in its own switch.
The -- after the program mark the start of arguments.
We do not need a shell variable but can directly use a command output, -var="$(pwd)".
† Aside from using storage (files, databases, etc) or pipes, there are two more ways to directly pass arguments given to this command-line ("one-liner") program
Pass the variable as an argument and read it from #ARGV in the program
t=$(pwd)
perl -i -0777 -s -pe'BEGIN { $d = shift }; s/.../$d/s' "$t" a.md
We need to also remove it from #ARGV so that the files can then be processed, which is what shift does, and need to do this in a BEGIN block since -p sets a loop.
Export a variable in bash making it an environment variable and a Perl script can then use that via %ENV variable
export t=$(pwd)
perl -i -0777 -s -pe's/.../$ENV{t}/s' a.md
Note, it's t in the %ENV hash ($ENV{t}), the name of the variable, not $t (value)
See for example this post and this post

Related

Serialize a subset of environment variables

I'm trying to export some environment variables for use by a TomCat process.
There's a few ways to do this (I know how to solve the overall problem), but it bugged me that I didn't know how to do this particular shell task.
Tomcat recommends that all your environment customizations should be exported by "$CATALINA_HOME/bin/setenv.sh".
This whole thing is gonna be stuffed into a Docker container, so the only parameterizability will be via Docker env variables (let's assume for this task that I don't want to use volume mounts or create setenv.sh during the build process).
First, observe that docker run -e can be used to pass environment into the container:
🍔 docker run -eMY_VAR=SUP alpine env
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=a528b6fc264b
MY_VAR=SUP
no_proxy=*.local, 169.254/16
HOME=/root
If we wanted to copy all of that env into setenv.sh, it's as simple as:
SETENV="/usr/local/tomcat/bin/setenv.sh"
echo '#!/bin/sh' > "$SETENV"
echo 'export -p' >> "$SETENV"
env >> "$SETENV"
But copying everything somewhat defeats the point of setenv.sh -- which is, to give your tomcat process a clean environment, with only intentional customizations.
So, we can agree on a convention for "which env vars are ones that we want to pass through to setenv.sh". Everything prefixed with MY_.
And now we get to an interesting shell problem.
env | grep '^MY_' | sed 's/^MY_/EXPORT /'
This gets us pretty close. Output looks like:
🍔 docker run -e MY_VAR=hey alpine sh -c "env | grep '^MY_' | sed 's/^MY_/EXPORT /'"
EXPORT VAR=hey
So, we've selected from the env command: only env vars prefixed with MY_. And we can redirect that output to setenv.sh.
Why do I say "pretty close"? Looks like we're done, right?
Try this for size:
🍔 docker run -e MY_VAR='multi
quote> line
quote> string' alpine sh -c "env | grep '^MY_' | sed 's/^MY_/EXPORT /'"
EXPORT VAR=multi
The script only worked for a simple subset of possibilities. i.e. we only managed to export the first line of our multi-line string.
For your convenience: env output for multi-line strings looks like this:
🍔 docker run -e MY_VAR='multi
line
string' alpine env
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=0d0afaac6bec
MY_VAR=multi
line
string
no_proxy=*.local, 169.254/16
HOME=/root
I hesitate to try and tackle this using awk; there may be further string escaping complications that I have not considered.
I wonder whether there's a better way altogether to select & serialize a subset of exported environment?
EDIT: I negligently tagged this as a bash question, when really my intention was to pose an sh question. Specifically my intention is to get something that will work with no dependencies other than those that come with the alpine docker image. i.e. BusyBox sh, sed, grep, awk, env.
I've retained the bash tag so as not to punish the initial answer that was submitted when this was a bash-only question.
But I will give preference to an sh-compatible answer, and in particular to one that works with just the BusyBox UNIX utils.
So you need several things:
Enumerate the environment variables and select a subset.
For each selected environment variable, emit sh code that sets the variable to the desired value.
You can use export -p if you want to export all variables in a form that can be read back in, but parsing it to select only certain variables is harder. One way to make use of export -p is to unset the other variables. This only works if none of the environment variables is read-only, but you can work around that by running a separate shell instance (as opposed to a subshell).
To gather the list of variables to unset, you only need to get a superset of the list of all environment variables, and remove the ones you want to keep. You can easily do that by filtering the env output. I do that with a simple grep, you may want to use more complex code if your criteria for inclusion are more complex than “begins with a specific prefix”.
The occasional false positive due to a variable containing a newline followed by a valid variable name and an equal sign will only lead to calling unset on a non-existent variable, which does nothing. The desired variables are removed from the exclusion list, so the final output will never omit a desired variable.
excluded=$(env | LC_ALL=C sed -n 's/^\([A-Z_a-z][0-9A-Z_a-z]*\)=.*/\1/p' |
grep -v 'MY_')
sh -c 'unset $1; export -p' sh "$excluded" >setenv.sh
Dash prints an extra export PATH (with no value) if PATH was in the environment when it was invoked. If that bothers you, change sh -c … to (unset PATH; sh -c …).
Assuming GNU grep:
grep --null '^MY_' </proc/self/environ
...will emit your environment variables in NUL-delimited form (newlines intact).
Similarly, if you have bash:
while IFS= read -r -d '' vardef; do
[[ $vardef = MY_* ]] && printf '%s\0' "$vardef"
done </proc/self/environ
Note that if these variables were set in the same shell session, you may need to create a subprocess for /proc/self/environ to be updated:
(while IFS= read -r -d '' vardef; do
[[ $vardef = MY_* ]] && printf '%s\0' "$vardef"
done </proc/self/environ)
alpine image doesn't ship with bash.
You can use this script to extract all MY_* variables including newline variables:
docker run -e MY_FOO=bar -e MY_VAR="multi' export MY_INJECTED='val" -e MY_VAR2=$'multi
0MY_line=val
string' alpine sh -c "awk -v RS='\06' -F= '/^MY_/{k=\$1; sub(/^[^=]+=/, \"\");
gsub(/\047/, \"\047\\\\\\047\047\"); printf \"export %s=\047%s\047\n\", k, \$0
}' /proc/self/environ"
This will output:
export MY_FOO='bar'
export MY_VAR='multi'\'' export MY_INJECTED='\''val'
export MY_VAR2='multi
0MY_line=val
string'
Here is how awk works:
-v RS='\6': sets record separator as \6 works for nul byte as well (assuming you don't have \6 in value)
-F=: sets field separator as =
/^MY_/: Only process records starting with MY_
store variable name or $1 in variable k
Using sub function get part after = in $0
Using print format output so that it can be used in $CATALINA_HOME/bin/setenv.sh file.
\047 is for printing single quote
what about
declare -p ${!MY_*}
and
declare -p ${!MY_*} | sed -r 's/^declare (-[^ ]*)* MY_/export /'
or
declare -p ${!MY_*} | sed 's/^declare \(-[^ ]*\)* MY_/export /'
EDIT posix compliant version :
some env or printenv accept -0 option to end each output line by \0 rather than a newline. Thus
env -0 | perl -ne 'BEGIN{$/="\0";$\="\n";$q="\047"}next unless /^MY_/;chomp;s/$q/$q\\$q$q/;s/=/=$q/;s/$/$q/;print'
How it works
$/ : input record separator
$\ : output record separator
$q : variable to store single quote (\047) because of surrounding single quotes in command
next : to filter "MY_" variables
chomp : removes the input separator
s/// : quote substitution
EDIT: variation of perl version in posix shell
env -0 | xargs -0 sh -c 'for entry; do [[ $entry = MY_* ]] || continue; printf "%s=\047%s\047\n" "${entry%%=*}" "$(echo "${entry#*=}" | sed '\''s/\x27/\x27\\\x27\x27/g'\'' )"; done' -

fgrep with string containing spaces inside ksh script

I am trying to write an fgrep statement removing records with a full record match from a file. I can do this on the command line, but not inside a ksh script. The code I am using boils down to these 4 lines of code:
Header='abc def|ghi jkl' #I use the head command to populate this variable
workfile=abc.txt
command="fgrep -Fxv \'$Header\' $workfile" >$outfile
$command
When I echo $command to STDIN the command is exactly what I would type on the command line (with the single quotes) and that works on the command line. When I execute it within the ksh script (file) the single quotes seem not to be recognized because the errors show it is parsing on spaces.
I have tried back ticks, exec, eval, double quotes instead of single quotes, and not using the $command variable. The problem remains.
I can do this on the command line, but not inside a ksh script
Here's a simple, portable, reliable solution using a heredoc.
#!/usr/bin/env ksh
workfile=abc.txt
outfile=out.txt
IFS= read -r Header <<'EOF'
abc def|ghi jul
EOF
IFS= read -r command <<'EOF'
grep -Fxv "$Header" "$workfile" > "$outfile"
EOF
eval "$command"
Explanation :
(Comments can't be added to the script above because they would affect the lines in the heredoc)
IFS= read -r Header <<'EOF' # Line separated literal strings
abc def|ghi jul # Set into the $Header variable
EOF # As if it were a text file
IFS= read -r command <<'EOF' # Command to execute
grep -Fxv "$Header" "$workfile" > "$outfile" # As if it were typed into
EOF # the shell command line
eval "$command" # Execute the command
The above example is the same as having a text file called header.txt, which contains the contents: abc def|ghi jul and typing the following command:
grep -Fxvf header.txt abc.txt
The heredoc addresses the problem of the script operating differently than the command line as a result of quoting/expansions/escaping issues.
A Word of caution regarding eval:
The use of eval in this example is specific. Please see Eval command and security issues for information on how eval can be misused and cause potentially very damaging results.
More Detail / Alternate Example:
For the sake of completeness, clarity, and ability to apply this concept to other situations, some notes about the heredoc and an alternative demonstration:
This implementation of the heredoc in this example is specifically designed with the following criteria:
Literal string assignment of contents, to the variables (using 'EOF')
Use of the eval command to evaluate and execute the referenced variables within the heredoc itself.
File or heredoc ?
One strength of using a heredoc combined with grep -F (fgrep), is the ability to treat a section of the script as if it were a file.
Case for file:
You want to frequently paste "pattern" lines into the file, and remove them as necessary, without having to modify the script file.
Case for heredoc:
You apply the script in an environment where specific files already exist, and you want to match specific exact literal patterns against it.
Example:
Scenario: I have 5 VPS Servers, and I want a script to produce a new fstab file but to ensure it doesn't contain the exact line:
/dev/xvda1 / ext3 errors=remount-ro,noatime,barrier=0 0 1
This scenario fits the type of situation addressed in this question. I could use the boilerplate from the above code in this answer and modify it as following:
#!/usr/bin/env ksh
workfile=/etc/fstab
IFS= read -r Header <<'EOF'
/dev/xvda1 / ext3 errors=remount-ro,noatime,barrier=0 0 1
EOF
IFS= read -r command <<'EOF'
grep -Fxv "$Header" "$workfile"
EOF
eval "$command"
This would give me a new fstab file, without the line contained in the heredoc.
Bash FAQ #50: I'm trying to put a command in a variable, but the complex cases always fail! provides comprehensive guidance - while it is written for Bash, most of it applies to Ksh as well.[1]
If you want to stick with storing your command in a variable (defining a function is the better choice), use an array, which bypasses the quoting issues:
#!/usr/bin/env ksh
Header='abc def|ghi jkl'
workfile=abc.txt
# Store command and arguments as elements of an array
command=( 'fgrep' '-Fxv' "$Header" "$workfile" )
# Invoke the array as a command.
"${command[#]}" > "$outfile"
Note: only a simple command can be stored in an array, and redirections can't be part of it.
[1] The function examples use local to create local variables, which ksh doesn't support. Omit local to make do with shell-global variables instead, or use function <name> {...} syntax with typeset instead of local to declare local variables in ksh.

How do I get command-line Perl to accept shell variables?

I can do math like
perl -e 'print 5253413/39151' -l
But I don't quite get how to take advantage of Perl's ability to do math with my own predefined bash variables. I've tried
var1=$(some wc command that yields a number); var1=$(some wc that yields another number)
perl -e 'print var1/var2' -l
But it doesn't work
There are two main ways to do this.
Within the Perl code you can use the %ENV built-in hash to access environment variables that are exported from the shell
$ export var1=5253413
$ export var2=39151
$ perl -E 'say $ENV{var1}/$ENV{var2}'
134.183366963807
You can use the shell interpolation facility to insert the value of a shell variable into a command
This is best done as parameters to the perl one-liner rather than introducing the values directly into the code
$ var1=5253413
$ var2=39151
$ perl -E '($v1, $v2) = #ARGV; say $v1/$v2' $var1 $var2
134.183366963807
Two less common ways to do this make use of long-standing perl features.
The first is the core module Env, which ties process environment variables to perl variables:
sh$ export VAR1=1000
sh$ export VAR2=33
sh$ perl -MEnv -E 'say $VAR1/$VAR2' # imports all environ vars
333.333333333333
sh$ perl -MEnv=VAR1,VAR2 -E 'say $VAR1/$VAR2' # imports only VAR1, VAR2
333.333333333333
Note that the variables need to be present in the environment inherited by the perl process, for example with export VAR as above, or explicitly for a single command (as by FOO=hello perl -MEnv -E 'say $FOO').
The second and rather more obscure way is to use use perl's -s switch to set arbitrary variables from the command line:
sh$ VAR1=1000
sh$ VAR2=33
sh$ perl -s -E 'say $dividend/$divisor' -- -dividend=$VAR1 -divisor=$VAR2
333.333333333333
awk does something similar with its -v switch.
I believe the spirit of the question is to pass variables without exported ENV vars.
Beside using perl -s -e expression -perlvar=val, below is code that uses two other mechanisms to pass the variable to perl.
a=x; b=N; c=z;
b=y perl -e '$pa='$a';' -e "\$pc=$c;" -e 'print "$pa$ENV{b}$pc\n";'
echo $a$b$c
Passing a and c is same, only the quoting is different. When passing using chained expressions, like this, it is important to end the expression with semi-colon; because, they flow into one expression at the end.
Passing b is done by ENV, but instead of using the exported value, it is passed directly into perl's ENV by giving the assignment before the command on the same command-line.
Last the echo command is to emphasize how the shell's definition of $b is unchanged.
Using the mechanism of b's passing, we arrive at a more secure solution, because the process's ENV data cannot be checked for the value, and it will not be seen in the command-line argument list.

How to run "source" command (Linux) from a perl script?

I am trying to source a script from a Perl script (script.pl).
system ("source /some/generic/script");
Please note that this generic script could be a shell, python or any other script. Also, I cannot replicate the logic present inside this generic script into my Perl script. I tried replacing system with ``, exec, and qx//. Each time I got the following error:
Can't exec "source": No such file or directory at script.pl line 18.
I came across many forums on the internet, which discussed various reasons for this problem. But none of them provided a solution. Is there any way to run/execute source command from a Perl script?
In bash, etc, source is a builtin that means read this file, and interpret it locally (a little like a #include).
In this context that makes no sense - you either need to remove source from the command and have a shebang (#!) line at the start of the shell script that tells the system which shell to use to execute that script, or you need to explicitly tell system which shell to use, e.g.
system "/bin/sh", "/some/generic/script";
[with no comment about whether it's actually appropriate to use system in this case].
There are a few things going on here. First, a child process can't change the environment of its parent. That source would only last as long as its process is around.
Here's a short program that set and export an environment variable.
#!/bin/sh
echo "PID" $$
export HERE_I_AM="JH";
Running the file does not export the variable. The file runs in its own proces. The process IDs ($$) are different in set_stuff.sh and the shell:
$ chmod 755 set_stuff.sh
$ ./set_stuff.sh
PID 92799
$ echo $$
92077
$ echo $HERE_I_AM # empty
source is different. It reads the file and evaluates it in the shell. The process IDs are the same in set_stuff.sh and the shell, so the file is actually affecting its own process:
$ unset HERE_I_AM # start over
$ source set_stuff.sh
PID 92077
$ echo $$
92077
$ echo $HERE_I_AM
JH
Now on to Perl. Calling system creates a child process (there's an exec in there somewhere) so that's not going to affect the Perl process.
$ perl -lwe 'system( "source set_stuff.sh; echo \$HERE_I_AM" );
print "From Perl ($$): $ENV{HERE_I_AM}"'
PID 92989
JH
Use of uninitialized value in concatenation (.) or string at -e line 1.
From Perl (92988):
Curiously, this works even though your version doesn't. I think the different is that in this there are no special shell metacharacters here, so it tries to exec the program directory, skipping the shell it just used for my more complicated string:
$ perl -lwe 'system( "source set_stuff.sh" ); print $ENV{HERE_I_AM}'
Can't exec "source": No such file or directory at -e line 1.
Use of uninitialized value in print at -e line 1.
But, you don't want a single string in that case. The list form is more secure, but source isn't a file that anything can execute:
$ which source # nothing
$ perl -lwe 'system( "source", "set_stuff.sh" ); print "From Perl ($$): $ENV{HERE_I_AM}"'
Can't exec "source": No such file or directory at -e line 1.
Use of uninitialized value in concatenation (.) or string at -e line 1.
From Perl (93766):
That is, you can call source, but as something that invokes the shell.
Back to your problem. There are various ways to tackle this, but we need to get the output of the program. Instead of system, use backticks. That's a double-quoted context so I need to protect some literal $s that I want to pass as part of the shell commans
$ perl -lwe 'my $o = `echo \$\$ && source set_stuff.sh && echo \$HERE_I_AM`; print "$o\nFrom Perl ($$): $ENV{HERE_I_AM}"'
Use of uninitialized value in concatenation (.) or string at -e line 1.
93919
From Shell PID 93919
JH
From Perl (93918):
Inside the backticks, you get what you like. The shell program can see the variable. Once back in Perl, it can't. But, I have the output now. Let's get more fancy. Get rid of the PID stuff because I don't need to see that now:
#!/bin/sh
export HERE_I_AM="JH";
And the shell command creates some output that has the name and value:
$ perl -lwe 'my $o = `source set_stuff.sh && echo HERE_I_AM=\$HERE_I_AM`; print $o'
HERE_I_AM=JH
I can parse that output and set variables in Perl. Now Perl has imported part of the environment of the shell program:
$ perl -lwe 'my $o = `source set_stuff.sh && echo HERE_I_AM=\$HERE_I_AM`; for(split/\R/,$o){ my($k,$v)=split/=/; $ENV{$k}=$v }; print "From Perl: $ENV{HERE_I_AM}"'
From Perl: JH
Let's get the entire environment, though. env outputs every value in the way I just processed it:
$ perl -lwe 'my $o = `source set_stuff.sh && env | sort`; print $o'
...
DISPLAY=:0
EC2_PATH=/usr/local/ec2/ec2-api-tools
EDITOR=/usr/bin/vi
...
I have a few hundred varaibles set in the shell, and I don't want to expose most of them. Those are all set by the Perl process, so I can temporarily clear out %ENV:
$ perl -lwe 'local %ENV=(); my $o = `source set_stuff.sh && env | sort`; print $o'
HERE_I_AM=JH
PWD=/Users/brian/Desktop/test
SHLVL=1
_=/usr/bin/env
Put that together with the post processing code and you have a way to pass that information back up to the parent.
This is, by the way, similar to how you'd pass variables back up to a parent shell process. Since that output is already something the shell understands, you use the shell's eval instead of parsing it.
You can't. source is a shell function that 'imports' the contents of that script into your current environment. It's not an executable.
You can replicate some of it's functionality by rolling your own - run or parse whatever you're 'sourcing' and capture the result:
print `. file_to_source; echo $somevar`;
or similar.

calling and changing a file using sed command within a function

Hi I have wrapped a sed command (which works out it's own) within a shell function.
#!/bin/bash
snp2fasta() {
sed -i "s/^\(.\{'$2'\}\)./\1'$3'/" $1;
}
and call it with
$ ./snp2fasta input.txt 45 A
no changes are made to input.txt
However if I simply do
$ sed -i 's/^\(.\{45\}\)./\1A/' input.txt
then this works and the file is changed by changing the 45th character to an A.
However when wrapping into a shell script (to handle command line variables) the shell script snp2fasta.sh runs fine, but no changes are made to the file.
why is this?
if you put it into a script, no more need of the function call ouside the script, use it directly intor the script.
Like on the other related post ( Use argument to...) about state it (to secure thje $1,2 and 3 content)
#!/bin/bash
# argument passed to script (or any other source if needed like intern to script)
File=$1
Place=$2
NewChar=$3
# sed with unambigous variable content
sed -i "s/^\(.\{${Place}\}\)./\1${NewChar}/" ${File}

Resources