snakemake: Problem with --rerun-triggers flag and bash variable - bash

I have a problem when I try to provide the --rerun-triggers flag as a bash variable.
My command is
snakemake $snakemake_extra -pr --snakefile Snakefile --configfile config.yaml -c 20 -n
and snakemake_extra is a bash variable defined as
snakemake_extra="--rerun-triggers {mtime,input,params}"
I get the following error:
snakemake: error: argument --rerun-triggers: invalid choice: '{mtime,input,params}' (choose from 'mtime', 'params', 'input', 'software-env', 'code')
The problem seems to be that snakemake(?) adds single-quotes before and after the {}.
When I insert the --rerun-triggers flag directly (without bash variable) it works fine. I need the bash variable however and can also not use a snakemake profile yaml.
Is there any possible workaround?
I am using snakemake version 7.12.1.
Thanx,
Carlus

Is there any possible workaround?
https://mywiki.wooledge.org/BashFAQ/050
Use array.
snakemake_extra=( --rerun-triggers {mtime,input,params} )
... "${snakemake_extra[#]}" ...

Related

strip quotes from variable from .env file included in makefile

I have a situation where I have an environment variable with space character inside. Some tools do not like quoting the value of the variable, as they will treat the quote as part of the variable.
This is set in a .env file.
PIP_EXTRA_INDEX_URL="https://token#repo https://token#repo"
When I include and export this .env file in a Makefile, I get this warning.
WARNING: Location '"https://token#repo' is ignored:
it is either a non-existing path or lacks a specific scheme.
But I have seen this behavior as initially mentioned also with other tools. Is there a way to handle this?
In the Makefile, I include it like below.
include .env
export
build:
docker build --build-arg PIP_EXTRA_INDEX_URL -t myimage .
Makefiles are not shell scripts and it is not possible to use the same syntax to define variables in both the shell and in make, except in very limited situations.
In the shell, you can have multiple assignments on the same line or even run programs on the same line. So, if your assignment has whitespace in it you have to quote it as you've done here.
In make, the syntax of an assignment is that all text after the assignment (and leading whitespace) becomes the value of the variable and there is no quoting needed; any quotes that are seen are kept as part of the variable value.
So, in the shell this assignment:
PIP_EXTRA_INDEX_URL='https://token#repo https://token#repo'
sets the shell variable PIP_EXTRA_INDEX_URL to the value https://token#repo https://token#repo ... note the quotes are stripped from the value by the shell.
In make this assignment:
PIP_EXTRA_INDEX_URL='https://token#repo https://token#repo'
sets the shell variable PIP_EXTRA_INDEX_URL to the value 'https://token#repo https://token#repo' ... note the quotes are not stripped from the value by make.
So if you use this value in a recipe like this:
do something "$(PIP_EXTRA_INDEX_URL)"
then make will expand that variable and you'll get:
do something "'https://token#repo https://token#repo'"
(including quotes) and that's your problem.
It works like this.
build:
docker build --build-arg PIP_EXTRA_INDEX_URL=$(PIP_EXTRA_INDEX_URL) -t myimage .

How to avoid "command not found" in a bash parameter expansion?

I wrote the following bash script:
${MY_FLAG:=true}
${LOG_FILE:="something.log"}
I am trying to assign true to MY_FLAG and the string "something.log" to LOG_FILE. I use parameter expansions because I want to set these variables only if they were not set already.
The problem is that MY_FILE becomes true but LOG_FILE throws an error:
script.sh: line 2: something.log: command not found
I could not find a way to assign the string as is, I tried with different options, simple quotes, and echoing it but nothing did the trick for me.
The parameters will always expand to a value, so you'll have to use them in a context where such an argument is ignored. Conveniently, : aka true does this:
: "${LOG_FILE:="something.log"}"
It only happens to work for your ${MY_FLAG:=true} because true (as discussed) is a valid command. If you run the script with MY_FLAG=date ./yourscript then you'll see that it actually runs date instead of just assigning a default.

are there security issues with using eval on an environment variable in a bash script?

I have a Bash script in which I call rsync in order to perform a backup to a remote server. To specify that my Downloads folder be backed up, I'm passing "'${HOME}/Downloads'" as an argument to rsync which produces the output:
rsync -avu '/Volumes/Norman Data/Downloads' me#example.com:backup/
Running the command with the variable expanded as above (through the terminal or in the script) works fine, but because of the space in the expanded variable and the fact that the quotes (single ticks) are ignored when included in the variable being passed as part of an argument (see here), the only way I can get it not to choke on the space is to do:
stmt="rsync -avu '${HOME}/Downloads' me#examle.com:backup/"
eval ${stmt}
It seems like there would be some vulnerabilities presented by running eval on anything not 100% private to that script. Am I correct in thinking I should be doing it a different way? If so, any hints for a bash-script-beginner would be greatly appreciated.
** EDIT ** - I actually have a bit more involved use case than. the example above. For the paths passed, I have an array of them, each containing spaces, that I'm then combining into 1 string kind of like
include_paths=(
"'${HOME}/dir_a'"
"'${HOME}/dir_b' --exclude=video"
)
for item in "${include_paths[#]}"
do
inc_args="${inc_args}" ${item}
done
inc_args evaluates to '/Volumes/Norman Data/me/dir_a' '/Volumes/Norman Data/me/dir_b' --exclude=video
which I then try to pass as an argument to rsync but the single ticks are read as literals and it breaks after the 1st /Volumes/Norman because of the space.
rsync -avu "${inc_args}" me#example.com:backup/
Using eval seems to read the single ticks as quotes and executes:
rsync -avu '/Volumes/Norman Data/me/dir_a' '/Volumes/Norman Data/me/dir_b' --exclude=video me#example.com:backup/
like I need it to. I can't seem to get any other way to work.
** EDIT 2 - SOLUTION **
So the 1st thing I needed to do was modify the include_paths array to:
remove single ticks from within double quoted items
move any path-specific flags (ex. --exclude) to their own items directly after the path it should apply to
I then built up an array containing the rsync command and its options, added the expanded include_paths and exclude_paths arrays and the connection string to the remote host.
And finally expanded that array, which ran my entire, properly quoted rsync command. In the end the modified array include_paths is:
include_paths=(
"${HOME}/dir_a"
"${HOME}/dir_b"
"--exclude=video"
"${HOME}/dir_c"
)
and I put everything together with:
cmd=(rsync -auvzP)
for item in "${exclude_paths[#]}"
do
cmd+=("--exclude=${item}")
done
for item in "${include_paths[#]}"
do
cmd+=("${item}")
done
cmd+=("me#example.com:backup/")
set -x
"${cmd[#]}"
Use an array for the commands/option instead of a plain variable.
stmt=(rsync -avu "${HOME}/Dowloads" me#example.com:backup/)
Execute it using the builtin command
command "${stmt[#]}"
...Or I personally just put the options/arguments in an array.
options=(-avu "${HOME}/Download" me#example.com:backup/)
The execute it using rsync
rsync "${options[#]}"
If you have newer version of bash which that supports the additional P.E. parameter expansion, then you could probably quote the array.
options=(-avu "${HOME}/Download" me#example.com:backup/)
Check the output by applying the P.E.
echo "${options[#]#Q}"
Should print
'-avu' '/Volumes/Norman Data/Downloads' 'me#examle.com:backup/'
Then you can just
rsync "${options[#]#Q}"

How to achieve the variable value declared in two level

I have a Unix variable like below:
emp_tbl=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28
Now I have created another variable like below:
tablename=emp_tbl
Now I want to see the value 1,2,3,... using $($tablename) but I am getting error in it:
~$>emp_tbl=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28
~$>tablename=emp_tbl
~$>echo $($tablename)
-bash: emp_tbl: command not found
You need indirection:
echo ${!tablename}
Read the documentation on shell parameter expansion yet again (reminder to self: do thou likewise).
Your attempt using $($tablename) fails because the $(...) notation is command substitution, and the value in $tablename is interpreted as the command name and the command with the name emp_tbl could not be found.

whitespace character in case of parameter substitution

I want to pass a filter statement with in my pig script using parameter substitution
For that I have tried
exec -param flt='a1==1 AND a2=2' filterscript.pig
But sadly it is throwing an exception message
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 101: Local file 'AND' does not exist.
Pig version - 0.9.2
I have tried flt='\'a1==1 AND a2=2\'' and flt="a1==1 AND a2==2" suggested by pig users in apache forum as well as seen a similar post in SO.
Any help will be appreciated
I think you are using the parameter passed as it is as a condition. If so you will get an error like this. Instead you can pass them as separate paarmeters and form the condition string inside the pig script.
exec -p p1=1 -p p2=2 filterscript.pig
Inside your filterscript.pig script you can use these parameter values in condition clauses. For example
a1==$p1 AND a2=$p2
If you run your script outside the grunt shell you can do the followings:
pig -param flt="a1\=\=1 AND a2\=\=2" -f filterscript.pig
where filterscript.pig is something like this:
A = load ...
...
B = filter A by $flt;
...
Note that the '=' is also escaped, otherwise the filter condition won't be evalued to boolean.
If you want to use the filter substitution within the grunt shell as you tried with exec,
then you'll encounter the whitespace problem. Since escaping the whitespace character doesn't work, as a workaround you can create a parameter file :
cat params.txt
flt="a1\=\=1 AND a2\=\=2"
Then issue:
exec -param_file params.txt filterscript.pig
Note: I use Pig 0.12

Resources