Control wildcard expansion in sh command - bash

I have a program which executes shell functions using call like sh -c "<command_string>". I can not change the way of calling these functions.
In this program I call different self written helper shell function, which are sourced into my environment. One of these looks like this. It unzips files with a given file pattern into a given directory.
function dwhUnzipFiles() {
declare OPTIND=1
while getopts "P:F:T:" opt; do
case "$opt" in
P) declare FILEPATTERN="$OPTARG" ;;
F) declare FROMDIR="$OPTARG" ;;
T) declare TODIR="$OPTARG" ;;
*) echo "Unbekannte Option | Usage: dwhUnzipFiles -P <filepattern> -F <fromdir> -T <todir>"
esac
done
shift $((OPTIND-1))
for currentfile in "${FROMDIR}"/"${FILEPATTERN}" ; do
unzip -o "$currentfile" -d "${TODIR}";
done
# error handling
# some more stuff
return $?
}
For this function I use arguments with wildcards for the FILEPATTERN variable. The function gets called by my program like this:
sh -c ". ~/dwh_env.sh && dwhUnzipFiles -P ${DWH_FILEPATTERN_MJF_WLTO}.xml.zip -F ${DWH_DIR_SRC_XML_CURR} -T ${DWH_DIR_SRC_XML_CURR}/workDir" where ${DWH_FILEPATTERN_MJF_WLTO} contains wildcards.
This works as intended. My confusion starts with another helper function, which is constructed in a similar way, but I'm not able to control the wildcard expansion correctly. It just deletes files in a directory depending on a given file pattern.
function dwhDeleteFiles() {
declare retFlag=0
declare OPTIND=1
while getopts "D:P:" opt; do
case "$opt" in
D) declare DIRECTORY="$OPTARG" ;;
P) declare FILEPATTERN="$OPTARG" ;;
*) echo "Unbekannte Option | Usage: dwhDeleteFiles -D <Directory> -P <Filepattern>"
esac
done
shift $((OPTIND-1))
for currentfile in "${DIRECTORY}"/"${FILEPATTERN}" ; do
rm -fv "${currentfile}";
done
# error handling
# some more stuff
return $retFlag
}
This function is called like this:
sh -c ". ~/dwh_env.sh && dwhDeleteFiles -P ${DWH_FILEPATTERN_MJF_WLTO}.xml -D ${DWH_DIR_SRC_XML_CURR}/workDir" where again ${DWH_FILEPATTERN_MJF_WLTO} contains wildcards. When I call this function with my program it results in doing nothing. I tried to play around with adding "" and \"\" to the arguments of my functions, but all what is happening is that instead of deleting all files in the given directory the function deletes only the first one in an alphanumerical order.
Can somebody explain to me, what is happening here? My idea is that the multiple passing of the variable, containing the wildcard, is not working. But how do I fix this and is it even possible in bash? And why is the dwhUnzipFilesfunction working and the dwhDeleteFiles is not?

Suppose that DWH_FILEPATTERN_MJF_WLTO is *, and you have a bunch of *.xml files, then the command is
dwhDeleteFiles -P *.xml -D ${DWH_DIR_SRC_XML_CURR}/workDir
which expands to
dwhDeleteFiles -P bar.xml baz.xml foo.xml zap.xml -D ${DWH_DIR_SRC_XML_CURR}/workDir
(Note alphabetical order of xml files). But the -P option only takes one arg, bar.xml (the first) and the remaining are treated as file arguments.
Try setting set -x in your script to see this in action.

Related

Questions about bash

Firstly, I'm wondering how to input information from the terminal into a variable in the script file. For example, lets say I wanted to do ./name.sh dave in the terminal instead of using read -p to ask for the name in the script. Secondly, I'm wondering how to go about creating a new directory and then copying files into that directory. I know how to use the mkdir command, but not how to copy files to that new directory.
Sorry if my wording is a bit bad I wasn't sure how else to ask the questions (this is my first day messing with bash.)
When you run:
./name.sh dave
the string dave will be the first positional argument in the script. You can access it with $1. To create a directory named dave and copy files into it, you might do:
#!/bin/bash
dir=${1:?}
mkdir "$dir" || exit
cp * "$dir"
A few things are a bit cryptic, and perhaps you might prefer:
#!/bin/sh
if test -z "$1"; then
echo "Parameter missing" >&2;
exit 1
fi
mkdir "$1" && cp * "$1"
Basically, you access the parameters via $1, $2, etc. The ${1:?} syntax is a shortcut that assigns the variable dir, but aborts the script if $1 is unset or empty. (eg, if you call the script without an argument.)
The rest seems pretty self-explanatory.
Suppose you wanted to specify the files to copy, so that ./name.sh dave would create a directory named dave and copy all files in the current directory to it (as above), but if you pass more arguments it would copy only those files. In that case, you might do something like:
#!/bin/bash
dir=${1:?}
shift # Discard the first argument, shift remaining down
mkdir "$dir" || exit
case $# in
0) cp * "$dir";;
*) cp "$#" "$dir";;
esac
Here, "$#" is the list of each argument, individually quoted. (eg, if you call the script with an argument that has spaces, it will properly pass that argument to cp. Compare that with cp $# $dir or cp "$*" $dir.) If you're just starting with shell scripts, I would advise you always be careful about quotes.

How can I use getopts in a script that appends lines from files in a separate directory to a new file?

I am trying to write a bash script that takes in a directory, reads each file in the directory, and then appends the first line of each file in that directory to a new file. When I hard-code the variables in my script, it works fine.
This works:
#!/bin/bash
rm /local/SomePath/multigene.firstline.btab
touch /local/SomePath/multigene.firstline.btab
btabdir=/local/SomePath/test/*
outfile=/local/SomePath/multigene.firstline.btab
for f in $btabdir
do
head -1 $f >> $outfile
done
This does not work:
#!/bin/bash
while getopts ":d:o:" opt; do
case ${opt} in
d) btabdir=$OPTARG;;
o) outfile=$OPTARG;;
esac
done
rm $outfile
touch $outfile
for f in $btabdir
do
head -1 $f >> $outfile
done
Here is how I call the script:
bash /local/SomePath/Scripts/btab.besthits.wBp-q_wBm-r.sh -d /local/SomePath/test/* -o /local/SomePath/out.test/multigene.firstline.btab
And here is what I get when I run it:
rm: missing operand
Try 'rm --help' for more information.
touch: missing file operand
Try 'touch --help' for more information.
/local/SomePath/Scripts/btab.besthits.wBp-q_wBm-r.sh: line 23: $outfile: ambiguous redirect
Any suggestions? I'd like to be able to use getopts so I can make the script more generic. Thanks!
You have to pay extra attention to quoting and globbing when writing bash scripts.
When you call the script with a glob (* here) it gets expanded and split into words by your shell. This happends before your script even gets executed.
If you for example do cat *.txt cat will get all .txt files in the directory as its arguments. It will be the same as calling cat afile.txt nextfile.txt (and so on). Cat will never see the asterisk.
In your script it means that the input -d /local/SomePath/test/* gets expanded som something like /local/SomePath/test/someFile /local/SomePath/test/someOtherFile /test/someThirdFile.
Subsequently getopts only takes the first file after -d as for $btabdir and the -o doesn't get handled in the case switch.
I suggest you start by quoting every variable, preferable in the "${name}" style, and only invoke the script with quoted input.
It might also be send in a directory path, test that it is a directory (test -d), and change your for loop to for f in "${btabdir}"/*
This also works:
head -n1 -q /local/SomePath/test/* >> /local/SomePath/out.test/multigene.firstline.btab
I think the right answer here is "don't do it that way." :-)
The reason your current script isn't working may be that the wildcard is expanded by your interactive shell, not by your script. Try running your command with an echo at the beginning of the line for a hint at what's really happening. Once getopts sees the second of the matched files in the glob, it stops processing options, so -o never gets read, and $outfile remains unset. And since you don't quote your variable in rm $outfile, it's as if you're running rm without options. Test the difference in your shell between rm alone and rm "".
Also, what happens to your for loop if there's a space in a filename? Since you have bash, you have arrays. And arrays are much better for processing lists of files.
Perhaps use something like this instead:
#!/bin/bash
# initialize an array
files=()
while getopts :d:o: opt; do
case "$opt" in
d)
if [[ ! -d "$OPTARG" ]]; then
printf 'ERROR: not a directory: %s\n' "$OPTARG" >&2
exit 65
fi
# add to the array
files+=( "$OPTARG"/* )
;;
o) outfile="$OPTARG" ;;
*)
printf 'ERROR: unknown option: %s\n' "$opt" >&2
exit 64
;;
esac
done
if ! rm -f "$outfile" && touch "$outfile"; then
printf 'ERROR: cannot create %s\n' "$outfile" >&2
exit 73
fi
for f in "${files[#]}"; do
read -r < "$f"
printf '%s\n' "$REPLY"
done > "$outfile"
Here are some highlights of the changes....
We're using arrays, of course. The array ${files[#]} will contain one-file-per-record, without relying on whitespace, so with proper quoting you'll avoid problems with special characters in filenames.
We test for more error conditions, and actually show errors and exit if we see them. (The exit values are sysexits.)
Instead of using head, we use read and a single redirect to $outfile. This saves multiple forks to an external program, and multiple fopen() calls to your output file.
Note that the argument to -d should be a directory, not a glob. And you can specify options multiple times. Multiple -d options will be added together, but only the last -o option will be used.

How to write an alias for "two" words [duplicate]

This question already has answers here:
Can I alias a subcommand? (shortening the output of `docker ps`)
(4 answers)
Closed 6 years ago.
The standard usage of an alias is to write a shortcut for an expanded command, for example: alias ls='ls --color'.
I want to know if it's possible to have "parameters" in the left side, so that it works the other way around. Using the above example, I'm interesting in knowing if alias ls --color='ls'is possible, that is, when someone types ls --color, the simple ls is run.
Forget about whether or not that's useful or make sense, I just want to know if it's possible, or if there is any workaround to achieve the same goal.
The existing answer doesn't correctly handle commands with spaces -- and indeed cannot: Condensing an array into a string is inherently buggy.
This version works with the list of arguments as an array, and thus avoids this loss of information:
ls() {
local -a args=( )
for arg; do
[[ $arg = --color ]] || args+=( "$arg" )
done
command ls "${args[#]}"
}
Alternately, if your real goal is to alias a subcommand (and you might want to process more subcommands in the future), consider a case structure, as the following:
ls() {
local subcommand
if (( "$#" == 0 )); then command ls; return; fi
subcommand=$1; shift
case $subcommand in
--color) command ls "$#" ;;
*) command ls "$subcommand" "$#" ;;
esac
}
Some tests, to distinguish correctness between this answer and the preexisting one:
tempdir=/tmp/ls-alias-test
mkdir -p "$dir"/'hello world' "$dir"/my--color--test
# with the alternate answer, this fails because it tries to run:
# ls /tmp/ls-alias-test/hello world
# (without the quotes preserved)
ls --color "$dir/hello world"
# with the alternate answer, this fails because it tries to run:
# ls /tmp/ls-alias-test/my--test
ls --color "$dir/my--color--test"
Use function after unalias ls:
unalias ls
ls () { p=$#; p=${p//--color/}; /bin/ls $p ;}

Pass commandline args into another script

I have couple of scripts which call into each other. However when I pass
Snippet from buid-and-run-node.sh
OPTIND=1 # Reset getopts in case it was changed in a previous run
while getopts "hn:c:f:s:" opt; do
case "$opt" in
h)
usage
exit 1
;;
n)
container_name=$OPTARG
;;
c)
test_command=$OPTARG
;;
s)
src=$OPTARG
;;
*)
usage
exit 1
;;
esac
done
$DIR/build-and-run.sh -n $container_name -c $test_command -s $src -f $DIR/../dockerfiles/dockerfile_node
Snippet from build-and-run.sh
OPTIND=1 # Reset getopts in case it was changed in a previous run
while getopts "hn:c:f:s:" opt; do
case "$opt" in
h)
usage
exit 1
;;
n)
container_name=$OPTARG
;;
c)
test_command=$OPTARG
;;
f)
dockerfile=$OPTARG
;;
s)
src=$OPTARG
;;
*)
usage
exit 1
;;
esac
done
I am calling it as such
build-and-run-node.sh -n test-page-helper -s ./ -c 'scripts/npm-publish.sh -r test/test-helpers.git -b patch'
with the intention that npm-publish.sh should run with the -r and -b parameters. However when I run the script I get
build-and-run.sh: illegal option -- r
which obviously means it is the build-and-run command that is consuming the -r. How do I avoid this?
You need double quotes around $test_command in buid-and-run-node.sh, otherwise that variable is being split on the white space and appears to contain arguments for buid-and-run.sh. Like this:
$DIR/build-and-run.sh -n $container_name -c "$test_command" -s $src -f $DIR/../dockerfiles/dockerfile_node
Further Info
As the comment below rightly points out, it's good practice to quote all variables in Bash, unless you know you want them off (for example, to enable shell globbing). It's also helpful, at least in cases where the variable name is part of a larger word, to use curly braces to delineate the variable name. This is to prevent later characters from being treated as part of the variable name if they're legal. So a better command call might look like:
"${DIR}/build-and-run.sh" -n "$container_name" -c "$test_command" -s "$src" -f "${DIR}/../dockerfiles/dockerfile_node"

Cleanly passing options between coupled shell scripts

I have a (bash) shell script myBaseScript.sh, with the following signature:
myBaseScript.sh [OPTIONS] FILE1 [FILE2 ...]
This script myBaseScript.sh parses the options via getopts, like so:
while getopts ":hi:m:s:t:v:" opt; do
#...
done
shift $(($OPTIND-1))
FILES=("$#")
Now I wish to create a script mySuperScript.sh, which calls myBaseScript.sh, allowing options to be forwarded to the base script:
mySuperScript.sh [OPTIONS] DIR1 [DIR2 ...]
where [OPTIONS] are the options used in myBaseScript.sh; they are not used in mySuperScript.sh.
The script mySuperScript.sh is to crawl through all directories DIR1, [DIR2], etc., compiling a list of valid files ${FILES[#]}, to be passed on to myBaseScript.sh.
Right now I have this in mySuperScript.sh:
myBaseScript.sh "$#" "${FILES[#]}"
where ${FILES[#]} is a list of files in the current path. In other words, this allows for
mySuperScript.sh [OPTIONS]
but not for any optional DIR1, [DIR2], etc.
I could use
allOpts=("$#")
while getopts ":hi:m:s:t:v:" opt; do echo. > /dev/null; done
shift $(($OPTIND-1))
DIRS=("$#")
and then pass the first $OPTIND entries of $allOpts as [OPTIONS] to myBaseScript.sh. But that seems excessively complicated, and moreover, it necessitates duplicating the list of valid options to myBaseScript.sh, which is a managerial nightmare.
I could also create a new script, dedicated to parsing these options. But that too seems a bit awkward, as it creates a dependency in myBaseScript.sh which was not there before...
So, what is the cleanest way to do this?
I don't know your entire usecase but if your goal is simply to support both files and directories then you can just use a single script which checks if an argument is a directory and crawls it.
For example:
#!/bin/bash
crawl_dir() {
dir="$1"
find "$dir" -type f
}
# parse options
while getopts ":hi:m:s:t:v:" opt; do
echo ...
done
shift $(($OPTIND-1))
args=( "$#" )
# parse arguments
files=()
for arg in "${args[#]}"
do
if [[ -d "$arg" ]]
then
files+=( $(crawl_dir "$arg") )
else
files+="$arg"
fi
done
# now you have an array of files to process
process "${files[#]}"
If you call mySuperScript.sh with an additional marker between the [OPTIONS] and the list of directories:
mySuperScript.sh [OPTIONS] -- DIR1 [DIR2 ...]
Then you can iterate over the arguments and collection the options into a variable. Something like:
while [ "$1" -ne "--" ]; do
OPTIONS="$OPTIONS $1"
shift
done
shift
Then you can later on call:
myBaseScript.sh $OPTIONS $some_list_of_files
This works fine as long as you're not trying to handle optional arguments containing spaces.
On a more practical note, this is generally when I start looking at Python.

Resources