Cleanly passing options between coupled shell scripts - bash

I have a (bash) shell script myBaseScript.sh, with the following signature:
myBaseScript.sh [OPTIONS] FILE1 [FILE2 ...]
This script myBaseScript.sh parses the options via getopts, like so:
while getopts ":hi:m:s:t:v:" opt; do
#...
done
shift $(($OPTIND-1))
FILES=("$#")
Now I wish to create a script mySuperScript.sh, which calls myBaseScript.sh, allowing options to be forwarded to the base script:
mySuperScript.sh [OPTIONS] DIR1 [DIR2 ...]
where [OPTIONS] are the options used in myBaseScript.sh; they are not used in mySuperScript.sh.
The script mySuperScript.sh is to crawl through all directories DIR1, [DIR2], etc., compiling a list of valid files ${FILES[#]}, to be passed on to myBaseScript.sh.
Right now I have this in mySuperScript.sh:
myBaseScript.sh "$#" "${FILES[#]}"
where ${FILES[#]} is a list of files in the current path. In other words, this allows for
mySuperScript.sh [OPTIONS]
but not for any optional DIR1, [DIR2], etc.
I could use
allOpts=("$#")
while getopts ":hi:m:s:t:v:" opt; do echo. > /dev/null; done
shift $(($OPTIND-1))
DIRS=("$#")
and then pass the first $OPTIND entries of $allOpts as [OPTIONS] to myBaseScript.sh. But that seems excessively complicated, and moreover, it necessitates duplicating the list of valid options to myBaseScript.sh, which is a managerial nightmare.
I could also create a new script, dedicated to parsing these options. But that too seems a bit awkward, as it creates a dependency in myBaseScript.sh which was not there before...
So, what is the cleanest way to do this?

I don't know your entire usecase but if your goal is simply to support both files and directories then you can just use a single script which checks if an argument is a directory and crawls it.
For example:
#!/bin/bash
crawl_dir() {
dir="$1"
find "$dir" -type f
}
# parse options
while getopts ":hi:m:s:t:v:" opt; do
echo ...
done
shift $(($OPTIND-1))
args=( "$#" )
# parse arguments
files=()
for arg in "${args[#]}"
do
if [[ -d "$arg" ]]
then
files+=( $(crawl_dir "$arg") )
else
files+="$arg"
fi
done
# now you have an array of files to process
process "${files[#]}"

If you call mySuperScript.sh with an additional marker between the [OPTIONS] and the list of directories:
mySuperScript.sh [OPTIONS] -- DIR1 [DIR2 ...]
Then you can iterate over the arguments and collection the options into a variable. Something like:
while [ "$1" -ne "--" ]; do
OPTIONS="$OPTIONS $1"
shift
done
shift
Then you can later on call:
myBaseScript.sh $OPTIONS $some_list_of_files
This works fine as long as you're not trying to handle optional arguments containing spaces.
On a more practical note, this is generally when I start looking at Python.

Related

How can I use getopts in a script that appends lines from files in a separate directory to a new file?

I am trying to write a bash script that takes in a directory, reads each file in the directory, and then appends the first line of each file in that directory to a new file. When I hard-code the variables in my script, it works fine.
This works:
#!/bin/bash
rm /local/SomePath/multigene.firstline.btab
touch /local/SomePath/multigene.firstline.btab
btabdir=/local/SomePath/test/*
outfile=/local/SomePath/multigene.firstline.btab
for f in $btabdir
do
head -1 $f >> $outfile
done
This does not work:
#!/bin/bash
while getopts ":d:o:" opt; do
case ${opt} in
d) btabdir=$OPTARG;;
o) outfile=$OPTARG;;
esac
done
rm $outfile
touch $outfile
for f in $btabdir
do
head -1 $f >> $outfile
done
Here is how I call the script:
bash /local/SomePath/Scripts/btab.besthits.wBp-q_wBm-r.sh -d /local/SomePath/test/* -o /local/SomePath/out.test/multigene.firstline.btab
And here is what I get when I run it:
rm: missing operand
Try 'rm --help' for more information.
touch: missing file operand
Try 'touch --help' for more information.
/local/SomePath/Scripts/btab.besthits.wBp-q_wBm-r.sh: line 23: $outfile: ambiguous redirect
Any suggestions? I'd like to be able to use getopts so I can make the script more generic. Thanks!
You have to pay extra attention to quoting and globbing when writing bash scripts.
When you call the script with a glob (* here) it gets expanded and split into words by your shell. This happends before your script even gets executed.
If you for example do cat *.txt cat will get all .txt files in the directory as its arguments. It will be the same as calling cat afile.txt nextfile.txt (and so on). Cat will never see the asterisk.
In your script it means that the input -d /local/SomePath/test/* gets expanded som something like /local/SomePath/test/someFile /local/SomePath/test/someOtherFile /test/someThirdFile.
Subsequently getopts only takes the first file after -d as for $btabdir and the -o doesn't get handled in the case switch.
I suggest you start by quoting every variable, preferable in the "${name}" style, and only invoke the script with quoted input.
It might also be send in a directory path, test that it is a directory (test -d), and change your for loop to for f in "${btabdir}"/*
This also works:
head -n1 -q /local/SomePath/test/* >> /local/SomePath/out.test/multigene.firstline.btab
I think the right answer here is "don't do it that way." :-)
The reason your current script isn't working may be that the wildcard is expanded by your interactive shell, not by your script. Try running your command with an echo at the beginning of the line for a hint at what's really happening. Once getopts sees the second of the matched files in the glob, it stops processing options, so -o never gets read, and $outfile remains unset. And since you don't quote your variable in rm $outfile, it's as if you're running rm without options. Test the difference in your shell between rm alone and rm "".
Also, what happens to your for loop if there's a space in a filename? Since you have bash, you have arrays. And arrays are much better for processing lists of files.
Perhaps use something like this instead:
#!/bin/bash
# initialize an array
files=()
while getopts :d:o: opt; do
case "$opt" in
d)
if [[ ! -d "$OPTARG" ]]; then
printf 'ERROR: not a directory: %s\n' "$OPTARG" >&2
exit 65
fi
# add to the array
files+=( "$OPTARG"/* )
;;
o) outfile="$OPTARG" ;;
*)
printf 'ERROR: unknown option: %s\n' "$opt" >&2
exit 64
;;
esac
done
if ! rm -f "$outfile" && touch "$outfile"; then
printf 'ERROR: cannot create %s\n' "$outfile" >&2
exit 73
fi
for f in "${files[#]}"; do
read -r < "$f"
printf '%s\n' "$REPLY"
done > "$outfile"
Here are some highlights of the changes....
We're using arrays, of course. The array ${files[#]} will contain one-file-per-record, without relying on whitespace, so with proper quoting you'll avoid problems with special characters in filenames.
We test for more error conditions, and actually show errors and exit if we see them. (The exit values are sysexits.)
Instead of using head, we use read and a single redirect to $outfile. This saves multiple forks to an external program, and multiple fopen() calls to your output file.
Note that the argument to -d should be a directory, not a glob. And you can specify options multiple times. Multiple -d options will be added together, but only the last -o option will be used.

Test -d directory true - subdirectory false (POSIX)

I'm trying to print all directories/subdirectories from a given start directory.
for i in $(ls -A -R -p); do
if [ -d "$i" ]; then
printf "%s/%s \n" "$PWD" "$i"
fi
done;
This script returns all of the directories found in the . directory and all of the files in that directory, but for some reason the test fails for subdirectories. All of the directories end up in $i and the output looks exactly the same.
Let's say I have the following structure:
foo/bar/test
echo $i prints
foo/
bar/
test/
While the contents of the folders are listed like this:
./foo:
file1
file2
./bar:
file1
file2
However the test statement just prints:
PWD/TO/THIS/DIRECTORY/foo
For some reason it returns true for the first level directories, but false for all of the subdirectories.
(ls is probably not a good way of doing this and I would be glad for a find statement that solves all of my issues, but first I want to know why this script doesn't work the way you'd think.)
As pointed out in the comments, the issue is that the directory names include a :, so -d is false.
I guess that this command gives you the output you want (although it requires Bash):
# enable globstar for **
# disabled in non-interactive shell (e.g. a script)
shopt -s globstar
# print each path ending in a / (all directories)
# ** expands recursively
printf '%s\n' **/*/
The standard way would either to do the recursion yourself, or to use find:
find . -type d
Consider your output:
dir1:
dir1a
Now, the following will be true:
[ -d dir1/dir1a ]
but that's not what your code does; instead, it runs:
[ -d dir1a ]
To avoid this, don't attempt to parse ls; if you want to implement recursion in baseline POSIX sh, do it yourself:
callForEachEntry() {
# because calling this without any command provided would try to execute all found files
# as commands, checking for safe/correct invocation is essential.
if [ "$#" -lt 2 ]; then
echo "Usage: callForEachEntry starting-directory command-name [arg1 arg2...]" >&2
echo " ...calls command-name once for each file recursively found" >&2
return 1
fi
# try to declare variables local, swallow/hide error messages if this fails; code is
# defensively written to avoid breaking if recursing changes either, but may be faulty if
# the command passed as an argument modifies "dir" or "entry" variables.
local dir entry 2>/dev/null ||: "not strict POSIX, but available in dash"
dir=$1; shift
for entry in "$dir"/*; do
# skip if the glob matched nothing
[ -e "$entry" ] || [ -L "$entry" ] || continue
# invoke user-provided callback for the entry we found
"$#" "$entry"
# recurse last for if on a baseline platform where the "local" above failed.
if [ -d "$entry" ]; then
callForEachEntry "$entry" "$#"
fi
done
}
# call printf '%s\n' for each file we recursively find; replace this with the code you
# actually want to call, wrapped in a function if appropriate.
callForEachEntry "$PWD" printf '%s\n'
find can also be used safely, but not as a drop-in replacement for the way ls was used in the original code -- for dir in $(find . -type d) is just as buggy. Instead, see the "Complex Actions" and "Actions In Bulk" section of Using Find.

Control wildcard expansion in sh command

I have a program which executes shell functions using call like sh -c "<command_string>". I can not change the way of calling these functions.
In this program I call different self written helper shell function, which are sourced into my environment. One of these looks like this. It unzips files with a given file pattern into a given directory.
function dwhUnzipFiles() {
declare OPTIND=1
while getopts "P:F:T:" opt; do
case "$opt" in
P) declare FILEPATTERN="$OPTARG" ;;
F) declare FROMDIR="$OPTARG" ;;
T) declare TODIR="$OPTARG" ;;
*) echo "Unbekannte Option | Usage: dwhUnzipFiles -P <filepattern> -F <fromdir> -T <todir>"
esac
done
shift $((OPTIND-1))
for currentfile in "${FROMDIR}"/"${FILEPATTERN}" ; do
unzip -o "$currentfile" -d "${TODIR}";
done
# error handling
# some more stuff
return $?
}
For this function I use arguments with wildcards for the FILEPATTERN variable. The function gets called by my program like this:
sh -c ". ~/dwh_env.sh && dwhUnzipFiles -P ${DWH_FILEPATTERN_MJF_WLTO}.xml.zip -F ${DWH_DIR_SRC_XML_CURR} -T ${DWH_DIR_SRC_XML_CURR}/workDir" where ${DWH_FILEPATTERN_MJF_WLTO} contains wildcards.
This works as intended. My confusion starts with another helper function, which is constructed in a similar way, but I'm not able to control the wildcard expansion correctly. It just deletes files in a directory depending on a given file pattern.
function dwhDeleteFiles() {
declare retFlag=0
declare OPTIND=1
while getopts "D:P:" opt; do
case "$opt" in
D) declare DIRECTORY="$OPTARG" ;;
P) declare FILEPATTERN="$OPTARG" ;;
*) echo "Unbekannte Option | Usage: dwhDeleteFiles -D <Directory> -P <Filepattern>"
esac
done
shift $((OPTIND-1))
for currentfile in "${DIRECTORY}"/"${FILEPATTERN}" ; do
rm -fv "${currentfile}";
done
# error handling
# some more stuff
return $retFlag
}
This function is called like this:
sh -c ". ~/dwh_env.sh && dwhDeleteFiles -P ${DWH_FILEPATTERN_MJF_WLTO}.xml -D ${DWH_DIR_SRC_XML_CURR}/workDir" where again ${DWH_FILEPATTERN_MJF_WLTO} contains wildcards. When I call this function with my program it results in doing nothing. I tried to play around with adding "" and \"\" to the arguments of my functions, but all what is happening is that instead of deleting all files in the given directory the function deletes only the first one in an alphanumerical order.
Can somebody explain to me, what is happening here? My idea is that the multiple passing of the variable, containing the wildcard, is not working. But how do I fix this and is it even possible in bash? And why is the dwhUnzipFilesfunction working and the dwhDeleteFiles is not?
Suppose that DWH_FILEPATTERN_MJF_WLTO is *, and you have a bunch of *.xml files, then the command is
dwhDeleteFiles -P *.xml -D ${DWH_DIR_SRC_XML_CURR}/workDir
which expands to
dwhDeleteFiles -P bar.xml baz.xml foo.xml zap.xml -D ${DWH_DIR_SRC_XML_CURR}/workDir
(Note alphabetical order of xml files). But the -P option only takes one arg, bar.xml (the first) and the remaining are treated as file arguments.
Try setting set -x in your script to see this in action.

How to write an alias for "two" words [duplicate]

This question already has answers here:
Can I alias a subcommand? (shortening the output of `docker ps`)
(4 answers)
Closed 6 years ago.
The standard usage of an alias is to write a shortcut for an expanded command, for example: alias ls='ls --color'.
I want to know if it's possible to have "parameters" in the left side, so that it works the other way around. Using the above example, I'm interesting in knowing if alias ls --color='ls'is possible, that is, when someone types ls --color, the simple ls is run.
Forget about whether or not that's useful or make sense, I just want to know if it's possible, or if there is any workaround to achieve the same goal.
The existing answer doesn't correctly handle commands with spaces -- and indeed cannot: Condensing an array into a string is inherently buggy.
This version works with the list of arguments as an array, and thus avoids this loss of information:
ls() {
local -a args=( )
for arg; do
[[ $arg = --color ]] || args+=( "$arg" )
done
command ls "${args[#]}"
}
Alternately, if your real goal is to alias a subcommand (and you might want to process more subcommands in the future), consider a case structure, as the following:
ls() {
local subcommand
if (( "$#" == 0 )); then command ls; return; fi
subcommand=$1; shift
case $subcommand in
--color) command ls "$#" ;;
*) command ls "$subcommand" "$#" ;;
esac
}
Some tests, to distinguish correctness between this answer and the preexisting one:
tempdir=/tmp/ls-alias-test
mkdir -p "$dir"/'hello world' "$dir"/my--color--test
# with the alternate answer, this fails because it tries to run:
# ls /tmp/ls-alias-test/hello world
# (without the quotes preserved)
ls --color "$dir/hello world"
# with the alternate answer, this fails because it tries to run:
# ls /tmp/ls-alias-test/my--test
ls --color "$dir/my--color--test"
Use function after unalias ls:
unalias ls
ls () { p=$#; p=${p//--color/}; /bin/ls $p ;}

Shell script to browse one or more directories passed as parameters

I made this script that should receive one or more parameter, and those parameter are all directories, and it has to browse those directories (one by one) and do some operations.
The operations work fine if the parameter is 1 (only one directory),
How should I modify my script to make it works if more than 1 parameter is passed
Example if I want it to do the same operations in 2 or 3 directories at the same time?
Thanks
#!/bin/sh
cd $1
for file in ./* # */
do
if [[ -d $file ]]
then
ext=dir
else
ext="${file##*.}"
fi
mv "${file}" "${file}.$ext"
done
First, if you are using bash use bash shebang (#! /bin/bash).
Then use
#! /bin/bash
for d in "$#"
do
echo "Do something with $d"
done
to iterate over the command line arguments (dirs in your case)
#!/bin/sh
for dir in "$#"; do
for file in "$dir"/*; do
echo "Doing something with '$file'"
done
done

Resources