What does read -r -a BUILD_ARGS_ARRAY <<< "$#" mean in bash? - bash

I read some piece of code in build.include
set -u
prepare_build_args() {
IFS=',' read -r -a BUILD_ARGS_ARRAY <<< "$#"
for i in ${BUILD_ARGS_ARRAY[#]}; do
BUILD_ARGS+="--build-arg $i "
done
}
I have difficulty in understanding this code because I am new to shell.
Is IFS a variable assigned with value ','? Why it is followed by a read command?
What does -r -a mean? and what does <<< do?
BUILD_ARGS_ARRAY[#] is not defined before . and there is set -u which means unassigned variable will be recognized as error.Is it the problem of scope? And What does [#] mean?
Finally,in my understanding BUILD_ARGS stored all the things in BUILD_ARG_ARRAY, but it is not returned out of the prepare_build_argsfunction?

Looking through the Bash manual might be helpful.
IFS is the Internal Field Separator, setting it before the read command applies it only for that command.
The read builtin command option -r stops backslashes mangling the data, and -a reads into an array (BUILD_ARGS_ARRAY in this case).
<<< is a here string which directs the arguments of the function prepare_build_args to the read command.
BUILD_ARGS_ARRAY is set by the read command. The [#] Bash syntax expands the array.
Variable scope is global unless the local builtin is used.

In short, this code:
Concatenates all your function's arguments together into a single string (normally this is what "$*" does, but "$#" does it as well when used in a context where its result is evaluated as a single string).
Splits that string on commas, storing the result in an array named BUILD_ARGS_ARRAY
Takes the array, concatenates its arguments into a single string (again!), splits that string on whitespace, expands each component generated by that split operation as a glob operation, and iterates over the glob results.
For each glob results, appends the string --build-arg <result> to BUILD_ARGS.
This is extremely buggy, and should never be used by anyone. To go into why in more detail:
"$#" is intended for use where its result can be treated as a list. Expanding it in a string context throws away the original division between arguments, replacing them with the first character in IFS (in the context in which the expansion is done, not the expansion of the read in which the result is consumed).
The unquoted ${foo[#]} expansions make the behavior of this code sensitive to whether your arguments contain globbing characters, and if so, which files exist in the directory it's run in and whether the nullglob, failglob, or similar options are set. See the shellcheck warning SC2068.
The net effect of this operation is to build a string which is presumably going to be expanded in generating a command line. Strings cannot be safely used in this way in the general case; see BashFAQ #50 describing the pitfalls, caveats, and alternative approaches.

Related

how to pass args to bash functions [duplicate]

This question already has answers here:
Propagate all arguments in a Bash shell script
(12 answers)
Closed 3 years ago.
Let's say I have a function abc() that will handle the logic related to analyzing the arguments passed to my script.
How can I pass all arguments my Bash script has received to abc()? The number of arguments is variable, so I can't just hard-code the arguments passed like this:
abc $1 $2 $3 $4
Better yet, is there any way for my function to have access to the script arguments' variables?
The $# variable expands to all command-line parameters separated by spaces. Here is an example.
abc "$#"
When using $#, you should (almost) always put it in double-quotes to avoid misparsing of arguments containing spaces or wildcards (see below). This works for multiple arguments. It is also portable to all POSIX-compliant shells.
It is also worth noting that $0 (generally the script's name or path) is not in $#.
The Bash Reference Manual Special Parameters Section says that $# expands to the positional parameters starting from one. When the expansion occurs within double quotes, each parameter expands to a separate word. That is "$#" is equivalent to "$1" "$2" "$3"....
Passing some arguments:
If you want to pass all but the first arguments, you can first use shift to "consume" the first argument and then pass "$#" to pass the remaining arguments to another command. In Bash (and zsh and ksh, but not in plain POSIX shells like dash), you can do this without messing with the argument list using a variant of array slicing: "${#:3}" will get you the arguments starting with "$3". "${#:3:4}" will get you up to four arguments starting at "$3" (i.e. "$3" "$4" "$5" "$6"), if that many arguments were passed.
Things you probably don't want to do:
"$*" gives all of the arguments stuck together into a single string (separated by spaces, or whatever the first character of $IFS is). This looses the distinction between spaces within arguments and the spaces between arguments, so is generally a bad idea. Although it might be ok for printing the arguments, e.g. echo "$*", provided you don't care about preserving the space within/between distinction.
Assigning the arguments to a regular variable (as in args="$#") mashes all the arguments together like "$*" does. If you want to store the arguments in a variable, use an array with args=("$#") (the parentheses make it an array), and then reference them as e.g. "${args[0]}" etc. Note that in Bash and ksh, array indexes start at 0, so $1 will be in args[0], etc. zsh, on the other hand, starts array indexes at 1, so $1 will be in args[1]. And more basic shells like dash don't have arrays at all.
Leaving off the double-quotes, with either $# or $*, will try to split each argument up into separate words (based on whitespace or whatever's in $IFS), and also try to expand anything that looks like a filename wildcard into a list of matching filenames. This can have really weird effects, and should almost always be avoided. (Except in zsh, where this expansion doesn't take place by default.)
I needed a variation on this, which I expect will be useful to others:
function diffs() {
diff "${#:3}" <(sort "$1") <(sort "$2")
}
The "${#:3}" part means all the members of the array starting at 3. So this function implements a sorted diff by passing the first two arguments to diff through sort and then passing all other arguments to diff, so you can call it similarly to diff:
diffs file1 file2 [other diff args, e.g. -y]
Use the $# variable, which expands to all command-line parameters separated by spaces.
abc "$#"
Here's a simple script:
#!/bin/bash
args=("$#")
echo Number of arguments: $#
echo 1st argument: ${args[0]}
echo 2nd argument: ${args[1]}
$# is the number of arguments received by the script. I find easier to access them using an array: the args=("$#") line puts all the arguments in the args array. To access them use ${args[index]}.
It's worth mentioning that you can specify argument ranges with this syntax.
function example() {
echo "line1 ${#:1:1}"; #First argument
echo "line2 ${#:2:1}"; #Second argument
echo "line3 ${#:3}"; #Third argument onwards
}
I hadn't seen it mentioned.
abc "$#" is generally the correct answer.
But I was trying to pass a parameter through to an su command, and no amount of quoting could stop the error su: unrecognized option '--myoption'. What actually worked for me was passing all the arguments as a single string :
abc "$*"
My exact case (I'm sure someone else needs this) was in my .bashrc
# run all aws commands as Jenkins user
aws ()
{
sudo su jenkins -c "aws $*"
}
abc "$#"
$# represents all the parameters given to your bash script.

Correctly allow word splitting of command substitution in bash

I write, maintain and use a healthy amount of bash scripts. I would consider myself a bash hacker and strive to someday be a bash ninja ( need to learn more awk first ). One of the most important feature/frustrations of bash to understand is how quotes, and subsequent parameter expansion, work. This is well documented, and for a good reason, many pitfalls, bugs and newbie-traps exist in the mysterious world of quoted parameter expansion and word splitting. For this reason, the advice is to "Double quote everything," but what if I want word splitting to occur?
In multiple style guides I can not find an example of safe and proper use of word splitting after command substitution.
What is the correct way to use unquoted command substitution?
Example:
I don't need help getting this command working, but it seems to be a violation of established patterns, if you would like to give feedback on this command, please keep it in comments
docker stats $(docker ps | awk '{print $NF}' | grep -v NAMES)
The command substitute returns output such as:
container-1 container-3 excitable-newton
This one-liner uses the command substitution to spit out the names of each of my running docker containers and the feeds them, with word splitting, as separate inputs to the docker stats command, which takes an arbitrary length list of container names and gives back some info about them.
If I used:
docker stats "$(docker ps | awk '{print $NF}' | grep -v NAMES)"
There would be one string of newline separated container names passed to docker stats.
This seems like a perfect example of when I would want word splitting, but shellcheck disagrees, is this somehow unsafe? Is there an established pattern for using word-splitting after expansion or substitution?
The safe way to capture output from one command and pass it to another is to temporarily capture the output in an array. This allows splitting on arbitrary delimiters and prevents unintentional splitting or globbing while capturing output as more than one string to be passed on to another command.
If you want to read a space-separated string into an array, use read -a:
read -r -a names < <(docker ps | awk '{print $NF}' | grep -v NAMES)
printf 'Found name: %s\n' "${names[#]}"
Unlike the unquoted-expansion approach, this doesn't expand globs. Thus, foo[bar] can't be replaced with a filesystem entry named foob, or with an empty string if no such filesystem entry exists and the nullglob shell option is set. (Likewise, * will no longer be replaced with a list of files in the current directory).
To go into detail regarding behavior: read -r -a reads up to a delimiter passed as the first character of the option argument following -d (if given), or a NUL if that option argument is 0 bytes, and splits the results into fields based on characters within IFS -- a set which, by default, contains the newline, the tab, and the space; it then assigns those split results to an array.
This behavior does not meaningfully vary based on shell-local configuration, except for IFS, which can be modified scoped to the single command.
mapfile -t and readarray -t are similarly consistent in behavior, and likewise recommended if portability constraints do not prevent their use.
By contrast, array=( $string ) is much more dependent on the shell's configuration and settings, and will behave badly if the shell's configuration is left at defaults:
When using array=( $string ), if set -f is not set, each word created by splitting $string is evaluated as a glob, with further variances based in behavior depending on the shopt settings nullglob (which would cause a pattern which didn't expand to any contents to result in an empty set, rather than the default of expanding to the glob expression itself), failglob (which would cause a pattern which didn't expand to any contents to result in a failure), extglob, dotglob and others.
When using array=( $string ), the value of IFS used for the split operation cannot be easily and reliably altered in a manner scoped to this single operation. By contrast, one can run IFS=: read to force read to split only on :s without modifying the value of IFS outside the scope of that single value; no equivalent for array=( $string ) exists without storing and re-setting IFS (which is an error-prone operation; some common idioms [such as assignment to oIFS or a similar variable name] operate contrary to intent in common scenarios, such as failing to reproduce an unset or empty IFS at the end of the block to which the temporary modification is intended to apply).
Thanks to #I'L'I's pointing to an example of a valid exception to the "Quote Everything" rule, my code does appear to be a exception to the rule.
In my particular use case, using docker container names, the risk of accidental globbing or expansion is low due to the constraints on container names. However #Charles Duffy provided a surefire and safe way to go about word splitting one command output before feeding it into the next command by reading the first output into an array using bash built-in read ( I found readarray better suited my case ).
readarray -t names < <(docker ps | awk '{print $NF}' | grep -v NAMES)
docker stats "${names[#]}"
This pattern allows for the output from the first command to be fed to the second command as properly split, separate arguments while avoiding unwanted globbing or splitting. Unfortunately my slick one-liner will perish in favor of safety.

How do I store a command in a variable and use it in a pipeline? [duplicate]

This question already has answers here:
Why does shell ignore quoting characters in arguments passed to it through variables? [duplicate]
(3 answers)
Closed 6 years ago.
If i use this command in pipeline, it's working very well;
pipeline ... | grep -P '^[^\s]*\s3\s'
But if I want to set grep into variable like:
var="grep -P '^[^\s]*\s3\s'"
And if I put variable in pipeline;
pipeline ... | $var
nothing happens, like there isn't any matches.
Any help what am I doing wrong?
The robust way to store a simple command in a variable in Bash is to use an array:
# Store the command names and arguments individually
# in the elements of an *array*.
cmd=( grep -P '^[^\s]*\s3\s' )
# Use the entire array as the command to execute - be sure to
# double-quote ${cmd[#]}.
echo 'before 3 after' | "${cmd[#]}"
If, by contrast, your command is more than a simple command and, for instance, involves pipes, multiple commands, loops, ..., defining a function is the right approach:
# Define a function encapsulating the command...
myGrep() { grep -P '^[^\s]*\s3\s'; }
# ... and use it:
echo 'before 3 after' | myGrep
Why what you tried didn't work:
var="grep -P '^[^\s]*\s3\s'"
causes the single quotes around the regex to become a literal, embedded part of $var's value.
When you then use $var - unquoted - as a command, the following happens:
Bash performs word-splitting, which means that it breaks the value of $var into words (separate tokens) by whitespace (the chars. defined in special variable $IFS, which contains a space, a tab, and a newline character by default).
Bash also performs globbing (pathname expansion) on the resulting works, which is not a problem here, but can have unintended consequences in general.
Also, if any of your original arguments had embedded whitespace, word splitting would split them into multiple words, and your original argument partitioning is lost.
(As an aside: "$var" - i.e., double-quoting the variable reference - is not a solution, because then the entire string is treated as the command name.)
Specifically, the resulting words are:
grep
-P
'^[^\s]*\s3\s' - including the surrounding single quotes
The words are then interpreted as the name of the command and its arguments, and invoked as such.
Given that the pattern argument passed to grep starts with a literal single quote, matching won't work as intended.
Short of using eval "$var" - which is NOT recommended for security reasons - you cannot persuade Bash to see the embedded single quotes as syntactical elements that should be removed (a process appropriate called quote removal).
Using an array bypasses all these problems by storing arguments in individual elements and letting Bash robustly assemble them into a command with "${cmd[#]}".
What you are doing wrong is trying to store a command in a variable. For simplicity, robustness, etc. commands are stored in aliases (if no arguments) or functions (if arguments), not variables. In this case:
$ alias foo='grep X'
$ echo "aXb" | foo
aXb
I recommend you read the book Shell Scripting Recipes by Chris Johnson ASAP to get the basics of shell programming and then Effective Awk Programming, 4th Edition, by Arnold Robbins when you're ready to start writing scripts to manipulate text.

Getting quoted-dollar-at ( "$#" ) behaviour for other variable expansion?

The shell has a great feature, where it'll preserve argument quoting across variable expansion when you use "$#", such that the script:
for f in "$#"; do echo "$f"; done
when invoked with arguments:
"with spaces" '$and $(metachars)'
will print, literally:
with spaces
$and $(metachars)
This isn't the normal behaviour of expansion of a quoted string, it seems to be a special case for "$#".
Is there any way to get this behaviour for other variables? In the specific case I'm interested in, I want to safely expand $SSH_ORIGINAL_COMMAND in a command= specifier in a restricted public key entry, without having to worry about spaces in arguments, metacharacters, etc.
"$SSH_ORIGINAL_COMMAND" expands like "$*" would, i.e. a naïve expansion that doesn't add any quoting around separate arguments.
Is the information required for "$#" style expansion simply not available to the shell in this case, by the time it gets the env var SSH_ORIGINAL_COMMAND? So I'd instead need to convince sshd to quote the arguments?
The answer to this question is making me wonder if it's possible at all.
You can get similar "quoted dollar-at" behavior for arbitrary arrays using "${YOUR_ARRAY_HERE[#]}" syntax for bash arrays. Of course, that's no complete answer, because you still have to break the string into multiple array elements according to the quotes.
One thought was to use bash -x, which renders expanded output, but only if you actually run the command; it doesn't work with -n, which prevents you from actually executing the commands in question. Likewise you could use eval or bash -c along with set -- to manage the quote removal, performing expansion on the outer shell and quote removal on the inner shell, but that would be extremely hard to bulletproof against executing arbitrary code.
As an end run, use xargs instead. xargs handles single and double quotes. This is a very imperfect solution, because xargs treats backslash-escaped characters very differently than bash does and fails entirely to handle semicolons and so forth, but if your input is relatively predictable it gets you most of the way there without forcing you to write a full shell parser.
SSH_ORIGINAL_COMMAND='foo "bar baz" $quux'
# Build out the parsed array.
# Bash 4 users may be able to do this with readarray or mapfile instead.
# You may also choose to null-terminate if newlines matter.
COMMAND_ARRAY=()
while read line; do
COMMAND_ARRAY+=("$line")
done < <(xargs -n 1 <<< "$SSH_ORIGINAL_COMMAND")
# Demonstrate working with the array.
N=0
for arg in "${COMMAND_ARRAY[#]}"; do
echo "COMMAND_ARRAY[$N]: $arg"
((N++))
done
Output:
COMMAND_ARRAY[0]: foo
COMMAND_ARRAY[1]: bar baz
COMMAND_ARRAY[2]: $quux

bash quotes in variable treated different when expanded to command [duplicate]

This question already has answers here:
Why does shell ignore quoting characters in arguments passed to it through variables? [duplicate]
(3 answers)
Closed 3 years ago.
Explaining the question through examples...
Demonstrates that the single-quotes after --chapters is gets escaped when the variable is expanded (I didn't expect this):
prompt#ubuntu:/my/scripts$ cat test1.sh
#!/bin/bash
actions="--tags all:"
actions+=" --chapters ''"
mkvpropedit "$1" $actions
prompt#ubuntu:/my/scripts$ ./test1.sh some.mkv
Error: Could not open '''' for reading.
And now for some reason mkvpropedit receives the double quotes as part of the filename (I didn't expect this either):
prompt#ubuntu:/my/scripts$ cat test1x.sh
#!/bin/bash
command="mkvpropedit \"$1\""
command+=" --tags all:"
command+=" --chapters ''"
echo "$command"
$command
prompt#ubuntu:/my/scripts$ ./test1x.sh some.mkv
mkvpropedit "some.mkv" --tags all: --chapters ''
Error: Could not open '''' for reading.
The above echo'd command seems to be correct. Putting the same text in another script gives the expected result:
prompt#ubuntu:/my/scripts$ cat test2.sh
#!/bin/bash
mkvpropedit "$1" --tags all: --chapters ''
prompt#ubuntu:/my/scripts$ ./test2.sh some.mkv
The file is being analyzed.
The changes are written to the file.
Done.
Could anyone please explain why the quotes are not behaving as expected. I found searching on this issue difficult as there are so many other quoting discussions on the web. I wouldn't even know how to explain the question without examples.
I am afraid that some day the file name in the argument contains some character that breaks everything, hence the maybe excessive quoting. I do not understand why the same command executes differently when typed directly in the script or when provided via a variable. Please enlighten me.
Thanks for reading.
The important thing to keep in mind is that quotes are only removed once, when the command line is originally parsed. A quote which is inserted into the command line as a result of parameter substitution ($foo) or command substitution ($(cmd args)) is not treated as a special character. [Note 1]
That seems different from whitespace and glob metacharacters. Word splitting and pathname expansion happen after parameter/command substitution (unless the substitution occurs inside quotes). [Note 2]
The consequence is that it is almost impossible to create a bash variable $args such that
cmd $args
If $args contains quotes, they are not removed. Words inside $args are delimited by sequences of whitespace, not single whitespace characters.
The only way to do it is to set $IFS to include some non-whitespace character; that character can then be used inside $args as a single-character delimiter. However, there is no way to quote a character inside a value, so once you do that, the character you chose cannot be used other than as a delimiter. This is not usually very satisfactory.
There is a solution, though: bash arrays.
If you make $args into an array variable, then you can expand it with the repeated-quote syntax:
cmd "${args[#]}"
which produces exactly one word per element of $args, and suppresses word-splitting and pathname expansion on those words, so they end up as literals.
So, for example:
actions=(--tags all:)
actions+=(--chapters '')
mkvpropedit "$1" "${actions[#]}"
will probably do what you want. So would:
args=("$1")
args+=(--tags)
args+=(all:)
args+=(--chapters)
args+=('')
mkvpropedit "${args[#]}"
and so would
command=(mkvpropedit "$1" --tags all: --chapters '')
"${command[#]}"
I hope that's semi-clear.
man bash (or the online version) contains a blow-by-blow account of how bash assembles commands, starting at the section "EXPANSION". It's worth reading for a full explanation.
Notes:
This doesn't apply to eval or commands like bash -c which evaluate their argument again after command line processing. But that's because command-line processing happens twice.
Word splitting is not the same as "dividing the command into words", which happens when the command is parsed. For one thing, word-splitting uses as separator characters the value of $IFS, whereas command-line parsing uses whitespace. But neither of these are done inside quotes, so they are similar in that respect. In any case, words are split in one way or another both before and after parameter substitution.

Resources