More than any other language I know, I've "learned" Bash by Googling every time I need some little thing. Consequently, I can patchwork together little scripts that appear to work. However, I don't really know what's going on, and I was hoping for a more formal introduction to Bash as a programming language. For example: What is the evaluation order? what are the scoping rules? What is the typing discipline, e.g. is everything a string? What is the state of the program -- is it a key-value assignment of strings to variable names; is there more than that, e.g. the stack? Is there a heap? And so on.
I thought to consult the GNU Bash manual for this kind of insight, but it doesn't seem to be what I want; it's more of a laundry list of syntactic sugar rather than an explanation of the core semantic model. The million-and-one "bash tutorials" online are only worse. Perhaps I should first study sh, and understand Bash as a syntactic sugar on top of this? I don't know if this is an accurate model, though.
Any suggestions?
EDIT: I've been asked to provide examples of what ideally I'm looking for. A rather extreme example of what I would consider a "formal semantics" is this paper on "the essence of JavaScript". Perhaps a slightly less formal example is the Haskell 2010 report.
A shell is an interface for the operating system. It is usually a more-or-less robust programming language in its own right, but with features designed to make it easy to interact specifically with the operating system and filesystem. The POSIX shell's (hereafter referred to just as "the shell") semantics are a bit of a mutt, combining some features of LISP (s-expressions have a lot in common with shell word splitting) and C (much of the shell's arithmetic syntax semantics comes from C).
The other root of the shell's syntax comes from its upbringing as a mishmash of individual UNIX utilities. Most of what are often builtins in the shell can actually be implemented as external commands. It throws many shell neophytes for a loop when they realize that /bin/[ exists on many systems.
$ if '/bin/[' -f '/bin/['; then echo t; fi # Tested as-is on OS X, without the `]`
t
wat?
This makes a lot more sense if you look at how a shell is implemented. Here's an implementation I did as an exercise. It's in Python, but I hope that's not a hangup for anyone. It's not terribly robust, but it is instructive:
#!/usr/bin/env python
from __future__ import print_function
import os, sys
'''Hacky barebones shell.'''
try:
input=raw_input
except NameError:
pass
def main():
while True:
cmd = input('prompt> ')
args = cmd.split()
if not args:
continue
cpid = os.fork()
if cpid == 0:
# We're in a child process
os.execl(args[0], *args)
else:
os.waitpid(cpid, 0)
if __name__ == '__main__':
main()
I hope the above makes it clear that the execution model of a shell is pretty much:
1. Expand words.
2. Assume the first word is a command.
3. Execute that command with the following words as arguments.
Expansion, command resolution, execution. All of the shell's semantics are bound up in one of these three things, although they're far richer than the implementation I wrote above.
Not all commands fork. In fact, there are a handful of commands that don't make a ton of sense implemented as externals (such that they would have to fork), but even those are often available as externals for strict POSIX compliance.
Bash builds upon this base by adding new features and keywords to enhance the POSIX shell. It is nearly compatible with sh, and bash is so ubiquitous that some script authors go years without realizing that a script may not actually work on a POSIXly strict system. (I also wonder how people can care so much about the semantics and style of one programming language, and so little for the semantics and style of the shell, but I diverge.)
Order of evaluation
This is a bit of a trick question: Bash interprets expressions in its primary syntax from left to right, but in its arithmetic syntax it follows C precedence. Expressions differ from expansions, though. From the EXPANSION section of the bash manual:
The order of expansions is: brace expansion; tilde expansion, parameter
and variable expansion, arithmetic expansion, and command substitution
(done in a left-to-right fashion); word splitting; and pathname expansion.
If you understand wordsplitting, pathname expansion and parameter expansion, you are well on your way to understanding most of what bash does. Note that pathname expansion coming after wordsplitting is critical, because it ensures that a file with whitespace in its name can still be matched by a glob. This is why good use of glob expansions is better than parsing commands, in general.
Scope
Function scope
Much like old ECMAscript, the shell has dynamic scope unless you explicitly declare names within a function.
$ foo() { echo $x; }
$ bar() { local x; echo $x; }
$ foo
$ bar
$ x=123
$ foo
123
$ bar
$ …
Environment and process "scope"
Subshells inherit the variables of their parent shells, but other kinds of processes don't inherit unexported names.
$ x=123
$ ( echo $x )
123
$ bash -c 'echo $x'
$ export x
$ bash -c 'echo $x'
123
$ y=123 bash -c 'echo $y' # another way to transiently export a name
123
You can combine these scoping rules:
$ foo() {
> local -x bar=123 # Export foo, but only in this scope
> bash -c 'echo $bar'
> }
$ foo
123
$ echo $bar
$
Typing discipline
Um, types. Yeah. Bash really doesn't have types, and everything expands to a string (or perhaps a word would be more appropriate.) But let's examine the different types of expansions.
Strings
Pretty much anything can be treated as a string. Barewords in bash are strings whose meaning depends entirely on the expansion applied to it.
No expansion
It may be worthwhile to demonstrate that a bare word really is just a word, and that quotes change nothing about that.
$ echo foo
foo
$ 'echo' foo
foo
$ "echo" foo
foo
Substring expansion
$ fail='echoes'
$ set -x # So we can see what's going on
$ "${fail:0:-2}" Hello World
+ echo Hello World
Hello World
For more on expansions, read the Parameter Expansion section of the manual. It's quite powerful.
Integers and arithmetic expressions
You can imbue names with the integer attribute to tell the shell to treat the right hand side of assignment expressions as arithmetic. Then, when the parameter expands it will be evaluated as integer math before expanding to … a string.
$ foo=10+10
$ echo $foo
10+10
$ declare -i foo
$ foo=$foo # Must re-evaluate the assignment
$ echo $foo
20
$ echo "${foo:0:1}" # Still just a string
2
Arrays
Arguments and Positional Parameters
Before talking about arrays it might be worth discussing positional parameters. The arguments to a shell script can be accessed using numbered parameters, $1, $2, $3, etc. You can access all these parameters at once using "$#", which expansion has many things in common with arrays. You can set and change the positional parameters using the set or shift builtins, or simply by invoking the shell or a shell function with these parameters:
$ bash -c 'for ((i=1;i<=$#;i++)); do
> printf "\$%d => %s\n" "$i" "${#:i:1}"
> done' -- foo bar baz
$1 => foo
$2 => bar
$3 => baz
$ showpp() {
> local i
> for ((i=1;i<=$#;i++)); do
> printf '$%d => %s\n' "$i" "${#:i:1}"
> done
> }
$ showpp foo bar baz
$1 => foo
$2 => bar
$3 => baz
$ showshift() {
> shift 3
> showpp "$#"
> }
$ showshift foo bar baz biz quux xyzzy
$1 => biz
$2 => quux
$3 => xyzzy
The bash manual also sometimes refers to $0 as a positional parameter. I find this confusing, because it doesn't include it in the argument count $#, but it is a numbered parameter, so meh. $0 is the name of the shell or the current shell script.
Arrays
The syntax of arrays is modeled after positional parameters, so it's mostly healthy to think of arrays as a named kind of "external positional parameters", if you like. Arrays can be declared using the following approaches:
$ foo=( element0 element1 element2 )
$ bar[3]=element3
$ baz=( [12]=element12 [0]=element0 )
You can access array elements by index:
$ echo "${foo[1]}"
element1
You can slice arrays:
$ printf '"%s"\n' "${foo[#]:1}"
"element1"
"element2"
If you treat an array as a normal parameter, you'll get the zeroth index.
$ echo "$baz"
element0
$ echo "$bar" # Even if the zeroth index isn't set
$ …
If you use quotes or backslashes to prevent wordsplitting, the array will maintain the specified wordsplitting:
$ foo=( 'elementa b c' 'd e f' )
$ echo "${#foo[#]}"
2
The main difference between arrays and positional parameters are:
Positional parameters are not sparse. If $12 is set, you can be sure $11 is set, too. (It could be set to the empty string, but $# will not be smaller than 12.) If "${arr[12]}" is set, there's no guarantee that "${arr[11]}" is set, and the length of the array could be as small as 1.
The zeroth element of an array is unambiguously the zeroth element of that array. In positional parameters, the zeroth element is not the first argument, but the name of the shell or shell script.
To shift an array, you have to slice and reassign it, like arr=( "${arr[#]:1}" ). You could also do unset arr[0], but that would make the first element at index 1.
Arrays can be shared implicitly between shell functions as globals, but you have to explicitly pass positional parameters to a shell function for it to see those.
It's often convenient to use pathname expansions to create arrays of filenames:
$ dirs=( */ )
Commands
Commands are key, but they're also covered in better depth than I can by the manual. Read the SHELL GRAMMAR section. The different kinds of commands are:
Simple Commands (e.g. $ startx)
Pipelines (e.g. $ yes | make config) (lol)
Lists (e.g. $ grep -qF foo file && sed 's/foo/bar/' file > newfile)
Compound Commands (e.g. $ ( cd -P /var/www/webroot && echo "webroot is $PWD" ))
Coprocesses (Complex, no example)
Functions (A named compound command that can be treated as a simple command)
Execution Model
The execution model of course involves both a heap and a stack. This is endemic to all UNIX programs. Bash also has a call stack for shell functions, visible via nested use of the caller builtin.
References:
The SHELL GRAMMAR section of the bash manual
The XCU Shell Command Language documentation
The Bash Guide on Greycat's wiki.
Advanced Programming in the UNIX Environment
Please make comments if you want me to expand further in a specific direction.
The answer to your question "What is the typing discipline, e.g. is everything a string"
Bash variables are character strings. But, Bash permits arithmetic operations and comparisons on variables when variables are integers. The exception to rule Bash variables are character strings is when said variables are typeset or declared otherwise
$ A=10/2
$ echo "A = $A" # Variable A acting like a String.
A = 10/2
$ B=1
$ let B="$B+1" # Let is internal to bash.
$ echo "B = $B" # One is added to B was Behaving as an integer.
B = 2
$ A=1024 # A Defaults to string
$ B=${A/24/STRING01} # Substitute "24" with "STRING01".
$ echo "B = $B" # $B STRING is a string
B = 10STRING01
$ B=${A/24/STRING01} # Substitute "24" with "STRING01".
$ declare -i B
$ echo "B = $B" # Declaring a variable with non-integers in it doesn't change the contents.
B = 10STRING01
$ B=${B/STRING01/24} # Substitute "STRING01" with "24".
$ echo "B = $B"
B = 1024
$ declare -i B=10/2 # Declare B and assigning it an integer value
$ echo "B = $B" # Variable B behaving as an Integer
B = 5
Declare option meanings:
-a Variable is an array.
-f Use function names only.
-i The variable is to be treated as an integer; arithmetic evaluation is performed when the variable is assigned a value.
-p Display the attributes and values of each variable. When -p is used, additional options are ignored.
-r Make variables read-only. These variables cannot then be assigned values by subsequent assignment statements, nor can they be unset.
-t Give each variable the trace attribute.
-x Mark each variable for export to subsequent commands via the environment.
The bash manpage has quite a bit more info than most manpages, and includes some of what you're asking for. My assumption after more than a decade of scripting bash is that, due to its' history as an extension of sh, it has some funky syntax (to maintain backward compatibility with sh).
FWIW, my experience has been like yours; although the various books (e.g., O'Reilly "Learning the Bash Shell" and similar) do help with the syntax, there are lots of strange ways of solving various problems, and some of them are not in the book and must be googled.
Related
This question already has answers here:
Propagate all arguments in a Bash shell script
(12 answers)
Closed 3 years ago.
Let's say I have a function abc() that will handle the logic related to analyzing the arguments passed to my script.
How can I pass all arguments my Bash script has received to abc()? The number of arguments is variable, so I can't just hard-code the arguments passed like this:
abc $1 $2 $3 $4
Better yet, is there any way for my function to have access to the script arguments' variables?
The $# variable expands to all command-line parameters separated by spaces. Here is an example.
abc "$#"
When using $#, you should (almost) always put it in double-quotes to avoid misparsing of arguments containing spaces or wildcards (see below). This works for multiple arguments. It is also portable to all POSIX-compliant shells.
It is also worth noting that $0 (generally the script's name or path) is not in $#.
The Bash Reference Manual Special Parameters Section says that $# expands to the positional parameters starting from one. When the expansion occurs within double quotes, each parameter expands to a separate word. That is "$#" is equivalent to "$1" "$2" "$3"....
Passing some arguments:
If you want to pass all but the first arguments, you can first use shift to "consume" the first argument and then pass "$#" to pass the remaining arguments to another command. In Bash (and zsh and ksh, but not in plain POSIX shells like dash), you can do this without messing with the argument list using a variant of array slicing: "${#:3}" will get you the arguments starting with "$3". "${#:3:4}" will get you up to four arguments starting at "$3" (i.e. "$3" "$4" "$5" "$6"), if that many arguments were passed.
Things you probably don't want to do:
"$*" gives all of the arguments stuck together into a single string (separated by spaces, or whatever the first character of $IFS is). This looses the distinction between spaces within arguments and the spaces between arguments, so is generally a bad idea. Although it might be ok for printing the arguments, e.g. echo "$*", provided you don't care about preserving the space within/between distinction.
Assigning the arguments to a regular variable (as in args="$#") mashes all the arguments together like "$*" does. If you want to store the arguments in a variable, use an array with args=("$#") (the parentheses make it an array), and then reference them as e.g. "${args[0]}" etc. Note that in Bash and ksh, array indexes start at 0, so $1 will be in args[0], etc. zsh, on the other hand, starts array indexes at 1, so $1 will be in args[1]. And more basic shells like dash don't have arrays at all.
Leaving off the double-quotes, with either $# or $*, will try to split each argument up into separate words (based on whitespace or whatever's in $IFS), and also try to expand anything that looks like a filename wildcard into a list of matching filenames. This can have really weird effects, and should almost always be avoided. (Except in zsh, where this expansion doesn't take place by default.)
I needed a variation on this, which I expect will be useful to others:
function diffs() {
diff "${#:3}" <(sort "$1") <(sort "$2")
}
The "${#:3}" part means all the members of the array starting at 3. So this function implements a sorted diff by passing the first two arguments to diff through sort and then passing all other arguments to diff, so you can call it similarly to diff:
diffs file1 file2 [other diff args, e.g. -y]
Use the $# variable, which expands to all command-line parameters separated by spaces.
abc "$#"
Here's a simple script:
#!/bin/bash
args=("$#")
echo Number of arguments: $#
echo 1st argument: ${args[0]}
echo 2nd argument: ${args[1]}
$# is the number of arguments received by the script. I find easier to access them using an array: the args=("$#") line puts all the arguments in the args array. To access them use ${args[index]}.
It's worth mentioning that you can specify argument ranges with this syntax.
function example() {
echo "line1 ${#:1:1}"; #First argument
echo "line2 ${#:2:1}"; #Second argument
echo "line3 ${#:3}"; #Third argument onwards
}
I hadn't seen it mentioned.
abc "$#" is generally the correct answer.
But I was trying to pass a parameter through to an su command, and no amount of quoting could stop the error su: unrecognized option '--myoption'. What actually worked for me was passing all the arguments as a single string :
abc "$*"
My exact case (I'm sure someone else needs this) was in my .bashrc
# run all aws commands as Jenkins user
aws ()
{
sudo su jenkins -c "aws $*"
}
abc "$#"
$# represents all the parameters given to your bash script.
Bash seems to behave unpredictably in regards to temporary, per-command variable assignment, specifically with IFS.
I often assign IFS to a temporary value in conjunction with the read command. I would like to use the same mechanic to tailor output, but currently resort to a function or subshell to contain the variable assignment.
$ while IFS=, read -a A; do
> echo "${A[#]:1:2}" # control (undesirable)
> done <<< alpha,bravo,charlie
bravo charlie
$ while IFS=, read -a A; do
> IFS=, echo "${A[*]:1:2}" # desired solution (failure)
> done <<< alpha,bravo,charlie
bravo charlie
$ perlJoin(){ local IFS="$1"; shift; echo "$*"; }
$ while IFS=, read -a A; do
> perlJoin , "${A[#]:1:2}" # function with local variable (success)
> done <<< alpha,bravo,charlie
bravo,charlie
$ while IFS=, read -a A; do
> (IFS=,; echo "${A[*]:1:2}") # assignment within subshell (success)
> done <<< alpha,bravo,charlie
bravo,charlie
If the second assignment in the following block does not affect the environment of the command, and it does not generate an error, then what is it for?
$ foo=bar
$ foo=qux echo $foo
bar
$ foo=bar
$ foo=qux echo $foo
bar
This is a common bash gotcha -- and https://www.shellcheck.net/ catches it:
foo=qux echo $foo
^-- SC2097: This assignment is only seen by the forked process.
^-- SC2098: This expansion will not see the mentioned assignment.
The issue is that the first foo=bar is setting a bash variable, not an environment variable. Then, the inline foo=qux syntax is used to set an environment variable for echo -- however echo never actually looks at that variable. Instead $foo gets recognized as a bash variable and replaced with bar.
So back to your main question, you were basically there with your final attempt using the subshell -- except that you don't actually need the subshell:
while IFS=, read -a A; do
IFS=,; echo "${A[*]:1:2}"
done <<< alpha,bravo,charlie
outputs:
bravo,charlie
For completeness, here's a final example that reads in multiple lines and uses a different output separator to demonstrate that the different IFS assignments aren't stomping on each other:
while IFS=, read -a A; do
IFS=:; echo "${A[*]:1:2}"
done < <(echo -e 'alpha,bravo,charlie\nfoo,bar,baz')
outputs:
bravo:charlie
bar:baz
The answer is a bit simpler than the other answers are presenting:
$ foo=bar
$ foo=qux echo $foo
bar
We see "bar" because the shell expands $foo before setting foo=qux
Simple Command Expansion -- there's a lot to get through here, so bear with me...
When a simple command is executed, the shell performs the following expansions, assignments, and redirections, from left to right.
The words that the parser has marked as variable assignments (those preceding the command name) and redirections are saved for later processing.
The words that are not variable assignments or redirections are expanded (see Shell Expansions). If any words remain after expansion, the first word is taken to be the name of the command and the remaining words are the arguments.
Redirections are performed as described above (see Redirections).
The text after the ‘=’ in each variable assignment undergoes tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal before being assigned to the variable.
If no command name results, the variable assignments affect the current shell environment. Otherwise, the variables are added to the environment of the executed command and do not affect the current shell environment. If any of the assignments attempts to assign a value to a readonly variable, an error occurs, and the command exits with a non-zero status.
If no command name results, redirections are performed, but do not affect the current shell environment. A redirection error causes the command to exit with a non-zero status.
If there is a command name left after expansion, execution proceeds as described below. Otherwise, the command exits. If one of the expansions contained a command substitution, the exit status of the command is the exit status of the last command substitution performed. If there were no command substitutions, the command exits with a status of zero.
So:
the shell sees foo=qux and saves that for later
the shell sees $foo and expands it to "bar"
then we now have: foo=qux echo bar
Once you really understand the order that bash does things, a lot of the mystery goes away.
Short answer: the effects of changing IFS are complex and hard to understand, and best avoided except for a few well-defined idioms (IFS=, read ... is one of the idioms I consider ok).
Long answer: There are a couple of things you need to keep in mind in order to understand the results you're seeing from changes to IFS:
Using IFS=something as a prefix to a command changes IFS only for that one command's execution. In particular, it does not affect how the shell parses the arguments to be passed to that command; that's controlled by the shell's value of IFS, not the one used for the command's execution.
Some commands pay attention to the value of IFS they're executed with (e.g. read), but others don't (e.g. echo).
Given the above, IFS=, read -a A does what you'd expect, it splits its input on ",":
$ IFS=, read -a A <<<"alpha,bravo,charlie"
$ declare -p A
declare -a A='([0]="alpha" [1]="bravo" [2]="charlie")'
But echo pays no attention; it always puts spaces between the arguments it's passed, so using IFS=something as a prefix to it has no effect at all:
$ echo alpha bravo
alpha bravo
$ IFS=, echo alpha bravo
alpha bravo
So when you use IFS=, echo "${A[*]:1:2}", it's equivalent to just echo "${A[*]:1:2}", and since the shell's definition of IFS starts with space, it puts the elements of A together with spaces between them. So it's equivalent to running IFS=, echo "alpha bravo".
On the other hand, IFS=,; echo "${A[*]:1:2}" changes the shell's definition of IFS, so it does affect how the shell puts the elements together, so it comes out equivalent to IFS=, echo "alpha,bravo". Unfortunately, it also affects everything else from that point on so you either have to isolate it to a subshell or set it back to normal afterward.
Just for completeness, here are a couple of other versions that don't work:
$ IFS=,; echo "${A[#]:1:2}"
bravo charlie
In this case, the [#] tells the shell to treat each element of the array as a separate argument, so it's left to echo to merge them, and it ignores IFS and always uses spaces.
$ IFS=,; echo "${A[#]:1:2}"
bravo charlie
So how about this:
$ IFS=,; echo ${A[*]:1:2}
bravo charlie
In this case, the [*] tells the shell to mash all elements together with the first character of IFS between them, giving bravo,charlie. But it's not in double-quotes, so the shell immediately re-splits it on ",", splitting it back into separate arguments again (and then echo joins them with spaces as always).
If you want to change the shell's definition of IFS without having to isolate it to a subshell, there are a few options to change it and set it back afterward. In bash, you can set it back to normal like this:
$ IFS=,
$ while read -a A; do # Note: IFS change not needed here; it's already changed
> echo "${A[*]:1:2}"
> done <<<alpha,bravo,charlie
bravo,charlie
$ IFS=$' \t\n'
But the $'...' syntax isn't available in all shells; if you need portability it's best to use literal characters:
IFS='
' # You can't see it, but there's a literal space and tab after the first '
Some people prefer to use unset IFS, which just forces the shell to its default behavior, which is pretty much the same as with IFS defined in the normal way.
...but if IFS had been changed in some larger context, and you don't want to mess that up, you need to save it and then set it back. If it's been changed normally, this'll work:
saveIFS=$IFS
...
IFS=$saveIFS
...but if someone thought it was a good idea to use unset IFS, this will define it as blank, giving weird results. So you can use this approach or the unset approach, but not both. If you want to make this robust against the unset conflict, you can use something like this in bash:
saveIFS=${IFS:-$' \t\n'}
...or for portability, leave off the $' ' and use literal space+tab+newline:
saveIFS=${IFS:-
} # Again, there's an invisible space and tab at the end of the first line
All in all, it's a lot of mess full of traps for the unwary. I recommend avoiding it whenever possible.
The shell has a great feature, where it'll preserve argument quoting across variable expansion when you use "$#", such that the script:
for f in "$#"; do echo "$f"; done
when invoked with arguments:
"with spaces" '$and $(metachars)'
will print, literally:
with spaces
$and $(metachars)
This isn't the normal behaviour of expansion of a quoted string, it seems to be a special case for "$#".
Is there any way to get this behaviour for other variables? In the specific case I'm interested in, I want to safely expand $SSH_ORIGINAL_COMMAND in a command= specifier in a restricted public key entry, without having to worry about spaces in arguments, metacharacters, etc.
"$SSH_ORIGINAL_COMMAND" expands like "$*" would, i.e. a naïve expansion that doesn't add any quoting around separate arguments.
Is the information required for "$#" style expansion simply not available to the shell in this case, by the time it gets the env var SSH_ORIGINAL_COMMAND? So I'd instead need to convince sshd to quote the arguments?
The answer to this question is making me wonder if it's possible at all.
You can get similar "quoted dollar-at" behavior for arbitrary arrays using "${YOUR_ARRAY_HERE[#]}" syntax for bash arrays. Of course, that's no complete answer, because you still have to break the string into multiple array elements according to the quotes.
One thought was to use bash -x, which renders expanded output, but only if you actually run the command; it doesn't work with -n, which prevents you from actually executing the commands in question. Likewise you could use eval or bash -c along with set -- to manage the quote removal, performing expansion on the outer shell and quote removal on the inner shell, but that would be extremely hard to bulletproof against executing arbitrary code.
As an end run, use xargs instead. xargs handles single and double quotes. This is a very imperfect solution, because xargs treats backslash-escaped characters very differently than bash does and fails entirely to handle semicolons and so forth, but if your input is relatively predictable it gets you most of the way there without forcing you to write a full shell parser.
SSH_ORIGINAL_COMMAND='foo "bar baz" $quux'
# Build out the parsed array.
# Bash 4 users may be able to do this with readarray or mapfile instead.
# You may also choose to null-terminate if newlines matter.
COMMAND_ARRAY=()
while read line; do
COMMAND_ARRAY+=("$line")
done < <(xargs -n 1 <<< "$SSH_ORIGINAL_COMMAND")
# Demonstrate working with the array.
N=0
for arg in "${COMMAND_ARRAY[#]}"; do
echo "COMMAND_ARRAY[$N]: $arg"
((N++))
done
Output:
COMMAND_ARRAY[0]: foo
COMMAND_ARRAY[1]: bar baz
COMMAND_ARRAY[2]: $quux
The $# variable seems to maintain quoting around its arguments so that, for example:
$ function foo { for i in "$#"; do echo $i; done }
$ foo herp "hello world" derp
herp
hello world
derp
I am also aware that bash arrays, work the same way:
$ a=(herp "hello world" derp)
$ for i in "${a[#]}"; do echo $i; done
herp
hello world
derp
What is actually going on with variables like this? Particularly when I add something to the quote like "duck ${a[#]} goose". If its not space separated what is it?
Usually, double quotation marks in Bash mean "make everything between the quotation marks one word, even if it has separators in it." But as you've noticed, $# behaves differently when it's within double quotes. This is actually a parsing hack that dates back to Bash's predecessor, the Bourne shell, and this special behavior applies only to this particular variable.
Without this hack (I use the term because it seems inconsistent from a language perspective, although it's very useful), it would be difficult for a shell script to pass along its array of arguments to some other command that wants the same arguments. Some of those arguments might have spaces in them, but how would it pass them to another command without the shell either lumping them together as one big word or reparsing the list and splitting the arguments that have whitespace?
Well, you could pass an array of arguments, and the Bourne shell really only has one array, represented by $* or $#, whose number of elements is $# and whose elements are $1, $2, etc, the so-called positional parameters.
An example. Suppose you have three files in the current directory, named aaa, bbb, and cc c (the third file has a space in the name). You can initialize the array (that is, you can set the positional parameters) to be the names of the files in the current directory like this:
set -- *
Now the array of positional parameters holds the names of the files. $#, the number of elements, is three:
$ echo $#
3
And we can iterate over the position parameters in a few different ways.
1) We can use $*:
$ for file in $*; do
> echo "$file"
> done
but that re-separates the arguments on whitespace and calls echo four times:
aaa
bbb
cc
c
2) Or we could put quotation marks around $*:
$ for file in "$*"; do
> echo "$file"
> done
but that groups the whole array into one argument and calls echo just once:
aaa bbb cc c
3) Or we could use $# which represents the same array but behaves differently in double quotes:
$ for file in "$#"; do
> echo "$file"
> done
will produce
aaa
bbb
cc c
because $1 = "aaa", $2 = "bbb", and $3 = "cc c" and "$#" leaves the elements intact. If you leave off the quotation marks around $#, the shell will flatten and re-parse the array, echo will be called four times, and you'll get the same thing you got with a bare $*.
This is especially useful in a shell script, where the positional parameters are the arguments that were passed to your script. To pass those same arguments to some other command -- without the shell resplitting them on whitespace -- use "$#".
# Truncate the files specified by the args
rm "$#"
touch "$#"
In Bourne, this behavior only applies to the positional parameters because it's really the only array supported by the language. But you can create other arrays in Bash, and you can even apply the old parsing hack to those arrays using the special "${ARRAYNAME[#]}" syntax, whose at-sign feels almost like a wink to Mr. Bourne:
$ declare -a myarray
$ myarray[0]=alpha
$ myarray[1]=bravo
$ myarray[2]="char lie"
$ for file in "${myarray[#]}"; do echo "$file"; done
alpha
bravo
char lie
Oh, and about your last example, what should the shell do with "pre $# post" where you have $# within double quotes but you have other stuff in there, too? Recent versions of Bash preserve the array, prepend the text before the $# to the first array element, and append the text after the $# to the last element:
pre aaa
bb
cc c post
This question already has answers here:
Propagate all arguments in a Bash shell script
(12 answers)
Closed 3 years ago.
Let's say I have a function abc() that will handle the logic related to analyzing the arguments passed to my script.
How can I pass all arguments my Bash script has received to abc()? The number of arguments is variable, so I can't just hard-code the arguments passed like this:
abc $1 $2 $3 $4
Better yet, is there any way for my function to have access to the script arguments' variables?
The $# variable expands to all command-line parameters separated by spaces. Here is an example.
abc "$#"
When using $#, you should (almost) always put it in double-quotes to avoid misparsing of arguments containing spaces or wildcards (see below). This works for multiple arguments. It is also portable to all POSIX-compliant shells.
It is also worth noting that $0 (generally the script's name or path) is not in $#.
The Bash Reference Manual Special Parameters Section says that $# expands to the positional parameters starting from one. When the expansion occurs within double quotes, each parameter expands to a separate word. That is "$#" is equivalent to "$1" "$2" "$3"....
Passing some arguments:
If you want to pass all but the first arguments, you can first use shift to "consume" the first argument and then pass "$#" to pass the remaining arguments to another command. In Bash (and zsh and ksh, but not in plain POSIX shells like dash), you can do this without messing with the argument list using a variant of array slicing: "${#:3}" will get you the arguments starting with "$3". "${#:3:4}" will get you up to four arguments starting at "$3" (i.e. "$3" "$4" "$5" "$6"), if that many arguments were passed.
Things you probably don't want to do:
"$*" gives all of the arguments stuck together into a single string (separated by spaces, or whatever the first character of $IFS is). This looses the distinction between spaces within arguments and the spaces between arguments, so is generally a bad idea. Although it might be ok for printing the arguments, e.g. echo "$*", provided you don't care about preserving the space within/between distinction.
Assigning the arguments to a regular variable (as in args="$#") mashes all the arguments together like "$*" does. If you want to store the arguments in a variable, use an array with args=("$#") (the parentheses make it an array), and then reference them as e.g. "${args[0]}" etc. Note that in Bash and ksh, array indexes start at 0, so $1 will be in args[0], etc. zsh, on the other hand, starts array indexes at 1, so $1 will be in args[1]. And more basic shells like dash don't have arrays at all.
Leaving off the double-quotes, with either $# or $*, will try to split each argument up into separate words (based on whitespace or whatever's in $IFS), and also try to expand anything that looks like a filename wildcard into a list of matching filenames. This can have really weird effects, and should almost always be avoided. (Except in zsh, where this expansion doesn't take place by default.)
I needed a variation on this, which I expect will be useful to others:
function diffs() {
diff "${#:3}" <(sort "$1") <(sort "$2")
}
The "${#:3}" part means all the members of the array starting at 3. So this function implements a sorted diff by passing the first two arguments to diff through sort and then passing all other arguments to diff, so you can call it similarly to diff:
diffs file1 file2 [other diff args, e.g. -y]
Use the $# variable, which expands to all command-line parameters separated by spaces.
abc "$#"
Here's a simple script:
#!/bin/bash
args=("$#")
echo Number of arguments: $#
echo 1st argument: ${args[0]}
echo 2nd argument: ${args[1]}
$# is the number of arguments received by the script. I find easier to access them using an array: the args=("$#") line puts all the arguments in the args array. To access them use ${args[index]}.
It's worth mentioning that you can specify argument ranges with this syntax.
function example() {
echo "line1 ${#:1:1}"; #First argument
echo "line2 ${#:2:1}"; #Second argument
echo "line3 ${#:3}"; #Third argument onwards
}
I hadn't seen it mentioned.
abc "$#" is generally the correct answer.
But I was trying to pass a parameter through to an su command, and no amount of quoting could stop the error su: unrecognized option '--myoption'. What actually worked for me was passing all the arguments as a single string :
abc "$*"
My exact case (I'm sure someone else needs this) was in my .bashrc
# run all aws commands as Jenkins user
aws ()
{
sudo su jenkins -c "aws $*"
}
abc "$#"
$# represents all the parameters given to your bash script.