Bash empty array expansion with `set -u` - bash

I'm writing a bash script which has set -u, and I have a problem with empty array expansion: bash appears to treat an empty array as an unset variable during expansion:
$ set -u
$ arr=()
$ echo "foo: '${arr[#]}'"
bash: arr[#]: unbound variable
(declare -a arr doesn't help either.)
A common solution to this is to use ${arr[#]-} instead, thus substituting an empty string instead of the ("undefined") empty array. However this is not a good solution, since now you can't discern between an array with a single empty string in it and an empty array. (#-expansion is special in bash, it expands "${arr[#]}" into "${arr[0]}" "${arr[1]}" …, which makes it a perfect tool for building command lines.)
$ countArgs() { echo $#; }
$ countArgs a b c
3
$ countArgs
0
$ countArgs ""
1
$ brr=("")
$ countArgs "${brr[#]}"
1
$ countArgs "${arr[#]-}"
1
$ countArgs "${arr[#]}"
bash: arr[#]: unbound variable
$ set +u
$ countArgs "${arr[#]}"
0
So is there a way around that problem, other than checking the length of an array in an if (see code sample below), or turning off -u setting for that short piece?
if [ "${#arr[#]}" = 0 ]; then
veryLongCommandLine
else
veryLongCommandLine "${arr[#]}"
fi
Update: Removed bugs tag due to explanation by ikegami.

According to the documentation,
An array variable is considered set if a subscript has been assigned a value. The null string is a valid value.
No subscript has been assigned a value, so the array isn't set.
But while the documentation suggests an error is appropriate here, this is no longer the case since 4.4.
$ bash --version | head -n 1
GNU bash, version 4.4.19(1)-release (x86_64-pc-linux-gnu)
$ set -u
$ arr=()
$ echo "foo: '${arr[#]}'"
foo: ''
There is a conditional you can use inline to achieve what you want in older versions: Use ${arr[#]+"${arr[#]}"} instead of "${arr[#]}".
$ function args { perl -E'say 0+#ARGV; say "$_: $ARGV[$_]" for 0..$#ARGV' -- "$#" ; }
$ set -u
$ arr=()
$ args "${arr[#]}"
-bash: arr[#]: unbound variable
$ args ${arr[#]+"${arr[#]}"}
0
$ arr=("")
$ args ${arr[#]+"${arr[#]}"}
1
0:
$ arr=(a b c)
$ args ${arr[#]+"${arr[#]}"}
3
0: a
1: b
2: c
Tested with bash 4.2.25 and 4.3.11.

The only safe idiom is ${arr[#]+"${arr[#]}"}
Unless you only care about Bash 4.4+, but you wouldn't be looking at this question if that were the case :)
This is already the recommendation in ikegami's answer, but there's a lot of misinformation and guesswork in this thread. Other patterns, such as ${arr[#]-} or ${arr[#]:0}, are not safe across all major versions of Bash.
As the table below shows, the only expansion that is reliable across all modern-ish Bash versions is ${arr[#]+"${arr[#]}"} (column +"). Of note, several other expansions fail in Bash 4.2, including (unfortunately) the shorter ${arr[#]:0} idiom, which doesn't just produce an incorrect result but actually fails. If you need to support versions prior to 4.4, and in particular 4.2, this is the only working idiom.
Unfortunately other + expansions that, at a glance, look the same do indeed emit different behavior. Using :+ instead of + (:+" in the table), for example, does not work because :-expansion treats an array with a single empty element (('')) as "null" and thus doesn't (consistently) expand to the same result.
Quoting the full expansion instead of the nested array ("${arr[#]+${arr[#]}}", "+ in the table), which I would have expected to be roughly equivalent, is similarly unsafe in 4.2.
You can see the code that generated this data along with results for several additional version of bash in this gist.

#ikegami's accepted answer is subtly wrong! The correct incantation is ${arr[#]+"${arr[#]}"}:
$ countArgs () { echo "$#"; }
$ arr=('')
$ countArgs "${arr[#]:+${arr[#]}}"
0 # WRONG
$ countArgs ${arr[#]+"${arr[#]}"}
1 # RIGHT
$ arr=()
$ countArgs ${arr[#]+"${arr[#]}"}
0 # Let's make sure it still works for the other case...

Turns out array handling has been changed in recently released (2016/09/16) bash 4.4 (available in Debian stretch, for example).
$ bash --version | head -n1
bash --version | head -n1
GNU bash, version 4.4.0(1)-release (x86_64-pc-linux-gnu)
Now empty arrays expansion does not emits warning
$ set -u
$ arr=()
$ echo "${arr[#]}"
$ # everything is fine

this may be another option for those who prefer not to duplicate arr[#] and are okay to have an empty string
echo "foo: '${arr[#]:-}'"
to test:
set -u
arr=()
echo a "${arr[#]:-}" b # note two spaces between a and b
for f in a "${arr[#]:-}" b; do echo $f; done # note blank line between a and b
arr=(1 2)
echo a "${arr[#]:-}" b
for f in a "${arr[#]:-}" b; do echo $f; done

#ikegami's answer is correct, but I consider the syntax ${arr[#]+"${arr[#]}"} dreadful. If you use long array variable names, it starts to looks spaghetti-ish quicker than usual.
Try this instead:
$ set -u
$ count() { echo $# ; } ; count x y z
3
$ count() { echo $# ; } ; arr=() ; count "${arr[#]}"
-bash: abc[#]: unbound variable
$ count() { echo $# ; } ; arr=() ; count "${arr[#]:0}"
0
$ count() { echo $# ; } ; arr=(x y z) ; count "${arr[#]:0}"
3
It looks like the Bash array slice operator is very forgiving.
So why did Bash make handling the edge case of arrays so difficult? Sigh. I cannot guarantee you version will allow such abuse of the array slice operator, but it works dandy for me.
Caveat: I am using GNU bash, version 3.2.25(1)-release (x86_64-redhat-linux-gnu)
Your mileage may vary.

"Interesting" inconsistency indeed.
Furthermore,
$ set -u
$ echo $#
0
$ echo "$1"
bash: $1: unbound variable # makes sense (I didn't set any)
$ echo "$#" | cat -e
$ # blank line, no error
While I agree that the current behavior may not be a bug in the sense that #ikegami explains, IMO we could say the bug is in the definition (of "set") itself, and/or the fact that it's inconsistently applied. The preceding paragraph in the man page says
... ${name[#]} expands each element of name to a separate word. When there are no array members, ${name[#]} expands to nothing.
which is entirely consistent with what it says about the expansion of positional parameters in "$#". Not that there aren't other inconsistencies in the behaviors of arrays and positional parameters... but to me there's no hint that this detail should be inconsistent between the two.
Continuing,
$ arr=()
$ echo "${arr[#]}"
bash: arr[#]: unbound variable # as we've observed. BUT...
$ echo "${#arr[#]}"
0 # no error
$ echo "${!arr[#]}" | cat -e
$ # no error
So arr[] isn't so unbound that we can't get a count of its elements (0), or a (empty) list of its keys? To me these are sensible, and useful -- the only outlier seems to be the ${arr[#]} (and ${arr[*]}) expansion.

I am complementing on #ikegami's (accepted) and #kevinarpe's (also good) answers.
You can do "${arr[#]:+${arr[#]}}" to workaround the problem. The right-hand-side (i.e., after :+) provides an expression that will be used in case the left-hand-side is not defined/null.
The syntax is arcane. Note that the right hand side of the expression will undergo parameter expansion, so extra attention should be paid to having consistent quoting.
: example copy arr into arr_copy
arr=( "1 2" "3" )
arr_copy=( "${arr[#]:+${arr[#]}}" ) # good. same quoting.
# preserves spaces
arr_copy=( ${arr[#]:+"${arr[#]}"} ) # bad. quoting only on RHS.
# copy will have ["1","2","3"],
# instead of ["1 2", "3"]
Like #kevinarpe mentions, a less arcane syntax is to use the array slice notation ${arr[#]:0} (on Bash versions >= 4.4), which expands to all the parameters, starting from index 0. It also doesn't require as much repetition. This expansion works regardless of set -u, so you can use this at all times. The man page says (under Parameter Expansion):
${parameter:offset}
${parameter:offset:length}
...
If parameter is an indexed array name subscripted by # or *, the result is the length members of the array beginning with ${parameter[offset]}. A negative offset is taken relative to one
greater than the maximum index of the specified array. It is an
expansion error if length evaluates to a number less than zero.
This is the example provided by #kevinarpe, with alternate formatting to place the output in evidence:
set -u
function count() { echo $# ; };
(
count x y z
)
: prints "3"
(
arr=()
count "${arr[#]}"
)
: prints "-bash: arr[#]: unbound variable"
(
arr=()
count "${arr[#]:0}"
)
: prints "0"
(
arr=(x y z)
count "${arr[#]:0}"
)
: prints "3"
This behaviour varies with versions of Bash. You may also have noticed that the length operator ${#arr[#]} will always evaluate to 0 for empty arrays, regardless of set -u, without causing an 'unbound variable error'.

Here are a couple of ways to do something like this, one using sentinels
and another using conditional appends:
#!/bin/bash
set -o nounset -o errexit -o pipefail
countArgs () { echo "$#"; }
arrA=( sentinel )
arrB=( sentinel "{1..5}" "./*" "with spaces" )
arrC=( sentinel '$PWD' )
cmnd=( countArgs "${arrA[#]:1}" "${arrB[#]:1}" "${arrC[#]:1}" )
echo "${cmnd[#]}"
"${cmnd[#]}"
arrA=( )
arrB=( "{1..5}" "./*" "with spaces" )
arrC=( '$PWD' )
cmnd=( countArgs )
# Checks expansion of indices.
[[ ! ${!arrA[#]} ]] || cmnd+=( "${arrA[#]}" )
[[ ! ${!arrB[#]} ]] || cmnd+=( "${arrB[#]}" )
[[ ! ${!arrC[#]} ]] || cmnd+=( "${arrC[#]}" )
echo "${cmnd[#]}"
"${cmnd[#]}"

Now, as technically right the "${arr[#]+"${arr[#]}"}" version is, you never want to use this syntax for appending to an array, ever!
This is, as this syntax actually expands the array and then appends.
And that means that there is a lot going on computational- and memory-wise!
To show this, I made a simple comparison:
# cat array_perftest_expansion.sh
#! /usr/bin/bash
set -e
set -u
loops=$1
arr=()
i=0
while [ $i -lt $loops ] ; do
arr=( ${arr[#]+"${arr[#]}"} "${i}" )
#arr=arr[${#arr[#]}]="${i}"
i=$(( i + 1 ))
done
exit 0
And then:
# timex ./array_perftest_expansion.sh 1000
real 1.86
user 1.84
sys 0.01
But with the second line enabled instead, just setting the last entry directly:
arr=arr[${#arr[#]}]="${i}"
# timex ./array_perftest_last.sh 1000
real 0.03
user 0.02
sys 0.00
If that is not enough, things get much worse, when you try to add more entries!
When using 4000 instead of 1000 loops:
# timex ./array_perftest_expansion.sh 4000
real 33.13
user 32.90
sys 0.22
Just setting the last entry:
# timex ./array_perftest_last.sh 4000
real 0.10
user 0.09
sys 0.00
And this gets worse and worse ... I could not wait for the expansion version to finish a loop of 10000!
With the last element instead:
# timex ./array_perftest_last.sh 10000
real 0.26
user 0.25
sys 0.01
Never use such an array expansion for any reason.

Interesting inconsistency; this lets you define something which is "not considered set" yet shows up in the output of declare -p
arr=()
set -o nounset
echo ${arr[#]}
=> -bash: arr[#]: unbound variable
declare -p arr
=> declare -a arr='()'
UPDATE: as others mentioned, fixed in 4.4 released after this answer was posted.

The most simple and compatible way seems to be:
$ set -u
$ arr=()
$ echo "foo: '${arr[#]-}'"

Related

Reading a particular Digit from a given number in shell [duplicate]

I have a string in a Bash shell script that I want to split into an array of characters, not based on a delimiter but just one character per array index. How can I do this? Ideally it would not use any external programs. Let me rephrase that. My goal is portability, so things like sed that are likely to be on any POSIX compatible system are fine.
Try
echo "abcdefg" | fold -w1
Edit: Added a more elegant solution suggested in comments.
echo "abcdefg" | grep -o .
You can access each letter individually already without an array conversion:
$ foo="bar"
$ echo ${foo:0:1}
b
$ echo ${foo:1:1}
a
$ echo ${foo:2:1}
r
If that's not enough, you could use something like this:
$ bar=($(echo $foo|sed 's/\(.\)/\1 /g'))
$ echo ${bar[1]}
a
If you can't even use sed or something like that, you can use the first technique above combined with a while loop using the original string's length (${#foo}) to build the array.
Warning: the code below does not work if the string contains whitespace. I think Vaughn Cato's answer has a better chance at surviving with special chars.
thing=($(i=0; while [ $i -lt ${#foo} ] ; do echo ${foo:$i:1} ; i=$((i+1)) ; done))
As an alternative to iterating over 0 .. ${#string}-1 with a for/while loop, there are two other ways I can think of to do this with only bash: using =~ and using printf. (There's a third possibility using eval and a {..} sequence expression, but this lacks clarity.)
With the correct environment and NLS enabled in bash these will work with non-ASCII as hoped, removing potential sources of failure with older system tools such as sed, if that's a concern. These will work from bash-3.0 (released 2005).
Using =~ and regular expressions, converting a string to an array in a single expression:
string="wonkabars"
[[ "$string" =~ ${string//?/(.)} ]] # splits into array
printf "%s\n" "${BASH_REMATCH[#]:1}" # loop free: reuse fmtstr
declare -a arr=( "${BASH_REMATCH[#]:1}" ) # copy array for later
The way this works is to perform an expansion of string which substitutes each single character for (.), then match this generated regular expression with grouping to capture each individual character into BASH_REMATCH[]. Index 0 is set to the entire string, since that special array is read-only you cannot remove it, note the :1 when the array is expanded to skip over index 0, if needed.
Some quick testing for non-trivial strings (>64 chars) shows this method is substantially faster than one using bash string and array operations.
The above will work with strings containing newlines, =~ supports POSIX ERE where . matches anything except NUL by default, i.e. the regex is compiled without REG_NEWLINE. (The behaviour of POSIX text processing utilities is allowed to be different by default in this respect, and usually is.)
Second option, using printf:
string="wonkabars"
ii=0
while printf "%s%n" "${string:ii++:1}" xx; do
((xx)) && printf "\n" || break
done
This loop increments index ii to print one character at a time, and breaks out when there are no characters left. This would be even simpler if the bash printf returned the number of character printed (as in C) rather than an error status, instead the number of characters printed is captured in xx using %n. (This works at least back as far as bash-2.05b.)
With bash-3.1 and printf -v var you have slightly more flexibility, and can avoid falling off the end of the string should you be doing something other than printing the characters, e.g. to create an array:
declare -a arr
ii=0
while printf -v cc "%s%n" "${string:(ii++):1}" xx; do
((xx)) && arr+=("$cc") || break
done
If your string is stored in variable x, this produces an array y with the individual characters:
i=0
while [ $i -lt ${#x} ]; do y[$i]=${x:$i:1}; i=$((i+1));done
The most simple, complete and elegant solution:
$ read -a ARRAY <<< $(echo "abcdefg" | sed 's/./& /g')
and test
$ echo ${ARRAY[0]}
a
$ echo ${ARRAY[1]}
b
Explanation: read -a reads the stdin as an array and assigns it to the variable ARRAY treating spaces as delimiter for each array item.
The evaluation of echoing the string to sed just add needed spaces between each character.
We are using Here String (<<<) to feed the stdin of the read command.
I have found that the following works the best:
array=( `echo string | grep -o . ` )
(note the backticks)
then if you do: echo ${array[#]} ,
you get: s t r i n g
or: echo ${array[2]} ,
you get: r
Pure Bash solution with no loop:
#!/usr/bin/env bash
str='The quick brown fox jumps over a lazy dog.'
# Need extglob for the replacement pattern
shopt -s extglob
# Split string characters into array (skip first record)
# Character 037 is the octal representation of ASCII Record Separator
# so it can capture all other characters in the string, including spaces.
IFS= mapfile -s1 -t -d $'\37' array <<<"${str//?()/$'\37'}"
# Strip out captured trailing newline of here-string in last record
array[-1]="${array[-1]%?}"
# Debug print array
declare -p array
string=hello123
for i in $(seq 0 ${#string})
do array[$i]=${string:$i:1}
done
echo "zero element of array is [${array[0]}]"
echo "entire array is [${array[#]}]"
The zero element of array is [h]. The entire array is [h e l l o 1 2 3 ].
Yet another on :), the stated question simply says 'Split string into character array' and don't say much about the state of the receiving array, and don't say much about special chars like and control chars.
My assumption is that if I want to split a string into an array of chars I want the receiving array containing just that string and no left over from previous runs, yet preserve any special chars.
For instance the proposed solution family like
for (( i=0 ; i < ${#x} ; i++ )); do y[i]=${x:i:1}; done
Have left overs in the target array.
$ y=(1 2 3 4 5 6 7 8)
$ x=abc
$ for (( i=0 ; i < ${#x} ; i++ )); do y[i]=${x:i:1}; done
$ printf '%s ' "${y[#]}"
a b c 4 5 6 7 8
Beside writing the long line each time we want to split a problem, so why not hide all this into a function we can keep is a package source file, with a API like
s2a "Long string" ArrayName
I got this one that seems to do the job.
$ s2a()
> { [ "$2" ] && typeset -n __=$2 && unset $2;
> [ "$1" ] && __+=("${1:0:1}") && s2a "${1:1}"
> }
$ a=(1 2 3 4 5 6 7 8 9 0) ; printf '%s ' "${a[#]}"
1 2 3 4 5 6 7 8 9 0
$ s2a "Split It" a ; printf '%s ' "${a[#]}"
S p l i t I t
If the text can contain spaces:
eval a=( $(echo "this is a test" | sed "s/\(.\)/'\1' /g") )
$ echo hello | awk NF=NF FS=
h e l l o
Or
$ echo hello | awk '$0=RT' RS=[[:alnum:]]
h
e
l
l
o
I know this is a "bash" question, but please let me show you the perfect solution in zsh, a shell very popular these days:
string='this is a string'
string_array=(${(s::)string}) #Parameter expansion. And that's it!
print ${(t)string_array} -> type array
print $#string_array -> 16 items
This is an old post/thread but with a new feature of bash v5.2+ using the shell option patsub_replacement and the =~ operator for regex. More or less same with #mr.spuratic post/answer.
str='There can be only one, the Highlander.'
regexp="${str//?/(&)}"
[[ "$str" =~ $regexp ]] &&
printf '%s\n' "${BASH_REMATCH[#]:1}"
Or by just: (which includes the whole string at index 0)
declare -p BASH_REMATCH
If that is not desired, one can remove the value of the first index (index 0), with
unset -v 'BASH_REMATCH[0]'
instead of using printf or echo to print the value of the array BASH_REMATCH
One can check/see the value of the variable "$regexp" with either
declare -p regexp
Output
declare -- regexp="(T)(h)(e)(r)(e)( )(c)(a)(n)( )(b)(e)( )(o)(n)(l)(y)( )(o)(n)(e)(,)( )(t)(h)(e)( )(H)(i)(g)(h)(l)(a)(n)(d)(e)(r)(.)"
or
echo "$regexp"
Using it in a script, one might want to test if the shopt is enabled or not, although the manual says it is on/enabled by default.
Something like.
if ! shopt -q patsub_replacement; then
shopt -s patsub_replacement
fi
But yeah, check the bash version too! If you're not sure which version of bash is in use.
if ! ((BASH_VERSINFO[0] >= 5 && BASH_VERSINFO[1] >= 2)); then
printf 'No dice! bash version 5.2+ is required!\n' >&2
exit 1
fi
Space can be excluded from regexp variable, change it from
regexp="${str//?/(&)}"
To
regexp="${str//[! ]/(&)}"
and the output is:
declare -- regexp="(T)(h)(e)(r)(e) (c)(a)(n) (b)(e) (o)(n)(l)(y) (o)(n)(e) (t)(h)(e) (H)(i)(g)(h)(l)(a)(n)(d)(e)(r)(.)"
Maybe not as efficient as the other post/answer but it is still a solution/option.
If you want to store this in an array, you can do this:
string=foo
unset chars
declare -a chars
while read -N 1
do
chars[${#chars[#]}]="$REPLY"
done <<<"$string"x
unset chars[$((${#chars[#]} - 1))]
unset chars[$((${#chars[#]} - 1))]
echo "Array: ${chars[#]}"
Array: f o o
echo "Array length: ${#chars[#]}"
Array length: 3
The final x is necessary to handle the fact that a newline is appended after $string if it doesn't contain one.
If you want to use NUL-separated characters, you can try this:
echo -n "$string" | while read -N 1
do
printf %s "$REPLY"
printf '\0'
done
AWK is quite convenient:
a='123'; echo $a | awk 'BEGIN{FS="";OFS=" "} {print $1,$2,$3}'
where FS and OFS is delimiter for read-in and print-out
For those who landed here searching how to do this in fish:
We can use the builtin string command (since v2.3.0) for string manipulation.
↪ string split '' abc
a
b
c
The output is a list, so array operations will work.
↪ for c in (string split '' abc)
echo char is $c
end
char is a
char is b
char is c
Here's a more complex example iterating over the string with an index.
↪ set --local chars (string split '' abc)
for i in (seq (count $chars))
echo $i: $chars[$i]
end
1: a
2: b
3: c
zsh solution: To put the scalar string variable into arr, which will be an array:
arr=(${(ps::)string})
If you also need support for strings with newlines, you can do:
str2arr(){ local string="$1"; mapfile -d $'\0' Chars < <(for i in $(seq 0 $((${#string}-1))); do printf '%s\u0000' "${string:$i:1}"; done); printf '%s' "(${Chars[*]#Q})" ;}
string=$(printf '%b' "apa\nbepa")
declare -a MyString=$(str2arr "$string")
declare -p MyString
# prints declare -a MyString=([0]="a" [1]="p" [2]="a" [3]=$'\n' [4]="b" [5]="e" [6]="p" [7]="a")
As a response to Alexandro de Oliveira, I think the following is more elegant or at least more intuitive:
while read -r -n1 c ; do arr+=("$c") ; done <<<"hejsan"
declare -r some_string='abcdefghijklmnopqrstuvwxyz'
declare -a some_array
declare -i idx
for ((idx = 0; idx < ${#some_string}; ++idx)); do
some_array+=("${some_string:idx:1}")
done
for idx in "${!some_array[#]}"; do
echo "$((idx)): ${some_array[idx]}"
done
Pure bash, no loop.
Another solution, similar to/adapted from Léa Gris' solution, but using read -a instead of readarray/mapfile :
#!/usr/bin/env bash
str='azerty'
# Need extglob for the replacement pattern
shopt -s extglob
# Split string characters into array
# ${str//?()/$'\x1F'} replace each character "c" with "^_c".
# ^_ (Control-_, 0x1f) is Unit Separator (US), you can choose another
# character.
IFS=$'\x1F' read -ra array <<< "${str//?()/$'\x1F'}"
# now, array[0] contains an empty string and the rest of array (starting
# from index 1) contains the original string characters :
declare -p array
# Or, if you prefer to keep the array "clean", you can delete
# the first element and pack the array :
unset array[0]
array=("${array[#]}")
declare -p array
However, I prefer the shorter (and easier to understand for me), where we remove the initial 0x1f before assigning the array :
#!/usr/bin/env bash
str='azerty'
shopt -s extglob
tmp="${str//?()/$'\x1F'}" # same as code above
tmp=${tmp#$'\x1F'} # remove initial 0x1f
IFS=$'\x1F' read -ra array <<< "$tmp" # assign array
declare -p array # verification

BASH - never execute unless environment variable is defined and certain value [duplicate]

I've got a few Unix shell scripts where I need to check that certain environment variables are set before I start doing stuff, so I do this sort of thing:
if [ -z "$STATE" ]; then
echo "Need to set STATE"
exit 1
fi
if [ -z "$DEST" ]; then
echo "Need to set DEST"
exit 1
fi
which is a lot of typing. Is there a more elegant idiom for checking that a set of environment variables is set?
EDIT: I should mention that these variables have no meaningful default value - the script should error out if any are unset.
Parameter Expansion
The obvious answer is to use one of the special forms of parameter expansion:
: ${STATE?"Need to set STATE"}
: ${DEST:?"Need to set DEST non-empty"}
Or, better (see section on 'Position of double quotes' below):
: "${STATE?Need to set STATE}"
: "${DEST:?Need to set DEST non-empty}"
The first variant (using just ?) requires STATE to be set, but STATE="" (an empty string) is OK — not exactly what you want, but the alternative and older notation.
The second variant (using :?) requires DEST to be set and non-empty.
If you supply no message, the shell provides a default message.
The ${var?} construct is portable back to Version 7 UNIX and the Bourne Shell (1978 or thereabouts). The ${var:?} construct is slightly more recent: I think it was in System III UNIX circa 1981, but it may have been in PWB UNIX before that. It is therefore in the Korn Shell, and in the POSIX shells, including specifically Bash.
It is usually documented in the shell's man page in a section called Parameter Expansion. For example, the bash manual says:
${parameter:?word}
Display Error if Null or Unset. If parameter is null or unset, the expansion of word (or a message to that effect if word is not present) is written to the standard error and the shell, if it is not interactive, exits. Otherwise, the value of parameter is substituted.
The Colon Command
I should probably add that the colon command simply has its arguments evaluated and then succeeds. It is the original shell comment notation (before '#' to end of line). For a long time, Bourne shell scripts had a colon as the first character. The C Shell would read a script and use the first character to determine whether it was for the C Shell (a '#' hash) or the Bourne shell (a ':' colon). Then the kernel got in on the act and added support for '#!/path/to/program' and the Bourne shell got '#' comments, and the colon convention went by the wayside. But if you come across a script that starts with a colon, now you will know why.
Position of double quotes
blong asked in a comment:
Any thoughts on this discussion? https://github.com/koalaman/shellcheck/issues/380#issuecomment-145872749
The gist of the discussion is:
… However, when I shellcheck it (with version 0.4.1), I get this message:
In script.sh line 13:
: ${FOO:?"The environment variable 'FOO' must be set and non-empty"}
^-- SC2086: Double quote to prevent globbing and word splitting.
Any advice on what I should do in this case?
The short answer is "do as shellcheck suggests":
: "${STATE?Need to set STATE}"
: "${DEST:?Need to set DEST non-empty}"
To illustrate why, study the following. Note that the : command doesn't echo its arguments (but the shell does evaluate the arguments). We want to see the arguments, so the code below uses printf "%s\n" in place of :.
$ mkdir junk
$ cd junk
$ > abc
$ > def
$ > ghi
$
$ x="*"
$ printf "%s\n" ${x:?You must set x} # Careless; not recommended
abc
def
ghi
$ unset x
$ printf "%s\n" ${x:?You must set x} # Careless; not recommended
bash: x: You must set x
$ printf "%s\n" "${x:?You must set x}" # Careful: should be used
bash: x: You must set x
$ x="*"
$ printf "%s\n" "${x:?You must set x}" # Careful: should be used
*
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
abc
def
ghi
$ x=
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
bash: x: You must set x
$ unset x
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
bash: x: You must set x
$
Note how the value in $x is expanded to first * and then a list of file names when the overall expression is not in double quotes. This is what shellcheck is recommending should be fixed. I have not verified that it doesn't object to the form where the expression is enclosed in double quotes, but it is a reasonable assumption that it would be OK.
Try this:
[ -z "$STATE" ] && echo "Need to set STATE" && exit 1;
Your question is dependent on the shell that you are using.
Bourne shell leaves very little in the way of what you're after.
BUT...
It does work, just about everywhere.
Just try and stay away from csh. It was good for the bells and whistles it added, compared the Bourne shell, but it is really creaking now. If you don't believe me, just try and separate out STDERR in csh! (-:
There are two possibilities here. The example above, namely using:
${MyVariable:=SomeDefault}
for the first time you need to refer to $MyVariable. This takes the env. var MyVariable and, if it is currently not set, assigns the value of SomeDefault to the variable for later use.
You also have the possibility of:
${MyVariable:-SomeDefault}
which just substitutes SomeDefault for the variable where you are using this construct. It doesn't assign the value SomeDefault to the variable, and the value of MyVariable will still be null after this statement is encountered.
Surely the simplest approach is to add the -u switch to the shebang (the line at the top of your script), assuming you’re using bash:
#!/bin/sh -u
This will cause the script to exit if any unbound variables lurk within.
${MyVariable:=SomeDefault}
If MyVariable is set and not null, it will reset the variable value (= nothing happens).
Else, MyVariable is set to SomeDefault.
The above will attempt to execute ${MyVariable}, so if you just want to set the variable do:
MyVariable=${MyVariable:=SomeDefault}
In my opinion the simplest and most compatible check for #!/bin/sh is:
if [ "$MYVAR" = "" ]
then
echo "Does not exist"
else
echo "Exists"
fi
Again, this is for /bin/sh and is compatible also on old Solaris systems.
bash 4.2 introduced the -v operator which tests if a name is set to any value, even the empty string.
$ unset a
$ b=
$ c=
$ [[ -v a ]] && echo "a is set"
$ [[ -v b ]] && echo "b is set"
b is set
$ [[ -v c ]] && echo "c is set"
c is set
I always used:
if [ "x$STATE" == "x" ]; then echo "Need to set State"; exit 1; fi
Not that much more concise, I'm afraid.
Under CSH you have $?STATE.
For future people like me, I wanted to go a step forward and parameterize the var name, so I can loop over a variable sized list of variable names:
#!/bin/bash
declare -a vars=(NAME GITLAB_URL GITLAB_TOKEN)
for var_name in "${vars[#]}"
do
if [ -z "$(eval "echo \$$var_name")" ]; then
echo "Missing environment variable $var_name"
exit 1
fi
done
We can write a nice assertion to check a bunch of variables all at once:
#
# assert if variables are set (to a non-empty string)
# if any variable is not set, exit 1 (when -f option is set) or return 1 otherwise
#
# Usage: assert_var_not_null [-f] variable ...
#
function assert_var_not_null() {
local fatal var num_null=0
[[ "$1" = "-f" ]] && { shift; fatal=1; }
for var in "$#"; do
[[ -z "${!var}" ]] &&
printf '%s\n' "Variable '$var' not set" >&2 &&
((num_null++))
done
if ((num_null > 0)); then
[[ "$fatal" ]] && exit 1
return 1
fi
return 0
}
Sample invocation:
one=1 two=2
assert_var_not_null one two
echo test 1: return_code=$?
assert_var_not_null one two three
echo test 2: return_code=$?
assert_var_not_null -f one two three
echo test 3: return_code=$? # this code shouldn't execute
Output:
test 1: return_code=0
Variable 'three' not set
test 2: return_code=1
Variable 'three' not set
More such assertions here: https://github.com/codeforester/base/blob/master/lib/assertions.sh
This can be a way too:
if (set -u; : $HOME) 2> /dev/null
...
...
http://unstableme.blogspot.com/2007/02/checks-whether-envvar-is-set-or-not.html
None of the above solutions worked for my purposes, in part because I checking the environment for an open-ended list of variables that need to be set before starting a lengthy process. I ended up with this:
mapfile -t arr < variables.txt
EXITCODE=0
for i in "${arr[#]}"
do
ISSET=$(env | grep ^${i}= | wc -l)
if [ "${ISSET}" = "0" ];
then
EXITCODE=-1
echo "ENV variable $i is required."
fi
done
exit ${EXITCODE}
Rather than using external shell scripts I tend to load in functions in my login shell. I use something like this as a helper function to check for environment variables rather than any set variable:
is_this_an_env_variable ()
local var="$1"
if env |grep -q "^$var"; then
return 0
else
return 1
fi
}
The $? syntax is pretty neat:
if [ $?BLAH == 1 ]; then
echo "Exists";
else
echo "Does not exist";
fi

Getting the last argument passed to a shell script

$1 is the first argument.
$# is all of them.
How can I find the last argument passed to a shell
script?
This is Bash-only:
echo "${#: -1}"
This is a bit of a hack:
for last; do true; done
echo $last
This one is also pretty portable (again, should work with bash, ksh and sh) and it doesn't shift the arguments, which could be nice.
It uses the fact that for implicitly loops over the arguments if you don't tell it what to loop over, and the fact that for loop variables aren't scoped: they keep the last value they were set to.
$ set quick brown fox jumps
$ echo ${*: -1:1} # last argument
jumps
$ echo ${*: -1} # or simply
jumps
$ echo ${*: -2:1} # next to last
fox
The space is necessary so that it doesn't get interpreted as a default value.
Note that this is bash-only.
The simplest answer for bash 3.0 or greater is
_last=${!#} # *indirect reference* to the $# variable
# or
_last=$BASH_ARGV # official built-in (but takes more typing :)
That's it.
$ cat lastarg
#!/bin/bash
# echo the last arg given:
_last=${!#}
echo $_last
_last=$BASH_ARGV
echo $_last
for x; do
echo $x
done
Output is:
$ lastarg 1 2 3 4 "5 6 7"
5 6 7
5 6 7
1
2
3
4
5 6 7
The following will work for you.
# is for array of arguments.
: means at
$# is the length of the array of arguments.
So the result is the last element:
${#:$#}
Example:
function afunction{
echo ${#:$#}
}
afunction -d -o local 50
#Outputs 50
Note that this is bash-only.
Use indexing combined with length of:
echo ${#:${##}}
Note that this is bash-only.
Found this when looking to separate the last argument from all the previous one(s).
Whilst some of the answers do get the last argument, they're not much help if you need all the other args as well. This works much better:
heads=${#:1:$#-1}
tail=${#:$#}
Note that this is bash-only.
This works in all POSIX-compatible shells:
eval last=\${$#}
Source: http://www.faqs.org/faqs/unix-faq/faq/part2/section-12.html
Here is mine solution:
pretty portable (all POSIX sh, bash, ksh, zsh) should work
does not shift original arguments (shifts a copy).
does not use evil eval
does not iterate through the whole list
does not use external tools
Code:
ntharg() {
shift $1
printf '%s\n' "$1"
}
LAST_ARG=`ntharg $# "$#"`
From oldest to newer solutions:
The most portable solution, even older sh (works with spaces and glob characters) (no loop, faster):
eval printf "'%s\n'" "\"\${$#}\""
Since version 2.01 of bash
$ set -- The quick brown fox jumps over the lazy dog
$ printf '%s\n' "${!#} ${#:(-1)} ${#: -1} ${#:~0} ${!#}"
dog dog dog dog dog
For ksh, zsh and bash:
$ printf '%s\n' "${#: -1} ${#:~0}" # the space beetwen `:`
# and `-1` is a must.
dog dog
And for "next to last":
$ printf '%s\n' "${#:~1:1}"
lazy
Using printf to workaround any issues with arguments that start with a dash (like -n).
For all shells and for older sh (works with spaces and glob characters) is:
$ set -- The quick brown fox jumps over the lazy dog "the * last argument"
$ eval printf "'%s\n'" "\"\${$#}\""
The last * argument
Or, if you want to set a last var:
$ eval last=\${$#}; printf '%s\n' "$last"
The last * argument
And for "next to last":
$ eval printf "'%s\n'" "\"\${$(($#-1))}\""
dog
If you are using Bash >= 3.0
echo ${BASH_ARGV[0]}
For bash, this comment suggested the very elegant:
echo "${#:$#}"
To silence shellcheck, use:
echo ${*:$#}
As a bonus, both also work in zsh.
shift `expr $# - 1`
echo "$1"
This shifts the arguments by the number of arguments minus 1, and returns the first (and only) remaining argument, which will be the last one.
I only tested in bash, but it should work in sh and ksh as well.
I found #AgileZebra's answer (plus #starfry's comment) the most useful, but it sets heads to a scalar. An array is probably more useful:
heads=( "${#: 1: $# - 1}" )
tail=${#:${##}}
Note that this is bash-only.
Edit: Removed unnecessary $(( )) according to #f-hauri's comment.
A solution using eval:
last=$(eval "echo \$$#")
echo $last
If you want to do it in a non-destructive way, one way is to pass all the arguments to a function and return the last one:
#!/bin/bash
last() {
if [[ $# -ne 0 ]] ; then
shift $(expr $# - 1)
echo "$1"
#else
#do something when no arguments
fi
}
lastvar=$(last "$#")
echo $lastvar
echo "$#"
pax> ./qq.sh 1 2 3 a b
b
1 2 3 a b
If you don't actually care about keeping the other arguments, you don't need it in a function but I have a hard time thinking of a situation where you would never want to keep the other arguments unless they've already been processed, in which case I'd use the process/shift/process/shift/... method of sequentially processing them.
I'm assuming here that you want to keep them because you haven't followed the sequential method. This method also handles the case where there's no arguments, returning "". You could easily adjust that behavior by inserting the commented-out else clause.
For tcsh:
set X = `echo $* | awk -F " " '{print $NF}'`
somecommand "$X"
I'm quite sure this would be a portable solution, except for the assignment.
After reading the answers above I wrote a Q&D shell script (should work on sh and bash) to run g++ on PGM.cpp to produce executable image PGM. It assumes that the last argument on the command line is the file name (.cpp is optional) and all other arguments are options.
#!/bin/sh
if [ $# -lt 1 ]
then
echo "Usage: `basename $0` [opt] pgm runs g++ to compile pgm[.cpp] into pgm"
exit 2
fi
OPT=
PGM=
# PGM is the last argument, all others are considered options
for F; do OPT="$OPT $PGM"; PGM=$F; done
DIR=`dirname $PGM`
PGM=`basename $PGM .cpp`
# put -o first so it can be overridden by -o specified in OPT
set -x
g++ -o $DIR/$PGM $OPT $DIR/$PGM.cpp
The following will set LAST to last argument without changing current environment:
LAST=$({
shift $(($#-1))
echo $1
})
echo $LAST
If other arguments are no longer needed and can be shifted it can be simplified to:
shift $(($#-1))
echo $1
For portability reasons following:
shift $(($#-1));
can be replaced with:
shift `expr $# - 1`
Replacing also $() with backquotes we get:
LAST=`{
shift \`expr $# - 1\`
echo $1
}`
echo $LAST
echo $argv[$#argv]
Now I just need to add some text because my answer was too short to post. I need to add more text to edit.
This is part of my copy function:
eval echo $(echo '$'"$#")
To use in scripts, do this:
a=$(eval echo $(echo '$'"$#"))
Explanation (most nested first):
$(echo '$'"$#") returns $[nr] where [nr] is the number of parameters. E.g. the string $123 (unexpanded).
echo $123 returns the value of 123rd parameter, when evaluated.
eval just expands $123 to the value of the parameter, e.g. last_arg. This is interpreted as a string and returned.
Works with Bash as of mid 2015.
To return the last argument of the most recently used command use the special parameter:
$_
In this instance it will work if it is used within the script before another command has been invoked.
#! /bin/sh
next=$1
while [ -n "${next}" ] ; do
last=$next
shift
next=$1
done
echo $last
Try the below script to find last argument
# cat arguments.sh
#!/bin/bash
if [ $# -eq 0 ]
then
echo "No Arguments supplied"
else
echo $* > .ags
sed -e 's/ /\n/g' .ags | tac | head -n1 > .ga
echo "Last Argument is: `cat .ga`"
fi
Output:
# ./arguments.sh
No Arguments supplied
# ./arguments.sh testing for the last argument value
Last Argument is: value
Thanks.
There is a much more concise way to do this. Arguments to a bash script can be brought into an array, which makes dealing with the elements much simpler. The script below will always print the last argument passed to a script.
argArray=( "$#" ) # Add all script arguments to argArray
arrayLength=${#argArray[#]} # Get the length of the array
lastArg=$((arrayLength - 1)) # Arrays are zero based, so last arg is -1
echo ${argArray[$lastArg]}
Sample output
$ ./lastarg.sh 1 2 buckle my shoe
shoe
Using parameter expansion (delete matched beginning):
args="$#"
last=${args##* }
It's also easy to get all before last:
prelast=${args% *}
$ echo "${*: -1}"
That will print the last argument
With GNU bash version >= 3.0:
num=$# # get number of arguments
echo "${!num}" # print last argument
Just use !$.
$ mkdir folder
$ cd !$ # will run: cd folder

What's a concise way to check that environment variables are set in a Unix shell script?

I've got a few Unix shell scripts where I need to check that certain environment variables are set before I start doing stuff, so I do this sort of thing:
if [ -z "$STATE" ]; then
echo "Need to set STATE"
exit 1
fi
if [ -z "$DEST" ]; then
echo "Need to set DEST"
exit 1
fi
which is a lot of typing. Is there a more elegant idiom for checking that a set of environment variables is set?
EDIT: I should mention that these variables have no meaningful default value - the script should error out if any are unset.
Parameter Expansion
The obvious answer is to use one of the special forms of parameter expansion:
: ${STATE?"Need to set STATE"}
: ${DEST:?"Need to set DEST non-empty"}
Or, better (see section on 'Position of double quotes' below):
: "${STATE?Need to set STATE}"
: "${DEST:?Need to set DEST non-empty}"
The first variant (using just ?) requires STATE to be set, but STATE="" (an empty string) is OK — not exactly what you want, but the alternative and older notation.
The second variant (using :?) requires DEST to be set and non-empty.
If you supply no message, the shell provides a default message.
The ${var?} construct is portable back to Version 7 UNIX and the Bourne Shell (1978 or thereabouts). The ${var:?} construct is slightly more recent: I think it was in System III UNIX circa 1981, but it may have been in PWB UNIX before that. It is therefore in the Korn Shell, and in the POSIX shells, including specifically Bash.
It is usually documented in the shell's man page in a section called Parameter Expansion. For example, the bash manual says:
${parameter:?word}
Display Error if Null or Unset. If parameter is null or unset, the expansion of word (or a message to that effect if word is not present) is written to the standard error and the shell, if it is not interactive, exits. Otherwise, the value of parameter is substituted.
The Colon Command
I should probably add that the colon command simply has its arguments evaluated and then succeeds. It is the original shell comment notation (before '#' to end of line). For a long time, Bourne shell scripts had a colon as the first character. The C Shell would read a script and use the first character to determine whether it was for the C Shell (a '#' hash) or the Bourne shell (a ':' colon). Then the kernel got in on the act and added support for '#!/path/to/program' and the Bourne shell got '#' comments, and the colon convention went by the wayside. But if you come across a script that starts with a colon, now you will know why.
Position of double quotes
blong asked in a comment:
Any thoughts on this discussion? https://github.com/koalaman/shellcheck/issues/380#issuecomment-145872749
The gist of the discussion is:
… However, when I shellcheck it (with version 0.4.1), I get this message:
In script.sh line 13:
: ${FOO:?"The environment variable 'FOO' must be set and non-empty"}
^-- SC2086: Double quote to prevent globbing and word splitting.
Any advice on what I should do in this case?
The short answer is "do as shellcheck suggests":
: "${STATE?Need to set STATE}"
: "${DEST:?Need to set DEST non-empty}"
To illustrate why, study the following. Note that the : command doesn't echo its arguments (but the shell does evaluate the arguments). We want to see the arguments, so the code below uses printf "%s\n" in place of :.
$ mkdir junk
$ cd junk
$ > abc
$ > def
$ > ghi
$
$ x="*"
$ printf "%s\n" ${x:?You must set x} # Careless; not recommended
abc
def
ghi
$ unset x
$ printf "%s\n" ${x:?You must set x} # Careless; not recommended
bash: x: You must set x
$ printf "%s\n" "${x:?You must set x}" # Careful: should be used
bash: x: You must set x
$ x="*"
$ printf "%s\n" "${x:?You must set x}" # Careful: should be used
*
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
abc
def
ghi
$ x=
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
bash: x: You must set x
$ unset x
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
bash: x: You must set x
$
Note how the value in $x is expanded to first * and then a list of file names when the overall expression is not in double quotes. This is what shellcheck is recommending should be fixed. I have not verified that it doesn't object to the form where the expression is enclosed in double quotes, but it is a reasonable assumption that it would be OK.
Try this:
[ -z "$STATE" ] && echo "Need to set STATE" && exit 1;
Your question is dependent on the shell that you are using.
Bourne shell leaves very little in the way of what you're after.
BUT...
It does work, just about everywhere.
Just try and stay away from csh. It was good for the bells and whistles it added, compared the Bourne shell, but it is really creaking now. If you don't believe me, just try and separate out STDERR in csh! (-:
There are two possibilities here. The example above, namely using:
${MyVariable:=SomeDefault}
for the first time you need to refer to $MyVariable. This takes the env. var MyVariable and, if it is currently not set, assigns the value of SomeDefault to the variable for later use.
You also have the possibility of:
${MyVariable:-SomeDefault}
which just substitutes SomeDefault for the variable where you are using this construct. It doesn't assign the value SomeDefault to the variable, and the value of MyVariable will still be null after this statement is encountered.
Surely the simplest approach is to add the -u switch to the shebang (the line at the top of your script), assuming you’re using bash:
#!/bin/sh -u
This will cause the script to exit if any unbound variables lurk within.
${MyVariable:=SomeDefault}
If MyVariable is set and not null, it will reset the variable value (= nothing happens).
Else, MyVariable is set to SomeDefault.
The above will attempt to execute ${MyVariable}, so if you just want to set the variable do:
MyVariable=${MyVariable:=SomeDefault}
In my opinion the simplest and most compatible check for #!/bin/sh is:
if [ "$MYVAR" = "" ]
then
echo "Does not exist"
else
echo "Exists"
fi
Again, this is for /bin/sh and is compatible also on old Solaris systems.
bash 4.2 introduced the -v operator which tests if a name is set to any value, even the empty string.
$ unset a
$ b=
$ c=
$ [[ -v a ]] && echo "a is set"
$ [[ -v b ]] && echo "b is set"
b is set
$ [[ -v c ]] && echo "c is set"
c is set
I always used:
if [ "x$STATE" == "x" ]; then echo "Need to set State"; exit 1; fi
Not that much more concise, I'm afraid.
Under CSH you have $?STATE.
For future people like me, I wanted to go a step forward and parameterize the var name, so I can loop over a variable sized list of variable names:
#!/bin/bash
declare -a vars=(NAME GITLAB_URL GITLAB_TOKEN)
for var_name in "${vars[#]}"
do
if [ -z "$(eval "echo \$$var_name")" ]; then
echo "Missing environment variable $var_name"
exit 1
fi
done
We can write a nice assertion to check a bunch of variables all at once:
#
# assert if variables are set (to a non-empty string)
# if any variable is not set, exit 1 (when -f option is set) or return 1 otherwise
#
# Usage: assert_var_not_null [-f] variable ...
#
function assert_var_not_null() {
local fatal var num_null=0
[[ "$1" = "-f" ]] && { shift; fatal=1; }
for var in "$#"; do
[[ -z "${!var}" ]] &&
printf '%s\n' "Variable '$var' not set" >&2 &&
((num_null++))
done
if ((num_null > 0)); then
[[ "$fatal" ]] && exit 1
return 1
fi
return 0
}
Sample invocation:
one=1 two=2
assert_var_not_null one two
echo test 1: return_code=$?
assert_var_not_null one two three
echo test 2: return_code=$?
assert_var_not_null -f one two three
echo test 3: return_code=$? # this code shouldn't execute
Output:
test 1: return_code=0
Variable 'three' not set
test 2: return_code=1
Variable 'three' not set
More such assertions here: https://github.com/codeforester/base/blob/master/lib/assertions.sh
This can be a way too:
if (set -u; : $HOME) 2> /dev/null
...
...
http://unstableme.blogspot.com/2007/02/checks-whether-envvar-is-set-or-not.html
None of the above solutions worked for my purposes, in part because I checking the environment for an open-ended list of variables that need to be set before starting a lengthy process. I ended up with this:
mapfile -t arr < variables.txt
EXITCODE=0
for i in "${arr[#]}"
do
ISSET=$(env | grep ^${i}= | wc -l)
if [ "${ISSET}" = "0" ];
then
EXITCODE=-1
echo "ENV variable $i is required."
fi
done
exit ${EXITCODE}
Rather than using external shell scripts I tend to load in functions in my login shell. I use something like this as a helper function to check for environment variables rather than any set variable:
is_this_an_env_variable ()
local var="$1"
if env |grep -q "^$var"; then
return 0
else
return 1
fi
}
The $? syntax is pretty neat:
if [ $?BLAH == 1 ]; then
echo "Exists";
else
echo "Does not exist";
fi

number of tokens in bash variable

how can I know the number of tokens in a bash variable (whitespace-separated tokens) - or at least, wether it is one or there are more.
The $# expansion will tell you the number of elements in a variable / array. If you're working with a bash version greater than 2.05 or so you can:
VAR='some string with words'
VAR=( $VAR )
echo ${#VAR[#]}
This effectively splits the string into an array along whitespace (which is the default delimiter), and then counts the members of the array.
EDIT:
Of course, this recasts the variable as an array. If you don't want that, use a different variable name or recast the variable back into a string:
VAR="${VAR[*]}"
I can't understand why people are using those overcomplicated bashisms all the time. There's almost always a straight-forward, no-bashism solution.
howmany() { echo $#; }
myvar="I am your var"
howmany $myvar
This uses the tokenizer built-in to the shell, so there's no discrepancy.
Here's one related gotcha:
myvar='*'
echo $myvar
echo "$myvar"
set -f
echo $myvar
echo "$myvar"
Note that the solution from #guns using bash array has the same gotcha.
The following is a (supposedly) super-robust version to work around the gotcha:
howmany() ( set -f; set -- $1; echo $# )
If we want to avoid the subshell, things start to get ugly
howmany() {
case $- in *f*) set -- $1;; *) set -f; set -- $1; set +f;; esac
echo $#
}
These two must be used WITH quotes, e.g. howmany "one two three" returns 3
set VAR='hello world'
echo $VAR | wc -w
here is how you can check.
if [ `echo $VAR | wc -w` -gt 1 ]
then
echo "Hello"
fi
Simple method:
$ VAR="a b c d"
$ set $VAR
$ echo $#
4
To count:
sentence="This is a sentence, please count the words in me."
words="${sentence//[^\ ]} "
echo ${#words}
To check:
sentence1="Two words"
sentence2="One"
[[ "$sentence1" =~ [\ ] ]] && echo "sentence1 has more than one word"
[[ "$sentence2" =~ [\ ] ]] && echo "sentence2 has more than one word"
For a robust, portable sh solution, see #JoSo's functions using set -f.
(Simple bash-only solution for answering (only) the "Is there at least 1 whitespace?" question; note: will also match leading and trailing whitespace, unlike the awk solution below:
[[ $v =~ [[:space:]] ]] && echo "\$v has at least 1 whitespace char."
)
Here's a robust awk-based bash solution (less efficient due to invocation of an external utility, but probably won't matter in many real-world scenarios):
# Functions - pass in a quoted variable reference as the only argument.
# Takes advantage of `awk` splitting each input line into individual tokens by
# whitespace; `NF` represents the number of tokens.
# `-v RS=$'\3'` ensures that even multiline input is treated as a single input
# string.
countTokens() { awk -v RS=$'\3' '{print NF}' <<<"$1"; }
hasMultipleTokens() { awk -v RS=$'\3' '{if(NF>1) ec=0; else ec=1; exit ec}' <<<"$1"; }
# Example: Note the use of glob `*` to demonstrate that it is not
# accidentally expanded.
v='I am *'
echo "\$v has $(countTokens "$v") token(s)."
if hasMultipleTokens "$v"; then
echo "\$v has multiple tokens."
else
echo "\$v has just 1 token."
fi
Not sure if this is exactly what you meant but:
$# = Number of arguments passed to the bash script
Otherwise you might be looking for something like man wc

Resources