number of tokens in bash variable - bash

how can I know the number of tokens in a bash variable (whitespace-separated tokens) - or at least, wether it is one or there are more.

The $# expansion will tell you the number of elements in a variable / array. If you're working with a bash version greater than 2.05 or so you can:
VAR='some string with words'
VAR=( $VAR )
echo ${#VAR[#]}
This effectively splits the string into an array along whitespace (which is the default delimiter), and then counts the members of the array.
EDIT:
Of course, this recasts the variable as an array. If you don't want that, use a different variable name or recast the variable back into a string:
VAR="${VAR[*]}"

I can't understand why people are using those overcomplicated bashisms all the time. There's almost always a straight-forward, no-bashism solution.
howmany() { echo $#; }
myvar="I am your var"
howmany $myvar
This uses the tokenizer built-in to the shell, so there's no discrepancy.
Here's one related gotcha:
myvar='*'
echo $myvar
echo "$myvar"
set -f
echo $myvar
echo "$myvar"
Note that the solution from #guns using bash array has the same gotcha.
The following is a (supposedly) super-robust version to work around the gotcha:
howmany() ( set -f; set -- $1; echo $# )
If we want to avoid the subshell, things start to get ugly
howmany() {
case $- in *f*) set -- $1;; *) set -f; set -- $1; set +f;; esac
echo $#
}
These two must be used WITH quotes, e.g. howmany "one two three" returns 3

set VAR='hello world'
echo $VAR | wc -w
here is how you can check.
if [ `echo $VAR | wc -w` -gt 1 ]
then
echo "Hello"
fi

Simple method:
$ VAR="a b c d"
$ set $VAR
$ echo $#
4

To count:
sentence="This is a sentence, please count the words in me."
words="${sentence//[^\ ]} "
echo ${#words}
To check:
sentence1="Two words"
sentence2="One"
[[ "$sentence1" =~ [\ ] ]] && echo "sentence1 has more than one word"
[[ "$sentence2" =~ [\ ] ]] && echo "sentence2 has more than one word"

For a robust, portable sh solution, see #JoSo's functions using set -f.
(Simple bash-only solution for answering (only) the "Is there at least 1 whitespace?" question; note: will also match leading and trailing whitespace, unlike the awk solution below:
[[ $v =~ [[:space:]] ]] && echo "\$v has at least 1 whitespace char."
)
Here's a robust awk-based bash solution (less efficient due to invocation of an external utility, but probably won't matter in many real-world scenarios):
# Functions - pass in a quoted variable reference as the only argument.
# Takes advantage of `awk` splitting each input line into individual tokens by
# whitespace; `NF` represents the number of tokens.
# `-v RS=$'\3'` ensures that even multiline input is treated as a single input
# string.
countTokens() { awk -v RS=$'\3' '{print NF}' <<<"$1"; }
hasMultipleTokens() { awk -v RS=$'\3' '{if(NF>1) ec=0; else ec=1; exit ec}' <<<"$1"; }
# Example: Note the use of glob `*` to demonstrate that it is not
# accidentally expanded.
v='I am *'
echo "\$v has $(countTokens "$v") token(s)."
if hasMultipleTokens "$v"; then
echo "\$v has multiple tokens."
else
echo "\$v has just 1 token."
fi

Not sure if this is exactly what you meant but:
$# = Number of arguments passed to the bash script
Otherwise you might be looking for something like man wc

Related

BASH - never execute unless environment variable is defined and certain value [duplicate]

I've got a few Unix shell scripts where I need to check that certain environment variables are set before I start doing stuff, so I do this sort of thing:
if [ -z "$STATE" ]; then
echo "Need to set STATE"
exit 1
fi
if [ -z "$DEST" ]; then
echo "Need to set DEST"
exit 1
fi
which is a lot of typing. Is there a more elegant idiom for checking that a set of environment variables is set?
EDIT: I should mention that these variables have no meaningful default value - the script should error out if any are unset.
Parameter Expansion
The obvious answer is to use one of the special forms of parameter expansion:
: ${STATE?"Need to set STATE"}
: ${DEST:?"Need to set DEST non-empty"}
Or, better (see section on 'Position of double quotes' below):
: "${STATE?Need to set STATE}"
: "${DEST:?Need to set DEST non-empty}"
The first variant (using just ?) requires STATE to be set, but STATE="" (an empty string) is OK — not exactly what you want, but the alternative and older notation.
The second variant (using :?) requires DEST to be set and non-empty.
If you supply no message, the shell provides a default message.
The ${var?} construct is portable back to Version 7 UNIX and the Bourne Shell (1978 or thereabouts). The ${var:?} construct is slightly more recent: I think it was in System III UNIX circa 1981, but it may have been in PWB UNIX before that. It is therefore in the Korn Shell, and in the POSIX shells, including specifically Bash.
It is usually documented in the shell's man page in a section called Parameter Expansion. For example, the bash manual says:
${parameter:?word}
Display Error if Null or Unset. If parameter is null or unset, the expansion of word (or a message to that effect if word is not present) is written to the standard error and the shell, if it is not interactive, exits. Otherwise, the value of parameter is substituted.
The Colon Command
I should probably add that the colon command simply has its arguments evaluated and then succeeds. It is the original shell comment notation (before '#' to end of line). For a long time, Bourne shell scripts had a colon as the first character. The C Shell would read a script and use the first character to determine whether it was for the C Shell (a '#' hash) or the Bourne shell (a ':' colon). Then the kernel got in on the act and added support for '#!/path/to/program' and the Bourne shell got '#' comments, and the colon convention went by the wayside. But if you come across a script that starts with a colon, now you will know why.
Position of double quotes
blong asked in a comment:
Any thoughts on this discussion? https://github.com/koalaman/shellcheck/issues/380#issuecomment-145872749
The gist of the discussion is:
… However, when I shellcheck it (with version 0.4.1), I get this message:
In script.sh line 13:
: ${FOO:?"The environment variable 'FOO' must be set and non-empty"}
^-- SC2086: Double quote to prevent globbing and word splitting.
Any advice on what I should do in this case?
The short answer is "do as shellcheck suggests":
: "${STATE?Need to set STATE}"
: "${DEST:?Need to set DEST non-empty}"
To illustrate why, study the following. Note that the : command doesn't echo its arguments (but the shell does evaluate the arguments). We want to see the arguments, so the code below uses printf "%s\n" in place of :.
$ mkdir junk
$ cd junk
$ > abc
$ > def
$ > ghi
$
$ x="*"
$ printf "%s\n" ${x:?You must set x} # Careless; not recommended
abc
def
ghi
$ unset x
$ printf "%s\n" ${x:?You must set x} # Careless; not recommended
bash: x: You must set x
$ printf "%s\n" "${x:?You must set x}" # Careful: should be used
bash: x: You must set x
$ x="*"
$ printf "%s\n" "${x:?You must set x}" # Careful: should be used
*
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
abc
def
ghi
$ x=
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
bash: x: You must set x
$ unset x
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
bash: x: You must set x
$
Note how the value in $x is expanded to first * and then a list of file names when the overall expression is not in double quotes. This is what shellcheck is recommending should be fixed. I have not verified that it doesn't object to the form where the expression is enclosed in double quotes, but it is a reasonable assumption that it would be OK.
Try this:
[ -z "$STATE" ] && echo "Need to set STATE" && exit 1;
Your question is dependent on the shell that you are using.
Bourne shell leaves very little in the way of what you're after.
BUT...
It does work, just about everywhere.
Just try and stay away from csh. It was good for the bells and whistles it added, compared the Bourne shell, but it is really creaking now. If you don't believe me, just try and separate out STDERR in csh! (-:
There are two possibilities here. The example above, namely using:
${MyVariable:=SomeDefault}
for the first time you need to refer to $MyVariable. This takes the env. var MyVariable and, if it is currently not set, assigns the value of SomeDefault to the variable for later use.
You also have the possibility of:
${MyVariable:-SomeDefault}
which just substitutes SomeDefault for the variable where you are using this construct. It doesn't assign the value SomeDefault to the variable, and the value of MyVariable will still be null after this statement is encountered.
Surely the simplest approach is to add the -u switch to the shebang (the line at the top of your script), assuming you’re using bash:
#!/bin/sh -u
This will cause the script to exit if any unbound variables lurk within.
${MyVariable:=SomeDefault}
If MyVariable is set and not null, it will reset the variable value (= nothing happens).
Else, MyVariable is set to SomeDefault.
The above will attempt to execute ${MyVariable}, so if you just want to set the variable do:
MyVariable=${MyVariable:=SomeDefault}
In my opinion the simplest and most compatible check for #!/bin/sh is:
if [ "$MYVAR" = "" ]
then
echo "Does not exist"
else
echo "Exists"
fi
Again, this is for /bin/sh and is compatible also on old Solaris systems.
bash 4.2 introduced the -v operator which tests if a name is set to any value, even the empty string.
$ unset a
$ b=
$ c=
$ [[ -v a ]] && echo "a is set"
$ [[ -v b ]] && echo "b is set"
b is set
$ [[ -v c ]] && echo "c is set"
c is set
I always used:
if [ "x$STATE" == "x" ]; then echo "Need to set State"; exit 1; fi
Not that much more concise, I'm afraid.
Under CSH you have $?STATE.
For future people like me, I wanted to go a step forward and parameterize the var name, so I can loop over a variable sized list of variable names:
#!/bin/bash
declare -a vars=(NAME GITLAB_URL GITLAB_TOKEN)
for var_name in "${vars[#]}"
do
if [ -z "$(eval "echo \$$var_name")" ]; then
echo "Missing environment variable $var_name"
exit 1
fi
done
We can write a nice assertion to check a bunch of variables all at once:
#
# assert if variables are set (to a non-empty string)
# if any variable is not set, exit 1 (when -f option is set) or return 1 otherwise
#
# Usage: assert_var_not_null [-f] variable ...
#
function assert_var_not_null() {
local fatal var num_null=0
[[ "$1" = "-f" ]] && { shift; fatal=1; }
for var in "$#"; do
[[ -z "${!var}" ]] &&
printf '%s\n' "Variable '$var' not set" >&2 &&
((num_null++))
done
if ((num_null > 0)); then
[[ "$fatal" ]] && exit 1
return 1
fi
return 0
}
Sample invocation:
one=1 two=2
assert_var_not_null one two
echo test 1: return_code=$?
assert_var_not_null one two three
echo test 2: return_code=$?
assert_var_not_null -f one two three
echo test 3: return_code=$? # this code shouldn't execute
Output:
test 1: return_code=0
Variable 'three' not set
test 2: return_code=1
Variable 'three' not set
More such assertions here: https://github.com/codeforester/base/blob/master/lib/assertions.sh
This can be a way too:
if (set -u; : $HOME) 2> /dev/null
...
...
http://unstableme.blogspot.com/2007/02/checks-whether-envvar-is-set-or-not.html
None of the above solutions worked for my purposes, in part because I checking the environment for an open-ended list of variables that need to be set before starting a lengthy process. I ended up with this:
mapfile -t arr < variables.txt
EXITCODE=0
for i in "${arr[#]}"
do
ISSET=$(env | grep ^${i}= | wc -l)
if [ "${ISSET}" = "0" ];
then
EXITCODE=-1
echo "ENV variable $i is required."
fi
done
exit ${EXITCODE}
Rather than using external shell scripts I tend to load in functions in my login shell. I use something like this as a helper function to check for environment variables rather than any set variable:
is_this_an_env_variable ()
local var="$1"
if env |grep -q "^$var"; then
return 0
else
return 1
fi
}
The $? syntax is pretty neat:
if [ $?BLAH == 1 ]; then
echo "Exists";
else
echo "Does not exist";
fi

How to parse $QUERY_STRING from a bash CGI script?

I have a bash script that is being used in a CGI. The CGI sets the $QUERY_STRING environment variable by reading everything after the ? in the URL. For example, http://example.com?a=123&b=456&c=ok sets QUERY_STRING=a=123&b=456&c=ok.
Somewhere I found the following ugliness:
b=$(echo "$QUERY_STRING" | sed -n 's/^.*b=\([^&]*\).*$/\1/p' | sed "s/%20/ /g")
which will set $b to whatever was found in $QUERY_STRING for b. However, my script has grown to have over ten input parameters. Is there an easier way to automatically convert the parameters in $QUERY_STRING into environment variables usable by bash?
Maybe I'll just use a for loop of some sort, but it'd be even better if the script was smart enough to automatically detect each parameter and maybe build an array that looks something like this:
${parm[a]}=123
${parm[b]}=456
${parm[c]}=ok
How could I write code to do that?
Try this:
saveIFS=$IFS
IFS='=&'
parm=($QUERY_STRING)
IFS=$saveIFS
Now you have this:
parm[0]=a
parm[1]=123
parm[2]=b
parm[3]=456
parm[4]=c
parm[5]=ok
In Bash 4, which has associative arrays, you can do this (using the array created above):
declare -A array
for ((i=0; i<${#parm[#]}; i+=2))
do
array[${parm[i]}]=${parm[i+1]}
done
which will give you this:
array[a]=123
array[b]=456
array[c]=ok
Edit:
To use indirection in Bash 2 and later (using the parm array created above):
for ((i=0; i<${#parm[#]}; i+=2))
do
declare var_${parm[i]}=${parm[i+1]}
done
Then you will have:
var_a=123
var_b=456
var_c=ok
You can access these directly:
echo $var_a
or indirectly:
for p in a b c
do
name="var$p"
echo ${!name}
done
If possible, it's better to avoid indirection since it can make code messy and be a source of bugs.
you can break $QUERY down using IFS. For example, setting it to &
$ QUERY="a=123&b=456&c=ok"
$ echo $QUERY
a=123&b=456&c=ok
$ IFS="&"
$ set -- $QUERY
$ echo $1
a=123
$ echo $2
b=456
$ echo $3
c=ok
$ array=($#)
$ for i in "${array[#]}"; do IFS="=" ; set -- $i; echo $1 $2; done
a 123
b 456
c ok
And you can save to a hash/dictionary in Bash 4+
$ declare -A hash
$ for i in "${array[#]}"; do IFS="=" ; set -- $i; hash[$1]=$2; done
$ echo ${hash["b"]}
456
Please don't use the evil eval junk.
Here's how you can reliably parse the string and get an associative array:
declare -A param
while IFS='=' read -r -d '&' key value && [[ -n "$key" ]]; do
param["$key"]=$value
done <<<"${QUERY_STRING}&"
If you don't like the key check, you could do this instead:
declare -A param
while IFS='=' read -r -d '&' key value; do
param["$key"]=$value
done <<<"${QUERY_STRING:+"${QUERY_STRING}&"}"
Listing all the keys and values from the array:
for key in "${!param[#]}"; do
echo "$key: ${param[$key]}"
done
I packaged the sed command up into another script:
$cat getvar.sh
s='s/^.*'${1}'=\([^&]*\).*$/\1/p'
echo $QUERY_STRING | sed -n $s | sed "s/%20/ /g"
and I call it from my main cgi as:
id=`./getvar.sh id`
ds=`./getvar.sh ds`
dt=`./getvar.sh dt`
...etc, etc - you get idea.
works for me even with a very basic busybox appliance (my PVR in this case).
To converts the contents of QUERY_STRING into bash variables use the following command:
eval $(echo ${QUERY_STRING//&/;})
The inner step, echo ${QUERY_STRING//&/;}, substitutes all ampersands with semicolons producing a=123;b=456;c=ok which the eval then evaluates into the current shell.
The result can then be used as bash variables.
echo $a
echo $b
echo $c
The assumptions are:
values will never contain '&'
values will never contain ';'
QUERY_STRING will never contain malicious code
While the accepted answer is probably the most beautiful one, there might be cases where security is super-important, and it needs to be also well-visible from your script.
In such a case, first I wouldn't use bash for the task, but if it should be done on some reason, it might be better to avoid these new array - dictionary features, because you can't be sure, how exactly are they escaped.
In this case, the good old primitive solutions might work:
QS="${QUERY_STRING}"
while [ "${QS}" != "" ]
do
nameval="${QS%%&*}"
QS="${QS#$nameval}"
QS="${QS#&}"
name="${nameval%%=*}"
val="${nameval#$name}"
val="${nameval#=}"
# and here we have $name and $val as names and values
# ...
done
This iterates on the name-value pairs of the QUERY_STRING, and there is no way to circumvent it with any tricky escape sequence - the " is a very strong thing in bash, except a single variable name substitution, which is fully controlled by us, nothing can be tricked.
Furthermore, you can inject your own processing code into "# ...". This enables you to allow only your own, well-defined (and, ideally, short) list of the allowed variable names. Needless to say, LD_PRELOAD shouldn't be one of them. ;-)
Furthermore, no variable will be exported, and exclusively QS, nameval, name and val is used.
Following the correct answer, I've done myself some changes to support array variables like in this other question. I added also a decode function of which I can not find the author to give some credit.
Code appears somewhat messy, but it works. Changes and other recommendations would be greatly appreciated.
function cgi_decodevar() {
[ $# -ne 1 ] && return
local v t h
# replace all + with whitespace and append %%
t="${1//+/ }%%"
while [ ${#t} -gt 0 -a "${t}" != "%" ]; do
v="${v}${t%%\%*}" # digest up to the first %
t="${t#*%}" # remove digested part
# decode if there is anything to decode and if not at end of string
if [ ${#t} -gt 0 -a "${t}" != "%" ]; then
h=${t:0:2} # save first two chars
t="${t:2}" # remove these
v="${v}"`echo -e \\\\x${h}` # convert hex to special char
fi
done
# return decoded string
echo "${v}"
return
}
saveIFS=$IFS
IFS='=&'
VARS=($QUERY_STRING)
IFS=$saveIFS
for ((i=0; i<${#VARS[#]}; i+=2))
do
curr="$(cgi_decodevar ${VARS[i]})"
next="$(cgi_decodevar ${VARS[i+2]})"
prev="$(cgi_decodevar ${VARS[i-2]})"
value="$(cgi_decodevar ${VARS[i+1]})"
array=${curr%"[]"}
if [ "$curr" == "$next" ] && [ "$curr" != "$prev" ] ;then
j=0
declare var_${array}[$j]="$value"
elif [ $i -gt 1 ] && [ "$curr" == "$prev" ]; then
j=$((j + 1))
declare var_${array}[$j]="$value"
else
declare var_$curr="$value"
fi
done
I would simply replace the & to ;. It will become to something like:
a=123;b=456;c=ok
So now you need just evaluate and read your vars:
eval `echo "${QUERY_STRING}"|tr '&' ';'`
echo $a
echo $b
echo $c
A nice way to handle CGI query strings is to use Haserl which acts as a wrapper around your Bash cgi script, and offers convenient and secure query string parsing.
To bring this up to date, if you have a recent Bash version then you can achieve this with regular expressions:
q="$QUERY_STRING"
re1='^(\w+=\w+)&?'
re2='^(\w+)=(\w+)$'
declare -A params
while [[ $q =~ $re1 ]]; do
q=${q##*${BASH_REMATCH[0]}}
[[ ${BASH_REMATCH[1]} =~ $re2 ]] && params+=([${BASH_REMATCH[1]}]=${BASH_REMATCH[2]})
done
If you don't want to use associative arrays then just change the penultimate line to do what you want. For each iteration of the loop the parameter is in ${BASH_REMATCH[1]} and its value is in ${BASH_REMATCH[2]}.
Here is the same thing as a function in a short test script that iterates over the array outputs the query string's parameters and their values
#!/bin/bash
QUERY_STRING='foo=hello&bar=there&baz=freddy'
get_query_string() {
local q="$QUERY_STRING"
local re1='^(\w+=\w+)&?'
local re2='^(\w+)=(\w+)$'
while [[ $q =~ $re1 ]]; do
q=${q##*${BASH_REMATCH[0]}}
[[ ${BASH_REMATCH[1]} =~ $re2 ]] && eval "$1+=([${BASH_REMATCH[1]}]=${BASH_REMATCH[2]})"
done
}
declare -A params
get_query_string params
for k in "${!params[#]}"
do
v="${params[$k]}"
echo "$k : $v"
done
Note the parameters end up in the array in reverse order (it's associative so that shouldn't matter).
why not this
$ echo "${QUERY_STRING}"
name=carlo&last=lanza&city=pfungen-CH
$ saveIFS=$IFS
$ IFS='&'
$ eval $QUERY_STRING
$ IFS=$saveIFS
now you have this
name = carlo
last = lanza
city = pfungen-CH
$ echo "name is ${name}"
name is carlo
$ echo "last is ${last}"
last is lanza
$ echo "city is ${city}"
city is pfungen-CH
#giacecco
To include a hiphen in the regex you could change the two lines as such in answer from #starfry.
Change these two lines:
local re1='^(\w+=\w+)&?'
local re2='^(\w+)=(\w+)$'
To these two lines:
local re1='^(\w+=(\w+|-|)+)&?'
local re2='^(\w+)=((\w+|-|)+)$'
For all those who couldn't get it working with the posted answers (like me),
this guy figured it out.
Can't upvote his post unfortunately...
Let me repost the code here real quick:
#!/bin/sh
if [ "$REQUEST_METHOD" = "POST" ]; then
if [ "$CONTENT_LENGTH" -gt 0 ]; then
read -n $CONTENT_LENGTH POST_DATA <&0
fi
fi
#echo "$POST_DATA" > data.bin
IFS='=&'
set -- $POST_DATA
#2- Value1
#4- Value2
#6- Value3
#8- Value4
echo $2 $4 $6 $8
echo "Content-type: text/html"
echo ""
echo "<html><head><title>Saved</title></head><body>"
echo "Data received: $POST_DATA"
echo "</body></html>"
Hope this is of help for anybody.
Cheers
Actually I liked bolt's answer, so I made a version which works with Busybox as well (ash in Busybox does not support here string).
This code will accept key1 and key2 parameters, all others will be ignored.
while IFS= read -r -d '&' KEYVAL && [[ -n "$KEYVAL" ]]; do
case ${KEYVAL%=*} in
key1) KEY1=${KEYVAL#*=} ;;
key2) KEY2=${KEYVAL#*=} ;;
esac
done <<END
$(echo "${QUERY_STRING}&")
END
One can use the bash-cgi.sh, which processes :
the query string into the $QUERY_STRING_GET key and value array;
the post request data (x-www-form-urlencoded) into the $QUERY_STRING_POST key and value array;
the cookies data into the $HTTP_COOKIES key and value array.
Demands bash version 4.0 or higher (to define the key and value arrays above).
All processing is made by bash only (i.e. in an one process) without any external dependencies and additional processes invoking.
It has:
the check for max length of data, which can be transferred to it's input,
as well as processed as query string and cookies;
the redirect() procedure to produce redirect to itself with the extension changed to .html (it is useful for an one page's sites);
the http_header_tail() procedure to output the last two strings of the HTTP(S) respond's header;
the $REMOTE_ADDR value sanitizer from possible injections;
the parser and evaluator of the escaped UTF-8 symbols embedded into the values passed to the $QUERY_STRING_GET, $QUERY_STRING_POST and $HTTP_COOKIES;
the sanitizer of the $QUERY_STRING_GET, $QUERY_STRING_POST and $HTTP_COOKIES values against possible SQL injections (the escaping like the mysql_real_escape_string php function does, plus the escaping of # and $).
It is available here:
https://github.com/VladimirBelousov/fancy_scripts
This works in dash using for in loop
IFS='&'
for f in $query_string; do
value=${f##*=}
key=${f%%=*}
# if you need environment variable -> eval "qs_$key=$value"
done

Getting the last argument passed to a shell script

$1 is the first argument.
$# is all of them.
How can I find the last argument passed to a shell
script?
This is Bash-only:
echo "${#: -1}"
This is a bit of a hack:
for last; do true; done
echo $last
This one is also pretty portable (again, should work with bash, ksh and sh) and it doesn't shift the arguments, which could be nice.
It uses the fact that for implicitly loops over the arguments if you don't tell it what to loop over, and the fact that for loop variables aren't scoped: they keep the last value they were set to.
$ set quick brown fox jumps
$ echo ${*: -1:1} # last argument
jumps
$ echo ${*: -1} # or simply
jumps
$ echo ${*: -2:1} # next to last
fox
The space is necessary so that it doesn't get interpreted as a default value.
Note that this is bash-only.
The simplest answer for bash 3.0 or greater is
_last=${!#} # *indirect reference* to the $# variable
# or
_last=$BASH_ARGV # official built-in (but takes more typing :)
That's it.
$ cat lastarg
#!/bin/bash
# echo the last arg given:
_last=${!#}
echo $_last
_last=$BASH_ARGV
echo $_last
for x; do
echo $x
done
Output is:
$ lastarg 1 2 3 4 "5 6 7"
5 6 7
5 6 7
1
2
3
4
5 6 7
The following will work for you.
# is for array of arguments.
: means at
$# is the length of the array of arguments.
So the result is the last element:
${#:$#}
Example:
function afunction{
echo ${#:$#}
}
afunction -d -o local 50
#Outputs 50
Note that this is bash-only.
Use indexing combined with length of:
echo ${#:${##}}
Note that this is bash-only.
Found this when looking to separate the last argument from all the previous one(s).
Whilst some of the answers do get the last argument, they're not much help if you need all the other args as well. This works much better:
heads=${#:1:$#-1}
tail=${#:$#}
Note that this is bash-only.
This works in all POSIX-compatible shells:
eval last=\${$#}
Source: http://www.faqs.org/faqs/unix-faq/faq/part2/section-12.html
Here is mine solution:
pretty portable (all POSIX sh, bash, ksh, zsh) should work
does not shift original arguments (shifts a copy).
does not use evil eval
does not iterate through the whole list
does not use external tools
Code:
ntharg() {
shift $1
printf '%s\n' "$1"
}
LAST_ARG=`ntharg $# "$#"`
From oldest to newer solutions:
The most portable solution, even older sh (works with spaces and glob characters) (no loop, faster):
eval printf "'%s\n'" "\"\${$#}\""
Since version 2.01 of bash
$ set -- The quick brown fox jumps over the lazy dog
$ printf '%s\n' "${!#} ${#:(-1)} ${#: -1} ${#:~0} ${!#}"
dog dog dog dog dog
For ksh, zsh and bash:
$ printf '%s\n' "${#: -1} ${#:~0}" # the space beetwen `:`
# and `-1` is a must.
dog dog
And for "next to last":
$ printf '%s\n' "${#:~1:1}"
lazy
Using printf to workaround any issues with arguments that start with a dash (like -n).
For all shells and for older sh (works with spaces and glob characters) is:
$ set -- The quick brown fox jumps over the lazy dog "the * last argument"
$ eval printf "'%s\n'" "\"\${$#}\""
The last * argument
Or, if you want to set a last var:
$ eval last=\${$#}; printf '%s\n' "$last"
The last * argument
And for "next to last":
$ eval printf "'%s\n'" "\"\${$(($#-1))}\""
dog
If you are using Bash >= 3.0
echo ${BASH_ARGV[0]}
For bash, this comment suggested the very elegant:
echo "${#:$#}"
To silence shellcheck, use:
echo ${*:$#}
As a bonus, both also work in zsh.
shift `expr $# - 1`
echo "$1"
This shifts the arguments by the number of arguments minus 1, and returns the first (and only) remaining argument, which will be the last one.
I only tested in bash, but it should work in sh and ksh as well.
I found #AgileZebra's answer (plus #starfry's comment) the most useful, but it sets heads to a scalar. An array is probably more useful:
heads=( "${#: 1: $# - 1}" )
tail=${#:${##}}
Note that this is bash-only.
Edit: Removed unnecessary $(( )) according to #f-hauri's comment.
A solution using eval:
last=$(eval "echo \$$#")
echo $last
If you want to do it in a non-destructive way, one way is to pass all the arguments to a function and return the last one:
#!/bin/bash
last() {
if [[ $# -ne 0 ]] ; then
shift $(expr $# - 1)
echo "$1"
#else
#do something when no arguments
fi
}
lastvar=$(last "$#")
echo $lastvar
echo "$#"
pax> ./qq.sh 1 2 3 a b
b
1 2 3 a b
If you don't actually care about keeping the other arguments, you don't need it in a function but I have a hard time thinking of a situation where you would never want to keep the other arguments unless they've already been processed, in which case I'd use the process/shift/process/shift/... method of sequentially processing them.
I'm assuming here that you want to keep them because you haven't followed the sequential method. This method also handles the case where there's no arguments, returning "". You could easily adjust that behavior by inserting the commented-out else clause.
For tcsh:
set X = `echo $* | awk -F " " '{print $NF}'`
somecommand "$X"
I'm quite sure this would be a portable solution, except for the assignment.
After reading the answers above I wrote a Q&D shell script (should work on sh and bash) to run g++ on PGM.cpp to produce executable image PGM. It assumes that the last argument on the command line is the file name (.cpp is optional) and all other arguments are options.
#!/bin/sh
if [ $# -lt 1 ]
then
echo "Usage: `basename $0` [opt] pgm runs g++ to compile pgm[.cpp] into pgm"
exit 2
fi
OPT=
PGM=
# PGM is the last argument, all others are considered options
for F; do OPT="$OPT $PGM"; PGM=$F; done
DIR=`dirname $PGM`
PGM=`basename $PGM .cpp`
# put -o first so it can be overridden by -o specified in OPT
set -x
g++ -o $DIR/$PGM $OPT $DIR/$PGM.cpp
The following will set LAST to last argument without changing current environment:
LAST=$({
shift $(($#-1))
echo $1
})
echo $LAST
If other arguments are no longer needed and can be shifted it can be simplified to:
shift $(($#-1))
echo $1
For portability reasons following:
shift $(($#-1));
can be replaced with:
shift `expr $# - 1`
Replacing also $() with backquotes we get:
LAST=`{
shift \`expr $# - 1\`
echo $1
}`
echo $LAST
echo $argv[$#argv]
Now I just need to add some text because my answer was too short to post. I need to add more text to edit.
This is part of my copy function:
eval echo $(echo '$'"$#")
To use in scripts, do this:
a=$(eval echo $(echo '$'"$#"))
Explanation (most nested first):
$(echo '$'"$#") returns $[nr] where [nr] is the number of parameters. E.g. the string $123 (unexpanded).
echo $123 returns the value of 123rd parameter, when evaluated.
eval just expands $123 to the value of the parameter, e.g. last_arg. This is interpreted as a string and returned.
Works with Bash as of mid 2015.
To return the last argument of the most recently used command use the special parameter:
$_
In this instance it will work if it is used within the script before another command has been invoked.
#! /bin/sh
next=$1
while [ -n "${next}" ] ; do
last=$next
shift
next=$1
done
echo $last
Try the below script to find last argument
# cat arguments.sh
#!/bin/bash
if [ $# -eq 0 ]
then
echo "No Arguments supplied"
else
echo $* > .ags
sed -e 's/ /\n/g' .ags | tac | head -n1 > .ga
echo "Last Argument is: `cat .ga`"
fi
Output:
# ./arguments.sh
No Arguments supplied
# ./arguments.sh testing for the last argument value
Last Argument is: value
Thanks.
There is a much more concise way to do this. Arguments to a bash script can be brought into an array, which makes dealing with the elements much simpler. The script below will always print the last argument passed to a script.
argArray=( "$#" ) # Add all script arguments to argArray
arrayLength=${#argArray[#]} # Get the length of the array
lastArg=$((arrayLength - 1)) # Arrays are zero based, so last arg is -1
echo ${argArray[$lastArg]}
Sample output
$ ./lastarg.sh 1 2 buckle my shoe
shoe
Using parameter expansion (delete matched beginning):
args="$#"
last=${args##* }
It's also easy to get all before last:
prelast=${args% *}
$ echo "${*: -1}"
That will print the last argument
With GNU bash version >= 3.0:
num=$# # get number of arguments
echo "${!num}" # print last argument
Just use !$.
$ mkdir folder
$ cd !$ # will run: cd folder

How to split one string into multiple strings separated by at least one space in bash shell?

I have a string containing many words with at least one space between each two. How can I split the string into individual words so I can loop through them?
The string is passed as an argument. E.g. ${2} == "cat cat file". How can I loop through it?
Also, how can I check if a string contains spaces?
I like the conversion to an array, to be able to access individual elements:
sentence="this is a story"
stringarray=($sentence)
now you can access individual elements directly (it starts with 0):
echo ${stringarray[0]}
or convert back to string in order to loop:
for i in "${stringarray[#]}"
do
:
# do whatever on $i
done
Of course looping through the string directly was answered before, but that answer had the the disadvantage to not keep track of the individual elements for later use:
for i in $sentence
do
:
# do whatever on $i
done
See also Bash Array Reference.
Did you try just passing the string variable to a for loop? Bash, for one, will split on whitespace automatically.
sentence="This is a sentence."
for word in $sentence
do
echo $word
done
This
is
a
sentence.
Probably the easiest and most secure way in BASH 3 and above is:
var="string to split"
read -ra arr <<<"$var"
(where arr is the array which takes the split parts of the string) or, if there might be newlines in the input and you want more than just the first line:
var="string to split"
read -ra arr -d '' <<<"$var"
(please note the space in -d ''; it cannot be omitted), but this might give you an unexpected newline from <<<"$var" (as this implicitly adds an LF at the end).
Example:
touch NOPE
var="* a *"
read -ra arr <<<"$var"
for a in "${arr[#]}"; do echo "[$a]"; done
Outputs the expected
[*]
[a]
[*]
as this solution (in contrast to all previous solutions here) is not prone to unexpected and often uncontrollable shell globbing.
Also this gives you the full power of IFS as you probably want:
Example:
IFS=: read -ra arr < <(grep "^$USER:" /etc/passwd)
for a in "${arr[#]}"; do echo "[$a]"; done
Outputs something like:
[tino]
[x]
[1000]
[1000]
[Valentin Hilbig]
[/home/tino]
[/bin/bash]
As you can see, spaces can be preserved this way, too:
IFS=: read -ra arr <<<' split : this '
for a in "${arr[#]}"; do echo "[$a]"; done
outputs
[ split ]
[ this ]
Please note that the handling of IFS in BASH is a subject on its own, so do your tests; some interesting topics on this:
unset IFS: Ignores runs of SPC, TAB, NL and on line starts and ends
IFS='': No field separation, just reads everything
IFS=' ': Runs of SPC (and SPC only)
Some last examples:
var=$'\n\nthis is\n\n\na test\n\n'
IFS=$'\n' read -ra arr -d '' <<<"$var"
i=0; for a in "${arr[#]}"; do let i++; echo "$i [$a]"; done
outputs
1 [this is]
2 [a test]
while
unset IFS
var=$'\n\nthis is\n\n\na test\n\n'
read -ra arr -d '' <<<"$var"
i=0; for a in "${arr[#]}"; do let i++; echo "$i [$a]"; done
outputs
1 [this]
2 [is]
3 [a]
4 [test]
BTW:
If you are not used to $'ANSI-ESCAPED-STRING' get used to it; it's a timesaver.
If you do not include -r (like in read -a arr <<<"$var") then read does backslash escapes. This is left as exercise for the reader.
For the second question:
To test for something in a string I usually stick to case, as this can check for multiple cases at once (note: case only executes the first match, if you need fallthrough use multiple case statements), and this need is quite often the case (pun intended):
case "$var" in
'') empty_var;; # variable is empty
*' '*) have_space "$var";; # have SPC
*[[:space:]]*) have_whitespace "$var";; # have whitespaces like TAB
*[^-+.,A-Za-z0-9]*) have_nonalnum "$var";; # non-alphanum-chars found
*[-+.,]*) have_punctuation "$var";; # some punctuation chars found
*) default_case "$var";; # if all above does not match
esac
So you can set the return value to check for SPC like this:
case "$var" in (*' '*) true;; (*) false;; esac
Why case? Because it usually is a bit more readable than regex sequences, and thanks to Shell metacharacters it handles 99% of all needs very well.
Just use the shells "set" built-in. For example,
set $text
After that, individual words in $text will be in $1, $2, $3, etc. For robustness, one usually does
set -- junk $text
shift
to handle the case where $text is empty or start with a dash. For example:
text="This is a test"
set -- junk $text
shift
for word; do
echo "[$word]"
done
This prints
[This]
[is]
[a]
[test]
$ echo "This is a sentence." | tr -s " " "\012"
This
is
a
sentence.
For checking for spaces, use grep:
$ echo "This is a sentence." | grep " " > /dev/null
$ echo $?
0
$ echo "Thisisasentence." | grep " " > /dev/null
$ echo $?
1
echo $WORDS | xargs -n1 echo
This outputs every word, you can process that list as you see fit afterwards.
(A) To split a sentence into its words (space separated) you can simply use the default IFS by using
array=( $string )
Example running the following snippet
#!/bin/bash
sentence="this is the \"sentence\" 'you' want to split"
words=( $sentence )
len="${#words[#]}"
echo "words counted: $len"
printf "%s\n" "${words[#]}" ## print array
will output
words counted: 8
this
is
the
"sentence"
'you'
want
to
split
As you can see you can use single or double quotes too without any problem
Notes:
-- this is basically the same of mob's answer, but in this way you store the array for any further needing. If you only need a single loop, you can use his answer, which is one line shorter :)
-- please refer to this question for alternate methods to split a string based on delimiter.
(B) To check for a character in a string you can also use a regular expression match.
Example to check for the presence of a space character you can use:
regex='\s{1,}'
if [[ "$sentence" =~ $regex ]]
then
echo "Space here!";
fi
For checking spaces just with bash:
[[ "$str" = "${str% *}" ]] && echo "no spaces" || echo "has spaces"
$ echo foo bar baz | sed 's/ /\n/g'
foo
bar
baz
For my use case, the best option was:
grep -oP '\w+' file
Basically this is a regular expression that matches contiguous non-whitespace characters. This means that any type and any amount of whitespace won't match. The -o parameter outputs each word matches on a different line.
Another take on this (using Perl):
$ echo foo bar baz | perl -nE 'say for split /\s/'
foo
bar
baz

What's a concise way to check that environment variables are set in a Unix shell script?

I've got a few Unix shell scripts where I need to check that certain environment variables are set before I start doing stuff, so I do this sort of thing:
if [ -z "$STATE" ]; then
echo "Need to set STATE"
exit 1
fi
if [ -z "$DEST" ]; then
echo "Need to set DEST"
exit 1
fi
which is a lot of typing. Is there a more elegant idiom for checking that a set of environment variables is set?
EDIT: I should mention that these variables have no meaningful default value - the script should error out if any are unset.
Parameter Expansion
The obvious answer is to use one of the special forms of parameter expansion:
: ${STATE?"Need to set STATE"}
: ${DEST:?"Need to set DEST non-empty"}
Or, better (see section on 'Position of double quotes' below):
: "${STATE?Need to set STATE}"
: "${DEST:?Need to set DEST non-empty}"
The first variant (using just ?) requires STATE to be set, but STATE="" (an empty string) is OK — not exactly what you want, but the alternative and older notation.
The second variant (using :?) requires DEST to be set and non-empty.
If you supply no message, the shell provides a default message.
The ${var?} construct is portable back to Version 7 UNIX and the Bourne Shell (1978 or thereabouts). The ${var:?} construct is slightly more recent: I think it was in System III UNIX circa 1981, but it may have been in PWB UNIX before that. It is therefore in the Korn Shell, and in the POSIX shells, including specifically Bash.
It is usually documented in the shell's man page in a section called Parameter Expansion. For example, the bash manual says:
${parameter:?word}
Display Error if Null or Unset. If parameter is null or unset, the expansion of word (or a message to that effect if word is not present) is written to the standard error and the shell, if it is not interactive, exits. Otherwise, the value of parameter is substituted.
The Colon Command
I should probably add that the colon command simply has its arguments evaluated and then succeeds. It is the original shell comment notation (before '#' to end of line). For a long time, Bourne shell scripts had a colon as the first character. The C Shell would read a script and use the first character to determine whether it was for the C Shell (a '#' hash) or the Bourne shell (a ':' colon). Then the kernel got in on the act and added support for '#!/path/to/program' and the Bourne shell got '#' comments, and the colon convention went by the wayside. But if you come across a script that starts with a colon, now you will know why.
Position of double quotes
blong asked in a comment:
Any thoughts on this discussion? https://github.com/koalaman/shellcheck/issues/380#issuecomment-145872749
The gist of the discussion is:
… However, when I shellcheck it (with version 0.4.1), I get this message:
In script.sh line 13:
: ${FOO:?"The environment variable 'FOO' must be set and non-empty"}
^-- SC2086: Double quote to prevent globbing and word splitting.
Any advice on what I should do in this case?
The short answer is "do as shellcheck suggests":
: "${STATE?Need to set STATE}"
: "${DEST:?Need to set DEST non-empty}"
To illustrate why, study the following. Note that the : command doesn't echo its arguments (but the shell does evaluate the arguments). We want to see the arguments, so the code below uses printf "%s\n" in place of :.
$ mkdir junk
$ cd junk
$ > abc
$ > def
$ > ghi
$
$ x="*"
$ printf "%s\n" ${x:?You must set x} # Careless; not recommended
abc
def
ghi
$ unset x
$ printf "%s\n" ${x:?You must set x} # Careless; not recommended
bash: x: You must set x
$ printf "%s\n" "${x:?You must set x}" # Careful: should be used
bash: x: You must set x
$ x="*"
$ printf "%s\n" "${x:?You must set x}" # Careful: should be used
*
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
abc
def
ghi
$ x=
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
bash: x: You must set x
$ unset x
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
bash: x: You must set x
$
Note how the value in $x is expanded to first * and then a list of file names when the overall expression is not in double quotes. This is what shellcheck is recommending should be fixed. I have not verified that it doesn't object to the form where the expression is enclosed in double quotes, but it is a reasonable assumption that it would be OK.
Try this:
[ -z "$STATE" ] && echo "Need to set STATE" && exit 1;
Your question is dependent on the shell that you are using.
Bourne shell leaves very little in the way of what you're after.
BUT...
It does work, just about everywhere.
Just try and stay away from csh. It was good for the bells and whistles it added, compared the Bourne shell, but it is really creaking now. If you don't believe me, just try and separate out STDERR in csh! (-:
There are two possibilities here. The example above, namely using:
${MyVariable:=SomeDefault}
for the first time you need to refer to $MyVariable. This takes the env. var MyVariable and, if it is currently not set, assigns the value of SomeDefault to the variable for later use.
You also have the possibility of:
${MyVariable:-SomeDefault}
which just substitutes SomeDefault for the variable where you are using this construct. It doesn't assign the value SomeDefault to the variable, and the value of MyVariable will still be null after this statement is encountered.
Surely the simplest approach is to add the -u switch to the shebang (the line at the top of your script), assuming you’re using bash:
#!/bin/sh -u
This will cause the script to exit if any unbound variables lurk within.
${MyVariable:=SomeDefault}
If MyVariable is set and not null, it will reset the variable value (= nothing happens).
Else, MyVariable is set to SomeDefault.
The above will attempt to execute ${MyVariable}, so if you just want to set the variable do:
MyVariable=${MyVariable:=SomeDefault}
In my opinion the simplest and most compatible check for #!/bin/sh is:
if [ "$MYVAR" = "" ]
then
echo "Does not exist"
else
echo "Exists"
fi
Again, this is for /bin/sh and is compatible also on old Solaris systems.
bash 4.2 introduced the -v operator which tests if a name is set to any value, even the empty string.
$ unset a
$ b=
$ c=
$ [[ -v a ]] && echo "a is set"
$ [[ -v b ]] && echo "b is set"
b is set
$ [[ -v c ]] && echo "c is set"
c is set
I always used:
if [ "x$STATE" == "x" ]; then echo "Need to set State"; exit 1; fi
Not that much more concise, I'm afraid.
Under CSH you have $?STATE.
For future people like me, I wanted to go a step forward and parameterize the var name, so I can loop over a variable sized list of variable names:
#!/bin/bash
declare -a vars=(NAME GITLAB_URL GITLAB_TOKEN)
for var_name in "${vars[#]}"
do
if [ -z "$(eval "echo \$$var_name")" ]; then
echo "Missing environment variable $var_name"
exit 1
fi
done
We can write a nice assertion to check a bunch of variables all at once:
#
# assert if variables are set (to a non-empty string)
# if any variable is not set, exit 1 (when -f option is set) or return 1 otherwise
#
# Usage: assert_var_not_null [-f] variable ...
#
function assert_var_not_null() {
local fatal var num_null=0
[[ "$1" = "-f" ]] && { shift; fatal=1; }
for var in "$#"; do
[[ -z "${!var}" ]] &&
printf '%s\n' "Variable '$var' not set" >&2 &&
((num_null++))
done
if ((num_null > 0)); then
[[ "$fatal" ]] && exit 1
return 1
fi
return 0
}
Sample invocation:
one=1 two=2
assert_var_not_null one two
echo test 1: return_code=$?
assert_var_not_null one two three
echo test 2: return_code=$?
assert_var_not_null -f one two three
echo test 3: return_code=$? # this code shouldn't execute
Output:
test 1: return_code=0
Variable 'three' not set
test 2: return_code=1
Variable 'three' not set
More such assertions here: https://github.com/codeforester/base/blob/master/lib/assertions.sh
This can be a way too:
if (set -u; : $HOME) 2> /dev/null
...
...
http://unstableme.blogspot.com/2007/02/checks-whether-envvar-is-set-or-not.html
None of the above solutions worked for my purposes, in part because I checking the environment for an open-ended list of variables that need to be set before starting a lengthy process. I ended up with this:
mapfile -t arr < variables.txt
EXITCODE=0
for i in "${arr[#]}"
do
ISSET=$(env | grep ^${i}= | wc -l)
if [ "${ISSET}" = "0" ];
then
EXITCODE=-1
echo "ENV variable $i is required."
fi
done
exit ${EXITCODE}
Rather than using external shell scripts I tend to load in functions in my login shell. I use something like this as a helper function to check for environment variables rather than any set variable:
is_this_an_env_variable ()
local var="$1"
if env |grep -q "^$var"; then
return 0
else
return 1
fi
}
The $? syntax is pretty neat:
if [ $?BLAH == 1 ]; then
echo "Exists";
else
echo "Does not exist";
fi

Resources