Why would I use declare / typeset in a shell script instead of just X=y? - bash

I've recently come across a shell script that uses
declare -- FOO="" which apparently is spelled typeset -- FOO="" in non-bash shells.
Why might I want to do that instead of plain FOO="" or export FOO?

The most important purpose of using declare is to control scope, or to use array types that aren't otherwise accessible.
Using Function-Local Variables
To give you an example:
print_dashes() { for (( i=0; i<10; i++; do printf '-'; done; echo; }
while read -p "Enter a number: " i; do
print_dashes
echo "You entered: $i"
done
You'd expect that to print the number the user entered, right? But instead, it'll always print the value of i that print_dashes leaves when it's complete.
Consider instead:
print_dashes() {
declare i # ''local i'' would also have the desired effect
for (( i=0; i<10; i++; do printf '-'; done; echo;
}
...now i is local, so the newly-assigned value doesn't last beyond its invocation.
Declaring Explicitly Global Variables
Contrariwise, you sometimes want to declare a global variable, and make it clear to your code's readers that you're doing that by intent, or to do so while also declaring something as an array (or otherwise where declare would otherwise implicitly specify global state). You can do that too:
myfunc() {
declare arg # make arg local
declare -g -A myfunc_args_seen # make myfunc_args_seen a global associative array
for arg; do
myfunc_args_seen["$arg"]=1
done
echo "Across all invocations of myfunc, we have seen the following arguments:"
printf ' - %q\n' "${!myfunc_args_seen[#]}"
}
Declaring Associative Arrays
Normal shell arrays can just be assigned: my_arr=( one two three )
However, that's not the case for associative arrays, which are keyed as strings. For those, you need to declare them:
declare -A my_arr=( ["one"]=1 ["two"]=2 ["three"]=3 )

declare -i cnt=0
declares an integer-only variable, which is faster for math and always evaluates in arithmetic context.
declare -l lower="$1"
declares a variabl that automatically lowercases anything put in it, without any special syntax on access.
declare -r unchangeable="$constant"
declares a variable read-only.
Take a look at https://unix.stackexchange.com/questions/254367/in-bash-scripting-whats-the-different-between-declare-and-a-normal-variable for some useful discussion - you might not need these things often, but if you don't know what's available you're likely to work harder than you should.

A great reason to use declare, typeset, and/or readonly is code compartmentalization and reuse (i.e. encapsulation). You can write code in one script that can be sourced by others.
(Note declared/typeset/readonly constants/variables/functions lose their "readonly-ness" in a subshell, but they retain it when a child script sources their defining script since sourcing loads a script into the current shell, not a subshell.)
Since sourcing loads code from the script into the current shell though, the namespaces will overlap. To prevent a variable in a child script from being overwritten by its parent (or vice-versa, depending on where the script is sourced and the variable used), you can declare a variable readonly so it won't get overwritten.
You have to be careful with this because once you declare something readonly, you cannot unset it, so you do not want to declare something readonly that might naturally be redefined in another script. For example, if you're writing a library for general use that has logging functions, you might not want to use typeset -f on a function called warn, error, or info, since it is likely other scripts will create similar logging functions of their own with that name. In this case, it is actually standard practice to prefix the function, variable, and/or constant name with the name of the defining script and then make it readonly (e.g. my_script_warn, my_script_error, etc.). This preserves the values of the functions, variables, and/or constants as used in the logic in the code in the defining script so they don't get overwritten by sourcing scripts and accidentally fail.

Related

Why is this Bash script not inheriting all environment variables?

I'm trying something very straightforward:
PEOPLE=(
"nick"
"bob"
)
export PEOPLE="$(IFS=, ; echo "${PEOPLE[*]}")"
echo "$PEOPLE" # prints 'nick,bob'
./process-people.sh
For some reason, process-people.sh isn't seeing $PEOPLE. As in, if I echo "$PEOPLE" from inside process-people.sh, it prints an empty line.
From what I understand, the child process created by invoking ./process-people.sh should inherit all the parent process's environment variables, including $PEOPLE.
Yet, I've tried this on both Bash 3.2.57(1)-release and 4.2.46(2)-release and it doesn't work.
What's going on here?
That's a neat solution you have there for joining the elements of a Bash array into a string. Did you know that in Bash you cannot export array variables to the environment? And if a variable is not in the environment, then the child process will not see it.
Ah. But you aren't exporting an array, are you. You're converting the array into a string and then exporting that. So it should work.
But this is Bash! Where various accidents of history conspire to give you the finger.
As #PesaThe and #chepner pointed out in the comments below, you cannot actually convert a Bash array variable to a string variable. According to the Bash reference on arrays:
Referencing an array variable without a subscript is equivalent to referencing with a subscript of 0.
So when you call export PEOPLE=... where PEOPLE was previously assigned an array value, what you're actually doing is PEOPLE[0]=.... Here's a fuller example:
PEOPLE=(
"nick"
"bob"
)
export PEOPLE="super"
echo "$PEOPLE" # masks the fact that PEOPLE is still an array and just prints 'super'
echo "${PEOPLE[*]}" # prints 'super bob'
It's unfortunate that the export silently fails to export the array to the environment (it returns 0), and it's confusing that Bash equates ARRAY_VARIABLE to ARRAY_VARIABLE[0] in certain situations. We'll just have to chalk that up to a combination of history and backwards compatibility.
Here's a working solution to your problem:
PEOPLE_ARRAY=(
"nick"
"bob"
)
export PEOPLE="$(IFS=, ; echo "${PEOPLE_ARRAY[*]}")"
echo "$PEOPLE" # prints 'nick,bob'
./process-people.sh
The key here is to assign the array and derived string to different variables. Since PEOPLE is a proper string variable, it will export just fine and process-people.sh will work as expected.
It's not possible to directly change a Bash array variable into a string variable. Once a variable is assigned an array value, it becomes an array variable. The only way to change it back to a string variable is to destroy it with unset and recreate it.
Bash has a couple of handy commands for inspecting variables that are useful for investigating these kinds of issues:
printenv PEOPLE # prints 'nick,bob'
declare -p PEOPLE_ARRAY # prints: declare -ax PEOPLE_ARRAY='([0]="nick" [1]="bob")'
printenv will only return a value for environment variables, vs. echo, which will print a result whether the variable has been properly exported or not.
declare -p will show the full value of a variable, without the gotchas related to including or leaving out array index references (e.g. ARRAY_VARIABLE[*]).

Deferred evaluation of bash variables

I need to define a string (options) which contains a variable (group) that is going to be available later in the script.
This is what I came up with, using a literal string that gets evaluated later.
#!/bin/bash
options='--group="$group"' #$group is not available at this point
#
# Some code...
#
group='trekkie'
eval echo "$options" # the result is used elsewhere
It works, however it makes use of eval which I would like to avoid if not absolutely necessary (I don't want to risk potential problems because of unpredictable data).
I've asked for help in multiple places and I've got a couple of answers that were directing me to use indirect variables.
The problem is I simply fail to see how indirect variables might help me with my problem. As far as I understand they only offer a way of indirectly referencing other variables like this:
options="--group="$group""
a=options
group='trekkies'
echo "${!a}" # spits out --group=
I would also like to avoid using functions if possible because I don't want to make things more complicated than they need to be.
More Idiomatic: Using Parameter Expansion
Don't attempt to define the --group="$group" argument up-front when you don't yet know the group name; instead, set a flag that indicates whether the argument is needed, and honor that flag when forming your final argument list.
By going the below approach, you avoid any need for "deferred evaluation":
#!/bin/bash
# initialize your flag as unset
unset needs_group
# depending on your application logic, optionally set that flag
if [[ $application_logic_here ]]; then
needs_group=1
fi
# ...so, the actual group can be defined later, when it's known...
group=trekkies
# and then check the flag to determine whether to pass the argument:
yourcommand ${needs_group+--group="$group"}
If you don't need the flag to be separate from the group variable, this is even easier:
# pass --group="$group" only if "$group" is a defined shell variable
yourcommand ${group+--group="$group"}
The relevant syntax is a parameter expansion: ${var+value} expands to value only if var is defined; and unlike most parameter expansions, its value can parse to multiple words with quoting applied.
Alternately: One-Liner Function Shims
Here, you really are defining --group="$group" before the group is known:
#!/bin/bash
if [[ $application_logic_here ]]; then
with_optional_group() { "$#" --group="$group"; }
else
with_optional_group() { "$#"; }
fi
group=trekkies
with_optional_group yourcommand

Bash typeset (declare) as integer without variable

Is it possible to typeset -i (synonymous with declare -i, see a manpage or a reference) in bash without assigning to a variable?
Consider the following example:
typeset -i a=42;
foo $a;
Is it possible to achieve the same functionality without using a helper variable?
Assume foo is not editable (for example, a binary) with reasonable ease.
Put the declaration of type inside the function's body. You can use either declare or (to be more explicit) local for this:
foo() {
local -i arg=$1
....
}
No other solution is possible without modifying the function's body (or adding a wrapper which performs typechecking before passing the arguments as untyped strings), as arguments to functions (and to external commands) are passed as strings, regardless of any type declarations which may have been made beforehand.

Scope of variables in KSH

I have written a sample KornShell function to split a String, put it in an array and then print out the values.
The code is as below
#!/usr/bin/ksh
splitString() {
string="abc#hotmail.com;xyz#gmail.com;uvw#yahoo.com"
oIFS="$IFS";
IFS=';'
set -A str $string
IFS="$oIFS"
}
splitString
echo "strings count = ${#str[#]}"
echo "first : ${str[0]}";
echo "second: ${str[1]}";
echo "third : ${str[2]}";
Now the echo does not print out the values of the array, so I assume it has something to do with the scope of the array defined.
I am new to Shell scripting, can anybody help me out with understanding the scope of variables in the example above?
The default scope of a variable is the whole script.
However, when you declare a variable inside a function, the variable becomes local to the function that declares it. Ksh has dynamic scoping, so the variable is also accessible in functions that are invoked by the function that declares the variable. This is tersely documented in the section on functions in the manual. Note that in AT&T ksh (as opposed to pdksh and derivatives, and the similar features of bash and zsh), this only applies to functions defined with the function keyword, not to functions defined with the traditional f () { … } syntax. In AT&T ksh93, all variables declared in functions defined with the traditional syntax are global.
The main way of declaring a variable is with the typeset builtin. It always makes a variable local (in AT&T ksh, only in functions declared with function). If you assign to a variable without having declared it with typeset, it's global.
The ksh documentation does not specify whether set -A makes a variable local or global, and different versions make it either. Under ksh 93u, pdksh or mksh, the variable is global and your script does print out the value. You appear to have ksh88 or an older version of ksh where the scope is local. I think that initializing str outside the function would create a global variable, but I'm not sure.
Note that you should use a local variable to override the value of IFS: saving to another variable is not only clumsy, it's also brittle because it doesn't restore IFS properly if it was unset. Furthermore, you should turn off globbing, because otherwise if the string contains shell globbing characters ?*\[ and one of the words happens to match one or more file on your system it will be expanded, e.g. set -A $string where string is a;* will result in str containing the list of file names in the current directory.
set -A str
function splitString {
typeset IFS=';' globbing=1
case $- in *f*) globbing=;; esac
set -f
set -A str $string
if [ -n "$globbing" ]; then set +f; fi
}
splitString "$string"
Variables are normally global to the shell they're defined in from the time they're defined.
The typeset command can make them local to the function they're defined in, or alternatively to make them automatically exported (even when they're updated.)
Read up "typeset" and "integer" in the manpage, or Korn's book.

Associative arrays are local by default

Associative arrays seem to be local by default when declared inside a function body, where they should be global. The following code
#!/bin/bash
f() {
declare -A map
map[x]=a
map[y]=b
}
f
echo x: ${map[x]} y: ${map[y]}
produces the output:
x: y:
while this
#!/bin/bash
declare -A map
f() {
map[x]=a
map[y]=b
}
f
echo x: ${map[x]} y: ${map[y]}
produces the output:
x: a y: b
Is it possible to declare a global associative array within a function?
Or what work-around can be used?
From: Greg Wooledge
Sent: Tue, 23 Aug 2011 06:53:27 -0700
Subject: Re: YAQAGV (Yet Another Question About Global Variables)
bash 4.2 adds "declare -g" to create global variables from within a
function.
Thank you Greg! However Debian Squeeze still has Bash 4.1.5
Fine, 4.2 adds "declare -g" but it's buggy for associative arrays so it doesn't (yet) answer the question. Here's my bug report and Chet's confirmation that there's a fix scheduled for the next release.
http://lists.gnu.org/archive/html/bug-bash/2013-09/msg00025.html
But I've serendipitously found a workaround, instead of declaring the array and assigning an initial value to it at the same time, first declare the array and then do the assignment. That is, don't do this:
declare -gA a=([x]=1 [y]=2)
but this instead:
declare -gA a; a=([x]=1 [y]=2)
You have already answered your own question with declare -g. The workaround on bash versions < 4.2 is to declare the array outside of the function.
f() {
map[y] = foo
}
declare -A map
foo
echo "${map[y]}"
This example declares a global associative array variable inside a function, in bash.
set -euf +x -o pipefail # There is no place for implicit errors in this script.
function init_arrays(){
# FYI. Multiple array declarations are not a problem. You can invoke it multiple times.
# The "-gA" switch is the trick for the global array declaration inside a function.
declare -gA my_var
}
function do_work(){
init_arrays
my_var[$1]=OPPA
}
do_work aa
echo ${my_var[aa]}
echo It is expected to get OPPA value printed above
Tested on GNU bash, version 4.4...
Important notes.
The declare -A command doesn't actually create an associative array immediately; it just sets an attribute on a variable name which allows you to assign to the name as an associative array. The array itself doesn't exist until the first assignment (!!!).
(I wanted to see a complete working example in this thread, sorry.)
For those who are stuck with Bash version < 4.2 and are not comfortable with proposed workarounds I share my custom implementation of global associative arrays. It does not have the full power of bash associative arrays and you need to be careful about special characters in array index, but gets job done.
get_array(){
local arr_name="$1"
local arr_key="$2"
arr_namekey_var="ASSOCARRAY__${arr_name}__${arr_key}"
echo "${!arr_namekey_var:=}"
}
set_array(){
local arr_name="$1"
local arr_key="$2"
local arr_value="$3"
arr_namekey_var="ASSOCARRAY__${arr_name}__${arr_key}"
if [[ -z "${arr_value}" ]]; then
eval ${arr_namekey_var}=
else
printf -v "${arr_namekey_var}" "${arr_value}"
fi
}
Few notes:
Array name and array key could be combined into a single value, but split proved convenient in practice.
__ as a separator can by hacked by malicious or careless use -- to be on the safe side use only single-underscore values in array name and key, on top of only using alphanumeric values. Of course the composition of the internal variable (separators, prefix, suffix...) can be adjusted to application and developer needs.
The default value expansion guarantees that undefined array key (and also array name!) will expand to null string.
Once you move to version of bash where you are comfortable with builtin associative arrays, these two procedures can be used as wrappers for actual associative arrays without having to refactor whole code base.

Resources