Scope of variables in KSH - shell

I have written a sample KornShell function to split a String, put it in an array and then print out the values.
The code is as below
#!/usr/bin/ksh
splitString() {
string="abc#hotmail.com;xyz#gmail.com;uvw#yahoo.com"
oIFS="$IFS";
IFS=';'
set -A str $string
IFS="$oIFS"
}
splitString
echo "strings count = ${#str[#]}"
echo "first : ${str[0]}";
echo "second: ${str[1]}";
echo "third : ${str[2]}";
Now the echo does not print out the values of the array, so I assume it has something to do with the scope of the array defined.
I am new to Shell scripting, can anybody help me out with understanding the scope of variables in the example above?

The default scope of a variable is the whole script.
However, when you declare a variable inside a function, the variable becomes local to the function that declares it. Ksh has dynamic scoping, so the variable is also accessible in functions that are invoked by the function that declares the variable. This is tersely documented in the section on functions in the manual. Note that in AT&T ksh (as opposed to pdksh and derivatives, and the similar features of bash and zsh), this only applies to functions defined with the function keyword, not to functions defined with the traditional f () { … } syntax. In AT&T ksh93, all variables declared in functions defined with the traditional syntax are global.
The main way of declaring a variable is with the typeset builtin. It always makes a variable local (in AT&T ksh, only in functions declared with function). If you assign to a variable without having declared it with typeset, it's global.
The ksh documentation does not specify whether set -A makes a variable local or global, and different versions make it either. Under ksh 93u, pdksh or mksh, the variable is global and your script does print out the value. You appear to have ksh88 or an older version of ksh where the scope is local. I think that initializing str outside the function would create a global variable, but I'm not sure.
Note that you should use a local variable to override the value of IFS: saving to another variable is not only clumsy, it's also brittle because it doesn't restore IFS properly if it was unset. Furthermore, you should turn off globbing, because otherwise if the string contains shell globbing characters ?*\[ and one of the words happens to match one or more file on your system it will be expanded, e.g. set -A $string where string is a;* will result in str containing the list of file names in the current directory.
set -A str
function splitString {
typeset IFS=';' globbing=1
case $- in *f*) globbing=;; esac
set -f
set -A str $string
if [ -n "$globbing" ]; then set +f; fi
}
splitString "$string"

Variables are normally global to the shell they're defined in from the time they're defined.
The typeset command can make them local to the function they're defined in, or alternatively to make them automatically exported (even when they're updated.)
Read up "typeset" and "integer" in the manpage, or Korn's book.

Related

Why would I use declare / typeset in a shell script instead of just X=y?

I've recently come across a shell script that uses
declare -- FOO="" which apparently is spelled typeset -- FOO="" in non-bash shells.
Why might I want to do that instead of plain FOO="" or export FOO?
The most important purpose of using declare is to control scope, or to use array types that aren't otherwise accessible.
Using Function-Local Variables
To give you an example:
print_dashes() { for (( i=0; i<10; i++; do printf '-'; done; echo; }
while read -p "Enter a number: " i; do
print_dashes
echo "You entered: $i"
done
You'd expect that to print the number the user entered, right? But instead, it'll always print the value of i that print_dashes leaves when it's complete.
Consider instead:
print_dashes() {
declare i # ''local i'' would also have the desired effect
for (( i=0; i<10; i++; do printf '-'; done; echo;
}
...now i is local, so the newly-assigned value doesn't last beyond its invocation.
Declaring Explicitly Global Variables
Contrariwise, you sometimes want to declare a global variable, and make it clear to your code's readers that you're doing that by intent, or to do so while also declaring something as an array (or otherwise where declare would otherwise implicitly specify global state). You can do that too:
myfunc() {
declare arg # make arg local
declare -g -A myfunc_args_seen # make myfunc_args_seen a global associative array
for arg; do
myfunc_args_seen["$arg"]=1
done
echo "Across all invocations of myfunc, we have seen the following arguments:"
printf ' - %q\n' "${!myfunc_args_seen[#]}"
}
Declaring Associative Arrays
Normal shell arrays can just be assigned: my_arr=( one two three )
However, that's not the case for associative arrays, which are keyed as strings. For those, you need to declare them:
declare -A my_arr=( ["one"]=1 ["two"]=2 ["three"]=3 )
declare -i cnt=0
declares an integer-only variable, which is faster for math and always evaluates in arithmetic context.
declare -l lower="$1"
declares a variabl that automatically lowercases anything put in it, without any special syntax on access.
declare -r unchangeable="$constant"
declares a variable read-only.
Take a look at https://unix.stackexchange.com/questions/254367/in-bash-scripting-whats-the-different-between-declare-and-a-normal-variable for some useful discussion - you might not need these things often, but if you don't know what's available you're likely to work harder than you should.
A great reason to use declare, typeset, and/or readonly is code compartmentalization and reuse (i.e. encapsulation). You can write code in one script that can be sourced by others.
(Note declared/typeset/readonly constants/variables/functions lose their "readonly-ness" in a subshell, but they retain it when a child script sources their defining script since sourcing loads a script into the current shell, not a subshell.)
Since sourcing loads code from the script into the current shell though, the namespaces will overlap. To prevent a variable in a child script from being overwritten by its parent (or vice-versa, depending on where the script is sourced and the variable used), you can declare a variable readonly so it won't get overwritten.
You have to be careful with this because once you declare something readonly, you cannot unset it, so you do not want to declare something readonly that might naturally be redefined in another script. For example, if you're writing a library for general use that has logging functions, you might not want to use typeset -f on a function called warn, error, or info, since it is likely other scripts will create similar logging functions of their own with that name. In this case, it is actually standard practice to prefix the function, variable, and/or constant name with the name of the defining script and then make it readonly (e.g. my_script_warn, my_script_error, etc.). This preserves the values of the functions, variables, and/or constants as used in the logic in the code in the defining script so they don't get overwritten by sourcing scripts and accidentally fail.

Bash: Hide global variable using local variable with same name

I'd like to use a global variable in a function but don't want the change to go outside the function. So I defined a local variable initialized to the value of the global variable. The global variable has a great name, so I want to use the same name on the local variable. This seems doable in Bash, but I'm not sure if this is undefined behavior.
#!/bin/bash
a=3
echo $a
foo() {
local a=$a ## defined or undefined?
a=4
echo $a
}
foo
echo $a
Gives output:
3
4
3
Expansion happen before assignment (early on) as the documentation states:
Expansion is performed on the command line after it has been split into words.
So the behavior should be predictable (and defined). In local a=$a when expanding $a it's still the global one. The command execution (assignment/declaration) happens later (when $a has already been replaced by its value).
However I am not sure this would not get confusing to have essentially two different variables (scope dependent) with the same name (i.e. appearing to be the one and same). So, I'd rather question wisdom of doing so on coding practices / readability / ease of navigation grounds.
There is a new shell option in Bash 5.0, localvar_inherit, to have local variables with the same name inherit the value of a variable with the same name in the preceding scope:
#!/usr/bin/env bash
shopt -s localvar_inherit
myfunc() {
local globalvar
echo "In call: $globalvar"
globalvar='local'
echo "In call, after setting: $globalvar"
}
globalvar='global'
echo "Before call: $globalvar"
myfunc
echo "After call: $globalvar"
with the following output:
Before call: global
In call: global
In call, after setting: local
After call: global
If you don't have Bash 5.0, you have to set the value in the function, as you did in your question, with the same result.

Deferred evaluation of bash variables

I need to define a string (options) which contains a variable (group) that is going to be available later in the script.
This is what I came up with, using a literal string that gets evaluated later.
#!/bin/bash
options='--group="$group"' #$group is not available at this point
#
# Some code...
#
group='trekkie'
eval echo "$options" # the result is used elsewhere
It works, however it makes use of eval which I would like to avoid if not absolutely necessary (I don't want to risk potential problems because of unpredictable data).
I've asked for help in multiple places and I've got a couple of answers that were directing me to use indirect variables.
The problem is I simply fail to see how indirect variables might help me with my problem. As far as I understand they only offer a way of indirectly referencing other variables like this:
options="--group="$group""
a=options
group='trekkies'
echo "${!a}" # spits out --group=
I would also like to avoid using functions if possible because I don't want to make things more complicated than they need to be.
More Idiomatic: Using Parameter Expansion
Don't attempt to define the --group="$group" argument up-front when you don't yet know the group name; instead, set a flag that indicates whether the argument is needed, and honor that flag when forming your final argument list.
By going the below approach, you avoid any need for "deferred evaluation":
#!/bin/bash
# initialize your flag as unset
unset needs_group
# depending on your application logic, optionally set that flag
if [[ $application_logic_here ]]; then
needs_group=1
fi
# ...so, the actual group can be defined later, when it's known...
group=trekkies
# and then check the flag to determine whether to pass the argument:
yourcommand ${needs_group+--group="$group"}
If you don't need the flag to be separate from the group variable, this is even easier:
# pass --group="$group" only if "$group" is a defined shell variable
yourcommand ${group+--group="$group"}
The relevant syntax is a parameter expansion: ${var+value} expands to value only if var is defined; and unlike most parameter expansions, its value can parse to multiple words with quoting applied.
Alternately: One-Liner Function Shims
Here, you really are defining --group="$group" before the group is known:
#!/bin/bash
if [[ $application_logic_here ]]; then
with_optional_group() { "$#" --group="$group"; }
else
with_optional_group() { "$#"; }
fi
group=trekkies
with_optional_group yourcommand

Use variable's value to get another variable [duplicate]

I am trying to create an environment variable in bash script, user will input the name of environment variable to be created and will input its value as well.
this is a hard coded way just to elaborate my question :
#!/bin/bash
echo Hello
export varName="nameX" #
echo $varName
export "$varName"="val" #here I am trying to create an environment
#variable whose name is nameX and assigning it value val
echo $nameX
it works fine
it's output is :
Hello
nameX
val
But, I want a generic code. So I am trying to take input from user the name of variable and its value but I am having trouble in it. I don't know how to echo variable whose name is user-defined
echo "enter the environment variable name"
read varName
echo "enter the value to be assigned to env variable"
read value
export "$varName"=$value
Now, I don't know how to echo environment variable
if I do like this :
echo "$varName"
it outputs the name that user has given to environment variable not the value that is assigned to it. how to echo value in it?
Thanks
To get closure: the OP's question boils down to this:
How can I get the value of a variable whose name is stored in another variable in bash?
var='value' # the target variable
varName='var' # the variable storing $var's *name*
gniourf_gniourf provided the solution in a comment:
Use bash's indirection expansion feature:
echo "${!varName}" # -> 'value'
The ! preceding varName tells bash not to return the value of $varName, but the value of the variable whose name is the value of $varName.
The enclosing curly braces ({ and }) are required, unlike with direct variable references (typically).
See https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html
The page above also describes the forms ${!prefix#} and ${!prefix*}, which return a list of variable names that start with prefix.
bash 4.3+ supports a more flexible mechanism: namerefs, via declare -n or, inside functions, local -n:
Note: For the specific use case at hand, indirect expansion is the simpler solution.
var='value'
declare -n varAlias='var' # $varAlias is now another name for $var
echo "$varAlias" # -> 'value' - same as $var
The advantage of this approach is that the nameref is effectively just an another name for the original variable (storage location), so you can also assign to the nameref to update the original variable:
varAlias='new value' # assign a new value to the nameref
echo "$var" # -> 'new value' - the original variable has been updated
See https://www.gnu.org/software/bash/manual/html_node/Shell-Parameters.html
Compatibility note:
Indirect expansion and namerefs are NOT POSIX-compliant; a strictly POSIX-compliant shell will have neither feature.
ksh and zsh have comparable features, but with different syntax.

creating environment variable with user-defined name - indirect variable expansion

I am trying to create an environment variable in bash script, user will input the name of environment variable to be created and will input its value as well.
this is a hard coded way just to elaborate my question :
#!/bin/bash
echo Hello
export varName="nameX" #
echo $varName
export "$varName"="val" #here I am trying to create an environment
#variable whose name is nameX and assigning it value val
echo $nameX
it works fine
it's output is :
Hello
nameX
val
But, I want a generic code. So I am trying to take input from user the name of variable and its value but I am having trouble in it. I don't know how to echo variable whose name is user-defined
echo "enter the environment variable name"
read varName
echo "enter the value to be assigned to env variable"
read value
export "$varName"=$value
Now, I don't know how to echo environment variable
if I do like this :
echo "$varName"
it outputs the name that user has given to environment variable not the value that is assigned to it. how to echo value in it?
Thanks
To get closure: the OP's question boils down to this:
How can I get the value of a variable whose name is stored in another variable in bash?
var='value' # the target variable
varName='var' # the variable storing $var's *name*
gniourf_gniourf provided the solution in a comment:
Use bash's indirection expansion feature:
echo "${!varName}" # -> 'value'
The ! preceding varName tells bash not to return the value of $varName, but the value of the variable whose name is the value of $varName.
The enclosing curly braces ({ and }) are required, unlike with direct variable references (typically).
See https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html
The page above also describes the forms ${!prefix#} and ${!prefix*}, which return a list of variable names that start with prefix.
bash 4.3+ supports a more flexible mechanism: namerefs, via declare -n or, inside functions, local -n:
Note: For the specific use case at hand, indirect expansion is the simpler solution.
var='value'
declare -n varAlias='var' # $varAlias is now another name for $var
echo "$varAlias" # -> 'value' - same as $var
The advantage of this approach is that the nameref is effectively just an another name for the original variable (storage location), so you can also assign to the nameref to update the original variable:
varAlias='new value' # assign a new value to the nameref
echo "$var" # -> 'new value' - the original variable has been updated
See https://www.gnu.org/software/bash/manual/html_node/Shell-Parameters.html
Compatibility note:
Indirect expansion and namerefs are NOT POSIX-compliant; a strictly POSIX-compliant shell will have neither feature.
ksh and zsh have comparable features, but with different syntax.

Resources