Global variables: Arrays not behaving like other variables - debugging

I have a PowerShell script that is made up of a main PS1 file that then loads a number of Modules. In one of those modules I define a variable $global:locationsXml and then proceed to add to it without the global flag, and it works great. I can reference it without the global flag from any other module.
However, I also define a $global:loadedDefinitions = #() array and add to it. But I have to refer to this variable with the global flag when adding to it with +=. I can reference it in any other module without the global flag, but in the creating module I need it. And that module is the same one where the xml variable works differently/correctly.
I also have a Hash Table that I define without the global flag, but in the top level script that loads all the modules, and that I can reference without the global flag from anywhere. Additionally I have tried initializing the problem array in the parent script, like the Hash Table, but still the array requires the global flag in the module that populates it. But NOT in a different module that just reads it.
All of this is currently being tested in Windows 7 and PS 2.0.
So, before I go tearing things apart I wonder; is there a known bug, where global arrays behave differently from other global variables, specifically when being written to in a module?
I guess including the global flag for writing to the few arrays I need won't be a big deal, but I would like to understand what is going on, especially if it is somehow intended behavior rather than a bug.
Edit: To clarify, this works
Script:
Define Hash Table without global specifier;
Load Module;
Call Function in Module;
Read and write Hash Table without global specifier;
And this works
Script:
Load Module;
Call Function in Module;
Initialize Array with global specifier;
Append to Array with global specifier;
Reference Array from anywhere else WITHOUT global specifier;
This doesn't
Script:
Load Module;
Call Function in Module;
Initialize Array WITH global specifier;
Append to Array without global specifier;
Reference Array from anywhere fails;
This approach, of only initializing the variable with the global specifier and then referencing without it works for other variables, but not for arrays, "seems" to be the behavior/bug I am seeing. It is doubly odd that the global specifier only needs to be used in the module where the Array is initialized, not in any other module. I have yet to verify if it is also just in the function where it is initialized, and/or just writing to the array, not reading.

When you read from variable without scope specifier, PowerShell first look for variable in current scope, then, if find nothing, go to parent scope, until it find variable or reach the global scope. When you write to variable without scope specifier, PowerShell write that variable in current scope only.
Set-StrictMode -Version Latest #To produce VariableIsUndefined error.
&{
$global:a=1
$global:a #1
$local:a # Error VariableIsUndefined.
$a #1 Refer to global, $a as no $a in current scope.
$a=2 # Create variable $a in current scope.
$global:a #1 Global variable have old value.
$local:a #2 New local variable have new value.
$a #2 Refer to local $a.
}
Calling object's methods, property's and indexer's accessors (including set accessors) only read from variable. Writing to object is a different from writing to variable.
Set-StrictMode -Version Latest #To produce VariableIsUndefined error.
&{
$global:a=1..3
$global:a-join',' #1,2,3
$local:a -join',' # Error VariableIsUndefined.
$a -join',' #1,2,3 Refer to global $a, as no $a in current scope.
$a[0]=4; # Write to object (Array) but not to variable, variable only read here.
$global:a-join',' #4,2,3 Global variable have different content now.
$local:a -join',' # And you still does not have local one.
$a -join',' #4,2,3 Refer to global $a, as no $a in current scope.
$a+=5 # In PowerShell V2 this is equivalents to $a=$a+5.
# There are two reference to $a here.
# First one refer to local $a, as it is write to variable.
# Second refer to global $a, as no $a in current scope.
# $a+5 expression create new object and you assing it to local variable.
$global:a-join',' #4,2,3 Global variable have old value.
$local:a -join',' #4,2,3,5 But now you have local variable with new value.
$a -join',' #4,2,3,5 Refer to local $a.
}
So if you want to write to global variable from non-global scope, then you have to use global scope specifier. But if you only want to read from global variable, which is not hided by local variable with same name, you may omit global scope specifier.

Related

Bash local variable scope best practice

I've seen that some people when writing bash script they define local variables inside an if else statement like example 1
Example 1:
#!/bin/bash
function ok() {
local animal
if [ ${A} ]; then
animal="zebra"
fi
echo "$animal"
}
A=true
ok
For another example, this is the same:
Example 2:
#!/bin/bash
function ok() {
if [ ${A} ]; then
local animal
animal="zebra"
fi
echo "$animal"
}
A=true
ok
So, the example above printed the same result but which one is the best practice to follow. I prefer the example 2 but I've seen a lot people declaring local variable inside a function like example 1. Would it be better to declare all local variables on top like below:
function ok() {
# all local variable declaration must be here
# Next statement
}
the best practice to follow
Check your scripts with https://shellcheck.net .
Quote variable expansions. Don't $var, do "$var". https://mywiki.wooledge.org/Quotes
For script local variables, prefer to use lowercase variable names. For exported variables, use upper case and unique variable names.
Do not use function name(). Use name(). https://wiki.bash-hackers.org/scripting/obsolete
Document the usage of global variables a=true. Or add local before using variables local a; then a=true. https://google.github.io/styleguide/shellguide.html#s4.2-function-comments
scope best practice
Generally, use the smallest scope possible. Keep stuff close to each other. Put local close to the variable usage. (This is like the rule from C or C++, to define a variable close to its usage, but unlike in C or C++, in shell declaration and assignment should be on separate lines).
Note that your examples are not the same. In the case variable A (or a) is an empty string, the first version will print an empty line (the local animal variable is empty), the second version will print the value of the global variable animal (there was no local). Although the scope should be as smallest, animal is used outside of if - so local should also be outside.
The local command constrains the variables declared to the function scope.
With that said, you can deduce that doing so inside an if block will be the same as if you did outside of it, as long as it's inside of a function.

The scope of local variables in sh

I've got quite a lot of headaches trying to debug my recursive function. It turns out that Dash interprets local variables strangely. Consider the following snippet:
iteration=0;
MyFunction()
{
local my_variable;
iteration=$(($iteration + 1));
if [ $iteration -lt 2 ]; then
my_variable="before recursion";
MyFunction
else
echo "The value of my_variable during recursion: '$my_variable'";
fi
}
MyFunction
In Bash, the result is:
The value of my_variable during recursion: ''
But in Dash, it is:
The value of my_variable during recursion: 'before recursion'
Looks like Dash makes the local variables available across the same function name. What is the point of this and how can I avoid issues when I don't know when and which recursive iteration changed the value of a variable?
local is not part of the POSIX specification, so bash and dash are free to implement it any way they like.
dash does not allow assignments with local, so the variable is unset unless it inherits a value from a surrounding scope. (In this case, the surrounding scope of the second iteration is the first iteration.)
bash does allow assignments (e.g., local x=3), and it always creates a variable with a default empty value unless an assignment is made.
This is a consequence of your attempt to read the variable in the inner-most invocation without having set it in there explicitly. In that case, the variable is indeed local to the function, but it inherits its initial value from the outer context (where you have it set to "before recursion").
The local marker on a variable thus only affects the value of the variable in the caller after the function invocation returned. If you set a local variable in a called function, its value will not affect the value of the same variable in the caller.
To quote the dash man page:
Variables may be declared to be local to a function by using a local command. This should appear as the first statement of a function, and the syntax is
local [variable | -] ...
Local is implemented as a builtin command.
When a variable is made local, it inherits the initial value and exported and readonly flags from the variable with the same name in the surrounding scope, if there is one. Otherwise, the variable is initially unset. The shell uses dynamic scoping, so that if you make the variable x local to function f, which then calls function g, references to the variable x made inside g will refer to the variable x declared inside f, not to the
global variable named x.
The only special parameter that can be made local is “-”. Making “-” local any shell options that are changed via the set command inside the function to be restored to their original values when the function returns.
To be sure about the value of a variable in a specific context, make sure to always set it explicitly in that context. Else, you rely on "fallback" behavior of the various shells which might be different across shells.

Why is this Bash script not inheriting all environment variables?

I'm trying something very straightforward:
PEOPLE=(
"nick"
"bob"
)
export PEOPLE="$(IFS=, ; echo "${PEOPLE[*]}")"
echo "$PEOPLE" # prints 'nick,bob'
./process-people.sh
For some reason, process-people.sh isn't seeing $PEOPLE. As in, if I echo "$PEOPLE" from inside process-people.sh, it prints an empty line.
From what I understand, the child process created by invoking ./process-people.sh should inherit all the parent process's environment variables, including $PEOPLE.
Yet, I've tried this on both Bash 3.2.57(1)-release and 4.2.46(2)-release and it doesn't work.
What's going on here?
That's a neat solution you have there for joining the elements of a Bash array into a string. Did you know that in Bash you cannot export array variables to the environment? And if a variable is not in the environment, then the child process will not see it.
Ah. But you aren't exporting an array, are you. You're converting the array into a string and then exporting that. So it should work.
But this is Bash! Where various accidents of history conspire to give you the finger.
As #PesaThe and #chepner pointed out in the comments below, you cannot actually convert a Bash array variable to a string variable. According to the Bash reference on arrays:
Referencing an array variable without a subscript is equivalent to referencing with a subscript of 0.
So when you call export PEOPLE=... where PEOPLE was previously assigned an array value, what you're actually doing is PEOPLE[0]=.... Here's a fuller example:
PEOPLE=(
"nick"
"bob"
)
export PEOPLE="super"
echo "$PEOPLE" # masks the fact that PEOPLE is still an array and just prints 'super'
echo "${PEOPLE[*]}" # prints 'super bob'
It's unfortunate that the export silently fails to export the array to the environment (it returns 0), and it's confusing that Bash equates ARRAY_VARIABLE to ARRAY_VARIABLE[0] in certain situations. We'll just have to chalk that up to a combination of history and backwards compatibility.
Here's a working solution to your problem:
PEOPLE_ARRAY=(
"nick"
"bob"
)
export PEOPLE="$(IFS=, ; echo "${PEOPLE_ARRAY[*]}")"
echo "$PEOPLE" # prints 'nick,bob'
./process-people.sh
The key here is to assign the array and derived string to different variables. Since PEOPLE is a proper string variable, it will export just fine and process-people.sh will work as expected.
It's not possible to directly change a Bash array variable into a string variable. Once a variable is assigned an array value, it becomes an array variable. The only way to change it back to a string variable is to destroy it with unset and recreate it.
Bash has a couple of handy commands for inspecting variables that are useful for investigating these kinds of issues:
printenv PEOPLE # prints 'nick,bob'
declare -p PEOPLE_ARRAY # prints: declare -ax PEOPLE_ARRAY='([0]="nick" [1]="bob")'
printenv will only return a value for environment variables, vs. echo, which will print a result whether the variable has been properly exported or not.
declare -p will show the full value of a variable, without the gotchas related to including or leaving out array index references (e.g. ARRAY_VARIABLE[*]).

Local variable declaration in /etc/init.d/functions

On RHEL, the daemon() function in /etc/init.d/functions is defined as follows:
daemon() {
# Test syntax.
local gotbase= force= nicelevel corelimit
local pid base= user= nice= bg= pid_file=
local cgroup=
nicelevel=0
... and so on ...
I'm trying to understand why some of the local variables are defined with an equals sign and some others not. What's happening here? Is this multiple declaration and assignment?
local varname
declares a local variable, but doesn't initialize it with any value.
local varname=value
declares a local variable, and also initializes it to value. You can initialize it to an empty string by providing an empty value, as in
local varname=
So in your example, pid is declared but not initialized, while base is declared and initialized to an empty string.
For most purposes there's not much difference between an unset variable and having an empty string as the value. But some of the parameter expansion operators can distinguish them. E.g.
${varname:-default}
will expand to default if varname is unset or empty, but
${varname-default}
will expand to default only if varname is unset. So if you use
${base-default}
it will expand to the empty string, not default.

Assigning one variable the value of another in bash

I am using a switcher that exports environmental variables based upon which machine is being used. One of the variables that gets exported is ems_1 .
Now, in my start.bash script, I am trying to assign a variable called provider with the value ems_1 has.
export provider = ems_1
Doesn't work . Any suggestions ?
export provider=$ems_1
You need to reference variables using the $ sign.
variable=value
cannot have spaces in-between.

Resources