Bash functions returning values meanwhile altering global variables - bash

I'm just struggling with bash functions, and trying to return string values meanwhile some global variable is modified inside the function. An example:
MyGlobal="some value"
function ReturnAndAlter () {
MyGlobal="changed global value"
echo "Returned string"
}
str=$(ReturnAndAlter)
echo $str # prints out 'Returned value' as expected
echo $MyGlobal # prints out the initial value, not changed
This is because $(...) (and also `...` if used instead) cause the function to have its own environment, so the global variable is never affected.
I found a very dirty workaround by returning the value into another global variable and calling the function only using its name, but think that there should be a cleaner way to do it.
My dirty solution:
MyGlobal="some value"
ret_val=""
function ReturnAndAlter () {
ret_val="Returned string"
MyGlobal="changed value"
}
ReturnAndAlter # call the bare function
str=$ret_val # and assign using the auxiliary global ret_val
echo $str
echo $MyGlobal # Here both global variables are updated.
Any new ideas? Some way of calling functions that I'm missing?

Setting global variables is the only way a function has of communicating directly with the shell that calls it. The practice of "returning" a value by capturing the standard output is a bit of a hack necessitated by the shell's semantics, which are geared towards making it easy to call other programs, not making it easy to do things in the shell itself.
So, don't worry; no, you aren't missing any cool tricks. You're doing what the shell allows you to do.

The $(…) (command expansion) is run in a sub-shell.
All changes inside the sub-shell are lost when the sub-shell close.
It is usually a bad idea to use both printing a result and changing a variable inside a function. Either make all variables or just use one printed string.
There is no other solution.

Related

Bash local variable scope best practice

I've seen that some people when writing bash script they define local variables inside an if else statement like example 1
Example 1:
#!/bin/bash
function ok() {
local animal
if [ ${A} ]; then
animal="zebra"
fi
echo "$animal"
}
A=true
ok
For another example, this is the same:
Example 2:
#!/bin/bash
function ok() {
if [ ${A} ]; then
local animal
animal="zebra"
fi
echo "$animal"
}
A=true
ok
So, the example above printed the same result but which one is the best practice to follow. I prefer the example 2 but I've seen a lot people declaring local variable inside a function like example 1. Would it be better to declare all local variables on top like below:
function ok() {
# all local variable declaration must be here
# Next statement
}
the best practice to follow
Check your scripts with https://shellcheck.net .
Quote variable expansions. Don't $var, do "$var". https://mywiki.wooledge.org/Quotes
For script local variables, prefer to use lowercase variable names. For exported variables, use upper case and unique variable names.
Do not use function name(). Use name(). https://wiki.bash-hackers.org/scripting/obsolete
Document the usage of global variables a=true. Or add local before using variables local a; then a=true. https://google.github.io/styleguide/shellguide.html#s4.2-function-comments
scope best practice
Generally, use the smallest scope possible. Keep stuff close to each other. Put local close to the variable usage. (This is like the rule from C or C++, to define a variable close to its usage, but unlike in C or C++, in shell declaration and assignment should be on separate lines).
Note that your examples are not the same. In the case variable A (or a) is an empty string, the first version will print an empty line (the local animal variable is empty), the second version will print the value of the global variable animal (there was no local). Although the scope should be as smallest, animal is used outside of if - so local should also be outside.
The local command constrains the variables declared to the function scope.
With that said, you can deduce that doing so inside an if block will be the same as if you did outside of it, as long as it's inside of a function.

The scope of local variables in sh

I've got quite a lot of headaches trying to debug my recursive function. It turns out that Dash interprets local variables strangely. Consider the following snippet:
iteration=0;
MyFunction()
{
local my_variable;
iteration=$(($iteration + 1));
if [ $iteration -lt 2 ]; then
my_variable="before recursion";
MyFunction
else
echo "The value of my_variable during recursion: '$my_variable'";
fi
}
MyFunction
In Bash, the result is:
The value of my_variable during recursion: ''
But in Dash, it is:
The value of my_variable during recursion: 'before recursion'
Looks like Dash makes the local variables available across the same function name. What is the point of this and how can I avoid issues when I don't know when and which recursive iteration changed the value of a variable?
local is not part of the POSIX specification, so bash and dash are free to implement it any way they like.
dash does not allow assignments with local, so the variable is unset unless it inherits a value from a surrounding scope. (In this case, the surrounding scope of the second iteration is the first iteration.)
bash does allow assignments (e.g., local x=3), and it always creates a variable with a default empty value unless an assignment is made.
This is a consequence of your attempt to read the variable in the inner-most invocation without having set it in there explicitly. In that case, the variable is indeed local to the function, but it inherits its initial value from the outer context (where you have it set to "before recursion").
The local marker on a variable thus only affects the value of the variable in the caller after the function invocation returned. If you set a local variable in a called function, its value will not affect the value of the same variable in the caller.
To quote the dash man page:
Variables may be declared to be local to a function by using a local command. This should appear as the first statement of a function, and the syntax is
local [variable | -] ...
Local is implemented as a builtin command.
When a variable is made local, it inherits the initial value and exported and readonly flags from the variable with the same name in the surrounding scope, if there is one. Otherwise, the variable is initially unset. The shell uses dynamic scoping, so that if you make the variable x local to function f, which then calls function g, references to the variable x made inside g will refer to the variable x declared inside f, not to the
global variable named x.
The only special parameter that can be made local is “-”. Making “-” local any shell options that are changed via the set command inside the function to be restored to their original values when the function returns.
To be sure about the value of a variable in a specific context, make sure to always set it explicitly in that context. Else, you rely on "fallback" behavior of the various shells which might be different across shells.

How do I pass a command parameter in a variable holding the command?

I want to produce the same output as this:
bash utilities.bash "is_net_connected"
But I don't know how to pass "is_net_connected" if command and file is stored in a variable like this:
T=$(bash utilities.bash)
I've tried these but it doesn't seem to work. It's not picking up ${1} in utilities.bash.
$(T) "is_net_connected"
$(T "is_net_connected")
Not the best way to inport but I'm trying to avoid cluttering my main script with function blocks.
T=$(bash utilities.bash) doesn't save the command; it runs the command and saves its output. You want to define a function instead.
T () {
bash utilities.bash "$#"
}
# Or on one line,
# T () { bash utilities.bash "$#"; }
Now
T "is_net_connected"
will run bash utilities.bash with whatever arguments were passed to T. In a case like this, an alias would work the same: alias T='bash utilities.bash'. However, any changes to what T should do will probably require switching from an alias to a function anyway, so you may as well use the function to start. (Plus, you would have to explicitly enable alias expansion in your script.)
You might be tempted to use
T="bash utilities.bash"
$T is_net_connected
Don't be. Unquoted parameter expansions are bad practice that only work in select situations, and you will get bitten eventually if you try to use them with more complicated commands. Use a function; that's why the language supports them.

Interpolation rules when defining a function

At a prompt, I can type:
$ e() { echo $1; }
and get a function which echoes its first argument. I do not understand why this works. Since $1 is undefined in the current environment, it seems that the above should be equivalent to:
$ e() { echo ; }
What is the relevant quoting/interpolation rule that allows this behavior? Note that this has nothing to do with $1 being special: if you use $FOO, you get a function that echoes the value of $FOO at the time the function is called rather than the value of $FOO when the function is defined.
Not sure how I missed this, since it's pretty clear in section 2.9.5:
When the function is declared, none of the expansions in wordexp shall be performed on the text in compound-command or io-redirect; all expansions shall be performed as normal each time the function is called. Similarly, the optional io-redirect redirections and any variable assignments within compound-command shall be performed during the execution of the function itself, not the function definition. See Consequences of Shell Errors for the consequences of failures of these operations on interactive and non-interactive shells.
Variables like $1 are special variables representing parameters passed from the command line. See the "Positional Parameters" section here: http://tldp.org/LDP/abs/html/internalvariables.html

Bash - problem with using unset variable in a script

I have following code:
VAR1=""
ANOTHER_VAR="$VAR1/path/to/file"
ANOTHER_VAR_2="$VAR1/path/to/another/file"
...
# getopts which reads params from command line and sets the VAR1
The problem is that setting the VAR1 after ANOTHER_VARs are set makes their paths without the VAR1 part. I can't move the getopts above those because the script is long and there are many methods which depends on the variables and on other methods. Any ideas how to solve this?
I'd make ANOTHER_VAR and ANOTHER_VAR_2 into functions. The return value would depend on the current value of VAR1.
ANOTHER_VAR () { echo "$VAR1/path/to/file"; }
ANOTHER_VAR_2 () { echo "$VAR1/path/to/another/file"; }
Then, instead of $ANOTHER_VAR, you'd use $(ANOTHER_VAR)
Is it possible to set the ANOTHER_VAR and ANOTHER_VAR_2 variables below where getopts is called? Also, how about setting the ANOTHER_VAR and ANOTHER_VAR_2 in a function, that's called after getopts?
foobar(){
do something
return
}
foobar()
Your 'many methods which depend on the variables' cannot be used before you set ANOTHER_VAR, so you can simply move the definitions to after the getopts loop.
One advantage of shell scripts is that variables do not have to be defined before the functions that use them are defined; the variables merely have to be defined at the time when the functions are used. (That said, it is not dreadfully good style to do this, but it will get you out of the scrape you are in. You should also be able to move the getopts loop up above the functions, which would be better. You have a lot of explaining to do before you can get away with "I can't move the getopts above those".)
So, your fixes are (in order of preference):
Move the getopts loop.
Move the two lines that set ANOTHER_VAR and ANOTHER_VAR_2 after the getopts loop and before you invoke any function that depends on them.

Resources