Unset is not useful? (Korn shell) - shell

I'm reading this:
You can delete a variable with the command unset varname. Normally this is not useful, since all variables that don't exist are assumed to be null, i.e., equal to empty string "". But if you use the option nounset which causes the shell to indicate an error when it encounters an undefined variable, then you may be interested in unset.
My first question is: I cannot see why the use of unset be not useful; if I want to put my variable to null I can use it (or set variable="" or variable=). On the other hand, if I have a variable that doesn't exist, I don't know why I should have to use it..
My second question is: Why may I be interested in unset in that case?

There is a relevant difference between unset and empty variables.
When you can't tell in front which variables will be used, you can process the output of set (examples: https://stackoverflow.com/a/43419722/3220113 and https://stackoverflow.com/a/28104421/3220113 ).
You might have a situaton where you have sourced a read-only config file, but you do not want all lines set in your environment. In that case you might want to unset the settings you do not need.
When you write some utility that uses some variables, you do not want to leave garbage in the environment. Next to using local variables using unset is another possibility.

I think I have found the answer to my question.
1) If you need to remove the definition and the content of a variable you can use unset command. However, unless you turn on the nounset set option, Korn Shell will allow using variables which don't exist, and it will default the content of such a variable as an empty string. That's why you normally don't use unset: because you normally leave the nounset option off and test variables via conditional logic. Hence in these cases, i.e. the inhibition of the use of a variable, it is not useful. (Obviously, it remains useful for deleting variables - as noted by #Walter A, i.e. "" is not unset, the complete removal of the variable.)
2) That said, it follows that if you use the nounset, unset command makes sense. Indeed, if you unset a variable, the shell will disallow using it.

Related

How does "FOO= myprogram" in bash make "if(getent("FOO"))" return true in C?

I recently ran into a C program that makes use of an environmental variable as a flag to change the behavior of a certain part of the program:
if (getenv("FOO")) do_this_if_foo();
You'd then request the program by prepending the environment variable, but without actually setting it to anything:
FOO= mycommand myargs
Note that the intention of this was to trigger the flag - if you didn't want the added operation, you just wouldn't include the FOO=. However, I've never seen an environment variable set like this before. Every example I can find of prepended variables sets a value, FOO=bar mycommand myargs, rather than leaving it empty like that.
What exactly is happening here, that allows this flag to work without being set? And are there potential issues with implementing environmental variables like this?
The bash manual says:
A variable may be assigned to by a statement of the form
name=[value]
If value is not given, the variable is assigned the null string.
Note that "null" (in the sense of e.g. JavaScript null) is not a thing in the shell. When the bash manual says "null string", it means an empty string (i.e. a string whose length is zero).
Also:
When a simple command is executed, the shell performs the following expansions, assignments, and redirections, from left to right.
[...]
If no command name results, the variable assignments affect the current shell environment. Otherwise, the variables are added to the environment of the executed command and do not affect the current shell environment.
So all FOO= mycommand does is set the environment variable FOO to the empty string while executing mycommand. This satisfies if (getenv("FOO")) because it only checks for the presence of the variable, not whether it has a (non-empty) value.
Of course, any other value would work as well: FOO=1 mycommand, FOO=asdf mycommand, etc.
FOO= is just setting the variable to null (to be precise it's setting the variable to a zero-byte string, which thus returns a pointer to a NUL terminator - thanks #CharlesDuffy). Given the code you posted it could be FOO='bananas'and produce the same behavior. It's very odd to write code that way though. The common reason to set a variable on the command line is to pass a value for that variable into the script, e.g. to set debugging or logging level flags is extremely common, e.g. (pseudocode):
debug=1 logLevel=3 myscript
myscript() {
if (debug == 1) {
if (loglevel > 0) {
printf "Entering myscript()\n" >> log
if (logLevel > 1) {
printf "Arguments: %s\n" "$*" >> log
}
}
}
do_stuff()
}
Having just a "variable exists" test is a bit harder to work with because then you have to specifically unset the variable to clear the flag instead of just setting FOO=1 when you want to do something and otherwise your script doesn't care when FOO is null or 0 or unset or anything else.

Deferred evaluation of bash variables

I need to define a string (options) which contains a variable (group) that is going to be available later in the script.
This is what I came up with, using a literal string that gets evaluated later.
#!/bin/bash
options='--group="$group"' #$group is not available at this point
#
# Some code...
#
group='trekkie'
eval echo "$options" # the result is used elsewhere
It works, however it makes use of eval which I would like to avoid if not absolutely necessary (I don't want to risk potential problems because of unpredictable data).
I've asked for help in multiple places and I've got a couple of answers that were directing me to use indirect variables.
The problem is I simply fail to see how indirect variables might help me with my problem. As far as I understand they only offer a way of indirectly referencing other variables like this:
options="--group="$group""
a=options
group='trekkies'
echo "${!a}" # spits out --group=
I would also like to avoid using functions if possible because I don't want to make things more complicated than they need to be.
More Idiomatic: Using Parameter Expansion
Don't attempt to define the --group="$group" argument up-front when you don't yet know the group name; instead, set a flag that indicates whether the argument is needed, and honor that flag when forming your final argument list.
By going the below approach, you avoid any need for "deferred evaluation":
#!/bin/bash
# initialize your flag as unset
unset needs_group
# depending on your application logic, optionally set that flag
if [[ $application_logic_here ]]; then
needs_group=1
fi
# ...so, the actual group can be defined later, when it's known...
group=trekkies
# and then check the flag to determine whether to pass the argument:
yourcommand ${needs_group+--group="$group"}
If you don't need the flag to be separate from the group variable, this is even easier:
# pass --group="$group" only if "$group" is a defined shell variable
yourcommand ${group+--group="$group"}
The relevant syntax is a parameter expansion: ${var+value} expands to value only if var is defined; and unlike most parameter expansions, its value can parse to multiple words with quoting applied.
Alternately: One-Liner Function Shims
Here, you really are defining --group="$group" before the group is known:
#!/bin/bash
if [[ $application_logic_here ]]; then
with_optional_group() { "$#" --group="$group"; }
else
with_optional_group() { "$#"; }
fi
group=trekkies
with_optional_group yourcommand

What is the purpose of setting a variable default to empty in bash?

In general, this syntax is used to guarantee a value, potentially a default argument.
(from the Bash reference manual)
${parameter:-word}
If parameter is unset or null, the expansion of word is substituted.
Otherwise, the value of parameter is substituted.
What would be the purpose of defaulting a variable to empty if the substitution is only chosen when the variable is empty anyway?
For reference, I'm looking at /lib/lsb/init-functions.
"Null" means the variable has a value, and this value is an empty string. The shell knows the variable exists.
"Unset" means the variable has not been defined : it does not exist as far as the shell is concerned.
In its usual mode, the shell will expand null and unset variable to an empty string. But there is a mode (set -u) that allows the shell to throw a runtime error if a variable is expanded when it is unset. It is good practice to enable this mode, because it is very easy to simply mis-type a variable name and get difficult to debug errors.
It can actually be useful from a computing perspective to differentiate between unset and empty variables, you can assign separate semantics to each case. For instance, say you have a function that may receive an argument. You may want to use a (non-null) default value if the parameter is unset, or any value passed to the function (including an empty string) if the parameter is set. You would do something like :
my_function()
{
echo "${1-DEFAULT_VALUE}"
}
Then, the two commands below would provide different outputs:
my_function # Echoes DEFAULT_VALUE
my_function "" # Echoes an empty line
There is also a type of expansion that does not differentiate between null and not set :
"${VAR:-DEFAULT_VALUE}"
They are both useful depending on what you need.
The way to test if a variable is set or not (without running the risk of a runtime error) is the following type of expansion :
"${VAR+VALUE}"
This will expand to an empty string if VAR is unset, or to VALUE if it is set (empty or with a value). Very useful when you need it.
Generally, it is helpful to:
Declare variables explicitely
set -u to prevent silent expansion failure
Explicitly handle unset variables through the appropriate expansion
This will make your scripts more reliable, and easier to debug.

What does the following assignment of variable do

I am trying to understand a bash script. I couldn't understand a piece of code. I wasn't sure what to google for either. So I'm posting it here. What does it do?
VARIABLE=${VARIABLE:-foo}
It assigns to VARIABLE:
Whatever is in $VARIABLE if it's not unset
foo otherwise
This is sometimes called a "default" parameter:
${parameter-default}, ${parameter:-default}
If parameter not set, use default.
If VARIABLE is not set, or is set to the empty string, then it sets VARIABLE to foo.
Otherwise, it effectively leaves VARIABLE alone, by setting it to its existing value.
The colon makes it treat the empty string as if VARIABLE is not set. If you say ${VARIABLE-foo}, it expands to $VARIABLE even if VARIABLE is set the empty string. This version only expands to foo if VARIABLE is not set at all.

Defining recursively expanded variable with same name as environment variable

I'm trying to lazily evaluate configuration option. I want to issue a Make error only if the variable is actually used (substituted).
Consider the following Makefile:
VAR = $(error "E")
NFS_DIR = NFS_DIR is $(VAR)
T = $(NFS_DIR) is 1
all:
echo Test
If I run it with my environment (which has /srv/nfs value), the output is following:
➜ ~ make
echo Test
Makefile:3: *** "E". Stop.
So the recursive definition acts like simple definition.
If I clear the environment, it works as expected:
➜ ~ env -i make
echo Test
Test
I couldn't find any mention that recursively-expanded variable, when defined with same name as environment variable, will act like simply-expanded variable.
So the questions are:
Why the observed behavior?
How to implement the desired behavior?
Update: to clarify — I don't want to use ?= since the options are configuration options I want them to come strictly from Makefile and not from environment.
Any variable which is in the environment when make starts, will be exported to the environment of any command make runs (as part of a recipe, etc.) In order for make to send the value of the variable to the command, make first has to expand it. It's not acting like a simply-expanded variable. It's just that running a command forces the expansion.
This is kind of a nasty side-effect but I'm not sure there's anything that can be done directly: this behavior is traditional and mandated by POSIX and lots of makefiles would break if it were changed.
You have two choices I can think of. The first is to use the unexport make command to tell make to not send that variable in the command's environment.
The second is to change the name of the variable in make to something that is not a valid environment variable: make will only export variables whose names are legal shell variables (contain only alphanumeric plus _). So using a name like VAR-local instead of VAR would do it.
The question appear to be extremely clear in the title but the actual request get lost in details so the only other reply left it mostly unanswered. Directly answering to the question in the title, which is very interesting, to define a variable in a Makefile with same name as environment variable you can get its value with printenv:
PATH=${shell printenv PATH}:/opt/bin
echo:
echo $(PATH)
Other techniques to achieve the same result without relying on evaluation with external commands are welcome.

Resources