Bash: ensuring a variable is set without erasing any existing value - bash

Let's say I'm running a bash script under set -u. Obviously, for any given variable, I need to ensure that it's set. Something like:
foo=
However, if I want to keep any pre-existing value that might be set by my caller, this would overwrite it. A simple solution to this problem is to do this instead:
: ${foo:=}
But I have some code that does this (more complicated) way:
foo=${foo+$foo}
Now, I know this second way works. My question is, is there any advantage to it over the first way? I am assuming there is but now can't remember what it was. Can anyone either think of an edge case (no matter how obscure) where these two constructs would behave differently, or provide a compelling explanation that they can't?

I can't think of any case where they would differ. They're just alternative logic for the same thing.
The meaning of the simple solution is: If foo is unset/empty, set it to the empty string.
The meaning of your code is: If foo is set, set it to itself, otherwise set it to an empty string.
Your code seems like more work -- why set something to itself? Just do nothing and it will keep its value. That's what the simpler version does.
You can also simplify the simple solution further by removing the : in the parameter expansion.
: ${foo=}
This makes it only test whether foo is unset. If it's set to the empty string, no default needs to be assigned.

My question is, is there any advantage to it over the first way?
Maybe this is subjective, but one advantage is that it clearly looks like a variable assignment. Anyone who sees the command foo=${foo+$foo} will immediately understand that it sets the variable foo (even if they need to look up the ${parameter+word} notation to figure out what it sets it to); but someone who sees the command : ${foo:=} is likely to completely miss that it has the side-effect of modifying foo. (The use of : is definitely a hint that something might be happening, since : itself does nothing; but it's not as blatant.)
And of course, someone searching the script for foo= will find the former but not the latter.
That said, I would personally write this as either foo="${foo-}" or foo="${foo:-}", which still makes clear that it sets foo, but is a bit simpler than foo=${foo+$foo}. I also think that readers are more likely to be familiar with ${parameter-word} than ${parameter+word}, but I haven't asked around to check.

Related

accelerate Tcl eval

I'm currently writing a Tcl-based tool for symbolic matrix manipulation, but the code is getting slow. I'm looking for ways to accelerate my Tcl code (Tcl version 8.6).
I have one suspicion. My code builds lists with a command name as the first element and command arguments as the following elements (this comes from emulating an object-oriented approach). I use eval to invoke these commands (and this is done often in the recursive processing). I read at https://wiki.tcl-lang.org/page/eval and https://wiki.tcl-lang.org/page/Tcl+Performance that eval may be slow.
I have three questions:
What would be the fastest way to invoke a command from a list with command name and parameters which is constructed just beforehand?
Would it accelerate the code to separate the command name myCmd and the parameter list myPar and invoke the command with [$myCmd {*}$myPar] instead (suggested at https://stackoverflow.com/a/27619692/3852630)?
Is the trick with if 1 instead of eval still promising in 8.6?
Thanks a lot for your help!
Above all, don't assume: time it to be sure. Be aware when timing things that repeatedly running a thing may change the time it takes to run it (as caches warm up). Think carefully about what you want to actually get the speed of.
The eval command is usually slow, but not in all cases. If you give it a list that you've constructed (e.g., with list or linsert or lappend or…) then it's fairly fast as it can avoid reparsing the input; it knows, but only in that case, that it can skip straight to dispatching to the command implementation. The other case that is fast is when you give it a value that was previously given to eval; the bytecode is already built and cached. These notes also apply with uplevel.
Doing $myCmd {*}$myParameters is fairly fast too; that's bytecoded into “assemble the words on the Tcl operand stack and do the right command dispatch” which is very close to what it would be for an arbitrary user command anyway (which very rarely have direct bytecode implementations).
I'd expect things with if 1 to be very quick in some cases and very slow in others; it forces full compilation, so if things can be cached well then that will be fast and if things can't it will be slow. And if you're just calling a command, it won't make much difference at all at best. The cases where it wins are when the thing being called is itself a bytecoded command and where you can cache things correctly.
If you're dealing with an ordinary command (e.g., a procedure, or one of Tcl's commands that touch the OS), I'd go with option 2: $myCmd {*}$myParameters or variants on it. It's about as fast as you're going to get. But I would not do:
set myParameters [linsert $myOriginalValues 0 "literal1" [cmdOutput2] $value3]
$myCmd {*}$myParameters
That's ridiculous. This is clearer and cleaner and faster:
$myCmd "literal1" [cmdOutput2] $value3 {*}$myOriginalValues
Part of the point of expansion syntax ({*}) is that you don't need to do complex argument marshalling, and that's good because complexity is hard to get right all the time.
A note about K and unsharing objects
Avoid copying data in memory. Change
set mylist [linsert $mylist 0 some new content]
to
set mylist [linsert $mylist[set mylist ""] 0 some new content]
This dereferences the value of the variable and then sets the variable to
the empty string. This reduces the variable's reference count.
See also https://stackoverflow.com/a/64117854/7552

What benefit does discriminating between local and global variables provide?

I'm wondering what benefit discriminating between local and global variables provides. It seems to me that if everything were made a global variable, there would be a lot less confusion.
Wouldn't declaring everything a global variable result in fewer errors because one wouldn't mistakenly call a local variable in a global instance, thereby encountering fewer errors?
Where is my logic wrong on this?
Some of this boils down to good coding practices. Keeping variables local also means it becomes simpler to share code from one application to another without having to worry about code conflicts. While its simpler to make everything global, getting into the habit of only using global variables when you actually have to will force you to code more efficiently and will make your code more structured.
I think your key oversight is thinking that an error telling you a local variable doesn't exist is a bad thing - it isn't. You've made a mistake and ruby is telling you so. This type of mistake is usually easy to fix: you've misspelled something or you're using something that you forgot to create.
Global variables everywhere might remove those errors but they would replace them with a far harder set of errors to reason about: accidentally using a variable that another bit of code is using. Imagine if every time you called a function (one of your own or a standard library one or one from a gem) you had to check which global variables it might change (and which functions it called, since it might also change global variables) If you make a mistake then you might get an error message (if the class of the object in the variable changes enough) but often you would just silently get incorrect results (if the value of a variable you were using changes unexpectedly).
In general global variables are much harder to work with and people avoid them when possible.
If all variables are global, every line of code in every program (including those which haven't been written yet) written by every programmer on the planet (including those who haven't been born yet or are already dead) must universally, uniquely agree on the names of variables. If you use a variable name that someone else on a different continent two years from now will also use, both of your programs will break, when used together.

Why are `?` and `!` not allowed in variable names while they are allowed in method names? [duplicate]

A method name can end with a question mark ?
def has_completed?
return count > 10
end
but a variable name cannot.
What is the reason for that? Isn't it convenient to have variable names ending the same way too? Given that we usually can't tell whether foobar is a method or a variable just by looking at the name foobar anyway, why the exception for the ? case?
And how should I work with this? Maybe always to use has or is in the code?
if process_has_completed
...
end
if user_is_using_console
...
end
You'd have to ask Matz to get an authoritative answer. However,
Ruby is an untyped programming language and a variable like finished? would imply a specific type (boolean) which appears somewhat contradictory to me.
A question somewhat requires a receiver (who can answer the question). A method must have a receiver (the object the method is called on), so a question mark makes sense. A variable on the other hand has no receiver, it's a mere container.
Now this is just a thought, but I think methods with names like empty? suggest that a some sort of check has to be made inside and object or a class (depending on the context). This check or evaluation means an action must be done. Overall, since we are asking (thus, ?) object for some state, means there is a possibility that object's state could change throughout the application's lifecycle. A variable could be outdated, but ?-method (check) will be done in the specific moment, thus providing an up-to-date information on some state that could be presented in a boolean form.
So I'd like to think that this is a design constraint provided by the architect (Matz) to enforce a more logical, close-to-real-life coding approach.

One liner to set environment variable if doesn't exist, else append

I am using bash.
There is an environment variable that I want to either append if it is already set like:
PATH=$PATH":/path/to/bin"
Or if it doesn't already exist I want to simply set it:
PATH="/path/to/bin"
Is there a one line statement to do this?
Obviously the PATH environment variable is pretty much always set but it was easiest to write this question with.
A little improvement on Michael Burr's answer. This works with set -u (set -o nounset) as well:
PATH=${PATH:+$PATH:}/path/to/bin
PATH=${PATH}${PATH:+:}/path/to/bin
${PATH} evaluates to nothing if PATH is not set/empty, otherwise it evaluates to the current path
${PATH:+:} evaluates to nothing if PATH is not set, otherwise it evaluates to ":"
The answers from Michael Burr and user spbnick are already excellent and illustrate the principle. I just want to add two more details:
In their versions, the new path is added to the end of PATH. This is what the OP asked, but it is a less common practice. Adding to the end means that the commands will only be picked if no other commands match from earlier paths. More commonly, users will add to the front to path. This is not what the OP asked, but for other users coming here it may be closer to what they expect. Since the syntax is different I'm highlighting it here.
Also, in the previous versions, the PATH is not quoted. While its unlikely on most Un*x-like operating systems to have spaces in PATH, it is still better practice to always quote.
My slightly improved version, for most typical use cases, is
PATH="/path/to/bin${PATH:+:$PATH}"

How to make fmpp fail if a value cannot be found for a variable?

Is there a parameter to make FMPP fail when it is unable to find a value for a variable in the template? right now it just leaves the text intact with ${} if it cannot resolve a variable.
Something strange is going on there, because it does fail and by default even aborts the whole batch processing if you refer to an undefined variable. Also, it doesn't leave ${}-s in the output, because all the ${ and } are "parsed away" before the template could do anything. So I suspect the value of those variables is indeed the string "${}", or you have some tricky #escape in the border/footer/header settings, or something tricky like that. (If you can provide a minimalistic example to reproduce this, I can certainly spot the reason.)

Resources