Bash variable concatenation - bash

Which form is most efficient?
1)
v=''
v+='a'
v+='b'
v+='c'
2)
v2='a'` `'b'` `'c'
Assuming readability were exactly the same to you, and that's a stretch, would 1) mean creating and throwing away a few string immutables (like in Python) or act as a Java "StringBuffer" with periodical expansion of the buffer capacity? How are string concatenations handled internally in Bash?
If 2) were just as readable to you as 1), would the backticks spawn subshells and would that be more costly, even as a potential 'no-op' than what is done in 1) ?

Well, the simplest and most efficient mechanism would be option 0:
v="abc"
The first mechanism involves four assignments.
The second mechanism is bizarre (and is definitely not readable). It (nominally) runs an empty command in two sub-shells (the two ` ` parts) and concatenates the outputs (an empty string) with the three constants. If the shell simply executes the back-tick commands without noting that they're empty (and it's not unreasonable that it won't notice; it is a weird thing to try — I don't recall seeing it done in my previous 30 years of shell scripting), this is definitely vastly slower.
So, given only options (1) and (2), use option (1), but in general, use option (0) shown above.
Why would you be building up the string piecemeal like that? What's missing from your example that makes the original code sensible but the reduced code shown less sensible.
v=""
x=$(...)
v="$v$x"
y=$(...)
v="$v$y"
z=$(...)
v="$v$z"
This would make more sense, especially if you use each of $x, $y and $z later, and/or use intermediate values of $v (perhaps in the commands represented by triple dots). The concatenation notation used will work with any Bourne-shell derivative; the alternative += shell will work with fewer shells, but is probably slightly more efficient (with the emphasis on 'slightly').

The portable and straight forward method would be to use double quotes and curly brackets for variables:
VARA="beginning text ${VARB} middle text ${VARC}..."
you can even set default values for empty variables this way
VARA="${VARB:-default text} substring manipulation 1st 3 characters ${VARC:0:3}"
using the curly brackets prevents situations where there is a $VARa and you want to write ${VAR}a but end up getting the contents of ${VARa}

Related

How to paste literal words in Tcl

Is there any syntax trick / feature which would allow me to paste two literal words in TCL, e.g. to concatenate a braced ({..}) word and a double-quoted "...") word into a single one?
I'm not asking about set a {foo}; set b "bar\nquux"; set c $a$b or append a $b -- I know about them; but about something without intermediate variables or commands. Analogous to the {*}word (which turns a word into a list).
I guess that the answer is "no way", but my shallow knowledge of Tcl doesn't allow me to draw such a conclusion.
If you are using a recent Tcl version (8.6.2 or newer) you can use
set c [string cat {foo} "bar\nquux"]
For older versions, you can resort to
set c [format %s%s {foo} "bar\nquux"]
There's no way to do what you're asking for without a command, since the syntax of braced words doesn't permit anything before or afterwards, and once you have several words you need to join them with a command (because that's what commands do from the perspective of Tcl's language core; take some values and produce a value result). Not that having braces in the middle of a string is syntax error — it isn't — but it does stop them being quote characters. To be clear:
puts a{b} prints a{b} because { is not special in that case and instead becomes part of the value.
puts {a}b is a syntax error. (The only exception to this is {*}, which started as {expand} but that was waaaay too wordy.)
Approaches that work:
Use string cat.
Use a concatenation procedure (e.g., proc strcat {a b} {return $a$b}
Put both values inside the braces so it is a combined literal. Which only works if you have both parts being literals, of course.
Convert the braced part to non-braced (and non-double-quoted) form. This is always possible as every braced string has a non-braced equivalent, but can involve a lot of backslashes.
If your word is a valid list, you can do:
set orig {abc def}
set new [join $orig {}]

What does the "%" mean in tcl?

In a situation like this for example:
[% $create_port %]
or [list [% $RTL_LIST %]]
I realized it had to do with the brackets, but what confuses me is that sometimes it is used with the brackets and variable followed, and sometimes you have brackets with variables inside without the %.
So i'm not sure what it is used for.
Any help is appreciated.
% is not a metacharacter in the Tcl language core, but it still has a few meanings in Tcl. In particular, it's the modulus operator in expr and a substitution field specifier in format, scan, clock format and clock scan. (It's also the default prompt character, and I have a trivial pass-through % command in my ~/.tclshrc to make cut-n-pasting code easier, but nobody else in the world needs to follow my lead there!)
But the code you have written does not appear to be any of those (because it would be a syntax error in all of the commands I've mentioned). It looks like it is some sort of directive processing scheme (with the special sequences being [% and %], with the brackets) though not one I recognise such as doctools or rivet. Because a program that embeds a Tcl interpreter could do an arbitrary transformation to scripts before executing them, it's extremely difficult to guess what it might really be.

What are the differences between script blocks, subexpressions, and subshells?

On the surface, these constructs seem similar in PowerShell:
{} Script Block - from about_Script_Blocks:
In the Windows PowerShell programming language, a script block is a collection of statements or expressions that can be used as a single unit. A script block can accept arguments and return values
$() Subexpression Operator - from about_Operators:
Returns the result of one or more statements. For a single result, returns a scalar. For multiple results, returns an array.
In the (contrived) illustration below, the subexpression and invoked script block perform the same functionality, but I'm sure a more complex example can highlight the need to use one or the other. In which scenarios might we choose to use one over the other? How might the choice influence scope and execution peformance?
$foo = $(Invoke-Bar; Invoke-Baz)
$foo = &{Invoke-Bar; Invoke-Baz}
Do I understand correctly that these expressions execute in the same PowerShell process (unlike many *nix shells that would fork a subshell process for similar command substitution $())?
Is there a way to—or even a reason to—run a portion of a script in a subshell/subprocess?

Why bash indirect expansion has to use temp variable?

From https://stackoverflow.com/a/10820494/1764881, I know that the standard way of doing it seems to be:
var="SAMPLE$i"
echo ${!var}
But, I can't seem to do any of these following forms. They all failed:
echo ${!SAMPLE$i}
echo ${!"SAMPLE$i"}
I read the bash man page, but I still couldn't understand. Is it true that the first form is the only form accepted?
Yes. The underlying logic is that all parameter expansions take a single, literal word as the name of the parameter to expand, and any additional operator does something to the result. ! is no exception; var is expanded as usual, but the result is expanded again.
(As an aside, even arrays follow this rule. It might seem that something like ${array[2]%foo} applies two operators to array, but really array[2] is treated as the name of a single parameter. There is a little difference, as the index is allowed to be an arbitrary arithmetic expression rather than a literal number.)
(And for completeness, I should mention the actual exceptions, ${!prefix*} and ${!name[*]}, which confusingly use the same operator ! for querying variables themselves. The first lists variable names starting with the same prefix; the second lists the keys of the named array.)

Bash string manipulation, extracting/removing parts

I'm modifying an old bash file and am having some trouble manipulating strings. The problem is that the strings can be anything random to the left of _<date>.<num>. For example, from ThisIsAString-Sub_tag_150827.1, I need to extract _150827.1. In bash, this seems very difficult to do. In any other language, I would split on _, and just grab the last element of the list. How do I do this in bash? I've tried a few different ways (including with awk), but cannot seem to get it right.
With bash's Parameter Expansion:
a="ThisIsAString-Sub_tag_150827.1"
echo "${a##*_}"
Output:
150827.1

Resources