How does bash know where my variable names end? - bash

Given:
myvar=Hello
echo $myvar -> Shows Hello (fine so far)
echo $myvar#world -> shows Hello#world (why? I thought it would complain that here is no such variable called myvar#world)
echo ${myvar#world} -> shows just Hello (again, why?)

The second case splits up into three parts:
[echo] [$myvar][#world]
1 2 3
Part 1 is the command, part 2 is a parameter, and part 3 is a string literal. The parameter stops on r since the # can't be part of the variable name (#'s are not allowed in variable names.)
The shell parser will recognise the start of a parameter name by the $, and the end by any character which cannot be part of the variable name. Normally only letters, numbers and underscores are allowed in a variable name, anything else will tell the shell that you're finished specifying the name of the variable.
All of these will print out $myvar followed by six literal characters:
echo $myvar world
echo $myvar?world
echo $myvar#world
If you want to put characters which can be part of a parameter directly after the parameter, you can include braces around the parameter name, like this:
myvar=hello
echo ${myvar}world
which prints out:
helloworld
Your third case is substring removal, except without a match. To get it to do something interesting, try this instead:
myvar="Hello World"
echo ${myvar#Hello }
which just prints World.

variables cannot contain a "#" so the shell knows its not part of a variable.
The construct ${myvar#world} actually is a string manipulator explained below:
# is actuially a string modifier that will remove the first part of the string matching "world". Since there is no string matching world in myvar is just echos back "hello"

Related

Variable Expansion when using variable within filename [duplicate]

This question already has answers here:
How to echo "$x_$y" in Bash script?
(4 answers)
When do we need curly braces around shell variables?
(7 answers)
Closed 3 years ago.
When using Bash I want to WGET multiple files from a server so I write a script with a For-loop that increments a counter to match the numbering of the files.
But I want to include the title of the file AND the number in which order the file appears (the "ID" of the file). So the the file has a URI of "example.com/files/hello_world.txt", with the ID of 42 and the title is "Hello World" when WGET it, the downloaded file should have the name "42_Hello_World.txt".
I tried the following code:
#! /bin/bash
# Init
index=42
title="Hello World"
# Replace blanks with underscore
title=${title/ /_}
# Concat fileName
fileName="$index_$title.txt"
echo $fileName
but the output is just "Hello_World.txt". When I change the order of $title and $index the output is "42.txt"
Can someone explain to me why this happens and how to solve it?
tl;dr
When using two or more variables in bash when evaluating a string only the last variable is "expanded". The first one is ignored. WHY???
_ is a valid character for an identifier, so $index_$title.txt is interpreted as the concatenation of two parameter expansions, $index_ and $title. To explicitly delimit the parameter name, use the full ${...} form:
fileName=${index}_$title.txt
The braces are not necessary for $title, because the following . cannot be interpreted as part of a parameter name (though the braces are certainly permitted: ${index}_${title}.txt).
Since index_ is not defined, $index_ expands to the empty string.
Yes. The explanation is the the _ character is a valid character in a variable name, so that your expressions are expanding the (undefined) variables $index_ and $title_ as empty strings. (The . is not a valid name character, so it terminates the 2nd name automatically.) Do this instead:
$ fileName="${index}_$title.txt"
$ echo $fileName
42_Hello_World.txt
$ echo "${title}_$index.txt"
Hello_World_42.txt
Could you please try following. This change should provide you your expected results. It is simple your variable "$index_$title.txt" is considered as you are concatenating 2 variables (index_ and title) so its better to quote _ like --> "_" and tell shell that it is a string.
index="42"
str="Hello World"
# Replace blanks with underscore
title=${str/ /_}
# Concat fileName
fileName=$index"_"$title".txt"
echo $fileName
In this nice url, you could see the last example of VALID variables(_ is there in the list):
https://bash.cyberciti.biz/guide/Rules_for_Naming_variable_name
The _ in the filename is not helping. _ is a valid variable character, and bash thinks that you want a variable called $index_ followed by $title, which isn't what you want. You can either:
Change the underscore character to an invalid variable name
Change to filename=$title"_"$index".txt" or
put brackets around $index
Hope this helps!
EDIT: You already have an answer here! How to echo "$x_$y" in Bash script?

bash variable substitution right next to another string [duplicate]

In shell scripts, when do we use {} when expanding variables?
For example, I have seen the following:
var=10 # Declare variable
echo "${var}" # One use of the variable
echo "$var" # Another use of the variable
Is there a significant difference, or is it just style? Is one preferred over the other?
In this particular example, it makes no difference. However, the {} in ${} are useful if you want to expand the variable foo in the string
"${foo}bar"
since "$foobar" would instead expand the variable identified by foobar.
Curly braces are also unconditionally required when:
expanding array elements, as in ${array[42]}
using parameter expansion operations, as in ${filename%.*} (remove extension)
expanding positional parameters beyond 9: "$8 $9 ${10} ${11}"
Doing this everywhere, instead of just in potentially ambiguous cases, can be considered good programming practice. This is both for consistency and to avoid surprises like $foo_$bar.jpg, where it's not visually obvious that the underscore becomes part of the variable name.
Variables are declared and assigned without $ and without {}. You have to use
var=10
to assign. In order to read from the variable (in other words, 'expand' the variable), you must use $.
$var # use the variable
${var} # same as above
${var}bar # expand var, and append "bar" too
$varbar # same as ${varbar}, i.e expand a variable called varbar, if it exists.
This has confused me sometimes - in other languages we refer to the variable in the same way, regardless of whether it's on the left or right of an assignment. But shell-scripting is different, $var=10 doesn't do what you might think it does!
You use {} for grouping. The braces are required to dereference array elements. Example:
dir=(*) # store the contents of the directory into an array
echo "${dir[0]}" # get the first entry.
echo "$dir[0]" # incorrect
You are also able to do some text manipulation inside the braces:
STRING="./folder/subfolder/file.txt"
echo ${STRING} ${STRING%/*/*}
Result:
./folder/subfolder/file.txt ./folder
or
STRING="This is a string"
echo ${STRING// /_}
Result:
This_is_a_string
You are right in "regular variables" are not needed... But it is more helpful for the debugging and to read a script.
Curly braces are always needed for accessing array elements and carrying out brace expansion.
It's good to be not over-cautious and use {} for shell variable expansion even when there is no scope for ambiguity.
For example:
dir=log
prog=foo
path=/var/${dir}/${prog} # excessive use of {}, not needed since / can't be a part of a shell variable name
logfile=${path}/${prog}.log # same as above, . can't be a part of a shell variable name
path_copy=${path} # {} is totally unnecessary
archive=${logfile}_arch # {} is needed since _ can be a part of shell variable name
So, it is better to write the three lines as:
path=/var/$dir/$prog
logfile=$path/$prog.log
path_copy=$path
which is definitely more readable.
Since a variable name can't start with a digit, shell doesn't need {} around numbered variables (like $1, $2 etc.) unless such expansion is followed by a digit. That's too subtle and it does make to explicitly use {} in such contexts:
set app # set $1 to app
fruit=$1le # sets fruit to apple, but confusing
fruit=${1}le # sets fruit to apple, makes the intention clear
See:
Allowed characters in Linux environment variable names
The end of the variable name is usually signified by a space or newline. But what if we don't want a space or newline after printing the variable value? The curly braces tell the shell interpreter where the end of the variable name is.
Classic Example 1) - shell variable without trailing whitespace
TIME=10
# WRONG: no such variable called 'TIMEsecs'
echo "Time taken = $TIMEsecs"
# What we want is $TIME followed by "secs" with no whitespace between the two.
echo "Time taken = ${TIME}secs"
Example 2) Java classpath with versioned jars
# WRONG - no such variable LATESTVERSION_src
CLASSPATH=hibernate-$LATESTVERSION_src.zip:hibernate_$LATEST_VERSION.jar
# RIGHT
CLASSPATH=hibernate-${LATESTVERSION}_src.zip:hibernate_$LATEST_VERSION.jar
(Fred's answer already states this but his example is a bit too abstract)
Following SierraX and Peter's suggestion about text manipulation, curly brackets {} are used to pass a variable to a command, for instance:
Let's say you have a sposi.txt file containing the first line of a well-known Italian novel:
> sposi="somewhere/myfolder/sposi.txt"
> cat $sposi
Ouput: quel ramo del lago di como che volge a mezzogiorno
Now create two variables:
# Search the 2nd word found in the file that "sposi" variable points to
> word=$(cat $sposi | cut -d " " -f 2)
# This variable will replace the word
> new_word="filone"
Now substitute the word variable content with the one of new_word, inside sposi.txt file
> sed -i "s/${word}/${new_word}/g" $sposi
> cat $sposi
Ouput: quel filone del lago di como che volge a mezzogiorno
The word "ramo" has been replaced.

How do you take a suffix of a string in bash using negative offsets?

I am trying to take the suffix of a string in Bash using the ${string:pos} substring syntax, but I cannot figure out why it won't work. I have managed to simplify my example code to this:
STRING="hello world"
POS=4
echo ${STRING:POS} # prints "o world"
echo ${STRING:4} # prints "o world"
POS=-4
echo ${STRING:POS} # prints "orld"
echo ${STRING:-4} # prints "hello world"
The first three lines work exactly as I would expect, but why does the final line print "hello world" instead of "orld"?
Because :- is parameter expansion syntax to "Use default values".
From the documentation:
When not performing substring expansion, using the form described
below (e.g., ‘:-’), Bash tests for a parameter that is unset or
null.
So by doing ${STRING:-4} you are actually asking bash to expand
STRING and if it is unset (have never been assigned before) or null
(a null string, printed as '') it will substitute the expansion with
4. In your example, STRING is set and thus it is expanded to its value.
As another answer states, you need to scape the expression to not
trigger the default value behavior, the manual specifies it:
Note that a negative offset must be separated from the colon by at
least one space to avoid being confused with the :- expansion.
For example:
${STRING:(-4)}
${STRING: -4}
You need to "escape" parameters starting with dash with a parenthesis or a space, otherwise bash will treat it as a normal string:
echo ${STRING:(-4)}
echo ${STRING: -4}

Theory: who can explain the use of =

can someone explain me with this code
data=$(date +"%Y-%m-%dS%H:%M:%S")
name="/home/cft/"$data"_test.tar"
touch $name
works, creating a new .tar file but this code doesn't work
data=$(date +"%Y-%m-%dS%H:%M:%S")
name= "/home/cft/"$data"_test.tar"
touch $name
and gives me this error: no such file or directory?
why the space between = and inverted commas creates this error?
Shell allows you to provide per-command environment overrides by prefixing the command with one or more variable assignments.
name= "/home/cft/"$data"_test.tar"
asks the shell to run the program named /home/cft/2013-10-08S12:00:00_test.tar (for example) with the value of name set to the empty string in its environment.
(In your case, the error occurs because the named tar file either doesn't exist or, if it does, is not an executable file.)
A variable assignment is identified by having no whitespace after the equal sign.
(name = whatever, of course, is simply a command called name with two string arguments, = and whatever.)
You can't have whitespace between the equal sign and the definition.
http://www.tldp.org/LDP/abs/html/varassignment.html
There is no theory behind this. It's just a decision the language designers made, and which the parser enforces.
In BASH (and other Bourne type shells like zsh and Kornshell), the equal sign cannot have spaces around it when setting variables.
Good:
$ foo="bar"
Bad:
$ foo= "bar"
$ foo = "bar"
There's no real reason that would prevent spaces from being used. Other programming languages have no problems with this. It's just the syntax of the shell itself.
The reason might be related to the original Bourne shell parsing where the shell would break up a command line based upon whitespace. That would make foo=bar a single parameter instead of two or three (depending if you have white space on both sides or just one side of the equal sign). The shell could see the = sign, and know this parameter is an assignment.
The shell parameter parsing is very primitive in many ways. Whitespace is very important. The shell has to be small and fast in order to be responsive. That means stripping down unessential things like complex line parsing.
Inverted commas I believe you mean quotation marks. Double quotes are used to override the breaking out of parameters over white space:
Bad:
$ foo=this is a test
bash: is: command not found
Good:
$ foo="this is a test"
Double quotes allow interpolation. Single quotes don't:
$ foo="bar"
$ echo "The value of foo is $foo"
The value of foo is bar
$ echo 'The value of foo is $foo'
The value of foo is $foo.
If you start out with single quotes, you can put double quotes inside. If you have single quotes, you can put double quotes inside.
$ foo="bar"
$ echo "The value of foo is '$foo'"
The value of foo is 'bar'
$ echo 'The value of foo is "$foo"'
The value of foo is "$foo"
This means you didn't have to unquote $data. However, you would have to put curly braces around it because underscores are legal characters in variable names. Thus, you want to make sure that the shell understand that the variable is $data and not $data_backup:
name="/home/cft/${data}_test.tar"

Bash bad substitution with subshell and substring

A contrived example... given
FOO="/foo/bar/baz"
this works (in bash)
BAR=$(basename $FOO) # result is BAR="baz"
BAZ=${BAR:0:1} # result is BAZ="b"
this doesn't
BAZ=${$(basename $FOO):0:1} # result is bad substitution
My question is which rule causes this [subshell substitution] to evaluate incorrectly? And what is the correct way, if any, to do this in 1 hop?
First off, note that when you say this:
BAR=$(basename $FOO) # result is BAR="baz"
BAZ=${BAR:0:1} # result is BAZ="b"
the first bit in the construct for BAZ is BAR and not the value that you want to take the first character of. So even if bash allowed variable names to contain arbitrary characters your result in the second expression wouldn't be what you want.
However, as to the rule that's preventing this, allow me to quote from the bash man page:
DEFINITIONS
The following definitions are used throughout the rest of this docu‐
ment.
blank A space or tab.
word A sequence of characters considered as a single unit by the
shell. Also known as a token.
name A word consisting only of alphanumeric characters and under‐
scores, and beginning with an alphabetic character or an under‐
score. Also referred to as an identifier.
Then a bit later:
PARAMETERS
A parameter is an entity that stores values. It can be a name, a num‐
ber, or one of the special characters listed below under Special Param‐
eters. A variable is a parameter denoted by a name. A variable has a
value and zero or more attributes. Attributes are assigned using the
declare builtin command (see declare below in SHELL BUILTIN COMMANDS).
And later when it defines the syntax you're asking about:
${parameter:offset:length}
Substring Expansion. Expands to up to length characters of
parameter starting at the character specified by offset.
So the rules as articulated in the manpage say that the ${foo:x:y} construct must have a parameter as the first part, and that a parameter can only be a name, a number, or one of the few special parameter characters. $(basename $FOO) is not one of the allowed possibilities for a parameter.
As for a way to do this in one assignment, use a pipe to other commands as mentioned in other responses.
Modified forms of parameter substitution such as ${parameter#word} can only modify a parameter, not an arbitrary word.
In this case, you might pipe the output of basename to a dd command, like
BAR=$(basename -- "$FOO" | dd bs=1 count=1 2>/dev/null)
(If you want a higher count, increase count and not bs, otherwise you may get fewer bytes than requested.)
In the general case, there is no way to do things like this in one assignment.
It fails because ${BAR:0:1} is a variable expansion. Bash expects to see a variable name after ${, not a value.
I'm not aware of a way to do it in a single expression.
As others have said, the first parameter of ${} needs to be a variable name. But you can use another subshell to approximate what you're trying to do.
Instead of:
BAZ=${$(basename $FOO):0:1} # result is bad substitution
Use:
BAZ=$(_TMP=$(basename $FOO); echo ${_TMP:0:1}) # this works
A contrived solution for your contrived example:
BAZ=$(expr $(basename $FOO) : '\(.\)')
as in
$ FOO=/abc/def/ghi/jkl
$ BAZ=$(expr $(basename $FOO) : '\(.\)')
$ echo $BAZ
j
${string:0:1},string must be a variable name
for example:
FOO="/foo/bar/baz"
baz="foo"
BAZ=eval echo '${'"$(basename $FOO)"':0:1}'
echo $BAZ
the result is 'f'

Resources