What is the $ in bash? - bash

I've been using bash for about 3 mounth.
I'm understanding the language step by step but I have a question.
The real significate of $ in bash is the same of C?
I mean the $ not $1, $0, $# etc etc.
Only the $.

The $ is used to perform parameter expansion. For a variable named foo, the expression $foo expands to the value of the variable.
$ foo=3
$ echo "foo"
foo
$ echo "$foo"
3
$ is also used as the default/generic prompt, but there it is simply used as a distinctive character; it has no actual meaning, and could be replaced without causing any change in functionality.

Related

Could someone explain me what this shell bash command "echo{,}" means?

If I do this:
echo{,}
The result is:
echo
I don't really understand the {,} at the end and the result
Thanks to clarify this.
I would start with something simpler to see how {} works: As #anubhava linked, it generates strings. Essentially, it expands all the elements in it and combines them with whatever is before and after it (space is separator if you don't quote).
Example:
$ bash -xc 'echo before{1,2}after and_sth_else'
+ echo before1after before2after and_sth_else
before1after before2after and_sth_else
Note that there is a space between echo and the arguments. This is not the case on what you have posted. So what happened there? Check the following:
$ bash -xc 'man{1,2}'
+ man1 man2
bash: man1: command not found
The result of the expansion is fed to bash and bash tries to execute it. In the above case, the command that is looking for is man1 (which does not exist).
Finally, combine the above to your question:
echo{,}
{,} expands to two empty elements/strings
These are then prefixed/concatenated with "echo" so we now have echo echo
Expansion finished and this is given to bash to execute
Command is echo and its first argument is "echo"... so it echoes echo!
echo{,}
is printing just echo because it is equivalent of echo echo.
More examples to clarify:
bash -xc 'echo{,}'
+ echo echo
echo
echo foo{,}
foo foo
echo foo{,,}
foo foo foo
More about Brace Expansion
Brace expansion is a mechanism by which arbitrary strings may be generated. This mechanism is similar to pathname expansion, but the filenames generated
need not exist. Patterns to be brace expanded take the form of an optional preamble, followed by either a series of comma-separated strings or a sequence
expression between a pair of braces, followed by an optional postscript. The preamble is prefixed to each string contained within the braces, and the
postscript is then appended to each resulting string, expanding left to right.
The {item1,item2,...} is a brace expansion.
So echo{,} is expanded as echo echo because {,} has two (empty) elements, then echo prints it argument.
Try this :
$ set -x
$ echo{,}
+ echo echo
echo
$ set +x
+ set +x
$
It's also handy to generate "cross products" without nested loops:
$ ary=( {1,2,3}{a,b,c} )
$ declare -p ary
declare -a ary=([0]="1a" [1]="1b" [2]="1c" [3]="2a" [4]="2b" [5]="2c" [6]="3a" [7]="3b" [8]="3c")

IFS=: "set $var" versus "set toto:fofo"

This is probably a stupid question but i am trying to understand why version 1 of the code below works and version 2 doesn't:
version 1:
$ VAR=toto:fofo:bar
$ IFS=:
$ set $VAR
$ echo $1
toto
version 2:
$ IFS=:
$ set toto:fofo:bar
$ echo $1
toto fofo bar
I don't understand why in the first version ':' are interpreted as a separator but in the second one they are not interpreted at all as if they are only interpreted if they are the output of a variable substitution ?
You're right. Word splitting only applies to the result of unquoted parameter expansions and command substitutions. It does not affect shell parsing or grammar.
Here's man bash with emphasis:
IFS
The Internal Field Separator that is used for word splitting after expansion [...]

Quoted printing of shell variables

I would like to print the contents of a variable in such a way, that the printed output could be pasted directly into a shell to get the original content of the variable.
This is trivial if the content doesn't contain any special characters, esp no quotes. e.g.
$ x=foo
$ echo x=${x}
x=foo
In the above example i can take the output (x=foo and paste it into a new terminal to assign foo to x).
If the variable content contains spaces, things get a bit trickier, but it's still easy:
$ x="foo bar"
$ echo x=\"${x}\"
x="foo bar"
Now trouble starts, if the variable is allowed to contain any character, e.g.:
$ x=foo\"bar\'baz
$echo ${x}
foo"bar'baz
$ echo x=\"${x}\"
x="foo"bar'baz"
$ x="foo"bar'baz"
>
(and the terminal hangs, waiting for me to close the unbalanced ")
What I would have expected was an output like the following:
x=foo\"bar\'baz
How would I do that, preferably POSIX compliant (but if it cannot be helped, bash only)?
Use declare -p:
x=foo\"bar\'baz
declare -p x
declare -- x="foo\"bar'baz"
Even with an array:
arr=(a 'foo' 'foo bar' 123)
declare -p arr
declare -a arr='([0]="a" [1]="foo" [2]="foo bar" [3]="123")'
The -p option will display the attributes and values of each variable name that can be used to directly set the variable again.
As per comment, doing this in POSIX without help of declare -p would be:
set | grep '^x='
You could use the printf command with its %q flag it supports to print the associated argument shell-quoted
x=foo\"bar\'baz
printf 'x=%q' "$x"
x=foo\"bar\'baz
Again though %q is not a POSIX extension but added to most recent versions of bash shell. If this is what you where looking for look up this to see POSIX sh equivalent for Bash's printf %q

A semantics for Bash scripts?

More than any other language I know, I've "learned" Bash by Googling every time I need some little thing. Consequently, I can patchwork together little scripts that appear to work. However, I don't really know what's going on, and I was hoping for a more formal introduction to Bash as a programming language. For example: What is the evaluation order? what are the scoping rules? What is the typing discipline, e.g. is everything a string? What is the state of the program -- is it a key-value assignment of strings to variable names; is there more than that, e.g. the stack? Is there a heap? And so on.
I thought to consult the GNU Bash manual for this kind of insight, but it doesn't seem to be what I want; it's more of a laundry list of syntactic sugar rather than an explanation of the core semantic model. The million-and-one "bash tutorials" online are only worse. Perhaps I should first study sh, and understand Bash as a syntactic sugar on top of this? I don't know if this is an accurate model, though.
Any suggestions?
EDIT: I've been asked to provide examples of what ideally I'm looking for. A rather extreme example of what I would consider a "formal semantics" is this paper on "the essence of JavaScript". Perhaps a slightly less formal example is the Haskell 2010 report.
A shell is an interface for the operating system. It is usually a more-or-less robust programming language in its own right, but with features designed to make it easy to interact specifically with the operating system and filesystem. The POSIX shell's (hereafter referred to just as "the shell") semantics are a bit of a mutt, combining some features of LISP (s-expressions have a lot in common with shell word splitting) and C (much of the shell's arithmetic syntax semantics comes from C).
The other root of the shell's syntax comes from its upbringing as a mishmash of individual UNIX utilities. Most of what are often builtins in the shell can actually be implemented as external commands. It throws many shell neophytes for a loop when they realize that /bin/[ exists on many systems.
$ if '/bin/[' -f '/bin/['; then echo t; fi # Tested as-is on OS X, without the `]`
t
wat?
This makes a lot more sense if you look at how a shell is implemented. Here's an implementation I did as an exercise. It's in Python, but I hope that's not a hangup for anyone. It's not terribly robust, but it is instructive:
#!/usr/bin/env python
from __future__ import print_function
import os, sys
'''Hacky barebones shell.'''
try:
input=raw_input
except NameError:
pass
def main():
while True:
cmd = input('prompt> ')
args = cmd.split()
if not args:
continue
cpid = os.fork()
if cpid == 0:
# We're in a child process
os.execl(args[0], *args)
else:
os.waitpid(cpid, 0)
if __name__ == '__main__':
main()
I hope the above makes it clear that the execution model of a shell is pretty much:
1. Expand words.
2. Assume the first word is a command.
3. Execute that command with the following words as arguments.
Expansion, command resolution, execution. All of the shell's semantics are bound up in one of these three things, although they're far richer than the implementation I wrote above.
Not all commands fork. In fact, there are a handful of commands that don't make a ton of sense implemented as externals (such that they would have to fork), but even those are often available as externals for strict POSIX compliance.
Bash builds upon this base by adding new features and keywords to enhance the POSIX shell. It is nearly compatible with sh, and bash is so ubiquitous that some script authors go years without realizing that a script may not actually work on a POSIXly strict system. (I also wonder how people can care so much about the semantics and style of one programming language, and so little for the semantics and style of the shell, but I diverge.)
Order of evaluation
This is a bit of a trick question: Bash interprets expressions in its primary syntax from left to right, but in its arithmetic syntax it follows C precedence. Expressions differ from expansions, though. From the EXPANSION section of the bash manual:
The order of expansions is: brace expansion; tilde expansion, parameter
and variable expansion, arithmetic expansion, and command substitution
(done in a left-to-right fashion); word splitting; and pathname expansion.
If you understand wordsplitting, pathname expansion and parameter expansion, you are well on your way to understanding most of what bash does. Note that pathname expansion coming after wordsplitting is critical, because it ensures that a file with whitespace in its name can still be matched by a glob. This is why good use of glob expansions is better than parsing commands, in general.
Scope
Function scope
Much like old ECMAscript, the shell has dynamic scope unless you explicitly declare names within a function.
$ foo() { echo $x; }
$ bar() { local x; echo $x; }
$ foo
$ bar
$ x=123
$ foo
123
$ bar
$ …
Environment and process "scope"
Subshells inherit the variables of their parent shells, but other kinds of processes don't inherit unexported names.
$ x=123
$ ( echo $x )
123
$ bash -c 'echo $x'
$ export x
$ bash -c 'echo $x'
123
$ y=123 bash -c 'echo $y' # another way to transiently export a name
123
You can combine these scoping rules:
$ foo() {
> local -x bar=123 # Export foo, but only in this scope
> bash -c 'echo $bar'
> }
$ foo
123
$ echo $bar
$
Typing discipline
Um, types. Yeah. Bash really doesn't have types, and everything expands to a string (or perhaps a word would be more appropriate.) But let's examine the different types of expansions.
Strings
Pretty much anything can be treated as a string. Barewords in bash are strings whose meaning depends entirely on the expansion applied to it.
No expansion
It may be worthwhile to demonstrate that a bare word really is just a word, and that quotes change nothing about that.
$ echo foo
foo
$ 'echo' foo
foo
$ "echo" foo
foo
Substring expansion
$ fail='echoes'
$ set -x # So we can see what's going on
$ "${fail:0:-2}" Hello World
+ echo Hello World
Hello World
For more on expansions, read the Parameter Expansion section of the manual. It's quite powerful.
Integers and arithmetic expressions
You can imbue names with the integer attribute to tell the shell to treat the right hand side of assignment expressions as arithmetic. Then, when the parameter expands it will be evaluated as integer math before expanding to … a string.
$ foo=10+10
$ echo $foo
10+10
$ declare -i foo
$ foo=$foo # Must re-evaluate the assignment
$ echo $foo
20
$ echo "${foo:0:1}" # Still just a string
2
Arrays
Arguments and Positional Parameters
Before talking about arrays it might be worth discussing positional parameters. The arguments to a shell script can be accessed using numbered parameters, $1, $2, $3, etc. You can access all these parameters at once using "$#", which expansion has many things in common with arrays. You can set and change the positional parameters using the set or shift builtins, or simply by invoking the shell or a shell function with these parameters:
$ bash -c 'for ((i=1;i<=$#;i++)); do
> printf "\$%d => %s\n" "$i" "${#:i:1}"
> done' -- foo bar baz
$1 => foo
$2 => bar
$3 => baz
$ showpp() {
> local i
> for ((i=1;i<=$#;i++)); do
> printf '$%d => %s\n' "$i" "${#:i:1}"
> done
> }
$ showpp foo bar baz
$1 => foo
$2 => bar
$3 => baz
$ showshift() {
> shift 3
> showpp "$#"
> }
$ showshift foo bar baz biz quux xyzzy
$1 => biz
$2 => quux
$3 => xyzzy
The bash manual also sometimes refers to $0 as a positional parameter. I find this confusing, because it doesn't include it in the argument count $#, but it is a numbered parameter, so meh. $0 is the name of the shell or the current shell script.
Arrays
The syntax of arrays is modeled after positional parameters, so it's mostly healthy to think of arrays as a named kind of "external positional parameters", if you like. Arrays can be declared using the following approaches:
$ foo=( element0 element1 element2 )
$ bar[3]=element3
$ baz=( [12]=element12 [0]=element0 )
You can access array elements by index:
$ echo "${foo[1]}"
element1
You can slice arrays:
$ printf '"%s"\n' "${foo[#]:1}"
"element1"
"element2"
If you treat an array as a normal parameter, you'll get the zeroth index.
$ echo "$baz"
element0
$ echo "$bar" # Even if the zeroth index isn't set
$ …
If you use quotes or backslashes to prevent wordsplitting, the array will maintain the specified wordsplitting:
$ foo=( 'elementa b c' 'd e f' )
$ echo "${#foo[#]}"
2
The main difference between arrays and positional parameters are:
Positional parameters are not sparse. If $12 is set, you can be sure $11 is set, too. (It could be set to the empty string, but $# will not be smaller than 12.) If "${arr[12]}" is set, there's no guarantee that "${arr[11]}" is set, and the length of the array could be as small as 1.
The zeroth element of an array is unambiguously the zeroth element of that array. In positional parameters, the zeroth element is not the first argument, but the name of the shell or shell script.
To shift an array, you have to slice and reassign it, like arr=( "${arr[#]:1}" ). You could also do unset arr[0], but that would make the first element at index 1.
Arrays can be shared implicitly between shell functions as globals, but you have to explicitly pass positional parameters to a shell function for it to see those.
It's often convenient to use pathname expansions to create arrays of filenames:
$ dirs=( */ )
Commands
Commands are key, but they're also covered in better depth than I can by the manual. Read the SHELL GRAMMAR section. The different kinds of commands are:
Simple Commands (e.g. $ startx)
Pipelines (e.g. $ yes | make config) (lol)
Lists (e.g. $ grep -qF foo file && sed 's/foo/bar/' file > newfile)
Compound Commands (e.g. $ ( cd -P /var/www/webroot && echo "webroot is $PWD" ))
Coprocesses (Complex, no example)
Functions (A named compound command that can be treated as a simple command)
Execution Model
The execution model of course involves both a heap and a stack. This is endemic to all UNIX programs. Bash also has a call stack for shell functions, visible via nested use of the caller builtin.
References:
The SHELL GRAMMAR section of the bash manual
The XCU Shell Command Language documentation
The Bash Guide on Greycat's wiki.
Advanced Programming in the UNIX Environment
Please make comments if you want me to expand further in a specific direction.
The answer to your question "What is the typing discipline, e.g. is everything a string"
Bash variables are character strings. But, Bash permits arithmetic operations and comparisons on variables when variables are integers. The exception to rule Bash variables are character strings is when said variables are typeset or declared otherwise
$ A=10/2
$ echo "A = $A" # Variable A acting like a String.
A = 10/2
$ B=1
$ let B="$B+1" # Let is internal to bash.
$ echo "B = $B" # One is added to B was Behaving as an integer.
B = 2
$ A=1024 # A Defaults to string
$ B=${A/24/STRING01} # Substitute "24" with "STRING01".
$ echo "B = $B" # $B STRING is a string
B = 10STRING01
$ B=${A/24/STRING01} # Substitute "24" with "STRING01".
$ declare -i B
$ echo "B = $B" # Declaring a variable with non-integers in it doesn't change the contents.
B = 10STRING01
$ B=${B/STRING01/24} # Substitute "STRING01" with "24".
$ echo "B = $B"
B = 1024
$ declare -i B=10/2 # Declare B and assigning it an integer value
$ echo "B = $B" # Variable B behaving as an Integer
B = 5
Declare option meanings:
-a Variable is an array.
-f Use function names only.
-i The variable is to be treated as an integer; arithmetic evaluation is performed when the variable is assigned a value.
-p Display the attributes and values of each variable. When -p is used, additional options are ignored.
-r Make variables read-only. These variables cannot then be assigned values by subsequent assignment statements, nor can they be unset.
-t Give each variable the trace attribute.
-x Mark each variable for export to subsequent commands via the environment.
The bash manpage has quite a bit more info than most manpages, and includes some of what you're asking for. My assumption after more than a decade of scripting bash is that, due to its' history as an extension of sh, it has some funky syntax (to maintain backward compatibility with sh).
FWIW, my experience has been like yours; although the various books (e.g., O'Reilly "Learning the Bash Shell" and similar) do help with the syntax, there are lots of strange ways of solving various problems, and some of them are not in the book and must be googled.

Why does $# work different from most other variables in bash?

The $# variable seems to maintain quoting around its arguments so that, for example:
$ function foo { for i in "$#"; do echo $i; done }
$ foo herp "hello world" derp
herp
hello world
derp
I am also aware that bash arrays, work the same way:
$ a=(herp "hello world" derp)
$ for i in "${a[#]}"; do echo $i; done
herp
hello world
derp
What is actually going on with variables like this? Particularly when I add something to the quote like "duck ${a[#]} goose". If its not space separated what is it?
Usually, double quotation marks in Bash mean "make everything between the quotation marks one word, even if it has separators in it." But as you've noticed, $# behaves differently when it's within double quotes. This is actually a parsing hack that dates back to Bash's predecessor, the Bourne shell, and this special behavior applies only to this particular variable.
Without this hack (I use the term because it seems inconsistent from a language perspective, although it's very useful), it would be difficult for a shell script to pass along its array of arguments to some other command that wants the same arguments. Some of those arguments might have spaces in them, but how would it pass them to another command without the shell either lumping them together as one big word or reparsing the list and splitting the arguments that have whitespace?
Well, you could pass an array of arguments, and the Bourne shell really only has one array, represented by $* or $#, whose number of elements is $# and whose elements are $1, $2, etc, the so-called positional parameters.
An example. Suppose you have three files in the current directory, named aaa, bbb, and cc c (the third file has a space in the name). You can initialize the array (that is, you can set the positional parameters) to be the names of the files in the current directory like this:
set -- *
Now the array of positional parameters holds the names of the files. $#, the number of elements, is three:
$ echo $#
3
And we can iterate over the position parameters in a few different ways.
1) We can use $*:
$ for file in $*; do
> echo "$file"
> done
but that re-separates the arguments on whitespace and calls echo four times:
aaa
bbb
cc
c
2) Or we could put quotation marks around $*:
$ for file in "$*"; do
> echo "$file"
> done
but that groups the whole array into one argument and calls echo just once:
aaa bbb cc c
3) Or we could use $# which represents the same array but behaves differently in double quotes:
$ for file in "$#"; do
> echo "$file"
> done
will produce
aaa
bbb
cc c
because $1 = "aaa", $2 = "bbb", and $3 = "cc c" and "$#" leaves the elements intact. If you leave off the quotation marks around $#, the shell will flatten and re-parse the array, echo will be called four times, and you'll get the same thing you got with a bare $*.
This is especially useful in a shell script, where the positional parameters are the arguments that were passed to your script. To pass those same arguments to some other command -- without the shell resplitting them on whitespace -- use "$#".
# Truncate the files specified by the args
rm "$#"
touch "$#"
In Bourne, this behavior only applies to the positional parameters because it's really the only array supported by the language. But you can create other arrays in Bash, and you can even apply the old parsing hack to those arrays using the special "${ARRAYNAME[#]}" syntax, whose at-sign feels almost like a wink to Mr. Bourne:
$ declare -a myarray
$ myarray[0]=alpha
$ myarray[1]=bravo
$ myarray[2]="char lie"
$ for file in "${myarray[#]}"; do echo "$file"; done
alpha
bravo
char lie
Oh, and about your last example, what should the shell do with "pre $# post" where you have $# within double quotes but you have other stuff in there, too? Recent versions of Bash preserve the array, prepend the text before the $# to the first array element, and append the text after the $# to the last element:
pre aaa
bb
cc c post

Resources