How does : <<'END' work in bash to create a multi-line comment block? - bash

I found a great answer for how to comment in bash script (by #sunny256):
#!/bin/bash
echo before comment
: <<'END'
bla bla
blurfl
END
echo after comment
The ' and ' around the END delimiter are important, otherwise things inside the block like for example $(command) will be parsed and executed.
This may be ugly, but it works and I'm keen to know what it means. Can anybody explain it simply? I did already find an explanation for : that it is no-op or true. But it does not make sense to me to call no-op or true anyway....

I'm afraid this explanation is less "simple" and more "thorough", but here we go.
The goal of a comment is to be text that is not interpreted or executed as code.
Originally, the UNIX shell did not have a comment syntax per se. It did, however, have the null command : (once an actual binary program on disk, /bin/:), which ignores its arguments and does nothing but indicate successful execution to the calling shell. Effectively, it's a synonym for true that looks like punctuation instead of a word, so you could put a line like this in your script:
: This is a comment
It's not quite a traditional comment; it's still an actual command that the shell executes. But since the command doesn't do anything, surely it's close enough: mission accomplished! Right?
The problem is that the line is still treated as a command beyond simply being run as one. Most importantly, lexical analysis - parameter substitution, word splitting, and such - still takes place on those destined-to-be-ignored arguments. Such processing means you run the risk of a syntax error in a "comment" crashing your whole script:
: Now let's see what happens next
echo "Hello, world!"
#=> hello.sh: line 1: unexpected EOF while looking for matching `''
That problem led to the introduction of a genuine comment syntax: the now-familiar # (which was first introduced in the C shell created at BSD). Everything from # to the end of the line is completely ignored by the shell, so you can put anything you like there without worrying about syntactic validity:
# Now let's see what happens next
echo "Hello, world!"
#=> Hello, world!
And that's How The Shell Got Its Comment Syntax.
However, you were looking for a multi-line (block) comment, of the sort introduced by /* (and terminated by */) in C or Java. Unfortunately, the shell simply does not have such a syntax. The normal way to comment out a block of consecutive lines - and the one I recommend - is simply to put a # in front of each one. But that is admittedly not a particularly "multi-line" approach.
Since the shell supports multi-line string-literals, you could just use : with such a string as an argument:
: 'So
this is all
a "comment"
'
But that has all the same problems as single-line :. You could also use backslashes at the end of each line to build a long command line with multiple arguments instead of one long string, but that's even more annoying than putting a # at the front, and more fragile since trailing whitespace breaks the line-continuation.
The solution you found uses what is called a here-document. The syntax some-command <<whatever causes the following lines of text - from the line immediately after the command, up to but not including the next line containing only the text whatever - to be read and fed as standard input to some-command. Here's an alternate shell implementation of "Hello, world" which takes advantage of this feature:
cat <<EOF
Hello, world
EOF
If you replace cat with our old friend :, you'll find that it ignores not only its arguments but also its input: you can feed whatever you want to it, and it will still do nothing (and still indicate that it did that nothing successfully).
However, the contents of a here-document do undergo string processing. So just as with the single-line : comment, the here-document version runs the risk of syntax errors inside what is not meant to be executable code:
#!/bin/sh -e
: <<EOF
(This is a backtick: `)
EOF
echo 'In modern shells, $(...) is preferred over backticks.'
#=> ./demo.sh: line 2: bad substitution: no closing "`" in `
The solution, as seen in the code you found, is to quote the end-of-document "sentinel" (the EOF or END or whatever) on the line introducing the here document (e.g. <<'EOF'). Doing this causes the entire body of the here-document to be treated as literal text - no parameter expansion or other processing occurs. Instead, the text is fed to the command unchanged, just as if it were being read from a file. So, other than a line consisting of nothing but the sentinel, the here-document can contain any characters at all:
#!/bin/sh -e
: <<'EOF'
(This is a backtick: `)
EOF
echo 'In modern shells, $(...) is preferred over backticks.'
#=> In modern shells, $(...) is preferred over backticks.
(It is worth noting that the way you quote the sentinel doesn't matter - you can use <<'EOF', <<E"OF", or even <<EO\F; all have the same result. This is different from the way here-documents work in some other languages, such as Perl and Ruby, where the content is treated differently depending on the way the sentinel is quoted.)
Notwithstanding any of the above, I strongly recommend that you instead just put a # at the front of each line you want to comment out. Any decent code editor will make that operation easy - even plain old vi - and the benefit is that nobody reading your code will have to spend energy figuring out what's going on with something that is, after all, intended to be documentation for their benefit.

It is called a Here Document. It is a code block that lets you send a list of commands to another command or program
The string following the << is the marker determining the end of the block. If you send commands to no-op, nothing happens, which is why you can use it as a comment block.

That's heredoc syntax. It's a way of defining multi-line string literals.
As the answer at your link explains, the single quotes around the END disables interpolation, similar to the way single-quoted strings disable interpolation in regular bash strings.

Related

Replace a variable every where in a shell script

I need to write a shell script such that I have to read .sh script and find a particular variable (for example, Variable_Name="variable1") and take out is value(variable1).
In other shell script if Variable_Name is used I need to replace it with its Value(variable1)
A simple approach, to build on, might be:
assignment=$(echo 'Variable_Name="variable1"' | sed -r 's/Variable_Name=(.*)/\1/')
echo $assignment
"variable1"
Depending on variable type, the value might be quoted or not, quoted with single apostrophs or quotes. That might be neccessary (String with or without blanks) or superflous. Behind the assignment there might be furter code:
pi=3.14;v=42;
or a comment:
user=janis # Janis Joplin
it might be complicated:
expr="foobar; O'Reilly " # trailing blank important
But only you may know, how complicated it might get. Maybe the simple case is already sufficient. If the new script looks similar, it might work, or not:
targetV=INSERT_HERE; secondV=23
# oops: secondV accidnetally hidden:
targetV="foobar; O'Reilly " # trailing blank important; secondV=23
If the second script is under your control, you can prevent such problems easily. If source and target language are identical, what worked here should work there too.

Way to create multiline comments in Bash?

I have recently started studying shell script and I'd like to be able to comment out a set of lines in a shell script. I mean like it is in case of C/Java :
/* comment1
comment2
comment3
*/`
How could I do that?
Use : ' to open and ' to close.
For example:
: '
This is a
very neat comment
in bash
'
Multiline comment in bash
: <<'END_COMMENT'
This is a heredoc (<<) redirected to a NOP command (:).
The single quotes around END_COMMENT are important,
because it disables variable resolving and command resolving
within these lines. Without the single-quotes around END_COMMENT,
the following two $() `` commands would get executed:
$(gibberish command)
`rm -fr mydir`
comment1
comment2
comment3
END_COMMENT
Note: I updated this answer based on comments and other answers, so comments prior to May 22nd 2020 may no longer apply. Also I noticed today that some IDE's like VS Code and PyCharm do not recognize a HEREDOC marker that contains spaces, whereas bash has no problem with it, so I'm updating this answer again.
Bash does not provide a builtin syntax for multi-line comment but there are hacks using existing bash syntax that "happen to work now".
Personally I think the simplest (ie least noisy, least weird, easiest to type, most explicit) is to use a quoted HEREDOC, but make it obvious what you are doing, and use the same HEREDOC marker everywhere:
<<'###BLOCK-COMMENT'
line 1
line 2
line 3
line 4
###BLOCK-COMMENT
Single-quoting the HEREDOC marker avoids some shell parsing side-effects, such as weird subsitutions that would cause crash or output, and even parsing of the marker itself. So the single-quotes give you more freedom on the open-close comment marker.
For example the following uses a triple hash which kind of suggests multi-line comment in bash. This would crash the script if the single quotes were absent. Even if you remove ###, the FOO{} would crash the script (or cause bad substitution to be printed if no set -e) if it weren't for the single quotes:
set -e
<<'###BLOCK-COMMENT'
something something ${FOO{}} something
more comment
###BLOCK-COMMENT
ls
You could of course just use
set -e
<<'###'
something something ${FOO{}} something
more comment
###
ls
but the intent of this is definitely less clear to a reader unfamiliar with this trickery.
Note my original answer used '### BLOCK COMMENT', which is fine if you use vanilla vi/vim but today I noticed that PyCharm and VS Code don't recognize the closing marker if it has spaces.
Nowadays any good editor allows you to press ctrl-/ or similar, to un/comment the selection. Everyone definitely understands this:
# something something ${FOO{}} something
# more comment
# yet another line of comment
although admittedly, this is not nearly as convenient as the block comment above if you want to re-fill your paragraphs.
There are surely other techniques, but there doesn't seem to be a "conventional" way to do it. It would be nice if ###> and ###< could be added to bash to indicate start and end of comment block, seems like it could be pretty straightforward.
After reading the other answers here I came up with the below, which IMHO makes it really clear it's a comment. Especially suitable for in-script usage info:
<< ////
Usage:
This script launches a spaceship to the moon. It's doing so by
leveraging the power of the Fifth Element, AKA Leeloo.
Will only work if you're Bruce Willis or a relative of Milla Jovovich.
////
As a programmer, the sequence of slashes immediately registers in my brain as a comment (even though slashes are normally used for line comments).
Of course, "////" is just a string; the number of slashes in the prefix and the suffix must be equal.
I tried the chosen answer, but found when I ran a shell script having it, the whole thing was getting printed to screen (similar to how jupyter notebooks print out everything in '''xx''' quotes) and there was an error message at end. It wasn't doing anything, but: scary. Then I realised while editing it that single-quotes can span multiple lines. So.. lets just assign the block to a variable.
x='
echo "these lines will all become comments."
echo "just make sure you don_t use single-quotes!"
ls -l
date
'
what's your opinion on this one?
function giveitauniquename()
{
so this is a comment
echo "there's no need to further escape apostrophes/etc if you are commenting your code this way"
the drawback is it will be stored in memory as a function as long as your script runs unless you explicitly unset it
only valid-ish bash allowed inside for instance these would not work without the "pound" signs:
1, for #((
2, this #wouldn't work either
function giveitadifferentuniquename()
{
echo nestable
}
}
Here's how I do multiline comments in bash.
This mechanism has two advantages that I appreciate. One is that comments can be nested. The other is that blocks can be enabled by simply commenting out the initiating line.
#!/bin/bash
# : <<'####.block.A'
echo "foo {" 1>&2
fn data1
echo "foo }" 1>&2
: <<'####.block.B'
fn data2 || exit
exit 1
####.block.B
echo "can't happen" 1>&2
####.block.A
In the example above the "B" block is commented out, but the parts of the "A" block that are not the "B" block are not commented out.
Running that example will produce this output:
foo {
./example: line 5: fn: command not found
foo }
can't happen
Simple solution, not much smart:
Temporarily block a part of a script:
if false; then
while you respect syntax a bit, please
do write here (almost) whatever you want.
but when you are
done # write
fi
A bit sophisticated version:
time_of_debug=false # Let's set this variable at the beginning of a script
if $time_of_debug; then # in a middle of the script
echo I keep this code aside until there is the time of debug!
fi
in plain bash
to comment out
a block of code
i do
:||{
block
of code
}

Echoing an environment variable, keeping newlines intact? [duplicate]

This question already has answers here:
When to wrap quotes around a shell variable?
(5 answers)
Closed 7 years ago.
I want to create some scripts for filling some templates and inserting them into my project folder. I want to use a shell script for this, and the templates are very small so I want to embed them in the shell script. The problem is that echo seems to ignore the line breaks in my string. Either that, or the string doesn't contain line breaks to begin with. Here is an example:
MY_STRING="
Hello, world! This
Is
A
Multi lined
String."
echo -e $MY_STRING
This outputs:
Hello, world! This Is A Multi lined String.
I'm assuming echo is the culprit here. How can I get it to acknowledge the line breaks?
You need double quotes around the variable interpolation.
echo -e "$MY_STRING"
This is an all-too common error. You should get into the habit of always quoting strings, unless you specifically need to split into whitespace-separated tokens or have wildcards expanded.
So to be explicit, the shell will normalize whitespace when it parses your command line. You can see this if you write a simple C program which prints out its argv array.
argv[0]='Hello,'
argv[1]='world!'
argv[2]='This'
argv[3]='Is'
argv[4]='A'
argv[5]='Multi'
argv[6]='lined'
argv[7]='String.'
By contrast, with quoting, the whole string is in argv[0], newlines and all.
For what it's worth, also consider here documents (with cat, not echo):
cat <<"HERE"
foo
Bar
HERE
You can also interpolate a variable in a here document.
cat <<HERE
$MY_STRING
HERE
... although in this particular case, it's hardly what you want.
echo is so nineties. The new (POSIX) kid on the block is printf.
printf '%s\n' "$MY_STRING"
No -e or SYSV vs BSD echo madness and full control over what gets printed where and how wide, escape sequences like in C. Everybody please start using printf now and never look back.
Try this :
echo "$MY_STRING"

How Does Bash Tokenize Scripts?

Coming from a C++: it always seems like magic to me that some whitespace has an effect on the validity or semantics of the script. Here's an example:
echo a 2 > &1
bash: syntax error near unexpected token `&'
echo a 2 >&1
a 2
echo a 2>&1
a
echo a 2>& 1
a
Looking at this didn't help much. My main problem is that it does not feel consistent; and I am in a state of confusion.
I'm trying to find out how bash tokenizes its scripts. A general description thereof to clear up any confusion would be appreciated.
Edit:
I am NOT looking for redirections specifically. They just came up as example. Other examples:
A="something"
A = "something"
if [$x = $y];
if [ $x = $y ];
Why isn't there a space necessary between ] and ;? Why does assignment require an immediate equal sign? ...
2>&1 is a single operator token, so any whitespace that breaks it up will change the meaning of the command. It just happens to be a parameterized token, which means the shell will further tokenize it to determine what exactly the operator does. The general form is n>&m, where n is the file descriptor you are redirecting, and m is the descriptor you are copying to. In this case, you are saying that the standard error (2) of the command should be copied to whatever standard output (1) is currently open on.
The examples you gave have the behavior they do for good reason.
Redirection sources default to FD 1. Thus, >&1 is legitimate syntax on its own -- it redirects FD 1 to FD 1 -- meaning allowing whitespace before the > would result in an ambiguous syntax: The parser couldn't tell if the preceding token was its own word or a redirection source.
Nothing other than a FD number is valid under >&, unless you're in a very new bash which allows a variable to be dereferenced to retrieve a FD number. In any event, anything immediately following >& is known to be a file descriptor, so allowing optional whitespace creates no ambiguity there.
a = 1 is parsed as a legitimate command, not a syntax error: It runs the command a with the first argument = and the second argument 1. Disallowing whitespace within assignments eliminates this ambiguity. Similarly, a= foo has a separate and distinct meaning: It exports an environment variable a with an empty value while running the command foo. Relaxing the whitespace rules would disallow both of these legitimate commands.
[ is a command, not special syntax known to the parser; thus, [foo tries to find a command (named, say, /usr/bin/[foo), requiring whitespace.
; takes precedence in the parser as a statement separator, rather than being treated as part of a word, unless quoted or escaped. The same is true of & (another separator), or a newline.
The thing is, there's no single general rule which will explain all this; you need to read and learn the language syntax. Fortunately, there's not very much syntax: Almost all commands are "simple commands", which follow very simple and clear rules. You're asking about, and we're explaining, some of the exceptions to that; there are other exceptions, such as [[ ]] in bash, but they're small enough in total that they can be learned.
Other suggested resources:
http://aosabook.org/en/bash.html (The Architecture of Open Source Applications; chapter on bash)
http://mywiki.wooledge.org/BashParser (Wooledge wiki high-level description of the parser -- though this focuses more on expansion rules than tokenization)
http://mywiki.wooledge.org/BashGuide (an introductory guide to bash syntax in general, written with more of a focus on accuracy and best practices than some competing materials).

Bash command line parsing containing whitespace

I have a parse a command line argument in shell script as follows:
cmd --a=hello world good bye --b=this is bash script
I need the parse the arguments of "a" i.e "hello world ..." which are seperated by whitespace into an array.
i.e a_input() array should contain "hello", "world", "good" and "bye".
Similarly for "b" arguments as well.
I tried it as follows:
--a=*)
a_input={1:4}
a_input=$#
for var in $a_input
#keep parsing until next --b or other argument is seen
done
But the above method is crude. Any other work around. I cannot use getopts.
The simplest solution is to get your users to quote the arguments correctly in the first place.
Barring that you can manually loop until you get to the end of the arguments or hit the next --argument (but that means you can't include a word that starts with -- in your argument value... unless you also do valid-option testing on those in which you limit slightly fewer -- words).
Adding to Etan Reisners answer, which is absolutely correct:
I personally find bash a bit cumbersome, when array/string processing gets more complex, and if you really have the strange requirement, that the caller should not be required to use quotes, I would here write an intermediate script in, say, Ruby or Perl, which just collects the parameters in a proper way, wraps quoting around them, and passes them on to the script, which originally was supposed to be called - even if this costs an additional process.
For example, a Ruby One-Liner such as
system("your_bash_script here.sh '".(ARGV.join(' ').split(' --').select {|s| s.size>0 }.join("' '"))."'")
would do this sanitizing and then invoke your script.

Resources