Shell script multi-line comment - shell

I am having a large shell script file. At times while doing modification I want to comment out part of it. But commenting line as shown in the below example is giving me error.
Script:
#!/bin/bash
<<COMMENT1
read build_label
read build_branch_tag
build_date_tag=$(echo $build_label | sed "s/$build_branch_tag//g")
echo $build_path
COMMENT1
echo "HELLO WORLD"
Error Message:
sed: first RE may not be empty
I just want to understand what's wrong with the above script and why comment section is not working properly.

First, using here docs to comment code is really dirty! Use the # instead. If you want to comment multiple lines, use your editor. In vim (commenting lines from 10 to 15 for example):
:10,15s/^/#
However, to solve your current problem you need to enclose the starting here-doc delimiter in single quotes, like this:
<<'COMMENT'
...
COMMENT
Using single quotes you tell bash that it should not attempt to expand variables or expression inside the here doc body.

Traditional UNIX shell doesn't have multiline comment support. What you're doing here is using a so-called "HERE document" without using its value, a common hack to get multiline comment like behaviour.
However, patterns inside the the HERE document are still evaluated, which means that your $(…) is executed. But since build_branch_tag has not been defined before, it will evaluate to an empty string, and the shell will thus execute sed s///g.
You can use a different hack:
: '
Bla bla, no $expansion is taking place here.
'
What this is doing: the : is a no-op command, it simply does nothing. And you're passing it an argument which is a string '…'. Inside the single quotes, no expansion/evaluation is taking place. Beware of ' inside the "commented out" region, though.

You can turn parameter substitution off inside a here document like this:
<<"Endofmessage"
or
<<'Endofmessage'
or
<<\Endofmessage
Here Documents
This type of redirection instructs the shell to read input from the
current source until a line containing only delimiter (with no
trailing blanks) is seen. All of the lines read up to that point are
then used as the standard input for a command. The format of
here-documents is:
<<[-]word
here-document delimiter No parameter expansion, command substitution, arithmetic expansion, or pathname expansion is performed
on word. If any characters in word are quoted, the delimiter is the
result of quote removal on word, and the lines in the here-document
are not expanded. If word is unquoted, all lines of the here-document
are subjected to parameter expansion, command substitution, and
arithmetic expansion. In the latter case, the character sequence
\ is ignored, and \ must be used to quote the characters \,
$, and `. If the redirection operator is <<-, then all leading tab
characters are stripped from input lines and the line containing
delimiter. This allows here-documents within shell scripts to be
indented in a natural fashion.
And maybe something that you may also like: I prefer to do multiline comments in my bash script with the nodepad++ shortcut ctrl+Q (toggle comment).

if this is not a syntax error (open string, ...)
#!/bin/bash
if false;then
read build_label
read build_branch_tag
build_date_tag=$(echo $build_label | sed "s/$build_branch_tag//g")
echo $build_path
fi
echo "HELLO WORLD"
if sysntax error or equivalent (unfound place like in search of error by descativate part of failing code)
#!/bin/bash
#read build_label
#read build_branch_tag
#build_date_tag=$(echo $build_label | sed "s/$build_branch_tag//g")
#echo $build_path
echo "HELLO WORLD"
for this you can use:
- editor if find/replace with regex is available like vi(m)
- a sed (sed '14,45 s/^/#/' YourFile > YourFile.Debug where 14 and 45 are first and last lines to comment)

Using here docs to comment code is safe and elegant like this:
: <<'EOT'
Example usage of the null command ':' and the here-document syntax for a
multi-line comment. If the delimiter word ('EOT' here) is quoted, the
here-document will not be expanded in any way. This is important, as
an unquoted delimiter will result in problems with unintended potential
expansions. All of this here-doc text is redirected to the standard input
of :, which does nothing but return true.
EOT

Related

How to use the value of a variable in a heredoc function in BASH to run a julia command

I am new to BASH and to Julia, and I am trying to do the following steps:
Save a filepath in a variable
Use this variable in a julia command, which i managed to execute through bash by using a heredoc function
Hard-coded command that works:
cat << "EOF" | julia --project=.
module test
ARGS=["/path/To/My/Directory"]
include("nameOfMyProject.jl")
end
EOF
The problem is, that I want to exchange the hardcoded path with the value of a variable.
I tried:
path="/path/To/My/Directory"
cat << "EOF" | julia --project=.
module test
ARGS=[$path]
include("nameOfMyProject.jl")
end
EOF
However, that does not work. I read i should try to use <<EOF instead of << "EOF" but then I receive this error: syntax: "/" is not a unary operator
I am not sure how to interprete this, since it might actually access the variable and find the "/", but it does work with it if I use "/" in the hard-coded way... so why not there?
I am not sure if my problem lies in the heredoc function or in something with this variable, but I would be grateful for any help!
Solution for this problem:
I am adding this for reference, if someone else has the same problem. The format that does the trick nicely is:
path="/path/To/My/Directory"
julia --project=. <<EOF
module test
ARGS=["$path"]
include("nameOfMyProject.jl")
end
EOF
Thanks to Jetchisel and Jens
From the bash manual (emphasis mine):
Here Documents
This type of redirection instructs the shell to read input from the
current source until a line containing only delimiter (with no trailing
blanks) is seen. All of the lines read up to that point are then used
as the standard input (or file descriptor n if n is specified) for a
command.
The format of here-documents is:
[n]<<[-]word
here-document
delimiter
No parameter and variable expansion, command substitution, arithmetic
expansion, or pathname expansion is performed on word. If any part of
word is quoted, the delimiter is the result of quote removal on word,
and the lines in the here-document are not expanded. If word is
unquoted, all lines of the here-document are subjected to parameter
expansion, command substitution, and arithmetic expansion, the
character sequence \<newline> is ignored, and \ must be used to quote
the characters \, $, and `.
So all you need is
julia --project=. << EOF
...$FOO...
EOF

bash print words on multiple lines in a single line

I am writing a shell script for which I write a header that has 30 (growing) column names. Right now, I have a echo statement that works and looks like this
echo "Colum_Name1, Column_Name2,Column_Name30"
While this works the readability sucks for me. if i want to add a column, its a bit of a nightmare to look at the screen and understand whether it is already in there. of course, I search my way out. Is it possible to do something like this with echo or printf and get the CSV in one line?
echo " Column_Name1,
Column_Name2,
Column_Name30"
and get the output as
Column_Name1,Column_Name2,Column_Name30
You can add backslash as the line continuation:
echo " Column_Name1,"\
"Column_Name2,"\
"Column_Name30"
From the bash manual:
The backslash character ‘\’ may be used to remove any special meaning
for the next character read and for line continuation.
Decouple the definition of the header and printing it, and use an array to store the column names.
headers=(
Column_Name1
Column_Name2
Column_Name30
)
(IFS=","; printf '%s\n' "${headers[*]}")
The elements of the array are joined by the first character of IFS when ${headers[*]} is expanded. The subshell is used so you don't have to worry about restoring the previous value of IFS.
Convenience solution, using paste:
If you don't mind the (probably negligible) overhead of invoking an external utility (paste) to build your string, you can combine it with a (literal, in this case) here-doc:
paste -s -d, - <<'EOF'
Column_Name1
Column_Name2
Column_Name30
EOF
yields
Column_Name1,Column_Name2,Column_Name30
The above acts like a single-quoted string, due to the opening delimiter, 'EOF', being quoted.
Omit the enclosing '...' to treat the string like a double-quoted string, i.e., with expansions being performed (allowing the inclusion of variable references, command substitutions, and arithmetic expansions).
If you take care to use actual leading tabs (\t) in your here-doc (multiple spaces do not work), you can even introduce indentation, by prepending - to the opening delimiter:
# !! Only works with actual *tabs* as the leading whitespace.
paste -s -d, - <<-'EOF'
Column_Name1
Column_Name2
Column_Name30
EOF
More efficient solution, using line continuation:
POSIX-compatible shells support line continuation even inside double-quoted strings, "..." (but not inside single-quoted ones, '...').
That means that any \<newline> sequence inside a double-quoted string is removed:
echo "\
Column_Name1,\
Column_Name2,\
Column_Name3\
"
Given that a here-document with an unquoted opening delimiter is treated like a double-quoted string, you can do the following:
cat <<EOF
Column_Name1,\
Column_Name2,\
Column_Name30
EOF
Note:
Using <<-EOF with to-be-stripped leading tabs (\t) for readability is not an option here, because the line continuations will still include them.
To take advantage of line continuation, it is invariably the interpolating (expanding) here-doc variety that must be used; therefore, you may need to \-escape $ instances to ensure their literal use.
Both commands again yield the desired single-line string:
Column_Name1,Column_Name2,Column_Name30
echo "foo bar" | (IFS=" "; xargs -n 1 echo)
yields
foo
bar

Input syntax for heredoc in Bash

I'm currently learning Unix bash, through the terminal in Ubuntu 16. I was writing a simple script, so I can practice writing code for Unix. Here is a sample of the code:
report_uptime(){
cat << _EOF_
<H2>System Uptime</H2>
<PRE>$(uptime)</PRE>
_EOF_
return
}
This code doesn't work and the reason is because after cat I'm supposed to use <<- instead of <<. Sometime << works. So when am I supposed to use << and when am I supposed to use<<-?
This works:
report_uptime(){
cat <<- _EOF_
<H2>System Uptime</H2>
<PRE>$(uptime)</PRE>
_EOF_
return
}
This isn't cat syntax per se; it's one of the redirection operators supported by your shell:
https://www.gnu.org/software/bash/manual/bashref.html#Here-Documents
3.6.6 Here Documents
This type of redirection instructs the shell to read input from the current source until a line containing only word (with no trailing blanks) is seen. All of the lines read up to that point are then used as the standard input for a command.
The format of here-documents is:
<<[-]word
here-document
delimiter
No parameter and variable expansion, command substitution, arithmetic expansion, or filename expansion is performed on word. If any characters in word are quoted, the delimiter is the result of quote removal on word, and the lines in the here-document are not expanded. If word is unquoted, all lines of the here-document are subjected to parameter expansion, command substitution, and arithmetic expansion, the character sequence \newline is ignored, and ‘\’ must be used to quote the characters ‘\’, ‘$’, and ‘`’.
If the redirection operator is ‘<<-’, then all leading tab characters are stripped from input lines and the line containing delimiter. This allows here-documents within shell scripts to be indented in a natural fashion.
This is not strictly speaking about the syntax of the cat utility, but of the syntax of the shell itself, i.e. of bash.
The construct that you're using is called a "here-document". A here-document feeds its contents into the standard input of whatever command you put before <<.
The syntax is this:
command <<word
...
contents
...
END_TAG
The word here is either exactly the same as the END_TAG, or it may be 'END_TAG', -END_TAG, or -'END_TAG'.
END_TAG: Without single quotes, the contents of the here-document will undergo substitution. This means that any variable, or simply put, "anything that contains a $" will be replaced with its value.
$ tr 'a-z' 'A-Z' <<TR_END
> This is my $HOME
> TR_END
THIS IS MY /USERS/KK
(The >, a greater-than sign and a space, is what's called the secondary prompt. I get that because I'm typing this directly into the shell, and it needs more lines of input before it can execute the whole command. It is not typed by me.)
'END_TAG': With single quotes, the contents of the here-document will not undergo substitution. This means, for example, that if you write $HOME in the here-document, it will be fed into the command just like that, not like /home/myname (or whatever your home directory may be).
$ tr 'a-z' 'A-Z' <<'TR_END'
> This is my $HOME
> TR_END
THIS IS MY $HOME
With a leading dash (-), the shell will strip off all tabs (but not spaces) at the start of each line of the here-document, including the line with the END_TAG at the end.
$ tr 'a-z' 'A-Z' <<-TR_END
> This line has a tab.
> This one does not.
> TR_END
THIS LINE HAS A TAB.
THIS ONE DOES NOT.
Without a leading dash, the shell will not strip off tabs. The END_TAG needs to be the first (and only) thing on the line ending the here-document.
$ tr 'a-z' 'A-Z' <<TR_END
> This line has a tab.
> This one does not.
> TR_END
THIS LINE HAS A TAB.
THIS ONE DOES NOT.
The bash shell also has something called "here-strings". It works in a similar fashion, but you're only feeding a single line into the command:
command <<<word
For example:
$ tr 'a-z' 'A-Z' <<<"hello world!"
HELLO WORLD!
Generally, the end marker should not be indented. Move it to column 1. Also remove the return statement, and it works for me in bash.

bash variable isn't working with ex in script

I have a little problem with my bash script.
#!/bin/bash
ex xxx.html << "HERE"
1,$s/\(foo\)/$1\1/
wq
HERE
This is just a little piece of my script. when I run it this is the output.
$1foo
Any way to fix this so the $1 will be the argument given to the script?
Thanks!
Try replacing "HERE" with HERE (unquoted). Also 1,$s becomes 1,\$s.
Here Documents
This type of redirection instructs the shell to read input from the
current source until a line containing only delimiter (with no trailing
blanks) is seen. All of the lines read up to that point are then used
as the standard input for a command.
The format of here-documents is:
<<[-]word
here-document
delimiter
No parameter expansion, command substitution, arithmetic expansion, or
pathname expansion is performed on word. If any characters in word are
quoted, the delimiter is the result of quote removal on word, and the
lines in the here-document are not expanded. If word is unquoted, all
lines of the here-document are subjected to parameter expansion, com-
mand substitution, and arithmetic expansion. In the latter case, the
character sequence \<newline> is ignored, and \ must be used to quote
the characters \, $, and `.
If the redirection operator is <<-, then all leading tab characters are
stripped from input lines and the line containing delimiter. This
allows here-documents within shell scripts to be indented in a natural
fashion.
Bash manual.
Remplace "HERE" with HERE (without quotes) and remplace 1,$ with 1,\$ or %
You could write your script as follows:
#!/bin/bash
ex xxx.html <<-HERE
%s/foo/$1&/
x
HERE
Although you could also build a smaller script:
#!/bin/bash
sed -i "s/foo/$1&/g" xxx.html
Try this.
#!/bin/bash
(echo '1,$s/\(foo\)/'"$1"'\1/'; echo 'wq') | ex xxx.html
That's 1,$s/\(foo\)/ in single quotes, adjacent to $1 in double quotes (so the shell substitutes the parameter), adjacent to \1/ in single quotes.

Delete all comments in a file using sed

How would you delete all comments using sed from a file(defined with #) with respect to '#' being in a string?
This helped out a lot except for the string portion.
If # always means comment, and can appear anywhere on a line (like after some code):
sed 's:#.*$::g' <file-name>
If you want to change it in place, add the -i switch:
sed -i 's:#.*$::g' <file-name>
This will delete from any # to the end of the line, ignoring any context. If you use # anywhere where it's not a comment (like in a string), it will delete that too.
If comments can only start at the beginning of a line, do something like this:
sed 's:^#.*$::g' <file-name>
If they may be preceded by whitespace, but nothing else, do:
sed 's:^\s*#.*$::g' <file-name>
These two will be a little safer because they likely won't delete valid usage of # in your code, such as in strings.
Edit:
There's not really a nice way of detecting whether something is in a string. I'd use the last two if that would satisfy the constraints of your language.
The problem with detecting whether you're in a string is that regular expressions can't do everything. There are a few problems:
Strings can likely span lines
A regular expression can't tell the difference between apostrophies and single quotes
A regular expression can't match nested quotes (these cases will confuse the regex):
# "hello there"
# hello there"
"# hello there"
If double quotes are the only way strings are defined, double quotes will never appear in a comment, and strings cannot span multiple lines, try something like this:
sed 's:#[^"]*$::g' <file-name>
That's a lot of pre-conditions, but if they all hold, you're in business. Otherwise, I'm afraid you're SOL, and you'd be better off writing it in something like Python, where you can do more advanced logic.
This might work for you (GNU sed):
sed '/#/!b;s/^/\n/;ta;:a;s/\n$//;t;s/\n\(\("[^"]*"\)\|\('\''[^'\'']*'\''\)\)/\1\n/;ta;s/\n\([^#]\)/\1\n/;ta;s/\n.*//' file
/#/!b if the line does not contain a # bail out
s/^/\n/ insert a unique marker (\n)
ta;:a jump to a loop label (resets the substitute true/false flag)
s/\n$//;t if marker at the end of the line, remove and bail out
s/\n\(\("[^"]*"\)\|\('\''[^'\'']*'\''\)\)/\1\n/;ta if the string following the marker is a quoted one, bump the marker forward of it and loop.
s/\n\([^#]\)/\1\n/;ta if the character following the marker is not a #, bump the marker forward of it and loop.
s/\n.*// the remainder of the line is comment, remove the marker and the rest of line.
Since there is no sample input provided by asker, I will assume a couple of cases and Bash is the input file because bash is used as the tag of the question.
Case 1: entire line is the comment
The following should be sufficient enough in most case:
sed '/^\s*#/d' file
It matches any line has which has none or at least one leading white-space characters (space, tab, or a few others, see man isspace), followed by a #, then delete the line by d command.
Any lines like:
# comment started from beginning.
# any number of white-space character before
# or 'quote' in "here"
They will be deleted.
But
a="foobar in #comment"
will not be deleted, which is the desired result.
Case 2: comment after actual code
For example:
if [[ $foo == "#bar" ]]; then # comment here
The comment part can be removed by
sed "s/\s*#*[^\"']*$//" file
[^\"'] is used to prevent quoted string confusion, however, it also means that comments with quotations ' or " will not to be removed.
Final sed
sed "/^\s*#/d;s/\s*#[^\"']*$//" file
To remove comment lines (lines whose first non-whitespace character is #) but not shebang lines (lines whose first characters are #!):
sed '/^[[:space:]]*#[^!]/d; /#$/d' file
The first argument to sed is a string containing a sed program consisting of two delete-line commands of the form /regex/d. Commands are separated by ;. The first command deletes comment lines but not shebang lines. The second command deletes any remaining empty comment lines. It does not handle trailing comments.
The last argument to sed is a file to use as input. In Bash, you can also operate on a string variable like this:
sed '/^[[:space:]]*#[^!]/d; /#$/d' <<< "${MYSTRING}"
Example:
# test.sh
S0=$(cat << HERE
#!/usr/bin/env bash
# comment
# indented comment
echo 'FOO' # trailing comment
# last line is an empty, indented comment
#
HERE
)
printf "\nBEFORE removal:\n\n${S0}\n\n"
S1=$(sed '/^[[:space:]]*#[^!]/d; /#$/d' <<< "${S0}")
printf "\nAFTER removal:\n\n${S1}\n\n"
Output:
$ bash test.sh
BEFORE removal:
#!/usr/bin/env bash
# comment
# indented comment
echo 'FOO' # trailing comment
# last line is an empty, indented comment
#
AFTER removal:
#!/usr/bin/env bash
echo 'FOO' # trailing comment
Supposing "being in a string" means "occurs between a pair of quotes, either single or double", the question can be rephrased as "remove everything after the first unquoted #". You can define the quoted strings, in turn, as anything between two quotes, excepting backslashed quotes. As a minor refinement, replace the entire line with everything up through just before the first unquoted #.
So we get something like [^\"'#] for the trivial case -- a piece of string which is neither a comment sign, nor a backslash, nor an opening quote. Then we can accept a backslash followed by anything: \\. -- that's not a literal dot, that's a literal backslash, followed by a dot metacharacter which matches any character.
Then we can allow zero or more repetitions of a quoted string. In order to accept either single or double quotes, allow zero or more of each. A quoted string shall be defined as an opening quote, followed by zero or more of either a backslashed arbitrary character, or any character except the closing quote: "\(\\.\|[^\"]\)*" or similarly for single-quoted strings '\(\\.\|[^\']\)*'.
Piecing all of this together, your sed script could look something like this:
s/^\([^\"'#]*\|\\.\|"\(\\.\|[^\"]\)*"\|'\(\\.\|[^\']\)*'\)*\)#.*/\1/
But because it needs to be quoted, and both single and double quotes are included in the string, we need one more additional complication. Recall that the shell allows you to glue together strings like "foo"'bar' gets replaced with foobar -- foo in double quotes, and bar in single quotes. Thus you can include single quotes by putting them in double quotes adjacent to your single-quoted string -- '"foo"'"'" is "foo" in single quotes next to ' in double quotes, thus "foo"'; and "' can be expressed as '"' adjacent to "'". And so a single-quoted string containing both double quotes foo"'bar can be quoted with 'foo"' adjacent to "'bar" or, perhaps more realistically for this case 'foo"' adjacent to "'" adjacent to another single-quoted string 'bar', yielding 'foo'"'"'bar'.
sed 's/^\(\(\\.\|[^\#"'"'"']*\|"\(\\.\|[^\"]\)*"\|'"'"'\(\\.\|[^\'"'"']\)*'"'"'\)*\)#.*/\1/p' file
This was tested on Linux; on other platforms, the sed dialect may be slightly different. For example, you may need to omit the backslashes before the grouping and alteration operators.
Alas, if you may have multi-line quoted strings, this will not work; sed, by design, only examines one input line at a time. You could build a complex script which collects multiple lines into memory, but by then, switching to e.g. Perl starts to make a lot of sense.
As you have pointed out, sed won't work well if any parts of a script look like comments but actually aren't. For example, you could find a # inside a string, or the rather common $# and ${#param}.
I wrote a shell formatter called shfmt, which has a feature to minify code. That includes removing comments, among other things:
$ cat foo.sh
echo $# # inline comment
# lone comment
echo '# this is not a comment'
[mvdan#carbon:12] [0] [/home/mvdan]
$ shfmt -mn foo.sh
echo $#
echo '# this is not a comment'
The parser and printer are Go packages, so if you'd like a custom solution, it should be fairly easy to write a 20-line Go program to remove comments in the exact way that you want.
sed 's:^#\(.*\)$:\1:g' filename
Supposing the lines starts with single # comment, Above command removes all comments from file.

Resources