Kornshell variable definition: What is ?FOO? - shell

I found a line of code in a kornshell script:
foo=`basename ?BAR?`
What does the question marks mean?
Thank you

touch BAR ABAR ABARZ
ls ?BAR?
ABARZ
? is normally a shell wildcard char that matches 1 character, and that 1 character position must be in use, as shown in the example above. It's like a 1-char version of '*', match 1 char (that must be there). Notice that if you change to
ls ?BAR*
You get output like
ABAR ABARZ
Your code shows the same behavior
foo=$(basename ?BAR?)
echo $foo
ABARZ
Does that make sense? Not really, but given the small context you have given the other possible interpretation is that the original script writer is using ?BAR? as a place-holder and telling you "change this to a real/meaningful value".
Other may have other ideas.
IHTH

Related

Way to create multiline comments in Bash?

I have recently started studying shell script and I'd like to be able to comment out a set of lines in a shell script. I mean like it is in case of C/Java :
/* comment1
comment2
comment3
*/`
How could I do that?
Use : ' to open and ' to close.
For example:
: '
This is a
very neat comment
in bash
'
Multiline comment in bash
: <<'END_COMMENT'
This is a heredoc (<<) redirected to a NOP command (:).
The single quotes around END_COMMENT are important,
because it disables variable resolving and command resolving
within these lines. Without the single-quotes around END_COMMENT,
the following two $() `` commands would get executed:
$(gibberish command)
`rm -fr mydir`
comment1
comment2
comment3
END_COMMENT
Note: I updated this answer based on comments and other answers, so comments prior to May 22nd 2020 may no longer apply. Also I noticed today that some IDE's like VS Code and PyCharm do not recognize a HEREDOC marker that contains spaces, whereas bash has no problem with it, so I'm updating this answer again.
Bash does not provide a builtin syntax for multi-line comment but there are hacks using existing bash syntax that "happen to work now".
Personally I think the simplest (ie least noisy, least weird, easiest to type, most explicit) is to use a quoted HEREDOC, but make it obvious what you are doing, and use the same HEREDOC marker everywhere:
<<'###BLOCK-COMMENT'
line 1
line 2
line 3
line 4
###BLOCK-COMMENT
Single-quoting the HEREDOC marker avoids some shell parsing side-effects, such as weird subsitutions that would cause crash or output, and even parsing of the marker itself. So the single-quotes give you more freedom on the open-close comment marker.
For example the following uses a triple hash which kind of suggests multi-line comment in bash. This would crash the script if the single quotes were absent. Even if you remove ###, the FOO{} would crash the script (or cause bad substitution to be printed if no set -e) if it weren't for the single quotes:
set -e
<<'###BLOCK-COMMENT'
something something ${FOO{}} something
more comment
###BLOCK-COMMENT
ls
You could of course just use
set -e
<<'###'
something something ${FOO{}} something
more comment
###
ls
but the intent of this is definitely less clear to a reader unfamiliar with this trickery.
Note my original answer used '### BLOCK COMMENT', which is fine if you use vanilla vi/vim but today I noticed that PyCharm and VS Code don't recognize the closing marker if it has spaces.
Nowadays any good editor allows you to press ctrl-/ or similar, to un/comment the selection. Everyone definitely understands this:
# something something ${FOO{}} something
# more comment
# yet another line of comment
although admittedly, this is not nearly as convenient as the block comment above if you want to re-fill your paragraphs.
There are surely other techniques, but there doesn't seem to be a "conventional" way to do it. It would be nice if ###> and ###< could be added to bash to indicate start and end of comment block, seems like it could be pretty straightforward.
After reading the other answers here I came up with the below, which IMHO makes it really clear it's a comment. Especially suitable for in-script usage info:
<< ////
Usage:
This script launches a spaceship to the moon. It's doing so by
leveraging the power of the Fifth Element, AKA Leeloo.
Will only work if you're Bruce Willis or a relative of Milla Jovovich.
////
As a programmer, the sequence of slashes immediately registers in my brain as a comment (even though slashes are normally used for line comments).
Of course, "////" is just a string; the number of slashes in the prefix and the suffix must be equal.
I tried the chosen answer, but found when I ran a shell script having it, the whole thing was getting printed to screen (similar to how jupyter notebooks print out everything in '''xx''' quotes) and there was an error message at end. It wasn't doing anything, but: scary. Then I realised while editing it that single-quotes can span multiple lines. So.. lets just assign the block to a variable.
x='
echo "these lines will all become comments."
echo "just make sure you don_t use single-quotes!"
ls -l
date
'
what's your opinion on this one?
function giveitauniquename()
{
so this is a comment
echo "there's no need to further escape apostrophes/etc if you are commenting your code this way"
the drawback is it will be stored in memory as a function as long as your script runs unless you explicitly unset it
only valid-ish bash allowed inside for instance these would not work without the "pound" signs:
1, for #((
2, this #wouldn't work either
function giveitadifferentuniquename()
{
echo nestable
}
}
Here's how I do multiline comments in bash.
This mechanism has two advantages that I appreciate. One is that comments can be nested. The other is that blocks can be enabled by simply commenting out the initiating line.
#!/bin/bash
# : <<'####.block.A'
echo "foo {" 1>&2
fn data1
echo "foo }" 1>&2
: <<'####.block.B'
fn data2 || exit
exit 1
####.block.B
echo "can't happen" 1>&2
####.block.A
In the example above the "B" block is commented out, but the parts of the "A" block that are not the "B" block are not commented out.
Running that example will produce this output:
foo {
./example: line 5: fn: command not found
foo }
can't happen
Simple solution, not much smart:
Temporarily block a part of a script:
if false; then
while you respect syntax a bit, please
do write here (almost) whatever you want.
but when you are
done # write
fi
A bit sophisticated version:
time_of_debug=false # Let's set this variable at the beginning of a script
if $time_of_debug; then # in a middle of the script
echo I keep this code aside until there is the time of debug!
fi
in plain bash
to comment out
a block of code
i do
:||{
block
of code
}

Substitution of substring doesn't work in bash (tried sed, ${a/b/c/})

Before to write, of course I read many other similar cases. Example I used #!/bin/bash instead of #!/bin/sh
I have a very simple script that reads lines from a template file and wants to replace some keywords with real data. Example the string <NAME> will be replaced with a real name. In the example I want to replace it with the word Giuseppe. I tried 2 solutions but they don't work.
#!/bin/bash
#read the template and change variable information
while read LINE
do
sed 'LINE/<NAME>/Giuseppe' #error: sed: -e expression #1, char 2: extra characters after command
${LINE/<NAME>/Giuseppe} #error: WORD(*) command not found
done < template_mail.txt
(*) WORD is the first word found in the line
I am sorry if the question is too basic, but I cannot see the error and the error message is not helping.
EDIT1:
The input file should not be changed, i want to use it for every mail. Every time i read it, i will change with a different name according to the receiver.
EDIT2:
Thanks your answers i am closer to the solution. My example was a simplified case, but i want to change also other data. I want to do multiple substitutions to the same string, but BASH allows me only to make one substitution. In all programming languages i used, i was able to substitute from a string, but BASH makes this very difficult for me. The following lines don't work:
CUSTOM_MAIL=$(sed 's/<NAME>/Giuseppe/' template_mail.txt) # from file it's ok
CUSTOM_MAIL=$(sed 's/<VALUE>/30/' CUSTOM_MAIL) # from variable doesn't work
I want to modify CUSTOM_MAIL a few times in order to include a few real informations.
CUSTOM_MAIL=$(sed 's/<VALUE1>/value1/' template_mail.txt)
${CUSTOM_MAIL/'<VALUE2>'/'value2'}
${CUSTOM_MAIL/'<VALUE3>'/'value3'}
${CUSTOM_MAIL/'<VALUE4>'/'value4'}
What's the way?
No need to do the loop manually. sed command itself runs the expression on each line of provided file:
sed 's/<NAME>/Giuseppe/' template_mail.txt > output_file.txt
You might need g modifier if there are more appearances of the <NAME> string on one line: s/<NAME>/Giuseppe/g

How Does Bash Tokenize Scripts?

Coming from a C++: it always seems like magic to me that some whitespace has an effect on the validity or semantics of the script. Here's an example:
echo a 2 > &1
bash: syntax error near unexpected token `&'
echo a 2 >&1
a 2
echo a 2>&1
a
echo a 2>& 1
a
Looking at this didn't help much. My main problem is that it does not feel consistent; and I am in a state of confusion.
I'm trying to find out how bash tokenizes its scripts. A general description thereof to clear up any confusion would be appreciated.
Edit:
I am NOT looking for redirections specifically. They just came up as example. Other examples:
A="something"
A = "something"
if [$x = $y];
if [ $x = $y ];
Why isn't there a space necessary between ] and ;? Why does assignment require an immediate equal sign? ...
2>&1 is a single operator token, so any whitespace that breaks it up will change the meaning of the command. It just happens to be a parameterized token, which means the shell will further tokenize it to determine what exactly the operator does. The general form is n>&m, where n is the file descriptor you are redirecting, and m is the descriptor you are copying to. In this case, you are saying that the standard error (2) of the command should be copied to whatever standard output (1) is currently open on.
The examples you gave have the behavior they do for good reason.
Redirection sources default to FD 1. Thus, >&1 is legitimate syntax on its own -- it redirects FD 1 to FD 1 -- meaning allowing whitespace before the > would result in an ambiguous syntax: The parser couldn't tell if the preceding token was its own word or a redirection source.
Nothing other than a FD number is valid under >&, unless you're in a very new bash which allows a variable to be dereferenced to retrieve a FD number. In any event, anything immediately following >& is known to be a file descriptor, so allowing optional whitespace creates no ambiguity there.
a = 1 is parsed as a legitimate command, not a syntax error: It runs the command a with the first argument = and the second argument 1. Disallowing whitespace within assignments eliminates this ambiguity. Similarly, a= foo has a separate and distinct meaning: It exports an environment variable a with an empty value while running the command foo. Relaxing the whitespace rules would disallow both of these legitimate commands.
[ is a command, not special syntax known to the parser; thus, [foo tries to find a command (named, say, /usr/bin/[foo), requiring whitespace.
; takes precedence in the parser as a statement separator, rather than being treated as part of a word, unless quoted or escaped. The same is true of & (another separator), or a newline.
The thing is, there's no single general rule which will explain all this; you need to read and learn the language syntax. Fortunately, there's not very much syntax: Almost all commands are "simple commands", which follow very simple and clear rules. You're asking about, and we're explaining, some of the exceptions to that; there are other exceptions, such as [[ ]] in bash, but they're small enough in total that they can be learned.
Other suggested resources:
http://aosabook.org/en/bash.html (The Architecture of Open Source Applications; chapter on bash)
http://mywiki.wooledge.org/BashParser (Wooledge wiki high-level description of the parser -- though this focuses more on expansion rules than tokenization)
http://mywiki.wooledge.org/BashGuide (an introductory guide to bash syntax in general, written with more of a focus on accuracy and best practices than some competing materials).

What does this variable assignment do?

I'm having to code a subversion hook script, and I found a few examples online, mostly python and perl. I found one or two shell scripts (bash) as well. I am confused by a line and am sorry this is so basic a question.
FILTER=".(sh|SH|exe|EXE|bat|BAT)$"
The script later uses this to perform a test, such as (assume EXT=ex):
if [[ "$FILTER" == *"$EXT"* ]]; then blah
My problem is the above test is true. However, I'm not asking you to assist in writing the script, just explaining the initial assignment of FILTER. I don't understand that line.
Editing in a closer example FILTER line. Of course the script, as written does not work, because 'ex' returns true, and not just 'exe'. My problem here is only, however, that I don't understant the layout of the variable assignment itself.
Why is there a period at the beginning? ".(sh..."
Why is there a dollar sign at the end? "...BAT)$"
Why are there pipes between each pattern? "sh|SH|exe"
You probably looking for something as next:
FILTER="\.(sh|SH|exe|EXE|bat|BAT)$"
for EXT
do
if [[ "$EXT" =~ $FILTER ]];
then
echo $EXT extension disallowed
else
echo $EXT is allowed
fi
done
save it to myscript.sh and run it as
myscript.sh bash ba.sh
and will get
bash is allowed
ba.sh extension disallowed
If you don't escape the "dot", e.g. with the FILTER=".(sh|SH|exe|EXE|bat|BAT)$" you will get
bash extension disallowed
ba.sh extension disallowed
What is (of course) wrong.
For the questions:
Why is there a period at the beginning? ".(sh..."
Because you want match .sh (as extension) and not for example bash (without the dot). And therefore the . must be escaped, like \. because the . in regex mean "any character.
Why is there a dollar sign at the end? "...BAT)$"
The $ mean = end of string. You want match file.sh and not file.sh.jpg. The .sh should be at the end of string.
Why are there pipes between each pattern? "sh|SH|exe"
In the rexex, the (...|...|...) construction delimites the "alternatives". As you sure quessed.
You really need read some "regex tutorial" - it is more complicated - and can't be explained in one answer.
Ps: NEVER use UPPERCASE variable names, they can collide with environment variables.
This just assigns a string to FILTER; the contents of that string have no special meaning. When you try to match it against the pattern *ex*, the result is true assuming that the value of $FILTER consists the string ex surrounded by anything on either side. This is true; ex is a substring of exe.
FILTER=".(sh|SH|exe|EXE|bat|BAT)$"
^^
|
+---- here is the "ex" from the pattern.
As I can this is similar to regular expression pattern:
In regular expressions the string start with can be show with ^, similarly in this case . represent seems doing that.
In the bracket you have exact string, which represents what the exact file extensions would be matched, they are 'Or' by using the '|'.
And at the end the expression should only pick the string will '$' or end point and not more than.
I would say that way original author might have looked at it and implemented it.

11*(...) as a bash parameter without quotation marks

I'm trying to write a small piece of code that passes a small formula to another program, however i've found that something strange happens when the formula starts with 11*(:
$ echo 11*15
Neatly prints '11*15'
$ echo 21*(15)
Neatly prints '21*(15)', while
echo 11*(15)
Only gives '11'. As far as I've found this only happens with '11*('. I know that this can be solved by using proper quotation marks, but I'm still curious as to why this happens.
Does anyone know?
How is your program coded? If its coded to take in parameters, then pass your formula like
./myprogram "11*15"
or
echo '11*15' | myprogram
If you do echo just like that on the command line, you may inadvertently display files that has 11 in its file name
11*(15) uses a Bash-specific extended glob syntax. You've stumbled across it accidentally, emphasizing why quotation marks are a good idea. (I also learned a lot tracking down why it was working differently for me; thanks for that.)
The behavior of
echo 11*(15)
in bash is going to vary depending on whether extglob is enabled. If it's enabled *(PATTERN-LIST) matches zero or more occurrences of the patterns. If it's disabled, it doesn't, and the resulting ( is likely to cause a syntax error.
For example:
$ ls
11 115 1155 11555 115555
$ shopt -u extglob
$ echo 11*(55)
bash: syntax error near unexpected token `('
$ shopt -s extglob
$ echo 11*(55)
11 1155 115555
$
(This explains the odd behavior I discussed in comments.)
Quoting from the bash 4.2.8 documentation (info bash):
If the `extglob' shell option is enabled using the `shopt' builtin,
several extended pattern matching operators are recognized. In the
following description, a PATTERN-LIST is a list of one or more
patterns separated by a `|'. Composite patterns may be formed using
one or more of the following sub-patterns:
`?(PATTERN-LIST)'
Matches zero or one occurrence of the given patterns.
`*(PATTERN-LIST)'
Matches zero or more occurrences of the given patterns.
`+(PATTERN-LIST)'
Matches one or more occurrences of the given patterns.
`#(PATTERN-LIST)'
Matches one of the given patterns.
`!(PATTERN-LIST)'
Matches anything except one of the given patterns.

Resources