While working on a project written in bash by my former colleague, I noticed that all .sh files contain nothing but function definitions start with #!/bin/false, which is, as I understand, a safety mechanism of preventing execution of include-only files.
Example:
my_foo.sh
#!/bin/false
function foo(){
echo foontastic
}
my_script.sh
#!/bin/bash
./my_foo.sh # does nothing
foo # error, no command named "foo"
. ./my_foo.sh
foo # prints "foontastic"
However when I don't use #!/bin/false, effects of both proper and improper use are exactly the same:
Example:
my_bar.sh
function bar(){
echo barvelous
}
my_script.sh
#!/bin/bash
./my_bar.sh # spawn a subshell, defines bar and exit, effectively doing nothing
bar # error, no command named "bar"
. ./my_bar.sh
bar # prints "barvelous"
Since properly using those scripts by including them with source in both cases works as expected, and executing them in both cases does nothing from the perspective of a parent shell and generate no error message concerning invalid use, what is exactly the purpose of #!/bash/false in those script?
In general, let’s consider a file testcode with bash code in it
#!/bin/bash
if [ "$0" = "${BASH_SOURCE[0]}" ]; then
echo "You are executing ${BASH_SOURCE[0]}"
else
echo "You are sourcing ${BASH_SOURCE[0]}"
fi
you can do three different things with it:
$ ./testcode
You are executing ./testcode
This works if testcode has the right permissions and the right shebang. With a shebang of #!/bin/false, this outputs nothing and returns a code of 1 (false).
$ bash ./testcode
You are executing ./testcode
This completely disregards the shebang (which can even be missing) and it only requires read permission, not executable permission. This is the way to call bash scripts from a CMD command line in Windows (if you have bash.exe in your PATH...), since there the shebang machanism doesn’t work.
$ . ./testcode
You are sourcing ./testcode
This also completely disregards the shebang, as above, but it is a complete different matter, because sourcing a script means having the current shell execute it, while executing a script means invoking a new shell to execute it. For instance, if you put an exit command in a sourced script, you exit from the current shell, which is rarely what you want. Therefore, sourcing is often used to load function definitions or constants, in a way somewhat resembling the import statement of other programming languages, and various programmers develop different habits to differentiate between scripts meant to be executed and include files to be sourced. I usually don’t use any extension for the former (others use .sh), but I use an extension of .shinc for the latter. Your former colleague used a shebang of #!/bin/false and one can only ask them why they preferred this to a zillion other possibilities. One reason that comes to my mind is that you can use file to tell these files apart:
$ file testcode testcode2
testcode: Bourne-Again shell script, ASCII text executable
testcode2: a /bin/false script, ASCII text executable
Of course, if these include files contain only function definitions, it’s harmless to execute them, so I don’t think your colleague did it to prevent execution.
Another habit of mine, inspired by the Python world, is to place some regression tests at the end of my .shinc files (at least while developing)
... function definitions here ...
[ "$0" != "${BASH_SOURCE[0]}" ] && return
... regression tests here ...
Since return generates an error in executed scripts but is OK in sourced scripts, a more cryptic way to get the same result is
... function definitions here ...
return 2>/dev/null || :
... regression tests here ...
The difference in using #!/bin/false or not from the point of view of the parent shell is in the return code.
/bin/false always return a failing return code (in my case 1, but not sure if it is standard).
Try that :
./my_foo.sh //does nothing
echo $? // shows "1", a.k.a failing
./my_bar.sh //does nothing
echo $? // shows "0", a.k.a. everything went right
So, using #!/bin/false not only documents the fact that the script is not intended to be executed, but also produces an error return code when doing so.
Related
I noticed that my script was ignoring my positional arguments in old terminal tabs, but working on recently created ones, so I decided to reduce it to the following:
TAG=test
while getopts 't:' c
do
case $c in
t)
TAG=$OPTARG
;;
esac
done
echo $TAG
And running the script I have:
~ source my_script
test
~ source my_script -t "test2"
test2
~ source my_script -t "test2"
test
I thought it could be that c was an special used variable elsewhere but after changing it to other names I had the exact same problem. I also tried adding a .sh extension to the file to see it that was a problem, but nothing worked.
Am I doing something wrong ? And why does it work the first time, but not the subsequent attempts ?
I am on MacOS and I use zsh.
Thank you very much.
The problem is that you're using source to run the script (the . command does the same thing). This makes it run in your current (interactive) shell (rather than a subprocess, like scripts normally do). This means it uses the same variables as the current shell, which is necessary if you want it to change those variables, but it can also have weird effects if you're not careful.
In this case, the problem is that getopts uses the variable OPTIND to keep track of where it is in the argument list (so it doesn't process the same argument twice). The first time you run the script with -t test2, getopts processes those arguments, and leaves OPTIND set to 3 (meaning that it's already done the first two arguments, "-t" and "test2". The second time you run it with options, it sees that OPTIND is set to 3, so it thinks it's already processed both arguments and just exits the loop.
One option is to add unset OPTIND before the while getopts loop, to reset the count and make it start from the beginning each time.
But unless there's some reason for this script to run in the current shell, it'd be better to make it a standard shell script and have it run as a subprocess. To do this:
Add a "shebang" line as the first line of the script. To make the script run in bash, that'd be either #!/bin/bash or #!/usr/bin/env bash. For zsh, use #!/bin/zsh or #!/usr/bin/env zsh. Since the script runs in a separate shell process, the you can run bash scripts from zsh or zsh scripts from bash, or whatever.
Add execute permission to the script file with chmod -x my_script (or whatever the file's actual name is).
Run the script with ./my_script (note the lack of a space between . and /), or by giving the full path to the script, or by putting the script in some directory in your PATH (the directories that're automatically searched for commands) and just running my_script. Do NOT run it with the bash, sh, zsh etc commands; these override the shebang and therefore can cause confusion.
Note: adding ".sh" to the filename is not recommended; it does nothing useful, and makes the script less convenient to run since you have to type in the extension every time you run it.
Also, a couple of recommendations: there are a bunch of all-caps variable names with special meanings (like PATH and OPTIND), so unless you want one of those special meanings, it's best to use lower- or mixed-case variable names (e.g. tag instead of TAG). Also, double-quoting variable references (e.g. echo "$tag" instead of echo $tag) avoids a lot of weird parsing headaches. Run your scripts through shellcheck.net; it's good at spotting common mistakes like this.
How to correctly pass to the script and substitute a variable that is already defined there?
My script test.sh:
#!/bin/bash
TARGETARCH=amd64
echo $1
When I enter:
bash test.sh https://example/$TARGETARCH
I want to see
https://example/amd64
but I actually see
https://example/
What am I doing wrong?
The first problem with the original approach is that the $TARGETARCH is removed by your calling shell before your script is ever invoked. To prevent that, you need to use quotes:
./yourscript 'https://example.com/$TARGETARCH'
The second problem is that parameter expansions only happen in code, not in data. This is, from a security perspective, a Very Good Thing -- if data were silently treated as code it would be impossible to write secure scripts handling untrusted data -- but it does mean you need to do some more work. The easy thing, in this case, is to export your variable and use GNU envsubst, as long as your operating system provides it:
#!/bin/bash
export TARGETARCH=amd64
substitutedValue=$(envsubst <<<"$1")
echo "Original value was: $1"
echo "Substituted value is: $substitutedValue"
See the above running in an online sandbox at https://replit.com/#CharlesDuffy2/EcstaticAfraidComputeranimation#replit.nix
Note the use of yourscript instead of test.sh here -- using .sh file extensions, especially for bash scripts as opposed to sh scripts, is an antipattern; the essay at https://www.talisman.org/~erlkonig/documents/commandname-extensions-considered-harmful/ has been linked by the #bash IRC channel on this topic for over a decade.
For similar reasons, changing bash yourscript to ./yourscript lets the #!/usr/bin/env bash line select an interpreter, so you aren't repeating the "bash" name in multiple places, leading to the risk of those places getting out of sync with each other.
Original Title: Indirect parameter substitution breaks when the script is sourced (zsh)
zsh 5.7.1 (x86_64-apple-darwin19.0)
GNU bash, version 4.4.20(1)-release (x86_64-pc-linux-gnu)
I’m developing a shell script on a Mac and I’m trying to keep it portable between bash & zsh, so array indexing is a consideration. I know that I can set KSH_ARRAYS to get indexing to start at 0, but I decided to query the OS for the shell that’s in use and set the start index accordingly, which led to the issue described below.
It made sense (to me anyway!) to use indirect expansion, which is what led to the problem. Consider the script indirect.sh:
#! /bin/bash
declare -r ARRAY_START_BASH=0
declare -r ARRAY_START_ZSH=1
declare -r SHELL_BASH=0
declare -r SHELL_ZSH=1
# Indirect expansion is used to reference the values of the variables declared
# in this case statement e.g. ${!ARRAY_START}
case $(basename $SHELL) in
"bash" )
declare -r SHELL_ID=SHELL_BASH
declare -r ARRAY_START=ARRAY_START_BASH
;;
"zsh" )
declare -r SHELL_ID=SHELL_ZSH
declare -r ARRAY_START=ARRAY_START_ZSH
;;
* )
return 1
;;
esac
echo "Shell ID: ${!SHELL_ID} Index arrays from: ${!ARRAY_START}"
It works fine when run from the command line while in the same directory:
<my home> ~ % echo "$(./indirect.sh)"
Shell ID: 1 Index arrays from: 1
Problems arise when I source the script:
<my home> ~ % echo "$(. ~/indirect.sh)"
/Users/<me>/indirect.sh:28: bad substitution
I don’t understand why sourcing the script changes the behavior of the parameter expansion.
Is this expected behavior? If so, I’d be grateful if someone could explain it and hopefully, offer a work around.
The problem described in the original post has nothing to do with indirect expansion. The difference in behavior is a result of different shells being invoked depending on whether the script is “executed” or “sourced”. These differences reveal the basic flaw in deriving the shell from the $SHELL variable that underpins the script's design. If the shell defined in $SHELL does not match the shebang, the script will fail either when sourced or executed. An explanation follows.
Indirect expansion doesn’t offer value in the given scenario because values could just as easily be assigned directly. They’ll have to be assigned that way regardless given the different syntax used for indirect expansion between shells. In fact, other syntax differences between shells makes the entire premise for detecting the shell moot! However, putting that aside, the difference in behavior is a result of different shells being invoked based on whether the script is “executed” or “sourced”. The behavior of sourcing is well documented with numerous explanations on the web, but for context here’s how it works:
Executing a Script
Use the “./“ syntax to execute a script.
When run this way, the script executes in a sub-shell. Any changes the
script makes to it’s shell are applied to the sub-shell, not the shell
in which the script was launched, so those changes are lost when the
shell exits because the sub-shell in which it executed is destroyed as
well. For example, if the script changes the working directory, it
does so in the sub-shell. The working directory of the main shell that
launched the script is unchanged when the script terminates. If you
want to make changes to the shell in which the script was launched, it
must be sourced.
Sourcing a Script
Use the “source “ syntax to source a
script. When run this way, the script essentially becomes an argument
for the source command, which handles invoking the appropriate
execution. Some shells (e.g. ksh) use a single period “.” instead of
“source”.
When a script is executed with the “./“ syntax, the shebang at the top of the file is used to determine which shell to use. When a script is sourced, the shebang is ignored and the shell in which the script is launched is used instead. Also note that the period that appears in the “./“ command syntax used to execute a script, is not related to the period that’s occasionally used as an alias for the source command.
The script in the post uses bash in the shebang statement, so it works when executed because it’s run using bash. When it’s sourced from zsh, it encounters the incorrect indirect expansion syntax:
“${!A_VAR}"
The correct syntax is:
"${(P)A_VAR}"
However, correcting the syntax won’t help because it will then fail when executed. The shebang will invoke bash and the syntax will be wrong again. That renders indirection useless for accessing a variable designed to indicate the shell in use. More importantly, a design based on querying an environment variable for the shell is flawed due to differences in the shell that’s ultimately used depending on whether the script is executed or sourced.
To add to your answer (what I'm going to say is too long for a comment), I can not think of any application, why your script could be useful if not sourced. Actually, I came accross the need of such a script by myself in exactly one occasion:
Since I use as interactive shell not only zsh, but also sometimes bash, so I have written my .zshrc and .bashrc to set up everything (including defining variables and shell functions for interactive use). In order to safe work,
I try to put code which works under both bash and zsh into a single file (say: .commonrc), and my .zshrc and .bashrc have inside them a
source .commonrc
While many things are so different in bash and zsh, that I can't put them into .commonrc, some can, provided I do some tweaking. One reason for headache is obviously the different indexing of arrays, which you seemingly try to solve. So I have also a similar feature. However, I don't nee ca case construct for this. Instead, my .bashrc looks like this (using your naming of the variables):
...
declare -r ARRAY_START=0
source .commonrc
...
and my .zshrc looks like this:
...
declare -r ARRAY_START=1
source .commonrc
...
Since it does not happen that the .bashrc is run from a zsh and vice versa, I don't need to query what kind of shell I have.
I am working with someone on a data analysis project and we frequently document the steps we perform by putting them into small shell scripts. The problem is that I use bash and the other person uses csh. The other person has a habit of using source to run these scripts instead of executing them directly (this habit probably dates back to times when spawning an extra shell was an extravagent waste of resources, so it's probably too entrenched to change) , and I want to have my scripts (which are, of course, bash scripts) simply stop with a message reminding the user to run them with bash instead of csh when this person sources them from within csh. At the same time, I would like them to continue to function as bash scripts.
So is there some code I can put at the beginning of my scripts that is a no-op in bash but will signal an error and cancel the execution of the rest of the file (but not kill the shell itself) when sourced from cshell?
This is harder than I thought due to csh's ancient variable substitution flavor. However, $?BASH_VERSION expands to 0 (not set) in csh and to 0BASH_VERSION (or whatever the last commands' RV was) in bash. So,
test "$?BASH_VERSION" = 0 && exit 1
should do the trick.
This is not easy, as you cannot assign variables the same way or run if statements the same way.
You can use csh's meagre string parsing skills against itself. The following executes cleanly in all shells, include KSH, BASH, ZSH, CSH and SH on all platforms that I tested it on (Linux, AIX, HP-UX, Solaris):
test '\\' = "\\" && echo "CSH detected"
The idea that is used is here is that backslashes are not special in double quoted strings on CSH, whereas all other shells do see them as different.
However, that is only half an answer as what do you want to do if you don't want to simply exit the script if the 'wrong' shell is detected? Well, you may want to have a Bournish sh part to your script and a csh part.
If you can keep the csh code limited to code that does not use single quotes, the following will work everywhere:
test '\\' = "\\" && goto csh
# Just skip the block containing the csh code. Again we use csh's meagre string parsing capabilities against it.
false || csh_code_block='
csh:
... csh code goes here ...
exit 0
'
... sh code goes here ...
If you are not worried about HP-UX's csh (which seems a little better than others in parsing) you could replace the multi-line single quoted command with a 'HERE' document (<<CSH_BLOCK ... CSH_BLOCK). You can't just reverse the order either, as the 'goto' statement doesn't like all syntax that it skips over.
Have people noticed that if you modify the source of a shell script, any instances that are currently running are liable to fail?
This in my opinion is very bad; it means that I have to make sure all instances of a script are stopped before I make changes. My preferred behavior would be that existing scripts continue running with old source code and that new instances use the new code (e.g. what happens for perl and python programs).
Do folks have any good workarounds for this behavior, other than pre-copying the shell script to a tempfile and running from that?
Thanks,
/YGA
Very slight addition to the other answers:
#!/bin/sh
{
# Your stuff goes here
exit
}
The exit at the end is important. Otherwise, the script file might still be accessed at the end to see if there are any more lines to interpret.
This question was later reposted here: Can a shell script indicate that its lines be loaded into memory initially?
Make sure the shell has to parse the whole file before executing any of it:
#!/bin/ksh
{
all the original script here
}
That does the trick.
Incidentally, with Perl (and I assume Python), the program parses the entire file before executing any of it, exactly as recommended here. Which is why you don't usually run into the problem with Perl or Python.
The desired behavior may not be possible, depending on complexity of the shell scripts that are involved.
If the full shell script is contained in a single source file, and that file is fully parsed before execution, then the shell script is generally safe from modifications to the copy on the disc during execution. Wrapping all the executable statements into a function (or series of functions) will generally achieve the goal you are after.
#!/bin/sh
doit()
{
# Stuff goes here
}
# Main
doit
The difficulty comes when the shell script "includes" other shell scripts (e.g. ".", or "source"). If these includes are wrapped in a function, they are not parsed until that statement is reached in the flow of execution. This makes the shell script vulnerable to changes to that external code.
In addition, if the shell script runs any external program (e.g. shell script, compiled program, etc), that result is not captured until that point in the execution is reached (if ever).
#!/bin/sh
doit()
{
if [[some_condition]] ; then
resultone=$(external_program)
fi
}
# Main
doit
this answer contains a robust and self contained way to make a script resistant to this problem: have the script copy and re-execute itself like this:
#!/bin/bash
if [[ $0 != /tmp/copy-* ]] ; then
rm -f /tmp/copy-$$
cp $0 /tmp/copy-$$
exec /tmp/copy-$$ "$#"
echo "error copying and execing script"
exit 1
fi
rm $0
# rest of script...
(This will not work if the original script begins with the characters /tmp/copy-)