Is pipeline guaranteed to create a subshell in any POSIX shell? - shell

This shell script behaves as expected.
trap 'echo exit' EXIT
foo()
{
exit
}
echo begin
foo
echo end
Here is the output.
$ sh foo.sh
begin
exit
This shows that the script exits while executing foo.
Now see the following script.
trap 'echo exit' EXIT
foo()
{
exit
}
echo begin
foo | cat
echo end
The only difference here is that the output of foo is being piped into `cat. Now the output looks like the following.
begin
end
exit
This shows that the script does not exit while executing foo because end is printed.
I believe this happens because in bash a pipeline causes a subshell to be opened, so foo | cat is equivalent to (foo) | cat.
Is this behaviour guaranteed in any POSIX shell? I could not find anything in the POSIX standard at http://pubs.opengroup.org/onlinepubs/9699919799/ that implies that a pipeline must lead to a subshell. Can someone confirm if this behaviour can be relied upon?

In 2.12 Shell Execution Environment you find this quote:
A subshell environment shall be created as a duplicate of the shell environment, except that signal traps that are not being ignored shall be set to the default action. Changes made to the subshell environment shall not affect the shell environment. Command substitution, commands that are grouped with parentheses, and asynchronous lists shall be executed in a subshell environment. Additionally, each command of a multi-command pipeline is in a subshell environment; as an extension, however, any or all commands in a pipeline may be executed in the current environment. All other commands shall be executed in the current shell environment.
Where the key sentence for this question is
Additionally, each command of a multi-command pipeline is in a subshell environment; as an extension, however, any or all commands in a pipeline may be executed in the current environment
So without the extension (which bash uses for things like lastpipe and, I thought, for the first element in a pipeline as well but apparently not or at least not always) it looks like you can assume there will be a subshell for each part of the pipeline but the exception means you can't quite count on that.

Related

Trap bash errors from child script

I am calling a bash script (say child.sh) within my bash script (say parent.sh), and I would like have the errors in the script (child.sh) trapped in my parent script (parent.sh).
I read through the medium article and the stack exchange post. Based on that I thought I should do set -E on my parent script so that the TRAPS are inherited by sub shell. Accordingly my code is as follows
parent.sh
#!/bin/bash
set -E
error() {
echo -e "$0: \e[0;33mERROR: The Zero Touch Provisioning script failed while running the command $BASH_COMMAND at line $BASH_LINENO.\e[0m" >&2
exit 1
}
trap error ERR
./child.sh
child.sh
#!/bin/bash
ls -al > /dev/null
cd non_exisiting_dir #To simulate error
echo "$0: I am still continuing after error"
Output
./child.sh: line 5: cd: non_exisiting_dir: No such file or directory
./child.sh: I am still continuing after error
Can you please let me know what am missing so that I can inherit the TRAPs defined in the parent script.
./child.sh does not run in a "subshell".
A subshell is not a child process of your shell which happens to be a shell, too, but a special environment where the commands from inside (...), $(...) or ...|... are run in, which is usually implemented by forking the current shell without executing another shell.
If you want to run child.sh in a subshell, then source that script from a subshell you can create with (...):
(. ./child.sh)
which will inherit your ERR trap because of set -E.
Notes:
There are other places where bash runs the commands in a subshell: process substitutions (<(...)), coprocesses, the command_not_found_handle function, etc.
In some shells (but not in bash) the leftmost command from a pipeline is not run in a subshell. For instance ksh -c ':|a=2; echo $a' will print 2. Also, not all shells implement subshells by forking a separate process.
Even if bash infamously allows functions to be exported to other bash scripts via the environment (with export -f funcname), that's AFAIK not possible with traps ;-)

How to run a time-limited background command and read its output (without timeout command)

I'm looking at https://stackoverflow.com/a/10225050/1737158
And in same Q there is an answer with timeout command but it's not in all OSes, so I want to avoid it.
What I try to do is:
demo="$(top)" &
TASK_PID=$!
sleep 3
echo "TASK_PID: $TASK_PID"
echo "demo: $demo"
And I expect to have nothing in $demo variable while top command never ends.
Now I get an empty result. Which is "acceptable" but when i re-use the same thing with the command which should return value, I still get an empty result, which is not ok. E.g.:
demo="$(uptime)" &
TASK_PID=$!
sleep 3
echo "TASK_PID: $TASK_PID"
echo "demo: $demo"
This should return uptime result but it doesn't. I also tried to kill the process by TASK_PID but I always get. If a command fails, I expect to have stderr captures somehow. It can be in different variable but it has to be captured and not leaked out.
What happens when you execute var=$(cmd) &
Let's start by noting that the simple command in bash has the form:
[variable assignments] [command] [redirections]
for example
$ demo=$(echo 313) declare -p demo
declare -x demo="313"
According to the manual:
[..] the text after the = in each variable assignment undergoes tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal before being assigned to the variable.
Also, after the [command] above is expanded, the first word is taken to be the name of the command, but:
If no command name results, the variable assignments affect the current shell environment. Otherwise, the variables are added to the environment of the executed command and do not affect the current shell environment.
So, as expected, when demo=$(cmd) is run, the result of $(..) command substitution is assigned to the demo variable in the current shell.
Another point to note is related to the background operator &. It operates on the so called lists, which are sequences of one or more pipelines. Also:
If a command is terminated by the control operator &, the shell executes the command asynchronously in a subshell. This is known as executing the command in the background.
Finally, when you say:
$ demo=$(top) &
# ^^^^^^^^^^^ simple command, consisting ONLY of variable assignment
that simple command is executed in a subshell (call it s1), inside which $(top) is executed in another subshell (call it s2), the result of this command substitution is assigned to variable demo inside the shell s1. Since no commands are given, after variable assignment, s1 terminates, but the parent shell never receives the variables set in child (s1).
Communicating with a background process
If you're looking for a reliable way to communicate with the process run asynchronously, you might consider coprocesses in bash, or named pipes (FIFO) in other POSIX environments.
Coprocess setup is simpler, since coproc will setup pipes for you, but note you might not reliably read them if process is terminated before writing any output.
#!/bin/bash
coproc top -b -n3
cat <&${COPROC[0]}
FIFO setup would look something like this:
#!/bin/bash
# fifo setup/clean-up
tmp=$(mktemp -td)
mkfifo "$tmp/out"
trap 'rm -rf "$tmp"' EXIT
# bg job, terminates after 3s
top -b >"$tmp/out" -n3 &
# read the output
cat "$tmp/out"
but note, if a FIFO is opened in blocking mode, the writer won't be able to write to it until someone opens it for reading (and starts reading).
Killing after timeout
How you'll kill the background process depends on what setup you've used, but for a simple coproc case above:
#!/bin/bash
coproc top -b
sleep 3
kill -INT "$COPROC_PID"
cat <&${COPROC[0]}

What does builtin commands means in bash?

I read a part of manual of bash. The item is "COMMAND EXECUTION ENVIRONMENT". The part says,
Builtin commands that are invoked as part of a pipeline are also executed in a subshell environment. Changes made to the subshell environment cannot affect the shell's execution environment.
I suppose it means value changed in pipeline is local because each command in pipeline runs in its own sub-shell. Like following,
value='1'
echo "Before pipe, ${value}"
value='2' | echo "${value}" | value='3' | echo "In another pipe, ${value}"
echo "After pipe, ${value}"
Before pipe, 1
In another pipe, 1
After pipe, 1
I read "SHELL BUILTIN COMMANDS" in bash. But I could not find "=" as builtin command. What does "builtin commands" means here? And are there "non-builtin commands" which can affect the change globally even in pipe-line?
And if you don't mind please let me know when the new sub-shell runs except for:
(...)
pipeline |
I think that the manual is basically saying that built-in commands, such as echo, printf, read, etc. don't get any special treatment and still run within their own sub-shell, even though in principal it would be possible for the shell to determine that all of the commands in the pipeline could be run natively in the same shell.
If you ask to pipe one command into another, then sub-shells are created, no matter what is on either side of the pipe.
For example:
echo string | read foo
uses the two built-ins, echo and read but the variable $foo ceases to exist after the pipeline finishes.

Exiting a shell script with an error

basically I have written a shell script for a homework assignment that works fine however I am having issues with exiting. Essentially the script reads numbers from the user until it reads a negative number and then does some output. I have the script set to exit and output an error code when it receives anything but a number and that's where the issue is.
The code is as follows:
if test $number -eq $number >dev/null 2>&1
then
"do stuff"
else
echo "There was an error"
exit
The problem is that we have to turn in our programs as text files using script and whenever I try to script my program and test the error cases it exits out of script as well. Is there a better way to do this?
The script is being run with the following command in the terminal
script "insert name of program here"
Thanks
If the program you're testing is invoked as a subprocess, then any exit command will only exit the command itself. The fact that you're seeing contrary behavior means you must be invoking it differently.
When invoking your script from the parent testing program, use:
# this runs "yourscript" as its own, external process.
./yourscript
...to invoke it as a subprocess, not
# this is POSIX-compliant syntax to run the commands in "yourscript" in the current shell.
. yourscript
...or...
# this is bash-extended syntax to run the commands in "yourscript" in the current shell.
source yourscript
...as either of the latter will run all the commands -- including exit -- inside your current shell, modifying its state or, in the case of exit, exec or similar, telling it to cease execution.

Execute bash commands from a Rakefile

I would like to execute a number of bash commands from a Rakefile.
I have tried the following in my Rakefile
task :hello do
%{echo "World!"}
end
but upon executing rake hello there is no output?
How do I execute bash commands from a Rakefile?
NOTE:This is not a duplicate as it's specifically asking how to execute bash commands from a Rakefile.
I think the way rake wants this to happen is with: http://rubydoc.info/gems/rake/FileUtils#sh-instance_method
Example:
task :test do
sh "ls"
end
The built-in rake function sh takes care of the return value of the command (the task fails if the command has a return value other than 0) and in addition it also outputs the commands output.
There are several ways to execute shell commands in ruby. A simple one (and probably the most common) is to use backticks:
task :hello do
`echo "World!"`
end
Backticks have a nice effect where the standard output of the shell command becomes the return value. So, for example, you can get the output of ls by doing
shell_dir_listing = `ls`
But there are many other ways to call shell commands and they all have benefits/drawbacks and work differently. This article explains the choices in detail, but here's a quick summary possibilities:
stdout = %x{cmd} - Alternate syntax for backticks, behind the scenes
it's doing the same thing
exec(cmd) - Completely replace the running process with a new cmd process
success = system(cmd) - Run a subprocess and return true/false
on success/failure (based on cmd exit status)
IO#popen(cmd) { |io| } - Run a subprocess and connect stdout and
stderr to io
stdin, stdout, stderr = Open3.popen3(cmd) - Run a subprocess and
connect to all pipes (in, out, err)
Given that the consensus seems to prefer rake's #sh method, but OP explicitly requests bash, this answer may have some use.
This is relevant since Rake#sh uses the Kernel#system call to run shell commands. Ruby hardcodes that to /bin/sh, ignoring the user's configured shell or $SHELL in the environment.
Here's a workaround which invokes bash from /bin/sh, allowing you to still use the sh method:
task :hello_world do
sh <<-EOS.strip_heredoc, {verbose: false}
/bin/bash -xeu <<'BASH'
echo "Hello, world!"
BASH
EOS
end
class String
def strip_heredoc
gsub(/^#{scan(/^[ \t]*(?=\S)/).min}/, ''.freeze)
end
end
#strip_heredoc is borrowed from rails:
https://github.com/rails/rails/blob/master/activesupport/lib/active_support/core_ext/string/strip.rb
You could probably get it by requiring active_support, or maybe it's autoloaded when you're in a rails project, but I was using this outside rails and so had to def it myself.
There are two heredocs, an outer one with the markers EOS and an inner one with the markers BASH.
The way this works is by feeding the inside heredoc between the BASH markers to bash's stdin. Note that it is running within the context of /bin/sh, so it's a posix heredoc, not a ruby one. Normally that requires the end marker to be in column 1, which isn't the case here because of the indenting.
However, because it's wrapped within a ruby heredoc, the strip_heredoc method applied there de-indents it, placing the entirety of the left side of the inner heredoc in column 1 prior to /bin/sh seeing it.
/bin/sh also would normally expand variables within the heredoc, which could interfere with the script. The single quotes around the start marker, 'BASH', tell /bin/sh not to expand anything inside the heredoc before it is passed to bash.
However /bin/sh does still apply escapes to the string before passing it to bash. That means backslash escapes have to be doubled to make it through /bin/sh to bash, i.e. \ becomes \\.
The bash options -xeu are optional.
The -eu arguments tell bash to run in strict mode, which stops execution upon any failure or reference to an undefined variable. This will return an error to rake, which will stop the rake task. Usually, this is what you want. The arguments can be dropped if you want normal bash behavior.
The -x option to bash and {verbose: false} argument to #sh work in concert so that rake only prints the bash commands which are actually executed. This is useful if your bash script isn't meant to run in its entirety, for example, if it has a test which allows it to exit gracefully early in the script.
Be careful to not set an exit code other than 0 if you don't want the rake task to fail. Usually, that means you don't want to use any || exit constructs without setting the exit code explicitly, i.e. || exit 0.
%{echo "World!"} defines a String. I expect you wanted %x{echo "World!"}.
%x{echo "World!"} executes the command and returns the output (stdout). You will not see the result. But you may do:
puts %x{echo "World!"}
There are more ways to call a system command:
Backticks: `
system( cmd )
popen
Open3#popen3
There are two ways:
sh " expr "
or
%x( expr )
Mind that ( expr ) can be { expr } , | expr | or ` expr `
The difference is, sh "expr" is a ruby method to execute something, and %x( expr ) is the ruby built-in method. The result and action are different. Here is an example
task :default do
value = sh "echo hello"
puts value
value = %x(echo world)
puts value
end
get:
hello # from sh "echo hello"
true # from puts value
world # from puts value
You can see that %x( expr ) will only do the shell expr but the stdout will not show in the screen. So, you'd better use%x( expr ) when you need the command result.
But if you just want to do a shell command, I recommend you use sh "expr". Because sh "irb" will make you go into the irb shell, while %x(irb) will dead.

Resources