When does a sub shell inherits its parent shell env? - bash

Under what circumstances is the environment of the shell passed to the sub-shell?

A subshell always gets all variables from the parent shell.
man bash will describe all the circumstances in which a subshell is used, which are mainly:
command &
command | command and
( command )
The so called environment only includes environment variables (export variable), and is passed on to every sub-process. Even when invoking bash -c command, which is not a sub-shell but a completely new bash instance.
In both cases changed values are not passed back to the parent process.

Related

Do bash scripts execute in new shells or subshells?

I am running a bash script from my bash interactive shell as:
./shell.sh
The confusion I am having is, will this script run inside a new shell instance or a subshell of my current bash instance?
I assume that all shell scripts invoked from a shell run inside a new shell therefore they aren't able to read the local shell variables of the invoking shell.
Also, if I put "echo $BASH_SUBSHELL" in my invoked script it returns me a value of "0" showing that it isn't a subshell. But according to some articles they say that a shell script when executed from a shell invokes a subshell. Please help.
You're correct; when you run a script with ./shell.sh, it runs in a new shell, not a subshell of the current shell.
It does run in a subprocess, which is a shell, so it's a tempting and common mistake to say "subprocess+shell=subshell, so it must be a subshell!" But that's incorrect. The shell running the script won't inherit shell variables from the parent shell process (it'll inherit environment variables, i.e. exported variables, but that's true of any subprocess), it won't inherit shell modes (e.g. set -e) or other shell state, and it won't even necessarily be running the same shell (if you're running bash and the script has a #!/bin/zsh shebang, it'll run in zsh). So it's logically a different shell that just happens to be running as a subprocess of the shell that launched it.

When are bash variables exported to subshells and/or accessible by scripts?

I'm confused over whether bash variables are exported to subshells and when they are accessible by scripts. My experience so far led me to believe that bash variables are automatically available to subshells. E.g.:
> FOO=bar
> echo $FOO
bar
> (echo $FOO)
bar
The above appears to demonstrate that bash variables are accessible in subshells.
Given this script:
#! /usr/bin/bash
# c.sh
func()
{
echo before
echo ${FOO}
echo after
}
func
I understand that calling the script in the current shell context gives it access to the current shell's variables:
> . ./c.sh
before
bar
after
If I were to call the script without the "dot space" precedent...
> ./c.sh
before
after
...isn't it the case that the script is called in a subshell? If so, and it's also true that the current shell's variables are available to subshells (as I inferred from the firstmost code-block), why is $FOO not available to c.sh when run this way?
Similarly, why is $FOO also unavailable when c.sh is run within parentheses - which I understood to mean running the expression in a subshell:
> (./c.sh)
before
after
(If this doesn't muddy this post with too many questions: if "./c.sh" and "(./c.sh)" both run the script in a subshell of the current shell, what's the difference between the two ways of calling?)
(...) runs ... in a separate environment, something most easily achieved (and implemented in bash, dash, and most other POSIX-y shells) using a subshell -- which is to say, a child created by fork()ing the old shell, but not calling any execv-family function. Thus, the entire in-memory state of the parent is duplicated, including non-exported shell variables. And for a subshell, this is precisely what you typically want: just a copy of the parent shell's process image, not replaced with a new executable image and thus keeping all its state in place.
Consider (. shell-library.bash; function-from-that-library "$preexisting_non_exported_variable") as an example: Because of the parens it fork()s a subshell, but it then sources the contents of shell-library.bash directly inside that shell, without replacing the shell interpreter created by that fork() with a separate executable. This means that function-from-that-library can see non-exported functions and variables from the parent shell (which it couldn't if it were execve()'d), and is a bit faster to start up (since it doesn't need to link, load, and otherwise initialize a new shell interpreter as happens during execve() operation); but also that changes it makes to in-memory state, shell configuration, and process attributes like working directory won't modify the parent interpreter that called it (as would be the case if there were no subshell and it weren't fork()'d), so the parent shell is protected from having configuration changes made by the library that could modify its later operation.
./other-script, by contrast, runs other-script as a completely separate executable; it does not retain non-exported variables after the child shell (which is not a subshell!) has been invoked. This works as follows:
The shell calls fork() to create a child. At this point in time, the child still has even non-exported variable state copied.
The child honors any redirections (if it was ./other-script >>log.out, the child would open("log.out", O_APPEND) and then fdup() the descriptor over to 1, overwriting stdout).
The child calls execv("./other-script", {"./other-script", NULL}), instructing the operating system to replace it with a new instance of other-script. After this call succeeds, the process running under the child's PID is an entirely new program, and only exported variables survive.

Why can you set environment variables in Bash functions but not in the script itself

Why does this work:
# a.sh
setEnv() {
export TEST_A='Set'
}
when this doesn't:
# b.sh
export TEST_B='Set'
Ex:
> source a.sh
> setEnv
> env | grep TEST_A
TEST_A=Set
> b.sh
> env | grep TEST_B
I understand why running the script doesn't work and what to do to make it work (source b.sh etc), but I'm curious to why the function works.
This is on OS X if that matters.
You need to understand the difference between sourcing and executing a script.
Sourcing runs the script from the parent-shell in which the script is invoked; all the environment variables are retained until the parent-shell is terminated (the terminal is closed, or the variables are reset or unset), whereas
Execute forks a new shell from the parent shell and those variables including your export variables are retained only in the sub-shell's environment and destroyed at the end of script termination.
i.e. the sub-shell ( imagine it being an environment) created in the first case to hold the variables are not allocated in scope of a separate child environment but are just added in the parents' ( e.g. imagine an extra memory cell, maintained by the parent ) environment which is held until you have the session open. But executing a script is, imagine a simple analogy, calling a function whose variables are in stored in stack which loose scope at the end of function call. Likewise, the forked shell's environment looses scope at the end of its termination.
So it comes down to this, even if you have a function to export your variable, if you don't source it to the current shell and just plainly execute it, the variable is not retained; i.e.
# a.sh
setEnv() {
export TEST_A='Set'
}
and if you run it in the shell as
bash script.sh # unlike/NOT source script.sh
env | grep TEST_A
# empty
Executing a function does not, in and of itself, start a new process like b.sh does.
From the man page (emphasis on the last sentence):
FUNCTIONS
A shell function, defined as described above under SHELL GRAMMAR,
stores a series of commands for later execution. When the name of a
shell function is used as a simple command name, the list of commands
associated with that function name is executed. **Functions are executed
in the context of the current shell; no new process is created to
interpret them (contrast this with the execution of a shell script).**
I understand why running the script doesn't work and what to do to make it work (source b.sh etc)
So you already understand the fact that executing b.sh directly -- in a child process, whose changes to the environment fundamentally won't be visible to the current process (shell) -- will not define TEST_B in the current (shell) process, so we can take this scenario out of the picture.
I'm curious why the function works.
When you source a script, you execute it in the context of the current shell - loosely speaking, it is as if you had typed the contents of the script directly at the prompt: any changes to the environment, including shell-specific elements such as shell variables, aliases, functions, become visible to the current shell.
Therefore, after executing source a.sh, function setEnv is now available in the current shell, and invoking it executes export TEST_A='Set', which defines environment variable TEST_A in the current shell (and subsequently created child processes would see it).
Perhaps your misconception is around what chepner's helpful answer addresses: in POSIX-like shells, functions run in the current shell - in contrast with scripts (when run without source), for which a child process is created.
This is on OS X if that matters.
Not in this case, because only functionality built into bash itself is used.

What is the difference between an inline variable assignment and a regular one in Bash?

What is the difference between:
prompt$ TSAN_OPTIONS="suppressions=/somewhere/file" ./myprogram
and
prompt$ TSAN_OPTIONS="suppressions=/somewhere/file"
prompt$ ./myprogram
The thread-sanitizer library gives the first case as how to get their library (used within myprogram) to read the file given in options. I read it, and assumed it was supposed to be two separate lines, so ran it as the second case.
The library doesn't use the file in the second case, where the environment variable and the program execution are on separate lines.
What's the difference?
Bonus question: How does the first case even run without error? Shouldn't there have to be a ; or && between them? The answer to this question likely answers my first...
The format VAR=value command sets the variable VAR to have the value value in the environment of the command command. The spec section covering this is the Simple Commands. Specifically:
Otherwise, the variable assignments shall be exported for the execution environment of the command and shall not affect the current execution environment except as a side-effect of the expansions performed in step 4.
The format VAR=value; command sets the shell variable VAR in the current shell and then runs command as a child process. The child process doesn't know anything about the variables set in the shell process.
The mechanism by which a process exports (hint hint) a variable to be seen by child processes is by setting them in its environment before running the child process. The shell built-in which does this is export. This is why you often see export VAR=value and VAR=value; export VAR.
The syntax you are discussing is a short-form for something akin to:
VAR=value
export VAR
command
unset -v VAR
only without using the current process environment at all.
To complement Etan Reisner's helpful answer:
It's important to distinguish between shell variables and environment variables:
Note: The following applies to all POSIX-compatible shells; bash-specific extensions are marked as such.
A shell variable is a shell-specific construct that is limited to the shell that defines it (with the exception of subshells, which get their own copies of the current shell's variables),
whereas an environment variable is inherited by any child process created by the current process (shell), whether that child process is itself a shell or not.
Note that all-uppercase variable names should only be used for environment variables.
Either way, a child process only ever inherits copies of variables, whose modification (by the child) does not affect the parent.
All environment variables are also shell variables (the shell ensures that),
but the inverse is NOT true: shell variables are NOT environment variables, unless explicitly designated or inherited as such - this designation is called exporting.
note that the off-by-default -a shell option (set with set -a, or passed to the shell itself as a command-line option) can be used to auto-export all shell variables.
Thus,
any variables you create implicitly by assignment - e.g., TSAN_OPTIONS="suppressions=/somewhere/file" - are ONLY shell variables, but NOT ALSO environment variables,
EXCEPT - perhaps confusingly - when prepended directly to a command - e.g. TSAN_OPTIONS="suppressions=/somewhere/file" ./myprogram - in which case they are ONLY environment variables, only in effect for THAT COMMAND.
This is what Etan's answer describes.
Shell variables become environment variables as well under the following circumstances:
based on environment variables that the shell itself inherited, such as $HOME
shell variables created explicitly with export varName[=value] or, in bash, also with declare -x varName[=value]
by contrast, in bash, using declare without -x, or using local in a function, creates mere shell variables
shell variables created implicitly while the off-by-default -a shell option is in effect (with limited exceptions)
Once a shell variable is marked as exported - i.e., marked as an environment variable - any subsequent changes to the shell variable update the environment variable as well; e.g.:
export TSAN_OPTIONS # creates shell variable *and* corresponding environment variable
# ...
TSAN_OPTIONS="suppressions=/somewhere/file" # updates *both* the shell and env. var.
export -p prints all environment variables
unset [-v] MYVAR undefines shell variable $MYVAR and also removes it as an environment variable, if applicable.
in bash:
You can "unexport" a given variable without also undefining it as a shell variable with export -n MYVAR - this removes MYVAR from the environment, but retains its current value as a shell variable.
declare -p MYVAR prints variable $MYVAR's current value along with its attributes; if the output starts with declare -x, $MYVAR is exported (is an environment variable)

Why are bash script variables not saving?

I have a simple bash script:
#!/bin/bash
JAVA_HOME=/usr
EC2_HOME=~/ec2-api
echo $EC2_HOME
export PATH=$PATH:$EC2_HOME/bin
I run the script like so
$ ./ec2
/Users/user/ec2-api
The script runs and produces the correct output.
However, when I now try to access the EC2_HOME variable, I get nothing out:
$ echo $EC2_HOME
I get a blank string back. What am I doing wrong?
Do either of the following instead:
source ec2
or
. ec2
(note the . notation is just a shortcut for source)
Explanation:
This is because ./ec2 actually spawns a subshell from your current shell to execute the script, and subshells cannot affect the environment of the parent shell from which it spawned.
Thus, EC2_HOME does get set to /Users/user/ec2-api correctly in the subshell (and similarly the PATH environment variable is updated and exported correctly in the subshell as well), but those changes won't propagate back to your parent shell.
Using source runs the script directly in the current shell without spawning a subshell, so the changes made will persist.
(A note on export: export is used to tell new shells spawned from the current shell to use the variables exported from the current shell. So for any variables you would only use in the current shell, they need not be exported.)
A shell script can never modify the environment of their parent.
To fix your problem, you can use the dot (.) command:
$ . ./ec2
and that should work. In cshell, it would be
% source ./ec2
To learn more about shells and scripts, my best resource is by far Unix power tools.

Resources