What is the purpose of these file descriptors closes? - bash

I'm following this example from Advaced Bash Guide,IO redirection, that shows ' Redirecting only stderr to a pipe'. I understood how it works except when it close fd 3.
why it need close fd 3 in each command when the last command closes is globally ?
exec 3>&1
ls -l 2>&1 >&3 3>&- | grep bad 3>&-
exec 3>&-

In the shell, initially you have 1:terminal, 2:terminal. Then 1 is duplicated so that 3:terminal.
When ls is executed, the file descriptors are inherited, but the pipe replaces the first, so that you have 1:pipe, 2:terminal, 3:terminal; then the redirects make it 1:terminal, 2:pipe, (3:closed). Meanwhile, grep has 0 connected to the pipe, and inherits 1:terminal, 2:terminal, 3:terminal, but the redirect turns it into 1:terminal, 2:terminal, (3:closed).
Finally, back in the shell, 3 is closed, returning to the initial state of 1:terminal, 2:terminal.
The thing to understand is that file descriptors are inherited when a process is forked, but become independent from then on, so each process's descriptor 3 must be closed separately. In this case, there would probably be no harm in leaving it open for ls and grep, but for tidyness it's closed anyway.

Related

Background process, with stdin a pipe connected to fd 3 in the parent shell script

Is it possible, in portable shell, without using a named pipe, to start a background process and have that process's stdin be a pipe which is open on file descriptor 3 (or some other number greater than 2) in the parent shell? Another way to put it is that I want to do in portable shell what popen("some-program", "w") does in C.
Concretely, do what this snippet does, without using mkfifo:
mkfifo fifo
some-program < fifo &
exec 3> fifo
rm fifo
Note: "bash" tag added primarily for visibility. I understand that this is possible using the "coprocess" extension in bash, ksh, or zsh; I am looking for a technique that works in an environment where only the facilities of XPG Issue 6 (POSIX.1-2001) Shell and Utilities are available. In addition, answers that use a named pipe (via mkfifo most obviously) will not be accepted, as will answers that require root privileges.
Answers that use any of the following are discouraged, but I'll listen to an argument that it's not possible to do it without one or more of them:
anything that creates a machine-language executable (e.g. c99)
anything intended primarily for interactive use (e.g. vi)
anything that's not part of the "Base" spec in Issue 6, or is obsolescent in either Issue 6 or any later revision of POSIX (e.g. uucp)
I'll also listen to an answer that makes a convincing argument that this is impossible within the restrictions above.
(Why no named pipes? Because they don't behave quite the same as normal pipes, particularly with the older, buggier OSes where "just use #!/bin/bash" isn't an option. Also, exposing the pipe in the file system, however briefly, means you cannot completely exclude the possibility of some unrelated process opening the pipe.)
The only standard shell feature that provides for creating pipes is the | operator. If we assume no extensions are in play, then the spec says that the commands in a (multi-command) pipeline execute in subshell environments. Such an environment cannot modify the parent shell so as to make a file descriptor for the write end of the pipe available there, so the closest we can come is to make it available within the scope of a { compound-command } on the write end of the pipe. Example:
#!/bin/sh
exec 4>&1
{
# Copy the write end of the pipe to FD 3, restore the parent
# shell's original stdout, and close excess FD 4
exec 3>&1 1>&4 4>&-
use-fd-3-here
} | something-goes-here
exec 4>&-
Now, although it can be redirected, the initial standard input to an asynchronous command is (an equivalent of) /dev/null, so it won't work to put just some-command & on the read end of the pipe. But we can put another compound command around some-command, to give us a place to make a copy of the standard input file descriptor from the pipe. Example:
{
some-command 0>&4 4>&-
} 4>&0 &
Those work together just fine. For example:
#!/bin/sh
# Copy the standard output FD to preserve access to it within processes in the
# pipeline
exec 4>&1
{
# Rotate file descriptors:
# - the write end of the pipe becomes 3
# - the parent shell's standard output becomes 1
# - excess FD 4 is closed for tidiness and safety
exec 3>&1 1>&4 4>&-
# This is piped into the standard input of a command running asynchronously.
# In this example, that process will substitute a "!" for the "?" and output
# the result
echo 'piped?' 1>&3
# This does not go to the async process
echo 'not piped?'
} | {
# Redirect the read end of the pipe to this process's standard input and
# clean up the extra file descriptor.
sed 's/?/!/' 0>&4 4>&-
} 4>&0 &
exec 4>&-
# wait for the async process to terminate
wait
Output:
not piped?
piped!
And if you were concerned about the relative order of the 4>&0 redirection and redirecting /dev/null to the standard input of the async process, then a little more compound-command declaration could make that unambiguous:
| {
{
some-command 's/?/!/' 0>&4 4>&-
} &
} 4>&0
Does it have to be backgrounded per se, and can you tolerate a different parent process?
#! /bin/sh
if [ -z "$REINVOKED" ]; then
# First time we're running.
# Save our stdout and launch some-program on a pipe connected to
# our second invocation. (A new shell will replace this one.)
exec 4>&1
REINVOKED=true
export REINVOKED
exec /bin/sh -c "$0 | some-program" # <- WARNING
exit 1
fi
# Second invocation.
#
exec 3>&1 # dup the pipe (our stdout) to fd3
exec 1>&4 # restore stdout from inherited fd4
exec 4>&- # close fd4
echo "READY" # goes to stdout
echo "READY" >&3 # goes to some-program
...
That dodgy environment variable usage to detect a second invocation of the script could be more robustly reimplemented as separate scripts. The WARNING line is where you will need to watch future revisions for command injection (CWE-78).
The processes ancestry will look like this:
/bin/sh -c "/rather/dodgy.sh | some-program"
|
+- /bin/sh -c /rather/dodgy.sh
|
+- some-program
All in the same process group, and presumably spawned from your interactive shell (-bash ?).
Here's a way to popen("some-program", "w") and send data to it through fd3 with a POSIX shell:
#!/bin/sh
{
{
# MAIN PROGRAM (basically, the whole script goes here)
echo "normal output that won't be processed by some-program"
echo 'some input for some-program' 1>&3
} 3>&1 1>&4 |
some-program
} 4>&1
Regarding the background part of the question: with the above construct, some-program is running in a sub-process; so, unless you need to detach some-program from its parent group (ie. setpgid(0, 0)), then there's no need to use any & in the shell.

Shell script exec with less then sign

Can someone explain to me what this line would do in a shell script?
exec 3<&0 </dev/null
I tried googling, but couldn't hone in on the details. I believe 3 is a new file descriptor, 0 is STDIN? and am not sure what the last /dev/null does, or the purpose of exec or the "<" signs.
exec without a command argument changes the I/O redirection for the rest of the script.
3<&0 duplicates the current stdin descriptor to file desscriptor 3.
</dev/null redirects stdin to /dev/null, which is a special device that contains nothing (reading it returns EOF immediately, writing to it discards the data).
The purpose of all this is to redirect standard input to the null device, but save it on FD 3 so that it can be reverted later. So somewhere later in the script you should see:
exec <&3 3<&-
This duplicates FD 3 back to stdin, and then closes FD 3.
Redirection syntax is described in the Redirections section of the Bash Manual.

Redirecting output within shell scripts called by cron [duplicate]

I'm playing with i/o shell redirection. The commands I've tried (in bash):
ls -al *.xyz 2>&1 1> files.lst
and
ls -al *.xyz 1> files.lst 2>&1
There is no any *.xyz file in current folder.
These commands gives me the different results. The first command shows an error message ls: *.xyz: No such file or directory on the screen. But the second one prints this error message to the file. Why did the first command failed to write an err output to the file?
The Bash manual has a clear example (similar to yours) to show that the order matters and also explains the difference. Here's the relevant part excerpted (emphasis mine):
Note that the order of redirections is significant. For example, the
command
ls > dirlist 2>&1
directs both standard output (file descriptor 1) and standard error
(file descriptor 2) to the file dirlist, while the command
ls 2>&1 > dirlist
directs only the standard output to file dirlist, because the standard
error was made a copy of the standard output before the standard
output was redirected to dirlist.
This post explains it from the POSIX viewpoint.
Confusions happen due to a key difference. > redirects not by making left operand (stderr) point to right operand (stdout) but by making a copy of the right operand and assigning it to the left. Conceptually, assignment by copy and not by reference.
So reading from left-to-right which is how this is interpreted by Bash: ls > dirlist 2>&1 means redirect stdout to the file dirlist, followed by redirection of stderr to whatever stdout is currently (which is already the file dirlist). However, ls 2>&1 > dirlist would redirect stderr to whatever stdout is currently (which is the screen/terminal) and then redirect stdout to dirlist.
Redirections are:
processed from left to right.
interpreted iteratively:
an earlier redirection can affect a later one:
if an earlier redirection has redirected a given stream (identified by a file descriptor number, such as 1 for stdout (the default), and 2 for stderr), later redirections targeting that stream refer to the already-redirected version.
but not vice versa - a later redirection has no retroactive effect on the the target of an earlier redirection:
e.g., if you specify file descriptor 1 as the target in an earlier redirection, what 1 means at that time is locked in, even if 1 is redirected later.
Note, however, that output isn't actually sent until all redirections are in place, and that any redirection-target output files are created or truncated before command execution begins (this is the reason why you can't read from and redirect output to the same file with a single command).
Applied to the example from the question:
>file 2>&1:
>file first redirects stdout (file descriptor 1, implied by not prefixing > with a file descriptor number) to output file file
2>&1 then redirects stderr (2) to the already redirected stdout (1).
The net effect is that both original streams end up in file.
2>&1 >file:
2>&1 first redirects stderr to the then-original stdout; since file descriptor 2 participates in no further redirections, stderr output will therefore go to whatever stdout was defined as at that point - i.e., the original stdout, given that this is the first redirection.
Technically, the original stdout file descriptor is duplicated, and that duplicate is what stderr then refers to, which explains why it isn't affected by a later redirection of stdout.
>file then redirects the original stdout to file - but that has no effect anymore on the already locked-in redirection of stderr.
The net effect is that only sent-directly-to-stdout output is captured in file, while sent-to-stderr output is output to (the original, unredirected) stdout.
This error:
ls: *.xyz: No such file or directory
is being written on stderr by ls binary.
However in this command:
ls -al *.xyz 2>&1 1> files.lst
You're first redirecting stderr to stdout which by default goes to tty (terminal)
And then you're redirecting stdout to a file files.lst, however remember that stderr doesn't redirected to file since you have stderr to stdout redirection before stdout to file redirection. Your stderr still gets written to tty in this case.
However in 2nd case you change the order of redirections (first stdout to file and then stderr to stdout) and that rightly redirects stderr to a file which is also being used by stdout.
Because order does matter.
In the first case, you first redirect stderr (2) to stdout (1).
Then you redirect (1) to a file. But stderr (2) is still pointed to stdout of the shell running the command. Pointing (1) to a file in this case doesn't change the output device that (2) is directed at, so it still goes to terminal.
In the second case, you redirect stdout (1) to a file. Then you point stderr (2) to the same place 1 is pointed, which is the file, so the error message goes to the file.
Redirections are processed from left to right
they are the same:
my_cmd 1>a_file
my_cmd >a_file
My method to remember the whole thing
let's imagine we are playing a game.
There is a bridge with 2 broken parts.
If we first place a block named 2>&1 to repair the first broken part, the ball named stderrr can arrive the place named stdout, but since the second part is still broken, stderrr will fall down to a river named tty (to your screen) .Then we place a block named 1>a_file to repair the second broken part, the ball named stdout can arrive the place named a_file
If we first place a block named 1>a_file to repair the second broken part and then repair the first broken part with 2>&1, the ball stderrr will not fall down to the river tty
First
$ ls *.xyz 2>&1 >files.lst
ls: cannot access '*.xyz': No such file or directory
$ cat files.lst
This writes the ls error message to the terminal because when 2>&1 copies stdout to stderr they both still points to the terminal. files.lst is empty because stderr is not pointing to this file but to the terminal.
Next
$ ls *.xyz >files.lst 2>&1
$ cat files.lst
ls: cannot access '*.xyz': No such file or directory
Here >files.lst redirects stdout to the file files.lst. Then 2>&1 redirects stderr to stdout which now points to files.lst.
So when you cat the file it now contains the ls error message.
ls -al *.xyz 2>&1 1> files.lst
In the first example, 2>&1 redirection just points stderr to screen.
ls -al *.xyz 1> files.lst 2>&1
Here though, this has an effect of ls -al *.xyz &>files.lst. Both stdout and stderr are passed to files.lst file.
The difference is the order of redirections.

What does minus mean in "exec 3>&-" and how do I use it?

I often have trouble figuring out certain language constructs because they won't register when googling or duckduckgoing them. With a bit of experimenting, it's often simple to figure it out, but I don't get this one.
I often see stuff like 2>&1 or 3>&- in bash scripts. I know this is some kind of redirection. 1 is stdout and 2 is stderror. 3 is probably custom. But what is the minus?
Also, I have a script whose output I want to log, but also want to see on screen. I use exec > >(tee $LOGFILE); exec 2>&1 for that. It works. But sometimes when I bashtrap this script, I cannot type at the prompt anymore. Output is hidden after Ctrl+C. Can I use a custom channel and the minus sign to fix this, or is it unrelated?
2>&1 means that stderr is redirected to stdout
3>&- means that file descriptor 3, opened for writing(same as stdout), is closed.
You can see more examples of redirection here
As for questions number 3, I think this is a good link.
The 3>&- close the file descriptor number 3 (it probably has been opened before with 3>filename).
The 2>&1 redirect the output of file descriptor 2 (stderr) to the same destination as file descriptor 1 (stdout). This dies call dup2() syscall.
For more information about redirecting file descriptor please consult the bash manpages (`man bash). They are dense but great.
For your script, I would do it like that:
#!/bin/bash
if [[ -z $recursive_call ]]; then
recursive_call=1
export recursive_call
"$0" "$#" | tee filename
exit
fi
# rest of the script goes there
It lose the exit code from the script though. There is a way in bash to get it I guess but I can't remember it now.

How do these stream redirections work?

From this perldoc page,
To capture a command's STDERR and STDOUT together:
$output = `cmd 2>&1`;
To capture a command's STDOUT but discard its STDERR:
$output = `cmd 2>/dev/null`;
To capture a command's STDERR but discard its STDOUT (ordering is important here):
$output = `cmd 2>&1 1>/dev/null`;
To exchange a command's STDOUT and STDERR in order to capture the STDERR but leave its STDOUT to come out the old STDERR:
$output = `cmd 3>&1 1>&2 2>&3 3>&-`;
I do not understand how 3 and 4 work, and I am not too sure what I understand about 1 and 2 is right. Below is what I understand. Please correct me where I am wrong.
I know that 0, 1 and 2 symbolize STDIN, STDOUT and STDERR.
redirect 2 to 1, so that both of them use the same stream now (& escaped 1 making sure that STDERR does not get redirected to a file named 1 instead)
redirect 2 (STDERR) to null stream, so that it gets discarded
I do not understand this one. Shouldn't it be just
$output = `cmd 1>/dev/null`;
Also, if the aim is to get the STDERR messages at STDOUT, won't 1>/dev/null redirect everything to /dev/null?
What is happening here? What is stream 3? Is it like a temporary variable?
Really, none of this is Perl -- all of this is handled by the shell that you're invoking by using the backticks operator. So your best reading is man sh, or the Shell chapter of the Unix standard.
In short, though, for #4:
3>&1: Open FD 3 to point to where stdout currently points.
1>&2: Reopen stdout to point to where stderr currently points.
2>&3: Reopen stderr to point to where FD 3 currently points, which is where stdout pointed before the previous step was completed. Now stdout and stderr have been succesfully swapped.
3>&-: Close FD 3 because it's not needed anymore.
Though documented in the perldocs, the redirection is all standard linux redirection. You understand 1 and 2 correctly.
3) Only STDOUT is normally caught by a basic redirect (>), so the original STDOUT must be discarded, and STDERR must be send to STDOUT.
4) cmd 3>&1 1>&2 2>&3 3>&- is equivalent to
var tmp = STDOUT;
STDOUT = STDERR;
STDERR = tmp;
delete tmp;
Normally we have this:
1-->STDOUT
2-->STDERR
2>&1 redirects file descriptor fd2 to fd1
1-->STDOUT
/
2./
2>/dev/null redirects fd2 to /dev/null.
1-->STDOUT
2-->/dev/null
2>&1 1>/dev/null redirects fd2 to fd1, and then redirects fd1 to /dev/null
/dev/null
/
1./ STDOUT
/
2./
3>&1 1>&2 2>&3 3>&-
first directs a new fd 3 to wherever
fd 1 is currently pointing (STDOUT).
then redirects fd1 to wherever fd2 is
current pointing (STDERR),
then redirects fd 2 to wherever fd 3
is currently pointing (STDOUT)
then closes fd3 (3>&- means close
file descriptor 3).
The whole thing effectively swaps fd1 and fd2. fd3 acted as a temporary variable.
1 --STDOUT
X
2 `-STDERR
See the docs for more information on IO redirection.
3.Nope. The ordering matters, so it gets rid of the original stdout, then it moves stderr to stdout.
4.3 is just another file descriptor, same as the first 3. Most processes can use a total of 256 different file descriptors.

Resources