Shell IO redirection order, pipe version - bash

I have seen this question:
Shell redirection i/o order.
But I have another question. If this line fails to redirect stderr to file:
ls -xy 2>&1 1>file
Then why this line can redirect stderr to grep?
ls -xy 2>&1 | grep ls
I want to know how it is actually being run underneath.
It is said that 2>&1 redirects stderr to a copy of stdout. What does "a copy of stdout" mean? What is actually being copied?
The terminal registers itself (through the OS) for sending and receiving through the standard streams of the processes it creates, right? Does the other redirections go through the OS as well (I don't think so, as the terminal can handle this itself)?

The pipe redirection (connecting standard output of one command to the stdin of the next) happens before the redirection performed by the command.
That means by the time 2>&1 happens, the stdout of ls is already setup to connect to stdin of grep.
See the man page of bash:
Pipelines
The standard output of command is connected via a pipe to
thestandard input of command2. This connection is performed before
anyredirections specified by the command (see REDIRECTION below). If
|&is used, command's standard error, in addition to its
standardoutput, is connected to command2's standard input through the
pipe;it is shorthand for 2>&1 |. This implicit redirection of
thestandard error to the standard output is performed after
anyredirections specified by the command.
(emphasis mine).
Whereas in the former case (ls -xy 2>&1 1>file), nothing like that happens i.e. when 2>&1 is performed the stdout of ls is still connected to the terminal (and hasn't yet been redirected to the file).

That answers my first question. What about the others?
Well, your second question has already been answered in the comments. (What is being duplicated is a file descriptor).
As to your last question(s),
The terminal registers itself (through the OS) for sending and receiving through the standard streams of the processes it creates, right? Does the other redirections go through the OS as well (I don't think so, as the terminal can handle this itself)?
it is the shell which attaches the standard streams of the processes it creates (pipes first, then <>’s, as you have just learned). In the default case, it attaches them to its own streams, which might be attached to a tty, with which you can interact in a number of ways, usually a terminal emulation window, or a serial console, whatever. Terminal is a very ambiguous word.

Related

How to get error text in the iperf message? [duplicate]

I am rather confused with the purpose of these three files. If my understanding is correct, stdin is the file in which a program writes into its requests to run a task in the process, stdout is the file into which the kernel writes its output and the process requesting it accesses the information from, and stderr is the file into which all the exceptions are entered. On opening these files to check whether these actually do occur, I found nothing seem to suggest so!
What I would want to know is what exactly is the purpose of these files, absolutely dumbed down answer with very little tech jargon!
Standard input - this is the file handle that your process reads to get information from you.
Standard output - your process writes conventional output to this file handle.
Standard error - your process writes diagnostic output to this file handle.
That's about as dumbed-down as I can make it :-)
Of course, that's mostly by convention. There's nothing stopping you from writing your diagnostic information to standard output if you wish. You can even close the three file handles totally and open your own files for I/O.
When your process starts, it should already have these handles open and it can just read from and/or write to them.
By default, they're probably connected to your terminal device (e.g., /dev/tty) but shells will allow you to set up connections between these handles and specific files and/or devices (or even pipelines to other processes) before your process starts (some of the manipulations possible are rather clever).
An example being:
my_prog <inputfile 2>errorfile | grep XYZ
which will:
create a process for my_prog.
open inputfile as your standard input (file handle 0).
open errorfile as your standard error (file handle 2).
create another process for grep.
attach the standard output of my_prog to the standard input of grep.
Re your comment:
When I open these files in /dev folder, how come I never get to see the output of a process running?
It's because they're not normal files. While UNIX presents everything as a file in a file system somewhere, that doesn't make it so at the lowest levels. Most files in the /dev hierarchy are either character or block devices, effectively a device driver. They don't have a size but they do have a major and minor device number.
When you open them, you're connected to the device driver rather than a physical file, and the device driver is smart enough to know that separate processes should be handled separately.
The same is true for the Linux /proc filesystem. Those aren't real files, just tightly controlled gateways to kernel information.
It would be more correct to say that stdin, stdout, and stderr are "I/O streams" rather
than files. As you've noticed, these entities do not live in the filesystem. But the
Unix philosophy, as far as I/O is concerned, is "everything is a file". In practice,
that really means that you can use the same library functions and interfaces (printf,
scanf, read, write, select, etc.) without worrying about whether the I/O stream
is connected to a keyboard, a disk file, a socket, a pipe, or some other I/O abstraction.
Most programs need to read input, write output, and log errors, so stdin, stdout,
and stderr are predefined for you, as a programming convenience. This is only
a convention, and is not enforced by the operating system.
As a complement of the answers above, here is a sum up about Redirections:
EDIT: This graphic is not entirely correct.
The first example does not use stdin at all, it's passing "hello" as an argument to the echo command.
The graphic also says 2>&1 has the same effect as &> however
ls Documents ABC > dirlist 2>&1
#does not give the same output as
ls Documents ABC > dirlist &>
This is because &> requires a file to redirect to, and 2>&1 is simply sending stderr into stdout
I'm afraid your understanding is completely backwards. :)
Think of "standard in", "standard out", and "standard error" from the program's perspective, not from the kernel's perspective.
When a program needs to print output, it normally prints to "standard out". A program typically prints output to standard out with printf, which prints ONLY to standard out.
When a program needs to print error information (not necessarily exceptions, those are a programming-language construct, imposed at a much higher level), it normally prints to "standard error". It normally does so with fprintf, which accepts a file stream to use when printing. The file stream could be any file opened for writing: standard out, standard error, or any other file that has been opened with fopen or fdopen.
"standard in" is used when the file needs to read input, using fread or fgets, or getchar.
Any of these files can be easily redirected from the shell, like this:
cat /etc/passwd > /tmp/out # redirect cat's standard out to /tmp/foo
cat /nonexistant 2> /tmp/err # redirect cat's standard error to /tmp/error
cat < /etc/passwd # redirect cat's standard input to /etc/passwd
Or, the whole enchilada:
cat < /etc/passwd > /tmp/out 2> /tmp/err
There are two important caveats: First, "standard in", "standard out", and "standard error" are just a convention. They are a very strong convention, but it's all just an agreement that it is very nice to be able to run programs like this: grep echo /etc/services | awk '{print $2;}' | sort and have the standard outputs of each program hooked into the standard input of the next program in the pipeline.
Second, I've given the standard ISO C functions for working with file streams (FILE * objects) -- at the kernel level, it is all file descriptors (int references to the file table) and much lower-level operations like read and write, which do not do the happy buffering of the ISO C functions. I figured to keep it simple and use the easier functions, but I thought all the same you should know the alternatives. :)
I think people saying stderr should be used only for error messages is misleading.
It should also be used for informative messages that are meant for the user running the command and not for any potential downstream consumers of the data (i.e. if you run a shell pipe chaining several commands you do not want informative messages like "getting item 30 of 42424" to appear on stdout as they will confuse the consumer, but you might still want the user to see them.
See this for historical rationale:
"All programs placed diagnostics on the standard output. This had
always caused trouble when the output was redirected into a file, but
became intolerable when the output was sent to an unsuspecting
process. Nevertheless, unwilling to violate the simplicity of the
standard-input-standard-output model, people tolerated this state of
affairs through v6. Shortly thereafter Dennis Ritchie cut the Gordian
knot by introducing the standard error file. That was not quite enough.
With pipelines diagnostics could come from any of several programs
running simultaneously. Diagnostics needed to identify themselves."
stdin
Reads input through the console (e.g. Keyboard input).
Used in C with scanf
scanf(<formatstring>,<pointer to storage> ...);
stdout
Produces output to the console.
Used in C with printf
printf(<string>, <values to print> ...);
stderr
Produces 'error' output to the console.
Used in C with fprintf
fprintf(stderr, <string>, <values to print> ...);
Redirection
The source for stdin can be redirected. For example, instead of coming from keyboard input, it can come from a file (echo < file.txt ), or another program ( ps | grep <userid>).
The destinations for stdout, stderr can also be redirected. For example stdout can be redirected to a file: ls . > ls-output.txt, in this case the output is written to the file ls-output.txt. Stderr can be redirected with 2>.
Using ps -aux reveals current processes, all of which are listed in /proc/ as /proc/(pid)/, by calling cat /proc/(pid)/fd/0 it prints anything that is found in the standard output of that process I think. So perhaps,
/proc/(pid)/fd/0 - Standard Output File
/proc/(pid)/fd/1 - Standard Input File
/proc/(pid)/fd/2 - Standard Error File
for example
But only worked this well for /bin/bash other processes generally had nothing in 0 but many had errors written in 2
For authoritative information about these files, check out the man pages, run the command on your terminal.
$ man stdout
But for a simple answer, each file is for:
stdout for a stream out
stdin for a stream input
stderr for printing errors or log messages.
Each unix program has each one of those streams.
stderr will not do IO Cache buffering so if our application need to print critical message info (some errors ,exceptions) to console or to file use it where as use stdout to print general log info as it use IO Cache buffering there is a chance that before writing our messages to file application may close ,leaving debugging complex
A file with associated buffering is called a stream and is declared to be a pointer to a defined type FILE. The fopen() function creates certain descriptive data for a stream and returns a pointer to designate the stream in all further transactions. Normally there are three open streams with constant pointers declared in the header and associated with the standard open files.
At program startup three streams are predefined and need not be opened explicitly: standard input (for reading conventional input), standard output (for writing conventional output), and standard error (for writing diagnostic output). When opened the standard error stream is not fully buffered; the standard input and standard output streams are fully buffered if and only if the stream can be determined not to refer to an interactive device
https://www.mkssoftware.com/docs/man5/stdio.5.asp
Here is a lengthy article on stdin, stdout and stderr:
What Are stdin, stdout, and stderr on Linux?
To summarize:
Streams Are Handled Like Files
Streams in Linux—like almost everything else—are treated as though
they were files. You can read text from a file, and you can write text
into a file. Both of these actions involve a stream of data. So the
concept of handling a stream of data as a file isn’t that much of a
stretch.
Each file associated with a process is allocated a unique number to
identify it. This is known as the file descriptor. Whenever an action
is required to be performed on a file, the file descriptor is used to
identify the file.
These values are always used for stdin, stdout, and stderr:
0: stdin
1: stdout
2: stderr
Ironically I found this question on stack overflow and the article above because I was searching for information on abnormal / non-standard streams. So my search continues.

Can the pipe operator be used with the stdout redirection operator?

We know that:
The pipe operator | is used to take the standard output of left side command as the standard input for the right side process.
The stdout redirection operator > is used to redirect the stdout to a file
And the question is, why cannot ls -la | > file redirect the output of ls -la to file? (I tried, and the file is empty)
Is it because that the stdout redirection operator > is not a process?
Is it because that the stdout redirection operator > is not a process?
In short, yes.
In a bit more detail, stdout, stderr and stdin are special file descriptors (FDs), but these remarks work for every FD: each FD refers to exactly one resource. It can be a file, a directory, a pipe, a device (such as terminal, a hard drive etc) and more. One FD, one resource. It is not possible for stdout to output to both a pipe and a file at the same time. What tee does is takes stdin (typically from a pipe, but not necessarily), opens a new FD associated with the filename provided as its argument, and writes whatever it gets from stdin to both stdout and the new FD. This copying of content from one to two FDs is not available from bash directly.
EDIT: I tried answering the question as originally posted. As it stands now, DevSolar's comment is actually more on point: why does > file, without a command, make an empty file in bash?
The answer is in Shell Command Language specification, under 2.9.1 Simple commands. In the first step, the redirection is detected. In the second step, no fields remain, so there is no command to be executed. In step 3, redirections are performed in a subshell; however, since there is no command, standard input is simply discarded, and the empty standard output of no-command is used to (try to) create a new file.

Redirect entire bash session to log file

I am familiar with the ability to pipe and redirect the IO of individual processes when running them in bash. However, is there a way to redirect stdio for an entire bash session?
Ideally, I would like to transparently pipe all stdout and stderr of all processes spawned by bash into tee to duplicate into a file the printed output displayed to the user. No matter what processes are run within that bash session, I could then go back later and look over the output.
Even more ideally, this should be the case for simple interactive programs that take options from stdin, but not for heavily interactive programs like vim.
The best I've found so far is: whenever the user opens a new terminal, run the command:
bash --login -i > >(tee ~/bash_$$.log) 2>&1
This will immediately start an interactive child shell in that new shell, and tee all stdin and stderr to a logfile named with the new parent shell's PID (to avoid overwriting).
This works, but vim fails to start with Vim: Warning: Output is not to a terminal. Are there any known solutions, up to and including patching the shell, to do this?
Background: vim is failing because isatty() is returning false when given the file descriptor for stdout; this is a safeguard to prevent uses such as vim >file that generally don't make sense. (Also, there are operating system calls available for interacting with PTYs that are useful to graphical, cursor-oriented programs that aren't available with a simple FIFO; this is why tools like ssh go to the trouble to provide a pseudoterminal during interactive use).
What's important for your purposes is that vim is directly inspecting the file descriptor it's passed as stdout. The shell is not a party to this -- it's literally vim running a standard-C-library call that gets some details about an open file descriptor -- so it's nothing that patching or reconfiguring the shell can fix.
To avoid this, then, you need to find a different way to redirect your output for logging such that stdout and stderr are still pointed at PTYs.
That said, for your real goal (logging all activity, vs redirecting stdout in-place), what you want is probably script:
if [ -z "$redirection_done" ]; then
redirection_done=1 exec script shell.log bash --login -i
fi
Using logging support from another tool which simulates a TTY, such as screen or tmux, will likewise suffice. (unbuffer, from the expect toolkit, can be used with similar effect).
Back to your literal question... (since while it may not be what you want to know, it is what you asked):
In all POSIX shells, including bash,
exec >wherever
...will immediately redirect stdout for the current shell to wherever. This can be a process substitution in bash, as anywhere else; thus, in an already-running shell, you can execute
exec > >(tee shell.log) 2>&1

Does stdout get stored somewhere in the filesystem or in memory? [duplicate]

This question already has answers here:
Send output of last command to a file automatically in bash?
(3 answers)
Closed 8 years ago.
I know I can save the result of a command to a variable using last_output=$(my_cmd) but what I'd really want is for $last_output to get updated every time I run a command. Is there a variable, zsh module, or plugin that I could install?
I guess my question is does stdout get permanently written somewhere (at least before the next command)? That way I could manipulate the results of the previous command without having to re-run it. This would be really useful for commands that take a long time to run
If you run the following:
exec > >(tee save.txt)
# ... stuff here...
exec >/dev/tty
...then your stdout for everything run between the two commands will go both to stdout, and to save.txt.
You could, of course, write a shell function which does this for you:
with_saved_output() {
"$#" \
2> >(tee "$HOME/.last-command.err >&2) \
| tee "$HOME/.last-command.out"
}
...and then use it at will:
with_saved_output some-command-here
...and zsh almost certainly will provide a mechanism to wrap interactively-entered commands. (In bash, which I can speak to more directly, you could do the same thing with a DEBUG trap).
However, even though you can, you shouldn't do this: When you split stdout and stderr into two streams, information about the exact ordering of writes is lost, even if those streams are recombined later.
Thus, the output
O: this is written to stdout first
E: this is written to stderr second
could become:
E: this is written to stderr second
O: this is written to stdout first
when these streams are individually passed through tee subprocesses to have copies written to disk. There are also buffering concerns created, and differences in behavior caused by software which checks whether it's outputting to a TTY and changes its behavior (for instance, software which turns color-coded output on when writing directly to console, and off when writing to a file or pipeline).
stdout is just a file handle that by default is connected to the console, but could be redirected.
yourcommand > save.txt
If you want to display the output to the console and save it to a file at the same time yout could pipe the output to tee, a command that writes everything it receives on stdin to stdout and to a file of your choice:
yourcommand | tee save.txt

Redirection standard error & ouput streams is postponed

I need to redirect output & error streams from one Windows process (GNU make.exe executing armcc toolchain) to some filter written on perl. The command I am running is:
Make Release 2>&1 | c:\cygwin\bin\perl ../tools/armfilt.pl
The compilation process throws out some prints which should be put then to STDOUT after some modifications. But I encountered a problem: all prints generated by the make are actually postponed till end of the make's process and only then are shown to a user. So, my questions are:
Why has it happen? I have tried to change the second process (perl.exe) priority from "Normal" to "Above normal" but it didn't help...
How to overcome this problem?
I think that one of possible workarounds may be to send only STDERR prints to the perl (that is what I actually need), not STDOUT+STDERR. But I don't know how to do it in Windows.
The Microsoft explanation concerning pipe operator usage says:
The pipe operator (|) takes the output (by default, STDOUT) of one
command and directs it into the input (by default, STDIN) of another
command.
But how to change this default STDOUT piping is not explained. Is it possible at all?

Resources