Bash redirection: named pipes and EOF - bash

Take the following code:
rm -f pipe
mkfifo pipe
foo () {
echo 1
sleep 1
echo 2
}
#1
exec 3< <(foo &)
cat <&3 # works
#2
foo >pipe &
cat <pipe # works
#3
exec 3<>pipe
foo >&3 &
cat <&3 # hangs
#4 -- update: this is the correct approach for what I want to do
foo >pipe &
exec 3<pipe
rm pipe
cat <&3 # works
Why does approach #3 hang, while others do not? Is there a way to make approach #3 not hang?
Rationale: I wish to use quasi-unnamed pipes to connect several asynchronously running subprocesses, for this I need to delete the pipe after making a file descriptor point to it:
mkfifo pipe
exec {fd}<>pipe
rm pipe
# use &$fd only

The problem in approach 3 is that the FIFO pipe then has 2 writers: The bash script (because you have opened it read/write by using exec 3<>) and the sub-shell running foo. You'll read EOF when all writers have closed the file descriptor. One writer (the sub-shell running foo) will exit fairly quickly (after roughly 1s) and therefore close the file descriptor. The other writer however (the main shell) only closes the file descriptor when it'd exit as there's no closes of file descriptor 3 anywhere. But it can't exit because it waits for the cat to exit first. That's a deadlock:
cat is waiting for an EOF
the EOF only appears when the main shell closes the fd (or exits)
the main shell is waiting for cat to terminate
Therefore you'll never exit.
Case 2 works because the pipe only ever has one writer (the sub-shell running foo) which exits very quickly, therefore an EOF will be read. In case 1, there's also only ever one writer because you open fd 3 read-only (exec 3<).
EDIT: Remove nonsense about case 4 not being correct (see comments). It's correct because the writer can't exit before the reader connects because it'll also be blocked when opening the file when the reader isn't opening yet. The newly added case 4 is unfortunately incorrect. It's racy and only works iff foo doesn't terminate (or close the pipe) before exec 3<pipe runs.
Also check the fifo(7) man page:
The kernel maintains exactly one pipe object for each FIFO special file that is opened by at least one process. The FIFO must be opened on both ends (reading and writing) before data can be passed. Normally, opening the FIFO blocks until the other end is opened also.

Related

bash hangs when exec > > is called and an additional bash script is executed with output to stdin [duplicate]

I have a shell script which writes all output to logfile
and terminal, this part works fine, but if I execute the script
a new shell prompt only appear if I press enter. Why is that and how do I fix it?
#!/bin/bash
exec > >(tee logfile)
echo "output"
First, when I'm testing this, there always is a new shell prompt, it's just that sometimes the string output comes after it, so the prompt isn't last. Did you happen to overlook it? If so, there seems to be a race where the shell prints the prompt before the tee in the background completes.
Unfortunately, that cannot fixed by waiting in the shell for tee, see this question on unix.stackexchange. Fragile workarounds aside, the easiest way to solve this that I see is to put your whole script inside a list:
{
your-code-here
} | tee logfile
If I run the following script (suppressing the newline from the echo), I see the prompt, but not "output". The string is still written to the file.
#!/bin/bash
exec > >(tee logfile)
echo -n "output"
What I suspect is this: you have three different file descriptors trying to write to the same file (that is, the terminal): standard output of the shell, standard error of the shell, and the standard output of tee. The shell writes synchronously: first the echo to standard output, then the prompt to standard error, so the terminal is able to sequence them correctly. However, the third file descriptor is written to asynchronously by tee, so there is a race condition. I don't quite understand how my modification affects the race, but it appears to upset some balance, allowing the prompt to be written at a different time and appear on the screen. (I expect output buffering to play a part in this).
You might also try running your script after running the script command, which will log everything written to the terminal; if you wade through all the control characters in the file, you may notice the prompt in the file just prior to the output written by tee. In support of my race condition theory, I'll note that after running the script a few times, it was no longer displaying "abnormal" behavior; my shell prompt was displayed as expected after the string "output", so there is definitely some non-deterministic element to this situation.
#chepner's answer provides great background information.
Here's a workaround - works on Ubuntu 12.04 (Linux 3.2.0) and on OS X 10.9.1:
#!/bin/bash
exec > >(tee logfile)
echo "output"
# WORKAROUND - place LAST in your script.
# Execute an executable (as opposed to a builtin) that outputs *something*
# to make the prompt reappear normally.
# In this case we use the printf *executable* to output an *empty string*.
# Use of `$ec` is to ensure that the script's actual exit code is passed through.
ec=$?; $(which printf) ''; exit $ec
Alternatives:
#user2719058's answer shows a simple alternative: wrapping the entire script body in a group command ({ ... }) and piping it to tee logfile.
An external solution, as #chepner has already hinted at, is to use the script utility to create a "transcript" of your script's output in addition to displaying it:
script -qc yourScript /dev/null > logfile # Linux syntax
This, however, will also capture stderr output; if you wanted to avoid that, use:
script -qc 'yourScript 2>/dev/null' /dev/null > logfile
Note, however, that this will suppress stderr output altogether.
As others have noted, it's not that there's no prompt printed -- it's that the last of the output written by tee can come after the prompt, making the prompt no longer visible.
If you have bash 4.4 or newer, you can wait for your tee process to exit, like so:
#!/usr/bin/env bash
case $BASH_VERSION in ''|[0-3].*|4.[0-3]) echo "ERROR: Bash 4.4+ needed" >&2; exit 1;; esac
exec {orig_stdout}>&1 {orig_stderr}>&2 # make a backup of original stdout
exec > >(tee -a "_install_log"); tee_pid=$! # track PID of tee after starting it
cleanup() { # define a function we'll call during shutdown
retval=$?
exec >&$orig_stdout # Copy your original stdout back to FD 1, overwriting the pipe to tee
exec 2>&$orig_stderr # If something overwrites stderr to also go through tee, fix that too
wait "$tee_pid" # Now, wait until tee exits
exit "$retval" # and complete exit with our original exit status
}
trap cleanup EXIT # configure the function above to be called during cleanup
echo "Writing something to stdout here"

Why bash is closed while writing to named pipe?

In bash 1:
$ mkfifo /tmp/pipe
$ echo 'something' > /tmp/pipe
Now it hangs and waits that data to be read.
In bash 2:
$ </tmp/pipe
Now shell 1 goes away, it is closed, my terminal is gone.
Why is this happening?
In bash manual there is written
The command substitution $(cat file) can be replaced by the
equivalent but faster $(< file).
So I was experimenting if plain "< file" works in a similar way to cat file content to terminal.
$ bash --version | head -1
GNU bash, version 4.3.11(1)-release (x86_64-pc-linux-gnu)
$ cat /proc/version
Linux version 3.16.0-71-generic (buildd#lgw01-46) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #92~14.04.1-Ubuntu SMP Thu May 12 23:31:46 UTC 2016
Edit
After seeing initial comments and answers I will add a bit of clarification.
I'm not concerned about different command line syntaxes.
But what I was really after was that in reader shell $ < /tmp/pipe scenario writer shell exits, but with $ cat /tmp/pipe in reader shell the writer shell does not exit. Why?
I see that I really did not phrase that in question and in body and should probably initiate another question?
From the pipe(7) manual page:
If all file descriptors referring to the read end of a pipe have been closed, then a write(2) will cause a SIGPIPE signal to be generated for the calling process.
What happens is that when the reading shell has finished reading and closes its end of the pipe, the writing shell will receive the SIGPIPE signal, and if it doesn't catch it then the shell will be terminated.
In manual sign $ is connected with variable not command prompt.
Try the following scripts:
1)
#!/bin/bash
echo $(< /tmp/pipe);
2)
#!/bin/bash
echo $(cat /tmp/pipe);
Both works correctly.
When you type < /tmp/pipe, you connect the standard input of the current shell to the named pipe instead. bash works by continuously reading from its input and executing what it reads as a command.
In shell 1, echo something > /tmp/pipe opens the pipe for writing, writes the string, then blocks until something reads it. As soon as echo completes, it will close its end of the pipe.
< /tmp/pipe opens the pipe for reading, and connects it to shell 2's standard input.
Shell 2 reads from the pipe (and tries to execute a command).
Back in shell 1, the echo, having unblocked after the 2nd shell read from the pipe, completes. The write end of the pipe closes.
With the write-end of the pipe closed, shell 2 will get a SIGPIPE when it tries to read another command, then exit.
(An alternate possibility is that shell 2 exits if the command it reads from the pipe and tries to execute causes an error.)
$(< file), on the other hand, is a special case of command substitution. When bash sees that, it simply reads from file itself, rather than spawning a cat process and capturing its output.

Background process appears to hang

Editor's note: The OP is ultimately looking to package the code from this answer
as a script. Said code creates a stay-open FIFO from which a background command reads data to process as it arrives.
It works if I type it in the terminal, but it won't work if I enter those commands in a script file and run it.
#!/bin/bash
cat >a&
pid=$!
it seems that the program is stuck at cat>a&
$pid has no value after running the script, but the cat process seems to exist.
cdarke's answer contains the crucial pointer: your script mustn't run in a child process, so you have to source it.
Based on the question you linked to, it sounds like you're trying to do the following:
Open a FIFO (named pipe).
Keep that FIFO open indefinitely.
Make a background command read from that FIFO whenever new data is sent to it.
See bottom for a working solution.
As for an explanation of your symptoms:
Running your script NOT sourced (NOT with .) means that the script runs in a child process, which has the following implications:
Variables defined in the script are only visible inside that script, and the variables cease to exist altogether when the script finishes running.
That's why you didn't see the script's $myPid variable after running the script.
When the script finishes running, its background tasks (cat >a&) are killed (as cdarke explains, the SIGHUP signal is sent to them; any process that doesn't explicitly trap that signal is terminated).
This contradicts your claim that the cat process continues to exist, but my guess is that you mistook an interactively started cat process for one started by a script.
By contrast, any FIFO created by your script (with mkfifo) does persist after the script exits (a FIFO behaves like a file - it persists until you explicitly delete it).
However, when you write to that FIFO without another process reading from it, the writing command will block and thus appear to hang (the writing process blocks until another process reads the data from the FIFO).
That's probably what happened in your case: because the script's background processes were killed, no one was reading from the FIFO, causing an attempt to write to it to block. You incorrectly surmised that it was the cat >a& command that was getting "stuck".
The following script, when sourced, adds functions to the current shell for setting up and cleaning up a stay-open FIFO with a background command that processes data as it arrives. Save it as file bgfifo_funcs:
#!/usr/bin/env bash
[[ $0 != "$BASH_SOURCE" ]] || { echo "ERROR: This script must be SOURCED." >&2; exit 2; }
# Set up a background FIFO with a command listening for input.
# E.g.:
# bgfifo_setup bgfifo "sed 's/^/# /'"
# echo 'hi' > bgfifo # -> '# hi'
# bgfifo_cleanup
bgfifo_setup() {
(( $# == 2 )) || { echo "ERROR: usage: bgfifo_setup <fifo-file> <command>" >&2; return 2; }
local fifoFile=$1 cmd=$2
# Create the FIFO file.
mkfifo "$fifoFile" || return
# Use a dummy background command that keeps the FIFO *open*.
# Without this, it would be closed after the first time you write to it.
# NOTE: This call inevitably outputs a job control message that looks
# something like this:
# [1]+ Stopped cat > ...
{ cat > "$fifoFile" & } 2>/dev/null
# Note: The keep-the-FIFO-open `cat` PID is the only one we need to save for
# later cleanup.
# The background processing command launched below will terminate
# automatically then FIFO is closed when the `cat` process is killed.
__bgfifo_pid=$!
# Now launch the actual background command that should read from the FIFO
# whenever data is sent.
{ eval "$cmd" < "$fifoFile" & } 2>/dev/null || return
# Save the *full* path of the FIFO file in a global variable for reliable
# cleanup later.
__bgfifo_file=$fifoFile
[[ $__bgfifo_file == /* ]] || __bgfifo_file="$PWD/$__bgfifo_file"
echo "FIFO '$fifoFile' set up, awaiting input for: $cmd"
echo "(Ignore the '[1]+ Stopped ...' message below.)"
}
# Cleanup function that you must call when done, to remove
# the FIFO file and kill the background commands.
bgfifo_cleanup() {
[[ -n $__bgfifo_file ]] || { echo "(Nothing to clean up.)"; return 0; }
echo "Removing FIFO '$__bgfifo_file' and terminating associated background processes..."
rm "$__bgfifo_file"
kill $__bgfifo_pid # Note: We let the job control messages display.
unset __bgfifo_file __bgfifo_pid
return 0
}
Then, source script bgfifo_funcs, using the . shell builtin:
. bgfifo_funcs
Sourcing executes the script in the current shell (rather than in a child process that terminates after the script has run), and thus makes the script's functions and variables available to the current shell. Functions by definition run in the current shell, so any background commands started from functions stay alive.
Now you can set up a stay-open FIFO with a background process that processes input as it arrives as follows:
# Set up FIFO 'bgfifo in the current dir. and process lines sent to it
# with a sample Sed command that simply prepends '# ' to every line.
$ bgfifo_setup bgfifo "sed 's/^/# /'"
# Send sample data to the FIFO.
$ echo 'Hi.' > bgfifo
# Hi.
# ...
$ echo 'Hi again.' > bgfifo
# Hi again.
# ...
# Clean up when done.
$ bgfifo_cleanup
The reason that cat >a "hangs" is because it is reading from the standard input stream (stdin, file descriptor zero), which defaults to the keyboard.
Adding the & causes it to run in background, which disconnects from the keyboard. Normally that would leave a suspended job in background, but, since you exit your script, its background tasks are killed (sends a SIGHUP signal).
EDIT: although I followed the link in the question, it was not stated originally that the OP was actually using a FIFO at that stage. So thanks to #mklement0.
I don't understand what you are trying to do here, but I suspect you need to run it as a "sourced" file, as follows:
. gash.sh
Where gash.sh is the name of your script. Note the preceding .
You need to specify a file with "cat":
#!/bin/bash
cat SOMEFILE >a &
pid=$!
echo PID $pid
Although that seems a bit silly - why not just "cp" the file (cp SOMEFILE a)?
Q: What exactly are you trying to accomplish?

How can I have output from one named pipe fed back into another named pipe?

I'm adding some custom logging functionality to a bash script, and can't figure out why it won't take the output from one named pipe and feed it back into another named pipe.
Here is a basic version of the script (http://pastebin.com/RMt1FYPc):
#!/bin/bash
PROGNAME=$(basename $(readlink -f $0))
LOG="$PROGNAME.log"
PIPE_LOG="$PROGNAME-$$-log"
PIPE_ECHO="$PROGNAME-$$-echo"
# program output to log file and optionally echo to screen (if $1 is "-e")
log () {
if [ "$1" = '-e' ]; then
shift
$# > $PIPE_ECHO 2>&1
else
$# > $PIPE_LOG 2>&1
fi
}
# create named pipes if not exist
if [[ ! -p $PIPE_LOG ]]; then
mkfifo -m 600 $PIPE_LOG
fi
if [[ ! -p $PIPE_ECHO ]]; then
mkfifo -m 600 $PIPE_ECHO
fi
# cat pipe data to log file
while read data; do
echo -e "$PROGNAME: $data" >> $LOG
done < $PIPE_LOG &
# cat pipe data to log file & echo output to screen
while read data; do
echo -e "$PROGNAME: $data"
log echo $data # this doesn't work
echo -e $data > $PIPE_LOG 2>&1 # and neither does this
echo -e "$PROGNAME: $data" >> $LOG # so I have to do this
done < $PIPE_ECHO &
# clean up temp files & pipes
clean_up () {
# remove named pipes
rm -f $PIPE_LOG
rm -f $PIPE_ECHO
}
#execute "clean_up" on exit
trap "clean_up" EXIT
log echo "Log File Only"
log -e echo "Echo & Log File"
I thought the commands on line 34 & 35 would take the $data from $PIPE_ECHO and output it to the $PIPE_LOG. But, it doesn't work. Instead I have to send that output directly to the log file, without going through the $PIPE_LOG.
Why is this not working as I expect?
EDIT: I changed the shebang to "bash". The problem is the same, though.
SOLUTION: A.H.'s answer helped me understand that I wasn't using named pipes correctly. I have since solved my problem by not even using named pipes. That solution is here: http://pastebin.com/VFLjZpC3
it seems to me, you do not understand what a named pipe really is. A named pipe is not one stream like normal pipes. It is a series of normal pipes, because a named pipe can be closed and a close on the producer side is might be shown as a close on the consumer side.
The might be part is this: The consumer will read data until there is no more data. No more data means, that at the time of the read call no producer has the named pipe open. This means that multiple producer can feed one consumer only when there is no point in time without at least one producer. Think of it of door which closes automatically: If there is a steady stream of people keeping the door always open either by handing the doorknob to the next one or by squeezing multiple people through it at the same time, the door is open. But once the door is closed it stays closed.
A little demonstration should make the difference a little clearer:
Open three shells. First shell:
1> mkfifo xxx
1> cat xxx
no output is shown because cat has opened the named pipe and is waiting for data.
Second shell:
2> cat > xxx
no output, because this cat is a producer which keeps the named pipe open until we tell him to close it explicitly.
Third shell:
3> echo Hello > xxx
3>
This producer immediately returns.
First shell:
Hello
The consumer received data, wrote it and - since one more consumer keeps the door open, continues to wait.
Third shell
3> echo World > xxx
3>
First shell:
World
The consumer received data, wrote it and - since one more consumer keeps the door open, continues to wait.
Second Shell: write into the cat > xxx window:
And good bye!
(control-d key)
2>
First shell
And good bye!
1>
The ^D key closed the last producer, the cat > xxx, and hence the consumer exits also.
In your case which means:
Your log function will try to open and close the pipes multiple times. Not a good idea.
Both your while loops exit earlier than you think. (check this with (while ... done < $PIPE_X; echo FINISHED; ) &
Depending on the scheduling of your various producers and consumers the door might by slam shut sometimes and sometimes not - you have a race condition built in. (For testing you can add a sleep 1 at the end of the log function.)
You "testcases" only tries each possibility once - try to use them multiple times (you will block, especially with the sleeps ), because your producer might not find any consumer.
So I can explain the problems in your code but I cannot tell you a solution because it is unclear what the edges of your requirements are.
It seems the problem is in the "cat pipe data to log file" part.
Let's see: you use a "&" to put the loop in the background, I guess you mean it must run in parallel with the second loop.
But the problem is you don't even need the "&", because as soon as no more data is available in the fifo, the while..read stops. (still you've got to have some at first for the first read to work). The next read doesn't hang if no more data is available (which would pose another problem: how does your program stops ?).
I guess the while read checks if more data is available in the file before doing the read and stops if it's not the case.
You can check with this sample:
mkfifo foo
while read data; do echo $data; done < foo
This script will hang, until you write anything from another shell (or bg the first one). But it ends as soon as a read works.
Edit:
I've tested on RHEL 6.2 and it works as you say (eg : bad!).
The problem is that, after running the script (let's say script "a"), you've got an "a" process remaining. So, yes, in some way the script hangs as I wrote before (not that stupid answer as I thought then :) ). Except if you write only one log (be it log file only or echo,in this case it works).
(It's the read loop from PIPE_ECHO that hangs when writing to PIPE_LOG and leaves a process running each time).
I've added a few debug messages, and here is what I see:
only one line is read from PIPE_LOG and after that, the loop ends
then a second message is sent to the PIPE_LOG (after been received from the PIPE_ECHO), but the process no longer reads from PIPE_LOG => the write hangs.
When you ls -l /proc/[pid]/fd, you can see that the fifo is still open (but deleted).
If fact, the script exits and removes the fifos, but there is still one process using it.
If you don't remove the log fifo at the cleanup and cat it, it will free the hanging process.
Hope it will help...

Preventing lock propagation

A simple and seemingly reliable way to do locking under bash is:
exec 9>>lockfile
flock 9
However, bash notoriously propagates such a fd lock to all forked stuff including executed programs etc.
Is there any way to tell bash not to duplicate the fd? It's great that the lock is attached to a fd which gets removed when the program terminates, no matter how it gets terminated.
I know I can do stuff like:
run_some_prog 9>&-
But that is quite tedious.
Is there any better solution?
You can use the -o command line option to flock(1) (long option --close, which might be better for writing in scripts for the self-documenting nature) to specify that the file descriptor should be closed before executing commands via flock(1):
-o, --close
Close the file descriptor on which the lock is held
before executing command. This is useful if command
spawns a child process which should not be holding
the lock.
Apparently flock -o FD does not fix the problem. A trick to get rid of the superfluous FD for later commands in the same shell script is to wrap the remaining part into a section which closes the FD, like this:
var=outside
exec 9>>lockfile
flock -n 9 || exit
{
: commands which do not see FD9
var=exported
# exit would exit script
# see CLUMSY below outside this code snippet
} 9<&-
# $var is "exported"
# drop lock closing the FD
exec 9<&-
: remaining commands without lock
This is a bit CLUMSY, because the close of the FD is so far separated from the lock.
You can refactor this loosing the "natural" command flow but keeping things together which belong together:
functions_running_with_lock()
{
: commands which do not see FD9
var=exported
# exit would exit script
}
var=outside
exec 9>>lockfile
flock -n 9 || exit
functions_running_with_lock 9<&-
# $var is "exported"
# drop lock closing the FD
exec 9<&-
: remaining commands without lock
A little nicer writing which keeps the natural command flow at the expense of another fork plus an additional process and a bit different workflow, which often comes handy. But this does not allow to set variables in the outer shell:
var=outside
exec 9>>lockfile
flock -n 9 || exit
(
exec 9<&-
: commands which do not see FD9
var=exported
# exit does not interrupt the whole script
exit
var=neverreached
)
# optionally test the ret if the parentheses using $?
# $var is "outside" again
# drop lock closing the FD
exec 9<&-
: remaining commands without lock
BTW, if you really want to be sure that bash does not introduce additional file descriptors (to "hide" the closed FD and skip a real fork), for example if you execute some deamon which then would hold the lock forever, the latter variant is recommended, just to be sure. lsof -nP and strace your_script are your friends.
There is no way to mark a FD as close-on-exec within bash, so no, there is no better solution.
-o doesn't work with file descriptors, it only works with files. You have to use -u to unlock the file descriptor.
What I do is this:
# start of the lock sections
LOCKFILE=/tmp/somelockfile
exec 8>"$LOCKFILE"
if ! flock -n 8;then
echo Rejected # for testing, remove this later
exit # exit, since we don't have lock
fi
# some code which shouldn't run in parallel
# End of lock section
flock -u 8
rm -f "$LOCKFILE"
This way the file descriptor will be closed by the process that made the lock, and since every other process will exit, that means only the process holding the lock will unlock the file descriptor and remove the lock file.

Resources