Exit from BASH Infinite loop in a pipeline - bash

I encountered a somewhat strange behavior of BASH infinite loops which outputs are pipelined to another processes. Namely, I run these two commands:
(while true; do echo xxx; done) | head -n 1
(while true; do date; done) | head -n 1
The first one exits instantly while the second one does not (and I assume it would run forever without being killed). I also tried an implicit infinite loop:
yes | head -n 1
and it also exits by itself. An appropriate line of output is immediately printed on the screen in each case. I am just curious what determines if such a commmand will finish.

When head exits, the standard output of the parenthesized expression is closed. If an external command, like date, is used, the loop hangs. If an internal command of bash is used, like echo, the loop exits. For proof, use
(while true; do /bin/echo xxx; done) | head -n 1
and it will hang. If you use
(while true; do date; echo $? 1>&2; sleep 1; done) | head -n 1
you will see that on the second round, the date command returns an error exit code, i.e. something other but zero. Bash obviously does this not take as serious as when an internal command gets into problems. I wonder if this is intended or rather a bug in bash.
To make sure the loop is exited, this seems to work:
(set -e; while true; do date ; done) | head -n 1

Related

While loop that exit when the condition is met

I need to write a while loop in bash script that does exit when the process is ended successfully what I have tried so far is;
VAR=`ps -ef |grep -i tail* |grep -v grep |wc -l`
while true; do
{
if [ $VAR = 0 ]
then
echo "Sending MAils ...."
exit
fi
}
done
use break instead of exit to continue the execution of your script.
Also, there is no need for {.
Your script has numerous errors. Probably try https://shellcheck.net/ before asking for human assistance.
You need to update the value of the variable inside the loop.
You seem to be reinventing pgrep, poorly.
(The regular expression tail* looks for tai, tail, taill, tailll ... What do you actually hope this should do?)
To break out of a loop and continue outside, use break.
The braces around your loop are superfluous. This is shell script, not C or Perl.
You are probably looking for something like
while true; do
if ! pgrep tail; then
echo "Sending mails ...."
break
fi
done
This avoids the use of a variable entirely; if you do need a variable, don't use upper case for your private variables.
Based on information in comments, if you have a number of processes like
tail -2000f /var/log/log.{1..10}
and no way to check their PIDs any longer, you might want to use fuser to tell you when none of them are running any longer:
while true; do
fuser /var/log/log.{1..10} || break
sleep 60
done
echo All processes are gone now.
Unfortunately, fuser does not reliably set its exit code - test on the command line (run tail -f $HOME/.bash_profile in one window and then fuser $HOME/.bash_profile && echo yes in another; then quit the tail and run fuser again. If it still prints yes you need something more.)
On MacOS, I found that fuser -u will print parentheses when the files are still open, and not when not:
while true; do
fuser -u /var/log/log.{1..10} 2>&1 | grep -q '[()]' || break
sleep 60
done
On Debian, fuser is in the package psmisc, and does set its exit code properly. You will probably also want to use the -s option to make it run quietly.
Notice also the addition of a sleep to avoid checking hundreds or thousands of times per second. You would probably want to make the same change to the original solution. How long to wait between iterations depends on how urgently you need the notification, and how heavy the operation to check the condition is. Even sleep 0.1 would be a significant improvement over the sleepless spin lock.

BASH stops without error, but works if copied in terminal

I am trying to write a script to slice a 13 Gb file in smaller parts to launch a split computation on a cluster. What I wrote so far works on terminal if I copy and paste it, but stops at the first cycle of the for loop.
set -ueo pipefail
NODES=8
READS=0days_rep2.fasta
Ntot=$(cat $READS | grep 'read' | wc -l)
Ndiv=$(($Ntot/$NODES))
for i in $(seq 0 $NODES)
do
echo $i
start_read=$(cat $READS | grep 'read' | head -n $(($Ndiv*${i}+1)) | tail -n 1)
echo ${start_read}
end_read=$(cat $READS | grep 'read' | head -n $(($Ndiv*${i}+$Ndiv)) | tail -n 1)
echo ${end_read}
done
If I run the script:
(base) [andrea#andrea-xps data]$ bash cluster.sh
0
>baa12ba1-4dc2-4fae-a989-c5817d5e487a runid=314af0bb142c280148f1ff034cc5b458c7575ff1 sampleid=0days_rep2 read=280855 ch=289 start_time=2019-10-26T02:42:02Z
(base) [andrea#andrea-xps data]$
it seems to stop abruptly after the command "echo ${start_read}" without raising any sort of error. If I copy and paste the script in terminal it runs without problems.
I am using Manjaro linux.
Andrea
The problem:
The problem here (as #Jens suggested in a comment) has to do with the use of the -e and pipefail options; -e makes the shell exit immediately if any simple command gets an error, and pipefail makes a pipeline fail if any command in it fails.
But what's failing? Take a look at the command here:
start_read=$(cat $READS | grep 'read' | head -n $(($Ndiv*${i}+1)) | tail -n 1)
Which, clearly, runs the cat, grep, head, and tail commands in a pipeline (which runs in a subshell so the output can be captured and put in the start_read variable). So cat starts up, and starts reading from the file and shoving it down the pipe to grep. grep reads that, picks out the lines containing 'read', and feeds them on toward head. head reads the first line of that (note that on the first pass, Ndiv is 0, so it's running head -n 1) from its input, feeds that on toward the tail command, and then exits. tail passes on the one line it got, then exits as well.
The problem is that when head exited, it hadn't read everything grep had to give it; that left grep trying to shove data into a pipe with nothing on the other end, so the system sent it a SIGPIPE signal to tell it that wasn't going to work, and that caused grep to exit with an error status. And then since it exited, cat was similarly trying to stuff an orphaned pipe, so it got a SIGPIPE as well and also exited with an error status.
Since both cat and grep exited with errors, and pipefail is set, that subshell will also exit with an error status, and that means the parent shell considers the whole assignment command to have failed, and abort the script on the spot.
Solutions:
So, one possible solution is to remove the -e option from the set command. -e is kind of janky in what it considers an exit-worthy error and what it doesn't, so I don't generally like it anyway (see BashFAQ #105 for details).
Another problem with -e is that (as we've seen here) it doesn't give much of any indication of what went wrong, or even that something went wrong! Error checking is important, but so's error reporting.
(Note: the danger in removing -e is that your script might get a serious error partway through... and then blindly keep running, in a situation that doesn't make sense, possibly damaging things in the process. So you should think about what might go wrong as the script runs, and add manual error checking as needed. I'll add some examples to my script suggestion below.)
Anyway, just removing -e is just papering over the fact that this isn't a really good approach to the problem. You're reading (or trying to read) over the entire file multiple times, and processing it through multiple commands each time. You really should only be reading through the thing twice: once to figure out how many reads there are, and once to break it into chunks. You might be able to write a program to do the splitting in awk, but most unix-like systems already have a program specifically for this task: split. There's also no need for cat everywhere, since the other commands are perfectly capable of reading directly from files (again, #Jens pointed this out in a comment).
So I think something like this would work:
#!/bin/bash
set -uo pipefail # I removed the -e 'cause I don't trust it
nodes=8 # Note: lower- or mixed-case variables are safer to avoid conflicts
reads=0days_rep2.fasta
splitprefix=0days_split_
Ntot=$(grep -c 'read' "$reads") || { # grep can both read & count in a single step
# The || means this'll run if there was an error in that command.
# A normal thing to do is print an error message to stderr
# (with >&2), then exit the script with a nonzero (error) status
echo "$0: Error counting reads in $reads" >&2
exit 1
}
Ndiv=$((($Ntot+$nodes-1)/$nodes)) # Force it to round *up*, not down
grep 'read' "$reads" | split -l $Ndiv -a1 - "$splitprefix" || {
echo "$0: Error splitting fasta file" >&2
exit 1
}
This'll create files named "0days_split_a" through "0days_split_h". If you have the GNU version of split, you could add its -d option (use numeric suffixes instead of letters) and/or --additional-suffix=.fasta (to add the .fasta extension to the split files).
Another note: if only a little bit of that big file is read lines, it might be faster to run grep 'read' "$reads" >sometempfile first, and then run the rest of the script on the temp file, so you don't have to read & thin it twice. But if most of the file is read lines, this won't help much.
Alright, we have found the troublemaker: set -e in combination with set -o pipefail.
Gordon Davisson's answer provides all the details. I provide this answer for the sole purpose of reaping an upvote for my debugging efforts in the comments to your answer :-)

Cancel "tail -f" if the file hasn't changed for N seconds?

I am debugging an application, that has a long running time and produces only a logfile as reliably complete output, so usually I use tail -f to monitor the output. Since it also requires some specific setup of the environment variables, I have wrapped the whole invocation, including tail -f LOGFILE &, into a bash script.
However, this creates a tail process that won't be terminated automatically and will remain running. Cleanup with trap leads to complicated code, once there is more than a single cleanup task, and there is no obvious way to account for all ways the script may be terminated.
Using the timeout command, I could limit tail -f to terminate after a fixed total time, but that will break cases, where it is SUPPOSED to run longer.
So I was wondering, if there is a way to limit tail -f such that it terminates, if the followed file doesn't change for a specified amount of time.
Update: The subsequent script worked for me when executed on its own, but in some instances, the tail process would not terminate regardless. It isn't entirely clear, whether tail -f detects that process it is piping to has terminated.
Lacking a builtin solution, a stdout based timeout can be produced in bash, exploiting that tail will terminate, if it's stdout is closed.
# Usage: withTimeout TIMEOUT COMMAND [ARGS ...]
# Execute COMMAND. Terminate, if it hasn't produced new output in TIMEOUT seconds.
# Depending on the platform, TIMEOUT may be fractional. See `help read`.
withTimeout () {
local timeout="$1"; shift 1
"$#" | while IFS= read -r -t "${timeout}" line || return 0; do
printf "%s\n" "$line"
done
}
withTimeout 2 tail -f LOGFILE &
Note that tail may resort to polling the file once per second, if it cannot use inotify. If faster output is needed the -s option can be supplied.

lazy (non-buffered) processing of shell pipeline

I'm trying to figure out how to perform the laziest possible processing of a standard UNIX shell pipeline. For example, let's say I have a command which does some calculations and outputting along the way, but the calculations get more and more expensive so that the first few lines of output arrive quickly but then subsequent lines get slower. If I'm only interested in the first few lines then I want to obtain those via lazy evaluation, terminating the calculations ASAP before they get too expensive.
This can be achieved with a straight-forward shell pipeline, e.g.:
./expensive | head -n 2
However this does not work optimally. Let's simulate the calculations with a script which gets exponentially slower:
#!/bin/sh
i=1
while true; do
echo line $i
sleep $(( i ** 4 ))
i=$(( i+1 ))
done
Now when I pipe this script through head -n 2, I observe the following:
line 1 is output.
After sleeping one second, line 2 is output.
Despite head -n 2 having already received two (\n-terminated) lines and exiting, expensive carries on running and now waits a further 16 seconds (2 ** 4) before completing, at which point the pipeline also completes.
Obviously this is not as lazy as desired, because ideally expensive would terminate as soon as the head process receives two lines. However, this does not happen; IIUC it actually terminates after trying to write its third line, because at this point it tries to write to its STDOUT which is connected through a pipe to STDIN the head process which has already exited and is therefore no longer reading input from the pipe. This causes expensive to receive a SIGPIPE, which causes the bash interpreter running the script to invoke its SIGPIPE handler which by default terminates running the script (although this can be changed via the trap command).
So the question is, how can I make it so that expensive quits immediately when head quits, not just when expensive tries to write its third line to a pipe which no longer has a listener at the other end? Since the pipeline is constructed and managed by the interactive shell process I typed the ./expensive | head -n 2 command into, presumably that interactive shell is the place where any solution for this problem would lie, rather than in any modification of expensive or head? Is there any native trick or extra utility which can construct pipelines with the behaviour I want? Or maybe it's simply impossible to achieve what I want in bash or zsh, and the only way would be to write my own pipeline manager (e.g. in Ruby or Python) which spots when the reader terminates and immediately terminates the writer?
If all you care about is foreground control, you can run expensive in a process substitution; it still blocks until it next tries to write, but head exits immediately (and your script's flow control can continue) after it's received its input
head -n 2 < <(exec ./expensive)
# expensive still runs 16 seconds in the background, but doesn't block your program
In bash 4.4, these store their PIDs in $! and allow process management in the same manner as other background processes.
# REQUIRES BASH 4.4 OR NEWER
exec {expensive_fd}< <(exec ./expensive); expensive_pid=$!
head -n 2 <&"$expensive_fd" # read the content we want
exec {expensive_fd}<&- # close the descriptor
kill "$expensive_pid" # and kill the process
Another approach is a coprocess, which has the advantage of only requiring bash 4.0:
# magic: store stdin and stdout FDs in an array named "expensive", and PID in expensive_PID
coproc expensive { exec ./expensive }
# read two lines from input FD...
head -n 2 <&"${expensive[0]}"
# ...and kill the process.
kill "$expensive_PID"
I'll answer with a POSIX shell in mind.
What you can do is use a fifo instead of a pipe and kill the first link the moment the second finishes.
If the expensive process is a leaf process or if it takes care of killing its children, you can use a simple kill. If it's a process-spawning shell script, you should run it in a process group (doable with set -m) and kill it with a process-group kill.
Example code:
#!/bin/sh -e
expensive()
{
i=1
while true; do
echo line $i
sleep 0.$i #sped it up a little
echo >&2 slept
i=$(( i+1 ))
done
}
echo >&2 NORMAL
expensive | head -n2
#line 1
#slept
#line 2
#slept
echo >&2 SPED-UP
mkfifo pipe
exec 3<>pipe
rm pipe
set -m; expensive >&3 & set +m
<&3 head -n 2
kill -- -$!
#line 1
#slept
#line 2
If you run this, the second run should not have the second slept line, meaning the first link was killed the moment head finished, not when the first link tried to output after head was finished.

Bash function hangs once conditions are met

All,
I am trying to run a bash script that kicks off several sub processes. The processes redirect to their own log files and I must kick them off in parallel. To do this i have written a check_procs procedure, that monitors for the number of processes using the same parent PID. Once the number reaches 1 again, the script should continue. However, it seems to just hang. I am not sure why, but the code is below:
check_procs() {
while true; do
mypid=$$
backup_procs=`ps -eo ppid | grep -w $mypid | wc -w`
until [ $backup_procs == 1 ]; do
echo $backup_procs
sleep 5
backup_procs=`ps -eo ppid | grep -w $mypid | wc -w`
done
done
}
This function is called after the processes are kicked off, and I can see it echoing out the number of processes, but then the echoing stops (suggesting the function has completed since the process count is now 1, but then nothing happens, and I can see the script is still in the process list of the server. I have to kill it off manually. The part where the function is called is below:
for ((i=1; i <= $threads; i++)); do
<Some trickery here to generate $cmdfile and $logfile>
nohup rman target / cmdfile=$cmdfile log=$logfile &
x=$(($x+1))
done
check_procs
$threads is a command line parameter passed to the script, and is a small number like 4 or 6. These are kicked off using nohup, as shown. When the IF in check_procs is satisfied, everything hangs instead of executing the remainder of the script. What's wrong with my function?
Maybe I'm mistaken, but it is not expected? Your outer loop runs forever, there is no exit point. Unless the process count increases again the outer loop runs infinitely (without any delay which is not recommended).

Resources