Kill a child-script when it emits a certain string - bash

I have the following setup:
#!/bin/bash
# call.sh
voip_binary $1
Now when I a certain event happens (when I hang up), then the voip_binary emits a certain string: ... [DISCONNCTD] ...
However, the script does not stop, but continues running. Now as soon as DISCONNCTD is discovered in the output, I want to kill the script.
So I could do the following, to get the relevant output:
voip_binary $1 | grep DISCONNCTD
Even though I now only get relevant output from the binary, I still don't know how to kill it, as soon as a line is emitted.

You can try this:
voip_binary $1 | grep 'DISCONNCTD' | while read line; do
pkill 'voip_binary'
done
Assuming voip_binary outputs your string without buffering.

Related

BASH stops without error, but works if copied in terminal

I am trying to write a script to slice a 13 Gb file in smaller parts to launch a split computation on a cluster. What I wrote so far works on terminal if I copy and paste it, but stops at the first cycle of the for loop.
set -ueo pipefail
NODES=8
READS=0days_rep2.fasta
Ntot=$(cat $READS | grep 'read' | wc -l)
Ndiv=$(($Ntot/$NODES))
for i in $(seq 0 $NODES)
do
echo $i
start_read=$(cat $READS | grep 'read' | head -n $(($Ndiv*${i}+1)) | tail -n 1)
echo ${start_read}
end_read=$(cat $READS | grep 'read' | head -n $(($Ndiv*${i}+$Ndiv)) | tail -n 1)
echo ${end_read}
done
If I run the script:
(base) [andrea#andrea-xps data]$ bash cluster.sh
0
>baa12ba1-4dc2-4fae-a989-c5817d5e487a runid=314af0bb142c280148f1ff034cc5b458c7575ff1 sampleid=0days_rep2 read=280855 ch=289 start_time=2019-10-26T02:42:02Z
(base) [andrea#andrea-xps data]$
it seems to stop abruptly after the command "echo ${start_read}" without raising any sort of error. If I copy and paste the script in terminal it runs without problems.
I am using Manjaro linux.
Andrea
The problem:
The problem here (as #Jens suggested in a comment) has to do with the use of the -e and pipefail options; -e makes the shell exit immediately if any simple command gets an error, and pipefail makes a pipeline fail if any command in it fails.
But what's failing? Take a look at the command here:
start_read=$(cat $READS | grep 'read' | head -n $(($Ndiv*${i}+1)) | tail -n 1)
Which, clearly, runs the cat, grep, head, and tail commands in a pipeline (which runs in a subshell so the output can be captured and put in the start_read variable). So cat starts up, and starts reading from the file and shoving it down the pipe to grep. grep reads that, picks out the lines containing 'read', and feeds them on toward head. head reads the first line of that (note that on the first pass, Ndiv is 0, so it's running head -n 1) from its input, feeds that on toward the tail command, and then exits. tail passes on the one line it got, then exits as well.
The problem is that when head exited, it hadn't read everything grep had to give it; that left grep trying to shove data into a pipe with nothing on the other end, so the system sent it a SIGPIPE signal to tell it that wasn't going to work, and that caused grep to exit with an error status. And then since it exited, cat was similarly trying to stuff an orphaned pipe, so it got a SIGPIPE as well and also exited with an error status.
Since both cat and grep exited with errors, and pipefail is set, that subshell will also exit with an error status, and that means the parent shell considers the whole assignment command to have failed, and abort the script on the spot.
Solutions:
So, one possible solution is to remove the -e option from the set command. -e is kind of janky in what it considers an exit-worthy error and what it doesn't, so I don't generally like it anyway (see BashFAQ #105 for details).
Another problem with -e is that (as we've seen here) it doesn't give much of any indication of what went wrong, or even that something went wrong! Error checking is important, but so's error reporting.
(Note: the danger in removing -e is that your script might get a serious error partway through... and then blindly keep running, in a situation that doesn't make sense, possibly damaging things in the process. So you should think about what might go wrong as the script runs, and add manual error checking as needed. I'll add some examples to my script suggestion below.)
Anyway, just removing -e is just papering over the fact that this isn't a really good approach to the problem. You're reading (or trying to read) over the entire file multiple times, and processing it through multiple commands each time. You really should only be reading through the thing twice: once to figure out how many reads there are, and once to break it into chunks. You might be able to write a program to do the splitting in awk, but most unix-like systems already have a program specifically for this task: split. There's also no need for cat everywhere, since the other commands are perfectly capable of reading directly from files (again, #Jens pointed this out in a comment).
So I think something like this would work:
#!/bin/bash
set -uo pipefail # I removed the -e 'cause I don't trust it
nodes=8 # Note: lower- or mixed-case variables are safer to avoid conflicts
reads=0days_rep2.fasta
splitprefix=0days_split_
Ntot=$(grep -c 'read' "$reads") || { # grep can both read & count in a single step
# The || means this'll run if there was an error in that command.
# A normal thing to do is print an error message to stderr
# (with >&2), then exit the script with a nonzero (error) status
echo "$0: Error counting reads in $reads" >&2
exit 1
}
Ndiv=$((($Ntot+$nodes-1)/$nodes)) # Force it to round *up*, not down
grep 'read' "$reads" | split -l $Ndiv -a1 - "$splitprefix" || {
echo "$0: Error splitting fasta file" >&2
exit 1
}
This'll create files named "0days_split_a" through "0days_split_h". If you have the GNU version of split, you could add its -d option (use numeric suffixes instead of letters) and/or --additional-suffix=.fasta (to add the .fasta extension to the split files).
Another note: if only a little bit of that big file is read lines, it might be faster to run grep 'read' "$reads" >sometempfile first, and then run the rest of the script on the temp file, so you don't have to read & thin it twice. But if most of the file is read lines, this won't help much.
Alright, we have found the troublemaker: set -e in combination with set -o pipefail.
Gordon Davisson's answer provides all the details. I provide this answer for the sole purpose of reaping an upvote for my debugging efforts in the comments to your answer :-)

pipe operator between two blocks

I've found an interesting bash script that with some modifications would likely solve my use case. But I'm unsure if I understand how it works, in particular the pipe between the blocks.
How do these two blocks work together, and what is the behaviour of the pipe that separates them?
function isTomcatUp {
# Use FIFO pipeline to check catalina.out for server startup notification rather than
# ping with an HTTP request. This was recommended by ForgeRock (Zoltan).
FIFO=/tmp/notifytomcatfifo
mkfifo "${FIFO}" || exit 1
{
# run tail in the background so that the shell can
# kill tail when notified that grep has exited
tail -f $CATALINA_HOME/logs/catalina.out &
# remember tail's PID
TAILPID=$!
# wait for notification that grep has exited
read foo <${FIFO}
# grep has exited, time to go
kill "${TAILPID}"
} | {
grep -m 1 "INFO: Server startup"
# notify the first pipeline stage that grep is done
echo >${FIFO}
}
# clean up
rm "${FIFO}"
}
Code Source: https://www.manthanhd.com/2016/01/15/waiting-for-tomcat-to-start-up-in-a-script/
bash has a whole set of compound commands, which work much like simple commands. Most relevant here is that each compound command has its own standard input and standard output.
{ ... } is one such compound command. Each command inside the group inherits its standard input and output from the group, so the effect is that the standard output of a group is the concatenation of its children's standard output. Likewise, each command inside reads in turn from the group's standard input. In your example, nothing interesting happens, because grep consumes all of the standard input and no other command tries to read from it. But consider this example:
$ cat tmp.txt
foo
bar
$ { read a; read b; echo "$b then $a"; } < tmp.txt
bar then foo
The first read gets a single line from standard input, and the second read gets the second. Importantly, the first read consumes a line of input before the second read could see it. Contrast this with
$ read a < tmp.txt
$ read b < tmp.txt
where a and b will both contain foo, because each read command opens tmp.txt anew and both will read the first line.
The { …; } operations groups the commands such that the I/O redirections apply to all the commands within it. The { must be separate as if it were a command name; the } must be preceded by either a semicolon or a newline and be separate too. The commands are not executed in a sub-shell, unlike ( … ) which also has some syntactic differences.
In your script, you have two such groupings connected by a pipe. Because of the pipe, each group is in a sub-shell, but it is not in a sub-shell because of the braces.
The first group runs tail -f on a file in background, and then waits for a FIFO to be closed so it can kill the tail -f. The second part looks for the first occurrence of some specific information and when it finds it, stops reading and writes to the FIFO to free everything up.
As with any pipeline, the exit status is the status of the last group — which is likely to be 0 because the echo succeeds.

Stdout race condition between script and subscript

I'm trying to call a script deepScript and process its output within another script shallowScript ; it looks schematically like the following pieces of code:
shallowScript.sh
#!/bin/zsh
exec 1> >( tr "[a-z]" "[A-Z]" )
print "Hello - this is shallowScript"
. ./deepScript.sh
deepScript.sh
#!/bin/zsh
print "Hello - this is deepScript"
Now, when I run ./shallowScript.sh, the outcome is erratic : either it works as expected (very rarely), or it prints an empty line followed by the two expected lines (sometimes), or it prints the two lines and then hangs until I hit return and give it a newline (most of the time).
So far, I found out the following:
it is probably a race condition, as the two "print"s try to output to stdout at the same time; inserting "sleep 1" before the call to ". ./deepScript.sh" corrects the problem consistently
the problem comes from the process substitution "exec 1> >(tr ...)"; commenting it out also corrects the problem consistently
I've browsed so many forums and posts about process substitution and redirection, but could not find out how to guarantee that my script calls commands synchronously. Ideas ?
zsh --version
zsh 5.0.5 (x86_64-apple-darwin14.0)
[EDIT]
As it seems that this strategy is bound to fail or lead to horrible workaround syntax, here is another strategy that seems to work with a bearable syntax: I removed all the redirect from shallowScript.sh and created a third script where the output processing happens in a function:
shallowScript.sh
#!/bin/zsh
print "Hello - this is shallowScript"
. ./deepScript.sh
thirdScript.sh
#!/bin/zsh
function _process {
while read input; do
echo $input | tr "[a-z]" "[A-Z]"
done
}
. ./shallowScript.sh | _process
I suppose the problem is that you don't see the prompt after executing the script:
$ ./shallowScript.sh
$ HELLO - THIS IS SHALLOWSCRIPT
HELLO - THIS IS DEEPSCRIPT
(nothing here)
and think it hangs here and waits for the newline. Actually it does not, and the behavior is very expected.
Instead of the newline you can enter any shell command e.g. ls and it will be executed.
$ ./shallowScript.sh
$ HELLO - THIS IS SHALLOWSCRIPT <--- note the prompt in this line
HELLO - THIS IS DEEPSCRIPT
echo test <--- my input
test <--- its result
$
What happens here is: the first shell (the one which is running shallowScript.sh) creates a pipe, executes a dup2 call to forward its stdout (fd 1) to the write end of the created pipe and then forks a new process (tr) so that everything the parent prints to stdout is sent to the stdin of tr.
What happens next is that the main shell (the one where you type the initial command ./shallowScript.sh) does not have an idea that it should delay printing the next command prompt until the end of tr process. It knows nothing about tr, so it just waits for the shallowScript.sh to execute, then prints a prompt. The tr is still running at that time, that's why its output (two lines) come after the prompt is printed, and you think the shell is waiting for the newline. It is not actually, it is ready for the next command. You can see the printed prompt ($ character or whatever) somewhere before, inside, or after the output of the script, it depends on how fast the tr process finished.
You see such behavior every time your process forks and the child continues to write to its stdout when the parent is already dead.
Long story short, try this:
$ ./shallowScript.sh | cat
HELLO - THIS IS SHALLOWSCRIPT
HELLO - THIS IS DEEPSCRIPT
$
Here the shell will wait for the cat process to finish before printing a next prompt, and the cat will finish only when all its input (e.g. the output from tr) is processed, just as you expect.
Update: found a relevant quote in zsh docs here: http://zsh.sourceforge.net/Doc/Release/Expansion.html#Process-Substitution
There is an additional problem with >(process); when this is attached to an external command, the parent shell does not wait for process to finish and hence an immediately following command cannot rely on the results being complete. The problem and solution are the same as described in the section MULTIOS in Redirection. Hence in a simplified version of the example above:
paste <(cut -f1 file1) <(cut -f3 file2) > >(process)
(note that no MULTIOS are involved), process will be run asynchronously as far as the parent shell is concerned. The workaround is:
{ paste <(cut -f1 file1) <(cut -f3 file2) } > >(process)
In your case it will give something like this:
{
print "Hello - this is shallowScript"
. ./deepScript.sh
} 1> >( tr "[a-z]" "[A-Z]" )
which of course works but looks worse than the original.

How to add a filter to `tail -f` output that would issue an audible alarm given matching input?

I need to analyze a log for specific output and if a certain keyword is matched in the output, I need to issue an echo -e "\a".
How would I filter a script so that this would occur? I pass it through ack[-grep] as well so I'd like to put the alarm notification in prior to the colorization probably.
tail -f file | while read line; do
echo $line | grep -qF keyword && echo -e \\a >&2; echo $line; done
This spawns more processes than sed, but puts the control character on the terminal via stderr so that it is less likely to be modified by downstream processes. Also, it is entirely possible for an implementation of sed to buffer its input and not generate any output until tail generates 8K or so of output, so you might get the alert hours or days after the data occurs.
I would use sed to replace the key word with key word \a. That should work so long as your downstream steps don't mangle the \a.

Shell Script Help--Accept Input and Run in BackGround?

I have a shell script in which in the first line I ask the user to input how many minutes they want the script to run for:
#!/usr/bin/ksh
echo "How long do you want the script to run for in minutes?:\c"
read scriptduration
loopcnt=0
interval=1
date2=$(date +%H:%M%S)
(( intervalsec = $interval * 1 ))
totalmin=${1:-$scriptduration}
(( loopmax = ${totalmin} * 60 ))
ofile=/home2/s499929/test.log
echo "$date2 total runtime is $totalmin minutes at 2 sec intervals"
while(( $loopmax > $loopcnt ))
do
date1=$(date +%H:%M:%S)
pid=`/usr/local/bin/lsof | grep 16752 | grep LISTEN |awk '{print $2}'` > /dev/null 2>&1
count=$(netstat -an|grep 16752|grep ESTABLISHED|wc -l| sed "s/ //g")
process=$(ps -ef | grep $pid | wc -l | sed "s/ //g")
port=$(netstat -an | grep 16752 | grep LISTEN | wc -l| sed "s/ //g")
echo "$date1 activeTCPcount:$count activePID:$pid activePIDcount=$process listen=$port" >> ${ofile}
sleep $intervalsec
(( loopcnt = loopcnt + 1 ))
done
It works great if I kick it off an input the values manually. But if I want to run this for 3 hours I need to kick off the script to run in the background.
I have tried just running ./scriptname & and I get this:
$ How long do you want the test to run for in minutes:360
ksh: 360: not found.
[2] + Stopped (SIGTTIN) ./test.sh &
And the script dies. Is this possible, any suggestions on how I can accept this one input and then run in the background?? Thanks!!!
You could do something like this:
test.sh arg1 arg2 &
Just refer to arg1 and arg2 as $1 and $2, respectively, in the bash script. ($0 is the name of the script)
So,
test.sh 360 &
will pass 360 as the first argument to the bash or ksh script which can be referred to as $1 in the script.
So the first few lines of your script would now be:
#!/usr/bin/ksh
scriptduration=$1
loopcnt=0
...
...
With bash you can start the script in the foreground and after you finished with the user input, interrupt it by hitting Ctrl-Z.
Then type
$ bg %
and the script will continue to run in the background.
Why You're Getting What You're Getting
When you run the script in the background, it can't take any user input. In fact, the program will freeze if it expects user input until its put back in the foreground. However, output has to go somewhere. Thus, the output goes to the screen (even though the program is running in the background. Thus, you see the prompt.
The prompt you see your program displaying is meaningless because you can't input at the prompt. Instead, you type in 360 and your shell is interpreting it as a command you want because you're not putting it in the program, you're putting it in the command prompt.
You want your program to be in the foreground for the input, but run in the background. You can't do both at once.
Solutions To Your Dilemma
You can have two programs. The first takes the input, and the second runs the actual program in the background.
Something like this:
#! /bin/ksh
read time?"How long in seconds do you want to run the job? "
my_actual_job.ksh $time &
In fact, you could even have a mechanism to run the job in the background if the time is over a certain limit, but otherwise run the job in the foreground.
#! /bin/ksh
readonly MAX_FOREGROUND_TIME=30
read time?"How long in seconds do you want to run the job? "
if [ $time -gt $MAX_FOREGROUND_TIME ]
then
my_actual_job.ksh $time &
else
my_actual_job.ksh $time
fi
Also remember if your job is in the background, it cannot print to the screen. You can redirect the output elsewhere, but if you don't, it'll print to the screen at inopportune times. For example, you could be in VI editing a file, and suddenly have the output appear smack in the middle of your VI session.
I believe there's an easy way to tell if your job is in the background, but I can't remember it offhand. You could find your current process ID by looking at $$, then looking at the output of jobs -p and see if that process ID is in the list. However, I'm sure someone will come up with an easy way to tell.
It is also possible that a program could throw itself into the background via the bg $$ command.
Some Hints
If you're running Kornshell, you might consider taking advantage of many of Kornshell's special features:
print: The print command is more flexible and robust than echo. Take a look at the manpage for Kornshell and see all of its features.
read: You notice that you can use the read var?"prompt" form of the read command.
readonly: Use readonly to declare constants. That way, you don't accidentally change the value of that variable later. Besides, it's good programming technique.
typeset: Take a look at typeset in the ksh manpage. The typeset command can help you declare particular variables as floating point vs. real, and can automatically do things like zero fill, right or left justify, etc.
Some things not specific to Kornshell:
The awk and sed commands can also do what grep does, so there's no reason to filter something through grep and then through awk or sed.
You can combine greps by using the -e parameter. grep foo | grep bar is the same as grep -e foo -e bar.
Hope this helps.
I've tested this with ksh and it worked. The trick is to let the script call itself with the time to wait as parameter:
if [ -z "$1" ]; then
echo "How long do you want the test to run for in minutes:\c"
read scriptduration
echo "running task in background"
$0 $scriptduration &
exit 0
else
scriptduration=$1
fi
loopcnt=0
interval=1
# ... and so on
So are you using bash or ksh? In bash, you can do this:
{ echo 360 | ./test.sh ; } &
It could work for ksh also.

Resources