Bash script: `exit 0` fails to exit - bash

So I have this Bash script:
#!/bin/bash
PID=`ps -u ...`
if [ "$PID" = "" ]; then
echo $(date) Server off: not backing up
exit
else
echo "say Server backup in 10 seconds..." >> fifo
sleep 10
STARTTIME="$(date +%s)"
echo nosave >> fifo
echo savenow >> fifo
tail -n 3 -f server.log | while read line
do
if echo $line | grep -q 'save complete'; then
echo $(date) Backing up...
OF="./backups/backup $(date +%Y-%m-%d\ %H:%M:%S).tar.gz"
tar -czhf "$OF" data
echo autosave >> fifo
echo "$(date) Backup complete, resuming..."
echo "done"
exit 0
echo "done2"
fi
TIMEDIFF="$(($(date +%s)-STARTTIME))"
if ((TIMEDIFF > 70)); then
echo "Save took too long, canceling backup."
exit 1
fi
done
fi
Basically, the server takes input from a fifo and outputs to server.log. The fifo is used to send stop/start commands to the server for autosaves. At the end, once it receives the message from the server that the server has completed a save, it tar's the data directory and starts saves again.
It's at the exit 0 line that I'm having trouble. Everything executes fine, but I get this output:
srv:scripts $ ./backup.sh
Sun Nov 24 22:42:09 EST 2013 Backing up...
Sun Nov 24 22:42:10 EST 2013 Backup complete, resuming...
done
But it hangs there. Notice how "done" echoes but "done2" fails. Something is causing it to hang on exit 0.
ADDENDUM: Just to avoid confusion for people looking at this in the future, it hangs at the exit line and never returns to the command prompt. Not sure if I was clear enough in my original description.
Any thoughts? This is the entire script, there's nothing else going on and I'm calling it direct from bash.

Here's a smaller, self contained example that exhibits the same behavior:
echo foo > file
tail -f file | while read; do exit; done
The problem is that since each part of the pipeline runs in a subshell, exit only exits the while read loop, not the entire script.
It will then hang until tail finds a new line, tries to write it, and discovers that the pipe is broken.
To fix it, you can replace
tail -n 3 -f server.log | while read line
do
...
done
with
while read line
do
...
done < <(tail -n 3 -f server.log)
By redirecting from a process substitution instead, the flow doesn't have to wait for tail to finish like it would in a pipeline, and it won't run in a subshell so that exit will actually exits the entire script.

But it hangs there. Notice how "done" echoes but "done2" fails.
done2 won't be printed at all since exit 0 has already ended your script with return code 0.

I don't know the details of bash subshells inside loops, but normally the appropriate way to exit a loop is to use the "break" command. In some cases that's not enough (you really need to exit the program), but refactoring that program may be the easiest (safest, most portable) way to solve that. It may also improve readability, because people don't expect programs to exit in the middle of a loop.

Related

Kill bash command when line is found

I want to kill a bash command when I found some string in the output.
To clarify, I want the solution to be similar to a timeout command:
timeout 10s looping_program.sh
Which will execute the script: looping_program.sh and kill the script after 10 seconds of execute.
Instead I want something like:
regexout "^Success$" looping_program.sh
Which will execute the script until it matches a line that just says Success in the stdout of the program.
Note that I'm assuming that this looping_program.sh does not exit at the same time it outputs Success for whatever reason, so simply waiting for the program to exit would waste time if I don't care about what happens after that.
So something like:
bash -e looping_program.sh > /tmp/output &
PID="$(ps aux | grep looping_program.sh | head -1 | tr -s ' ' | cut -f 2 -d ' ')"
echo $PID
while :; do
echo "$(tail -1 /tmp/output)"
if [[ "$(tail -1 /tmp/output)" == "Success" ]]; then
kill $PID
exit 0
fi
sleep 1
done
Where looping_program.sh is something like:
echo "Fail"
sleep 1;
echo "Fail"
sleep 1;
echo "Fail"
sleep 1;
echo "Success"
sleep 1;
echo "Fail"
sleep 1;
echo "Fail"
sleep 1;
echo "Fail"
sleep 1;
But that is not very robust (uses a single tmp file... might kill other programs...) and I want it to just be one command. Does something like this exist? I may just write a c program to do it if not.
P.S.: I provided my code as an example of what I wanted the program to do. It does not use good programming practices. Notes from other commenters:
#KamilCuk Do not use temporary file. Use a fifo.
#pjh Note that any approach that involves using kill with a PID in shell code runs the risk of killing the wrong process. Use kill in shell programs only when it is absolutely necessary.
There are more suggestions below from other users, I just wanted to make sure no one came across this and thought it would be good to model their code after.
looping_program() {
for i in 1 2 3; do echo $i; sleep 1; done
echo Success
yes
}
coproc looping_program
while IFS= read -r line; do
if [[ "$line" =~ Success ]]; then
break
fi
done <&${COPROC[0]}
exec {COPROC[0]}>&- {COPROC[1]}>&-
kill ${COPROC_PID}
wait ${COPROC_PID}
Notes:
Do not use temporary file. Use a fifo.
Do not use tail -n1 to read last line. Read from the stream in a loop.
Do not repeat tail -1 twice. Cache the result.
Wait for pid after killing to synchronize.
When you're using a coprocess, use COPROC_PID to get the PID
When you're not using a coprocess, use $! to get the PID of a background process started from the current shell.
When you can't use $! (because the process you're trying to get a PID of was not spawned in the background as a direct child of the current shell), do not use ps aux | grep to get the pid. Use pgrep.
Do not use echo $(stuff). Just run the stuff, no echo.
With expect
#!/usr/bin/env -S expect -f
set timeout -1
spawn ./looping_program.sh
expect "Success"
send -- "\x03"
expect eof
Call it looping_killer:
$ ./looping_killer
spawn ./looping_program.sh
Fail
Fail
Fail
Success
^C
To pass the program and pattern:
./looping_killer some_program "some pattern"
You'd change the expect script to
#!/usr/bin/env -S expect -f
set timeout -1
spawn [lindex $argv 0]
expect -- [lindex $argv 1]
send -- "\x03"
expect eof
Assuming that your looping program exists when it tries to write to a broken pipe, this will print all output up to and including the 'Success' line and then exit:
./looping_program | sed '/^Success$/q'
You may need to disable buffering of the looping program output. See Force line-buffering of stdout in a pipeline and How to make output of any shell command unbuffered? for ways to do it.
See Should I save my scripts with the .sh extension? and Erlkonig: Commandname Extensions Considered Harmful for reasons why I dropped the '.sh' suffix.
Note that any approach that involves using kill with a PID in shell code runs the risk of killing the wrong process. Use kill in shell programs only when it is absolutely necessary.

When using exec with &, the final command does not run

It seems the code after if/fi is not running. Here is what I have:
I have a script, /my/scripts/dir/directoryPercentFull.sh:
directoryPercentFull="$(df | grep '/aDir/anotherDir' | grep -o '...%' | sed 's/%//g' | sed 's/ //g')"
if [ $directoryPercentFull -gt 90 ]
then
echo $directoryPercentFull
exec /someDir/someOtherDir/test01.sh &
exec /someDir/someOtherOtherDir/test02.sh &
exec /someDir/yetAnotherDir/test03.sh
fi
echo "Processing Done"
The scripts being called are:
/someDir/someOtherDir/test01.sh
#!/usr/bin/env bash
echo "inside test01.sh"
sleep 5
echo "leaving test01.sh"
/someDir/someOtherOtherDir/test02.sh
#!/usr/bin/env bash
echo "inside test02.sh"
sleep 5
echo "leaving test02.sh"
/someDir/yetAnotherDir/test03.sh
#!/usr/bin/env bash
echo "inside test03.sh"
sleep 5
echo "leaving test03.sh"
running the script by cd-ing to /my/scripts/dir and then doing ./directoryPercentFull.sh gives:
OUTPUT:
93
inside test03.sh
inside test02.sh
inside test01.sh
leaving test03.sh
leaving test01.sh
leaving test02.sh
OUTPUT EXPECTED:
93
inside test01.sh
inside test02.sh
inside test03.sh
leaving test01.sh
leaving test02.sh
leaving test03.sh
Processing Done
The order of the echo commands are not that big of a deal, though if someone knows why they go 3,2,1, then 3,1,2, I wouldn't hate an explanation.
However, I am not getting that final Processing Done. Anyone have any clue why the final echo back in /my/scripts/dir/directoryPercentFull.sh does not occur? I have purposefully not placed an & after the last exec statement, as I don't want what what is after the if/fi to run until all of it is finished processing.
/someDir/someOtherDir/test01.sh &
/someDir/someOtherOtherDir/test02.sh &
/someDir/yetAnotherDir/test03.sh
Get rid of all the execs. exec causes the shell process to be replaced by the given command, meaning the shell does not continue executing further commands.
The order of the echo commands are not that big of a deal, though if someone knows why they go 3,2,1, then 3,1,2, I wouldn't hate an explanation.
The printouts could come in any order. The three scripts are run in parallel processes so there's no telling which order they echo their printouts.

"allowed" operations in bash read while loop

I have a file text.txt which contains two lines.
first line
second line
I am trying to loop in bash using following loop:
while read -r LINE || [[ -n "$LINE" ]]; do
# sed -i 'some command' somefile
echo "echo something"
echo "$LINE"
sh call_other_script.sh
if ! sh some_complex_script.sh ; then
echo "operation failed"
fi
done <file.txt
When calling some_complex_script.sh only the first line is processed, however when commenting it out all two lines are processed.
some_complex_script.sh does all kind of stuff, like starting processes, sqlplus, starting WildFly etc.
./bin/call_some_script.sh | tee $SOME_LOGFILE &
wait
...
sqlplus $ORACLE_USER/$ORACLE_PWD#$DB<<EOF
whenever sqlerror exit 1;
whenever oserror exit 2;
INSERT INTO TABLE ....
COMMIT;
quit;
EOF
...
nohup $SERVER_DIR/bin/standalone.sh -c $WILDFLY_PROFILE -u 230.0.0.4 >/dev/null 2>&1 &
My question is if there are some operations which are not supposed to be called in some_complex_script.sh and in the loop (it may as well take 10 minutes to finish, is this a good idea at all?) which may break that loop.
The script is called using Jenkins and the Publish over SSH Plugin. When some_complex_script.sh is called on its own, there are no problems.
You should close or redirect stdin for the other commands you run, to stop them reading from the file. eg:
sh call_other_script.sh </dev/null

Ending Timestamp not printing on Shell Script: Using trap

I have a shell script I use for deployments. Since I want to capture the output of the entire process, I've wrapped it in a subshell and tail that out:
#! /usr/bin/env ksh
# deploy.sh
########################################################################
(yadda, yadda, yadda)
########################################################################
# LOGGING WRAPPER
#
dateFormat=$(date +"%Y.%m.%d-%H.%M.%S")
(
print -n "EXECUING: $0 $*: "
date
#
########################################################################
(yadda, yadda, yadda)
#
# Tail Startup
#
trap 'printf "Stopping Script: ";date;exit 0"' INT
print "TAILING LOG: YOU MAY STOP THIS WITH A CTRL-C WHEN YOU SEE THAT SERVER HAS STARTED"
sleep 2
./tailLog.sh
) 2>&1 | tee "deployment.$dateFormat.log"
#
########################################################################
Before I employed the subshell, the trap command worked. When you pressed CNTL-C, the program would print Stopping Script: and the date.
However, I wanted to make sure that no one forgets to save the output of this script, so I employed the subshell to automatically save the output. And, now trap doesn't seem to be working.
What am I doing wrong?
NEW INFORMATION
A little more playing around. I now see the issue isn't the shell or subshell. It's the damn pipe!
If I don't pipe the output to tee, the trap works fine. If I pipe the output to tee, the trap doesn't work.
So, the real question is how do I tee the output and still be able to use trap?
TEST PROGRAM
Before you answer, please, please, try these test programs:
#! /bin/ksh
dateFormat=$(date +"%Y.%m.%d-%H:%M:%S")
(
trap 'printf "The script was killed at: %s\n", "$(date)"' SIGINT
echo "$0 $*"
while sleep 2
do
print -n "The time is now "
date
done
) | tee somefile
And
#! /bin/ksh
dateFormat=$(date +"%Y.%m.%d-%H:%M:%S")
(
trap 'printf "The script was killed at: %s\n", "$(date)"' SIGINT
echo "$0 $*"
while sleep 2
do
print -n "The time is now "
date
done
)
The top one pipes to somefile..... The bottom one doesn't. The bottom one, the trap works. The top one, the trap doesn't. See if you can get the pipe to work and the "The script was killed at" line to print into the teed out file.
The pipe does work. The trap doesn't, but only when I have the pipe. You can move the trap statement all around and put in layers and layers of sub shells. There's some minor thing I am doing that's wrong, and I have no idea what it is.
Since trap stops the running process – logShell.sh – I think the pipe doesn't get executed at all. You can't do it this way.
One solution could be editing logShell.sh to write line by line in your log file. Maybe you could post it and we can discuss how you manage it.
OK, now I've got it. You have to use tee with -i to ignore interrupt signals.
#! /bin/ksh
dateFormat=$(date +"%Y.%m.%d-%H:%M:%S")
(
trap 'printf "The script was killed at: %s\n", "$(date)"' SIGINT
echo "$0 $*"
while sleep 2
do
print -n "The time is now "
date
done
) | tee -i somefile
this one works fine!

Bash script not exiting immediately when `exit` is called

I have the following bash script:
tail -F -n0 /private/var/log/system.log | while read line
do
if [ ! `echo $line | grep -c 'launchd'` -eq 0 ]; then
echo 'launchd message'
exit 0
fi
done
For some reason, it is echoing launchd message, waiting for a full 5 seconds, and then exiting.
Why is this happening and how do I make it exit immediately after it echos launchd message?
Since you're using a pipe, the while loop is being run in a subshell. Run it in the main shell instead.
#!/bin/bash
while ...
do
...
done < <(tail ...)
As indicated by Ignacio, your tail | while creates a subshell. The delay is because it's waiting for the next line to be written to the log file before everything closes.
You can add this line immediately before your exit command if you'd prefer not using process substitution:
kill -SIGPIPE $$
Unfortunately, I don't know of any way to control the exit code using this method. It will be 141 which is 128 + 13 (the signal number of SIGPIPE).
If you're trying to make the startup of a daemon dependent on another one having started, there's probably a better way to do that.
By the way, if you're really writing a Bash script (which you'd have to be to use <() process substitution), you can write your if like this: if [[ $line == *launchd* ]].
You can also exit the subshell with a tell-tale exit code and then test the value of "$?" to get the same effect you're looking for:
tail -F -n0 /private/var/log/system.log | while read line
do
if [ ! `echo $line | grep -c 'launchd'` -eq 0 ]; then
echo 'launchd message'
exit 10
fi
done
if [ $? -eq 10 ]; then exit 0; fi

Resources