How can I get both the process id and the exit code from a bash script? - bash

I need a bash script that does the following:
Starts a background process with all output directed to a file
Writes the process's exit code to a file
Returns the process's pid (right away, not when process exits).
The script must exit
I can get the pid but not the exit code:
$ executable >>$log 2>&1 &
pid=`jobs -p`
Or, I can capture the exit code but not the pid:
$ executable >>$log;
# blocked on previous line until process exits
echo $0 >>$log;
How can I do all of these at the same time?

The pid is in $!, no need to run jobs. And the return status is returned by wait:
$executable >> $log 2>&1 &
pid=$!
wait $!
echo $? # return status of $executable
EDIT 1
If I understand the additional requirement as stated in a comment, and you want the script to return immediately (without waiting for the command to finish), then it will not be possible to have the initial script write the exit status of the command. But it is easy enough to have an intermediary write the exit status as soon as the child finishes. Something like:
sh -c "$executable"' & echo pid=$! > pidfile; wait $!; echo $? > exit-status' &
should work.
EDIT 2
As pointed out in the comments, that solution has a race condition: the main script terminates before the pidfile is written. The OP solves this by doing a polling sleep loop, which is an abomination and I fear I will have trouble sleeping at night knowing that I may have motivated such a travesty. IMO, the correct thing to do is to wait until the child is done. Since that is unacceptable, here is a solution that blocks on a read until the pid file exists instead of doing the looping sleep:
{ sh -c "$executable > $log 2>&1 &"'
echo $! > pidfile
echo # Alert parent that the pidfile has been written
wait $!
echo $? > exit-status
' & } | read

Related

how do I watch for a process to have died in shell script?

I'm running a shell test program that I can view a progress bar but when I run it I keep getting a unary error . Is kill -0 a way to kill a subprocess in shell ?
Or is there another method to test if my process has died?
heres my code to run a progress bar until my command ends:
#!/bin/sh
# test my progress bar
spin[0]="-"
spin[1]="\\"
spin[2]="|"
spin[3]="/"
sleep 10 2>/dev/null & # run as background process
pid=$! # grab process id
echo -n "[sleeping] ${spin[0]}"
while [ kill -0 $pid ] # wait for process to end
do
for i in "${spin[#]}"
do
echo -ne "\b$i"
sleep 0.1
done
done
enter code here
1. Is kill -0 a way to kill a subprocess in shell ?
On Linux OS, kill -0 is just a way to try to kill a process and see what happens, '0' is not a POSIX signal, it does nothing at all.
If the process is running, kill will return 0, if not, it will return 1.
ps $pid >/dev/null 2>&1 could do the same job.
To kill a process, one generally use the SIGQUIT/3 (quit program) or SIGKILL/9 (terminate program) ; the process could trap the signal and make a clean exit, or it could ignore the signal so the OS has to terminate it 'quick and dirty'.
2. test and '['
The square bracket '[' is an utility ( /bin/[ ), and expect something you didn't provide correctly.
The syntax of while is while list; do list; done where list will return an exit code, so you don't have to use something else.
3. how do I watch for a process to have died in shell script?
Like you did, the code below will do the job:
#!/bin/bash
spin[0]="-"
spin[1]="\\"
spin[2]="|"
spin[3]="/"
sleep 10 2>/dev/null & # run as background process
pid=$! # grab process id
echo -n "[sleeping] ${spin[0]}"
#while ps -p $pid >/dev/null 2>&1 # using ps
while kill -0 $pid >/dev/null 2>&1 # using kill
do
for i in "${spin[#]}"
do
echo -ne "\b$i"
sleep 0.5
done
done
CAVEATS
I use /bin/bash as interpreter, as some of the Bourne Shell (sh) could not support the use of an array (ie spin[n]).
It's probably cleaner to run the spinner in the background and kill it when the process (running in the foreground) terminates. Or, you could open another file descriptor and write something into it after the background process terminates, and have the main process block on a read. eg:
#!/bin/bash
# test my progress bar
spin[0]='-'
spin[1]='\'
spin[2]='|'
spin[3]='/'
{ { { sleep 10 2>/dev/null; echo >&5; } & # run as background process
} 5>&1 1>&3 | { # wait for process to end
while ! read -t 1; do
printf "\r[sleeping] ${spin[ $(( i = ++i % 4 )) ]}"
done
}
} 3>&1

chain starting of programs in bash

i want to start programs in a chain, like start a bash script startScript.sh and then in the startScript.sh chain load certain programs.
startScript.sh
./program1 &
#PID=(echo $!)
PID=$!
echo "Wait for 10 seconds here"
if ps -p $PID > /dev/null; then echo "continue"; ./program2; else echo "PID is not running"; exit; fi
echo "Wait for 10 seconds here"
#if program1 is running and program2 is running, then ./program3, else exit.
I tried to find the running pid of program1 with $!, but the problem is that program1 is itself a shell script and they invokes further shell scripts in which I am not interested. Hence $! never gives me the pid of ./program1 but something irrelevant. HOw can I get the PID of program1?
Also how can I get the PID of program1 as well as program2 to see if they are running and then start program3.
The special parameter $! in shell (from man bash):
Expands to the process ID of the job most recently placed into the background, whether executed as an asynchronous command or using the bg builtin (see JOB CONTROL below).
It's important to notice that job control is shell-local (in your case local to the bash interpreter executing your startScript.sh).
That means the pid from below:
#!/bin/bash
./program.sh
pid=$!
will always contain the PID of the program.sh script (i.e. the process executing it, defined with a hashbang, in your case probably another bash process), regardless of the background processes spawned by that child process (remember that $! is local to your script; the $! in the child will be undefined, until a background process is spawned there).
What maybe sidetracked you initially is the line:
PID=(echo $!)
that didn't properly set the PID. The PID was set to an array containing two words (elements): echo and <pid>. You want this:
pid=$!
(and maybe to use lowercase name for the non-global variable pid).
A simple demo:
$ cat script1.sh
#!/bin/bash
./script2.sh &
echo "in script1: $!"
$ cat script2.sh
#!/bin/bash
echo "in script2: $!"
sleep 5 &
echo "in script2 after sleep: $!"
$ ./script1.sh
in script1: 19537
in script2:
in script2 after sleep: 19541
Notice how the $! in initially undefined in script2.sh - if $! were to return the last PID globally, it would be set to PID of script2.sh.

shell script - how to stop "watch" command in the shell script [duplicate]

I have a bash script that launches a child process that crashes (actually, hangs) from time to time and with no apparent reason (closed source, so there isn't much I can do about it). As a result, I would like to be able to launch this process for a given amount of time, and kill it if it did not return successfully after a given amount of time.
Is there a simple and robust way to achieve that using bash?
P.S.: tell me if this question is better suited to serverfault or superuser.
(As seen in:
BASH FAQ entry #68: "How do I run a command, and have it abort (timeout) after N seconds?")
If you don't mind downloading something, use timeout (sudo apt-get install timeout) and use it like: (most Systems have it already installed otherwise use sudo apt-get install coreutils)
timeout 10 ping www.goooooogle.com
If you don't want to download something, do what timeout does internally:
( cmdpid=$BASHPID; (sleep 10; kill $cmdpid) & exec ping www.goooooogle.com )
In case that you want to do a timeout for longer bash code, use the second option as such:
( cmdpid=$BASHPID;
(sleep 10; kill $cmdpid) \
& while ! ping -w 1 www.goooooogle.com
do
echo crap;
done )
# Spawn a child process:
(dosmth) & pid=$!
# in the background, sleep for 10 secs then kill that process
(sleep 10 && kill -9 $pid) &
or to get the exit codes as well:
# Spawn a child process:
(dosmth) & pid=$!
# in the background, sleep for 10 secs then kill that process
(sleep 10 && kill -9 $pid) & waiter=$!
# wait on our worker process and return the exitcode
exitcode=$(wait $pid && echo $?)
# kill the waiter subshell, if it still runs
kill -9 $waiter 2>/dev/null
# 0 if we killed the waiter, cause that means the process finished before the waiter
finished_gracefully=$?
sleep 999&
t=$!
sleep 10
kill $t
I also had this question and found two more things very useful:
The SECONDS variable in bash.
The command "pgrep".
So I use something like this on the command line (OSX 10.9):
ping www.goooooogle.com & PING_PID=$(pgrep 'ping'); SECONDS=0; while pgrep -q 'ping'; do sleep 0.2; if [ $SECONDS = 10 ]; then kill $PING_PID; fi; done
As this is a loop I included a "sleep 0.2" to keep the CPU cool. ;-)
(BTW: ping is a bad example anyway, you just would use the built-in "-t" (timeout) option.)
Assuming you have (or can easily make) a pid file for tracking the child's pid, you could then create a script that checks the modtime of the pid file and kills/respawns the process as needed. Then just put the script in crontab to run at approximately the period you need.
Let me know if you need more details. If that doesn't sound like it'd suit your needs, what about upstart?
One way is to run the program in a subshell, and communicate with the subshell through a named pipe with the read command. This way you can check the exit status of the process being run and communicate this back through the pipe.
Here's an example of timing out the yes command after 3 seconds. It gets the PID of the process using pgrep (possibly only works on Linux). There is also some problem with using a pipe in that a process opening a pipe for read will hang until it is also opened for write, and vice versa. So to prevent the read command hanging, I've "wedged" open the pipe for read with a background subshell. (Another way to prevent a freeze to open the pipe read-write, i.e. read -t 5 <>finished.pipe - however, that also may not work except with Linux.)
rm -f finished.pipe
mkfifo finished.pipe
{ yes >/dev/null; echo finished >finished.pipe ; } &
SUBSHELL=$!
# Get command PID
while : ; do
PID=$( pgrep -P $SUBSHELL yes )
test "$PID" = "" || break
sleep 1
done
# Open pipe for writing
{ exec 4>finished.pipe ; while : ; do sleep 1000; done } &
read -t 3 FINISHED <finished.pipe
if [ "$FINISHED" = finished ] ; then
echo 'Subprocess finished'
else
echo 'Subprocess timed out'
kill $PID
fi
rm finished.pipe
Here's an attempt which tries to avoid killing a process after it has already exited, which reduces the chance of killing another process with the same process ID (although it's probably impossible to avoid this kind of error completely).
run_with_timeout ()
{
t=$1
shift
echo "running \"$*\" with timeout $t"
(
# first, run process in background
(exec sh -c "$*") &
pid=$!
echo $pid
# the timeout shell
(sleep $t ; echo timeout) &
waiter=$!
echo $waiter
# finally, allow process to end naturally
wait $pid
echo $?
) \
| (read pid
read waiter
if test $waiter != timeout ; then
read status
else
status=timeout
fi
# if we timed out, kill the process
if test $status = timeout ; then
kill $pid
exit 99
else
# if the program exited normally, kill the waiting shell
kill $waiter
exit $status
fi
)
}
Use like run_with_timeout 3 sleep 10000, which runs sleep 10000 but ends it after 3 seconds.
This is like other answers which use a background timeout process to kill the child process after a delay. I think this is almost the same as Dan's extended answer (https://stackoverflow.com/a/5161274/1351983), except the timeout shell will not be killed if it has already ended.
After this program has ended, there will still be a few lingering "sleep" processes running, but they should be harmless.
This may be a better solution than my other answer because it does not use the non-portable shell feature read -t and does not use pgrep.
Here's the third answer I've submitted here. This one handles signal interrupts and cleans up background processes when SIGINT is received. It uses the $BASHPID and exec trick used in the top answer to get the PID of a process (in this case $$ in a sh invocation). It uses a FIFO to communicate with a subshell that is responsible for killing and cleanup. (This is like the pipe in my second answer, but having a named pipe means that the signal handler can write into it too.)
run_with_timeout ()
{
t=$1 ; shift
trap cleanup 2
F=$$.fifo ; rm -f $F ; mkfifo $F
# first, run main process in background
"$#" & pid=$!
# sleeper process to time out
( sh -c "echo \$\$ >$F ; exec sleep $t" ; echo timeout >$F ) &
read sleeper <$F
# control shell. read from fifo.
# final input is "finished". after that
# we clean up. we can get a timeout or a
# signal first.
( exec 0<$F
while : ; do
read input
case $input in
finished)
test $sleeper != 0 && kill $sleeper
rm -f $F
exit 0
;;
timeout)
test $pid != 0 && kill $pid
sleeper=0
;;
signal)
test $pid != 0 && kill $pid
;;
esac
done
) &
# wait for process to end
wait $pid
status=$?
echo finished >$F
return $status
}
cleanup ()
{
echo signal >$$.fifo
}
I've tried to avoid race conditions as far as I can. However, one source of error I couldn't remove is when the process ends near the same time as the timeout. For example, run_with_timeout 2 sleep 2 or run_with_timeout 0 sleep 0. For me, the latter gives an error:
timeout.sh: line 250: kill: (23248) - No such process
as it is trying to kill a process that has already exited by itself.
#Kill command after 10 seconds
timeout 10 command
#If you don't have timeout installed, this is almost the same:
sh -c '(sleep 10; kill "$$") & command'
#The same as above, with muted duplicate messages:
sh -c '(sleep 10; kill "$$" 2>/dev/null) & command'

How can I make an external program interruptible in this trap-captured bash script?

I am writing a script which will run an external program (arecord) and do some cleanup if it's interrupted by either a POSIX signal or input on a named pipe. Here's the draft in full
#!/bin/bash
X=`date '+%Y-%m-%d_%H.%M.%S'`
F=/tmp/$X.wav
P=/tmp/$X.$$.fifo
mkfifo $P
trap "echo interrupted && (rm $P || echo 'couldnt delete $P') && echo 'removed fifo' && exit" INT
# this forked process will wait for input on the fifo
(echo 'waiting for fifo' && cat $P >/dev/null && echo 'fifo hit' && kill -s SIGINT $$)&
while true
do
echo waiting...
sleep 1
done
#arecord $F
This works perfectly as it is: the script ends when a signal arrives and a signal is generated if the fifo is written-to.
But instead of the while true loop I want the now-commented-out arecord command, but if I run that program instead of the loop the SIGINT doesn't get caught in the trap and arecord keeps running.
What should I do?
It sounds like you really need this to work more like an init script. So, start arecord in the background and put the pid in a file. Then use the trap to kill the arecord process based on the pidfile.
#!/bin/bash
PIDFILE=/var/run/arecord-runner.pid #Just somewhere to store the pid
LOGFILE=/var/log/arecord-runner.log
#Just one option for how to format your trap call
#Note that this does not use &&, so one failed function will not
# prevent other items in the trap from running
trapFunc() {
echo interrupted
(rm $P || echo 'couldnt delete $P')
echo 'removed fifo'
kill $(cat $PIDFILE)
exit 0
}
X=`date '+%Y-%m-%d_%H.%M.%S'`
F=/tmp/$X.wav
P=/tmp/$X.$$.fifo
mkfifo $P
trap "trapFunc" INT
# this forked process will wait for input on the fifo
(echo 'waiting for fifo' && cat $P >/dev/null && echo 'fifo hit' && kill -s SIGINT $$)&
arecord $F 1>$LOGFILE 2>&1 & #Run in the background, sending logs to file
echo $! > $PIDFILE #Save pid of the last background process to file
while true
do
echo waiting...
sleep 1
done
Also... you may have your trap written with '&&' clauses for a reason, but as an alternative, you can give a function name as I did above, or a sort of anonymous function like this:
trap "{ command1; command2 args; command3; exit 0; }"
Just make sure that each command is followed by a semicolon and there are spaces between the braces and the commands. The risk of using && in the trap is that your script will continue to run past the interrupt if one of the commands before the exit fails to execute (but maybe you want that?).

How to suppress Terminated message after killing in bash?

How can you suppress the Terminated message that comes up after you kill a
process in a bash script?
I tried set +bm, but that doesn't work.
I know another solution involves calling exec 2> /dev/null, but is that
reliable? How do I reset it back so that I can continue to see stderr?
In order to silence the message, you must be redirecting stderr at the time the message is generated. Because the kill command sends a signal and doesn't wait for the target process to respond, redirecting stderr of the kill command does you no good. The bash builtin wait was made specifically for this purpose.
Here is very simple example that kills the most recent background command. (Learn more about $! here.)
kill $!
wait $! 2>/dev/null
Because both kill and wait accept multiple pids, you can also do batch kills. Here is an example that kills all background processes (of the current process/script of course).
kill $(jobs -rp)
wait $(jobs -rp) 2>/dev/null
I was led here from bash: silently kill background function process.
The short answer is that you can't. Bash always prints the status of foreground jobs. The monitoring flag only applies for background jobs, and only for interactive shells, not scripts.
see notify_of_job_status() in jobs.c.
As you say, you can redirect so standard error is pointing to /dev/null but then you miss any other error messages. You can make it temporary by doing the redirection in a subshell which runs the script. This leaves the original environment alone.
(script 2> /dev/null)
which will lose all error messages, but just from that script, not from anything else run in that shell.
You can save and restore standard error, by redirecting a new filedescriptor to point there:
exec 3>&2 # 3 is now a copy of 2
exec 2> /dev/null # 2 now points to /dev/null
script # run script with redirected stderr
exec 2>&3 # restore stderr to saved
exec 3>&- # close saved version
But I wouldn't recommend this -- the only upside from the first one is that it saves a sub-shell invocation, while being more complicated and, possibly even altering the behavior of the script, if the script alters file descriptors.
EDIT:
For more appropriate answer check answer given by Mark Edgar
Solution: use SIGINT (works only in non-interactive shells)
Demo:
cat > silent.sh <<"EOF"
sleep 100 &
kill -INT $!
sleep 1
EOF
sh silent.sh
http://thread.gmane.org/gmane.comp.shells.bash.bugs/15798
Maybe detach the process from the current shell process by calling disown?
The Terminated is logged by the default signal handler of bash 3.x and 4.x. Just trap the TERM signal at the very first of child process:
#!/bin/sh
## assume script name is test.sh
foo() {
trap 'exit 0' TERM ## here is the key
while true; do sleep 1; done
}
echo before child
ps aux | grep 'test\.s[h]\|slee[p]'
foo &
pid=$!
sleep 1 # wait trap is done
echo before kill
ps aux | grep 'test\.s[h]\|slee[p]'
kill $pid ## no need to redirect stdin/stderr
sleep 1 # wait kill is done
echo after kill
ps aux | grep 'test\.s[h]\|slee[p]'
Is this what we are all looking for?
Not wanted:
$ sleep 3 &
[1] 234
<pressing enter a few times....>
$
$
[1]+ Done sleep 3
$
Wanted:
$ (set +m; sleep 3 &)
<again, pressing enter several times....>
$
$
$
$
$
As you can see, no job end message. Works for me in bash scripts as well, also for killed background processes.
'set +m' disables job control (see 'help set') for the current shell. So if you enter your command in a subshell (as done here in brackets) you will not influence the job control settings of the current shell. Only disadvantage is that you need to get the pid of your background process back to the current shell if you want to check whether it has terminated, or evaluate the return code.
This also works for killall (for those who prefer it):
killall -s SIGINT (yourprogram)
suppresses the message... I was running mpg123 in background mode.
It could only silently be killed by sending a ctrl-c (SIGINT) instead of a SIGTERM (default).
disown did exactly the right thing for me -- the exec 3>&2 is risky for a lot of reasons -- set +bm didn't seem to work inside a script, only at the command prompt
Had success with adding 'jobs 2>&1 >/dev/null' to the script, not certain if it will help anyone else's script, but here is a sample.
while true; do echo $RANDOM; done | while read line
do
echo Random is $line the last jobid is $(jobs -lp)
jobs 2>&1 >/dev/null
sleep 3
done
Another way to disable job notifications is to place your command to be backgrounded in a sh -c 'cmd &' construct.
#!/bin/bash
# ...
pid="`sh -c 'sleep 30 & echo ${!}' | head -1`"
kill "$pid"
# ...
# or put several cmds in sh -c '...' construct
sh -c '
sleep 30 &
pid="${!}"
sleep 5
kill "${pid}"
'
I found that putting the kill command in a function and then backgrounding the function suppresses the termination output
function killCmd() {
kill $1
}
killCmd $somePID &
Simple:
{ kill $! } 2>/dev/null
Advantage? can use any signal
ex:
{ kill -9 $PID } 2>/dev/null

Resources