Using ksh93 i'm attempting to wait for a background process ,run_cataloguer(), to finish, from within a separate background process ,send_mail(), using the script below:
#!/usr/bin/env ksh
function run_cataloguer
{
echo "In run_cataloguer()"
sleep 2
echo "leaving run_cataloguer()"
}
function send_mail
{
echo "In send_mail()"
#jobs
wait_for_cataloguer
sleep 1
echo "Leaving send_mail() "
}
function wait_for_cataloguer
{
echo "In wait_for_cataloguer() PID_CAT = $PID_CAT"
wait $PID_CAT
waitRet=$?
echo "waitRet = $waitRet"
}
run_cataloguer &
PID_CAT=$!
echo "PID_CAT = $PID_CAT"
send_mail &
wait # Wait for all
echo "Finished main"
The following output is seen:
PID_CAT = 1265
In run_cataloguer()
In send_mail()
In wait_for_cataloguer() PID_CAT = 1265
waitRet = 127 # THIS SHOULD be 0
Leaving send_mail()
leaving run_cataloguer()
Finished main
The problem is
waitRet = 127
which means the wait command can't see $PID_CAT, so it doesn't wait for run_cataloguer() to finish and
"leaving send_mail()"
is printed before
"leaving run_cataloguer()"
If I run send_mail in the foreground then waitRet = 0, which is correct.
So, it appears that you cannot wait for a background process from within a separate background process.
Also, if I uncomment the jobs command, nothing is returned , which appears to confirm the previous statement.
If anyone has a solution ,apart form using flag files, :), it would be much appreciated.
It looks like this cannot be done. The solution I used was from Parvinder here:
wait child process but get error: 'pid is not a child of this shell'
Related
It seems that bash's wait doesn't honor set -e as I would expect. Or it somehow loses track of the child process exiting with an error. Consider the example.
set -e # exit immediately on error
function child()
{
if [ $1 -eq 3 ]; then
echo "child $1 performing error"
# exit 1 ## I also tried this
false
else
echo "child $1 performing successful"
true
fi
echo "child $1 exiting normally"
}
# parent
child 1 & # succeeds
child 2 & # fails
child 3 & # succeeds
wait # why doesn't wait indicate an error?
echo "Launch nukes!" # don't want this to execute if a child failed
I want the set -e semantics, but wait doesn't seem to honor them.
A parent launches three children. One of them chokes and exits with an error (honoring the set -e). The problem is that the parent process plunders on as if nothing bad happened silently. I.e. I want to propagate the error.
Is there a way to enable this behavior? I.e. have wait return non-zero if any child exits non-zero.
If you are using bash 4.3, you can use wait -n in a loop to wait for each child in turn. You won't know which child failed, but whenever one does fail, the exit status of wait will be non-zero.
child 1 & # succeeds
child 2 & # fails
child 3 & # succeeds
for i in 1 2 3; do
wait -n
done
According to the Man page, that's what should happen (emphasis mine):
If n is not given, all currently active child processes are waited
for, and the return status is zero.
You want the second form where you specify wait {pid1} {pid2} {...} and get back the right error code.
I am experiencing some strange return values from system() when a child process receives a SIGINT from the terminal. To explain, from a Perl script parent.pl I used system() to run another Perl script as a child process, but I also needed to run the child through the shell, so I used the system 'sh', '-c', ... form.. So the parent of the child became the sh process and the parent of the sh process became parent.pl. Also, to avoid having the sh process receiving the SIGINT signal, I trapped it.
For example, parent.pl:
use feature qw(say);
use strict;
use warnings;
for (1..3) {
my $res = system 'sh', '-c', "trap '' INT; child$_.pl";
say "Parent received return value: " . ($res >> 8);
}
where child1.pl:
local $SIG{INT} = "DEFAULT";
sleep 10;
say "Child timed out..";
exit 1;
child2.pl:
local $SIG{INT} = sub { die };
sleep 10;
say "Child timed out..";
exit 1;
and child3.pl is:
eval {
local $SIG{INT} = sub { die };
sleep 10;
};
if ( $# ) {
print $#;
exit 2;
}
say "Child timed out..";
exit 0;
If I run parent.pl (from the command line) and press CTRL-C to abort each child process, the output is:
^CParent received return value: 130
^CDied at ./child2.pl line 7.
Parent received return value: 4
^CDied at ./child3.pl line 8.
Parent received return value: 2
Now, I would like to know why I get a return value of 130 for case 1, and a return value of 4 for case 2.
Also, it would be nice to know exactly what the "DEFAULT" signal handler does in this case.
Note: the same values are returned if I replace sh with bash ( and trap SIGINT instead of INT in bash ).
See also:
Propagation of signal to parent when using system
perlipc
Chapter 15, in Programming Perl, 4th Edition
This question is very similar to Propagation of signal to parent when using system that you asked earlier.
From my bash docs:
When a command terminates on a fatal signal N, bash uses the value of 128+N as the exit status.
SIGINT is typically 2, so 128 + 2 give you 130.
Perl's die figures out its exit code by inspecting $! or $? for an uncaught exception (so, not the case where you use eval):
exit $! if $!; # errno
exit $? >> 8 if $? >> 8; # child exit status
exit 255; # last resort
Notice that in this case, Perl exits with the value as is, not shifted up 8 bits.
The errno value happens to be 4 (see errno.h). The $! variable is a dualvar with different string and numeric values. Use it numerically (like adding zero) to get the number side:
use v5.10;
local $SIG{INT}=sub{
say "numeric errno is ", $!+0;
die
};
sleep 10;
print q(timed out);
exit 1;
This prints:
$ bash -c "perl errno.pl"
^Cnumeric errno is 4
Died at errno.pl line 6.
$ echo $?
4
Taking your questions out of order:
Also, it would be nice to know exactly what the "DEFAULT" signal handler does in this case.
Setting the handler for a given signal to "DEFAULT" affirms or restores the default signal handler for the given signal, whose action depends on the signal. Details are available from the signal(7) manual page. The default handler for SIGINT terminates the process.
Now, I would like to know why I get a return value of 130 for case 1, and a return value of 4 for case 2.
Your child1 explicitly sets the default handler for SIGINT, so that signal causes it to terminate abnormally. Such a process has no exit code in the conventional sense. The shell also receives the SIGINT, but it traps and ignores it. The exit status it reports for the child process (and therefore for itself) reflects the signal (number 2) that killed the child.
Your other two child processes, on the other hand, catch SIGINT and terminate normally in response. These do produce exit codes, which the shell passes on to you (after trapping and ignoring the SIGINT). The documentation for die() describes how the exit code is determined in this case, but the bottom line is that if you want to exit with a specific code then you should use exit instead of die.
I'd like to return the results of a script that also kicks off a background task. The command substitution operator waits for the background task, making the call slow. I created the following example to illustrate the problem:
function answer {
sleep 5 &
echo string
}
echo $(answer)
Is there a way to call a command without waiting on any background jobs it creates?
Thanks,
Mark
The problem is that sleep inherits stdout and keeps it open. You can simply redirect stdout:
answer() {
sleep 5 > /dev/null &
echo "string"
}
echo "$(answer)"
If you are intending for the program to continue merrily along in the mean time while the function works, you can just call the function to run in the background.
function answer {
sleep 5
echo Second
}
echo $(answer) &
echo First
The output of which will be
First
Second
In shell script I am trying to wait for non-child process. I got reference on how to do it from:
WAIT for "any process" to finish
My shell script structure is:
Main.sh
func1(){
return 1
}
func2(){
# Wait for func1 to finish
while kill -0 "$pid_func1"; do
sleep 0.5
done
}
# Call function 1 in background
func1 &
pid_func1=$!
func2 &
In this case how do I receive the return value of func1 inside function func2?
You generally cannot capture the exit status of non-child processes. You may be able to work something involving logging the exit codes to status files and then reading the values, but otherwise you're not going to be able to capture the values
I used anothe shell variable to store the return status in this case and checked value of this shell variable whereever required. Find a sample shell script below to simulate the scenario.
#!/bin/bash
func1(){
retvalue=23 # return value which needs to be returned
status_func1=$retvalue # store this value in shell variable
echo "func1 executing"
return $retvalue
}
func2(){
# Not possible to use wait command for pid of func1 as it is not a child of func2
#wait $pid_func1
#ret_func1=$?
while kill -0 "$pid_func1"; do
echo "func1 is still executing"
sleep 0.5
done
echo "func2 executing"
#echo "func1 ret: $ret_func1"
echo "func1 ret: $status_func1"
}
# Main shell script starts here
func1 &
pid_func1=$!
func2 &
Hope its useful for others who are facing the same issue.
Alright, I am trying to figure this problem out. I have a class that loops indefinitely until I either restart it manually or it runs out of available ram. I've written the code to be compliant with both CLI and normal web based execution. The only difference is with web-based execution the script will last about 12 hours or so until it crashes due to memory issues. When I run it in CLI it runs far longer, (On average 4-5 days before a crash due to memory)
The script is an IRC bot that is heavily customized for what I need it to do. I don't know enough of C++, ruby, python or other languages to make something that is cross platform compliant. My dev machine is Windows and my production server is Ubuntu. Right now I have the script successfully forking off and detaching from the terminal window so I can close that with out ending the script.
But what I am trying to figure out is how to catch errors and restart the script automatically since it tends to fail at random times and not always when I am at the IRC channel to catch the failure. One last positive would be a way to catch if I requested a restart from the channel and have the bot restart as I am constantly adding in new code functions or just general bug fixes.
Here is my CLI start php script
#!/usr/bin/php
<?php
include_once ("./config/base_conf.php");
include_once ("./libs/irc_base.php");
if ($config ['database'] == true) {
include_once ("./config/db_conf.php");
}
$server = getopt ( 's', array ("server::" ) );
if (! $server) {
$SER = 'default_server';
} elseif ($server ['server'] == 'raelgun') {
$SER = 'server_a';
} else {
$SER = 'default_server';
}
declare ( ticks = 1 )
;
$pid = pcntl_fork ();
if ($pid == - 1) {
die ( "could not fork" );
} else if ($pid) {
exit (); // we are the parent
} else {
// we are the child
}
// detatch from the controlling terminal
if (posix_setsid () == - 1) {
die ( "could not detach from terminal" );
}
$posid = posix_getpid ();
$PID_FILE = "/var/run/bot_process_".$SER.".pid";
$fp = fopen ($PID_FILE , "w" ) or die("File Exists Process Running");
fwrite ( $fp, $posid );
fclose ( $fp );
// setup signal handlers
pcntl_signal ( SIGTERM, "sig_handler" );
pcntl_signal ( SIGHUP, "sig_handler" );
// loop forever performing tasks
$bot = new IRC_BOT ( $config, $SER );
function sig_handler($signo) {
switch ($signo) {
case SIGTERM :
$bot->machineKill();
unlink($PID_FILE);
exit ();
break;
case SIGHUP :
$bot->machineKill();
unlink($PID_FILE);
break;
default :
// handle all other signals
}
}
Depending on the server I connect to since it connects to a maximum of 2 servers I run the following in the terminal to get the script running
php bot_start_shell.php --server="servernamehere" > /dev/null
So what I am trying to do is get a shell file coded correctly to monitor that script, and if it exits due to error or requested restart to restart the script.
I've used this technique for a while, where a shell script runs a PHP script, monitors the exit value and restarts.
Here's a test script that uses exit() to return a value to the shell script - 95,96 & 100 are taken as other 'unplanned restarts', handled at the bottom of the script.
#!/usr/bin/php
<?php
// cli-script.php
// for testing of the BASH script
exit (rand(95, 100));
/* normally we would return one of
# 97 - planned pause/restart
# 98 - planned restart
# 99 - planned stop, exit.
# anything else is an unplanned restart
*/
I prefer to wait a few seconds before I restart the script, to avoid wasting CPU if the script being called instantly fails, and so would be immediately restarted.
#!/bin/bash
# runPHP-Worker.sh
# a shell script that keeps looping until an exit code is given
# if its does an exit(0), restart after a second - or if it's a declared error
# if we've restarted in a planned fashion, we don't bother with any pause
# and for one particular code, we can exit the script entirely.
# The numbers 97, 98, 99 must match what is returned from the PHP script
nice php -q -f ./cli-script.php -- $#
ERR=$?
## Possibilities
# 97 - planned pause/restart
# 98 - planned restart
# 99 - planned stop, exit.
# 0 - unplanned restart (as returned by "exit;")
# - Anything else is also unplanned paused/restart
if [ $ERR -eq 97 ]
then
# a planned pause, then restart
echo "97: PLANNED_PAUSE - wait 1";
sleep 1;
exec $0 $#;
fi
if [ $ERR -eq 98 ]
then
# a planned restart - instantly
echo "98: PLANNED_RESTART, no pause";
exec $0 $#;
fi
if [ $ERR -eq 99 ]
then
# planned complete exit
echo "99: PLANNED_SHUTDOWN";
exit 0;
fi
# unplanned exit, pause, and then restart
echo "unplanned restart: err:" $ERR;
echo "sleeping for 1 sec"
sleep 1
exec $0 $#
If you don't want to do different things for each value, it really just comes down to
#!/bin/bash
php -q -f ./cli-script.php -- $#
exec $0 $#;